1 Introduction

In this work we consider a distribution dependent SDE (henceforth DDSDE) of the form

$$\begin{aligned} X_t = \xi + \int _0^t B_s(X_s,\mathcal {L}(X_s))\mathop {}\!\text {d}s + W_t \end{aligned}$$
(1.1)

where \(B : \mathbb {R}_+\times \mathbb {R}^d \times \mathcal {P}(\mathbb {R}^d)\rightarrow \mathbb {R}^d\), \(\xi \) is an \(\mathbb {R}^d\)-valued random variable and W is a \(\mathbb {R}^d\)-valued stochastic process independent of \(\xi \). The drift B and the law of \((\xi ,W)\) are prescribed, while the process X is the unknown and \(\mathcal {L}(X_t)\) denotes the law of its marginal at time t.

Usually in the literature W is sampled as a standard Brownian motion; in this case the DDSDE is also called a McKean–Vlasov SDE, after the pioneering work [34] where it was first introduced.

The importance of McKean–Vlasov equations is due to their connection to systems of N particles subject to a mean field interaction of the form

$$\begin{aligned} X^{i,N}_t = \xi ^i + \int _0^t B_s\big (X^{i,N}_s, L^N\big (X^{(N)}_s\big )\big ) \mathop {}\!\text {d}s + W^{i}_t, \quad L^N\big (X^{(N)}_t\big ):=\frac{1}{N}\sum _{i=1}^N \delta _{X^{i,N}_t} \nonumber \\ \end{aligned}$$
(1.2)

where \((\xi ^i,W^i)\) are typically taken to be i.i.d. copies of \((\xi ,W)\) and \(L^N\big (X^{(N)}_t\big )\) stands for the empirical measure of the system at time t. One expects the DDSDE (1.1) to be the mean field limit of (1.2) in the sense that, as N goes to infinity, \(L^N\big (X^{(N)}_t\big )\) converges weakly to \(\mathcal {L}(X_t)\) with probability 1.

Another feature of DDSDEs in the Brownian noise case is their connection to nonlinear Fokker–Planck PDEs (also called McKean–Vlasov equations) of the form

$$\begin{aligned} \partial _t \rho + \nabla \cdot ((B_t(\,\cdot \,, \rho )\, \rho ) = \frac{1}{2} \Delta \rho , \quad \rho _0=\mathcal {L}(\xi ), \end{aligned}$$
(1.3)

which describe the evolution of the marginal \(\rho _t=\mathcal {L}(X_t)\); in particular, both (1.1) and (1.3) provide a macroscopic, compact description of the system (1.2), allowing one to reduce its complexity. For this reason, DDSDEs have found applications in numerous fields, see the review [27] and the references therein; let us also mention their connection to mean-field games [31].

Classical results concerning the well-posedness of the DDSDE (1.1) and the mean-field limit property go back to Sznitman [44] and Gärtner [20]; in the last years the field has witnessed substantial contributions both from the analytic and probabilistic communities. On the one hand, new methods based on entropy inequalities [6, 14, 28] and modulated energy methods [41, 42] have allowed for the rigorous derivation of mean field limits for fairly singular B; while on the other, DDSDEs with irregular drifts are related to the flourishing field of regularization by noise phenomena. The latter topic was initiated by Zvonkin [49] and Veretennikov [48] in the case of standard SDEs, see [13] for a general overview; recently many authors have applied similar techniques in the DDSDE case, see for instance [5, 10, 25, 35, 40].

Contrary to the previously mentioned works, here we will study DDSDEs in which W is sampled as a fractional Brownian motion (fBm for short) of Hurst parameter \(H\in (0,1)\). Our main reasons for doing so are the following:

  1. 1.

    It was shown in [11], revisiting the ideas of Tanaka [45], that for Lipschitz B the mean-field limit of (1.2) to (1.1) holds for any choice of the process W, regardless of it being Markov or a semimartingale. In particular the DDSDE has a physical meaning and still provides a compact description of a much more complex system of interacting particles.

  2. 2.

    Several regularization by noise results for standard SDE are available for W sampled as an fBm (or similar fractional processes), see [1, 3, 9, 32, 37] for a short selection.

In light of Point 2. above, it is natural to expect similar results to hold for DDSDEs with singular (possibly even distributional in space) drifts and W sampled as an fBm; by Point 1., they are relevant in the study of particle systems with singular interactions (for instance with a discontinuity at the origin, as typical of Coulomb and Riesz-type potentials).

Let us mention that there is a certain degree of arbitrariness in choosing W to be sampled as an fBm, as one could consider other non-Markovian, non-martingale processes. We believe our choice to be simple enough while at the same time representing what one might expect for a larger class of processes (e.g. Gaussian processes satisfying a local non-determinism condition). In this sense, this work also serves as a comparison to the results from [19], where we explored in detail the DDSDE (1.1) in the opposite regime where no assumption whatsoever is imposed on W, thus no regularization can be observed.

Despite the above motivations, singular DDSDEs driven by fBm (or similar fractional processes) so far have not received the same attention as their Brownian counterparts; to the best of our knowledge, the only previous work treating these kind of equations is [4]. After the first version of this manuscript came out, a different approach based on relative entropy methods has been proposed in [21].

One possible reason for this is the substantial new difficulties presented by such equations: fBm with parameter \(H\ne 1/2\) is neither a Markov process, nor a semimartingale, so techniques based on Itô calculus are not applicable. This includes in particular the connection to parabolic semigroups, the martingale problem formulation and the use of Zvonkin transform (or Itô–Tanaka trick), all techniques used extensively in the aforementioned works in the Brownian case. It also prevents the use of standard arguments, which typically rely on establishing uniqueness of the law \(\rho _t=\mathcal {L}(X_t)\) through PDE analysis of (1.3) and then fixing the law in the DDSDE and treating it as a standard SDE.

Treating DDSDEs driven by fBm thus requires a novel set of tools and ideas; our strategy in this paper builds on the work of Catellier and Gubinelli [9], which represented a major breakthrough in the study of standard SDEs driven by fBm of the form

$$\begin{aligned} X_t = \xi + \int _0^t b_s(X_s)\text {d}s + W_t. \end{aligned}$$
(1.4)

Therein the authors develop a pathwise approach to the equation, based on nonlinear Young integrals and Girsanov transform, that allows to give meaning to (1.4) and establish its path-by-path uniqueness, for drifts b of poor regularity, possibly even distributional. Their results and techniques have been revisited in subsequent works [17, 18, 22, 23]; in general it suffices to require

$$\begin{aligned} b \in {\left\{ \begin{array}{ll} L^q_T B^\alpha _{\infty ,\infty } &{} \text {with } \alpha>1-\frac{1}{2H}+\frac{1}{Hq}\quad \ \, \text {if } H\le 1/2\\ C^{\alpha H}_T C^0_x \cap C^0_T C^\alpha _x \quad &{}\text {with } \alpha>1-\frac{1}{2H}\qquad \qquad \text {if } H>1/2 \end{array}\right. } \end{aligned}$$
(1.5)

see for instance Theorem 15 and Corollary 2 from [17]. Here \(B^\alpha _{\infty ,\infty }\) denote Besov-Hölder spaces; see Sect. 1.1 below for the relevant definitions and notations in use throughout the article.

For the sake of exposition, let us ignore for the moment the additional time regularity required in (1.5) in the case \(H>1/2\), since it is mostly of a technical nature; then condition (1.5) roughly amounts to the drift b enjoying a spatial regularity \(B^{\alpha }_{\infty ,\infty }\) with \(\alpha >1-1/(2H)\). Observe that for all \(H\in (0,1)\) this includes values \(\alpha <1/2\), while for \(H<1/2\) we are even allowed to take \(\alpha <0\), namely distributional b. To the best of our knowledge, no work after [9] has improved on the allowed range of \(\alpha \).

With the above theory at hand, we can interpret the DDSDE (1.1) by rewriting it as

$$\begin{aligned} X_t = \xi + \int _0^t \bar{b}_s(X_s)\mathop {}\!\text {d}s + W_t, \quad \bar{b}_t(\cdot ):= B_t(\cdot ,\mathcal {L}(X_t)); \end{aligned}$$

namely, X solves the SDE with drift \(\bar{b}\), in the Catellier-Gubinelli sense, where \(\bar{b}\) depends in a nontrivial way on the law of X itself. This interpretation comes with a natural fixed point formulation: given a process X, we can associate to it a “flow of measures” \(\mu _t=\mathcal {L}(X_t)\) and a drift \(b^\mu _t := B_t(\cdot , \mu _t)\), then solve the associated SDE, which gives a new process \(Y=\mathcal {I}(X)\); thus X is a solution to (1.1) if and only if it is a fixed point for \(\mathcal {I}\).

Alternatively, one could start with the flow of measures \(\mu _\cdot =\{\mu _t\}_{t\in [0,T]}\) and set up the fixed point procedure for this object, by defining \(\mathcal {J}(\mu _\cdot )_t=\mathcal {L}(X_t)\) for X solution to \(b^\mu \). These two interpretations are in fact equivalent: once \(\mu _\cdot \) is completely determined, the DDSDE reduces to a standard SDE with fixed drift \(b^\mu \), to which the previous results can be applied; see Lemma 4.4 for more details. Throughout the article we will exploit both interpretations whenever useful.

Given the above interpretation, we need two main ingredients to develop a solution theory:

  1. 1.

    Firstly, B must have the properties that \(b^\mu \) satisfies (1.5) for any \(\mu _\cdot \) of interest and that the solution-to-drift map \(X(\mapsto \mu _\cdot )\mapsto b^\mu \) is Lipschitz in suitable topology.

  2. 2.

    Secondly, we must develop stability estimates for the drift-to-solution map \(b\mapsto Y\), in an appropriate topology that complements the stability of \(\mu \mapsto b^\mu \).

Once these points are established, the contractivity of the overall map \(X\mapsto b^\mu \mapsto \mathcal {I}(X)\) follows.

There are however major problems with the program outlined above; to describe them without too many technicalities, let us consider here the most relevant case \(B(\mu )=f*\mu +g\) for time homogeneous \(f,g\in B^\alpha _{\infty ,\infty }\), \(\alpha >1-1/(2H)\). In this case, the map \(\mu \mapsto b^{\mu }\) is naturally Lipschitz in the total variation topology, in the sense that

$$\begin{aligned} \Vert B(\mu ^1)-B(\mu ^2)\Vert _{B^{\alpha }_{\infty ,\infty }} \lesssim \Vert \mu ^1-\mu ^2\Vert _{TV}; \end{aligned}$$

however due to the lack of an underlying parabolic PDE (1.3) (and the associated maximum principle) in the fBm setting, it is not obvious how to control the drift-to-solution map \(b\mapsto Y\) in this topology, i.e. how to bound \(\Vert \mathcal {L}(Y^1_t)-\mathcal {L}(Y^2_t)\Vert _{TV}\) as a function of \(\Vert b^1-b^2\Vert _{B^\alpha _{\infty ,\infty }}\).

One of the main intuitions of the current work, which allows us to overcome this difficulty, is the understanding that although the regularity \(B^\alpha _{\infty ,\infty }\) is needed in order to solve the SDE (1.4), one may establish stability estimates in the weaker norm \(B^{\alpha -1}_{\infty ,\infty }\). Roughly speaking, given two solutions \(X^1,X^2\) to (1.4) associated to different initial data and drifts \((\xi ^i,b^i)\), for any \(p\in [1,\infty )\) we have

$$\begin{aligned} \mathbb {E}\Big [\sup _{t\in [0,T]} |X^1_t-X^2_t|^p\Big ]^{1/p} \lesssim \mathbb {E}\big [\,|\xi ^1-\xi ^2|^p\big ]^{1/p} + \Vert b^1-b^2\Vert _{B^{\alpha -1}_{\infty ,\infty }} \end{aligned}$$
(1.6)

see Theorem 3.13 and Corollary 3.17 for the rigorous statements. This property is naturally analogous to standard ODE theory, where solvability requires b Lipschitz, but stability estimates are in the supremum norm.

In our setting, it implies that B only needs to enjoy some multiscale regularity of the form

$$\begin{aligned} \Vert B(\mu )\Vert _{B^\alpha _{\infty ,\infty }}\lesssim 1, \quad \Vert B(\mu ^1)-B(\mu ^2)\Vert _{B^{\alpha -1}_{\infty ,\infty }} \lesssim d(\mu ^1,\mu ^2) \end{aligned}$$

for another notion of distance \(d(\mu ^1,\mu ^2)\), possibly different from the total variation one. The right choice for d turns out to be the family of p-Wasserstein distances \(d_p(\mu ^1,\mu ^2)\), which complements the bound (1.6) thanks to the basic property \(d_p(\mathcal {L}(X^1_t),\mathcal {L}(X^2_t))\le \mathbb {E}[|X^1_t-X^2_t|^p]^{1/p}\).

Overall, the newly found stability estimate (1.6) and the use of Wasserstein distance allow us to fulfill Points 1.-2. outlined above and to solve the DDSDE (2.1) for a large class of drifts B, see Theorems 2.4 and 2.5 for the precise statements; this includes the case \(B(\mu )=f*\mu +g\) mentioned above.

For the sake of this preliminary discussion we have ignored the time regularity requirement in (1.5), but it does indeed play a relevant role, making the proofs a bit more technical and requiring us to treat the cases \(H>1/2\) and \(H\le 1/2\) slightly differently; see Sect. 4 for more details.

Let us stress that, since we are not allowed to use the same tools as in the Brownian setting, our results are not optimal for the choice \(H=1/2\), sharper ones being available for instance in [25, 40]. Nevertheless, they still provide some new insights, with the stability estimate (1.6) being new in this setting as well. This also partially answers the ongoing debate from [25, 26, 40] on whether the drift B should be taken Lipschitz continuous in the measure argument \(\mu \) w.r.t. the total variation distance, the Wasserstein one or a weighted mix of the two: the use of Wasserstein distance allows the drift to be Lipschitz continuous in the different regularity scale \(B^{\alpha -1}_{\infty ,\infty }\), which is strictly negative in the regime \(\alpha \in (0,1)\), which is admissible in (1.5) for \(H=1/2\).

A major open problem coming from this work is the mean-field convergence (and associated propagation of chaos property) of the particle system (1.2) to (1.1), for the class of singular drifts for which we establish well-posedness of the DDSDE in Theorem 2.4. Our techniques are currently not enough to give a full answer; recently, several authors have investigated the Brownian setting using alternative tools based on Girsanov theorem and Large Deviations, see [24, 29, 30, 46]. Contrary to Itô calculus, these tools are available for fBm as well, thus we hope they may be of help in future investigations.

Another interesting question posed by the current work is whether our results can be further improved, in the sense of allowing values of \(\alpha <1-1/(2H)\), at least in some special cases. Theorems 2.6 and 2.7 suggest an affirmative answer for convolutional drifts \(B(\mu )=b*\mu \), see also the discussion at the beginning of Sect. 5; this is in analogy with the Brownian case, where standard SDE theory requires roughly \(b\in L^\infty _x\), but the nonlinear PDE (1.3) can be solved for roughly \(b\in W^{-1,\infty }_x\).

We conclude this introduction with the structure of the paper. In Sect. 1.1 we introduce all relevant notations adopted in the paper and recall some well-known facts. Section 2 contains all our main results and Sect. 2.1 relevant examples of drifts B satisfying them. We present in detail the Catellier–Gubinelli theory of SDEs driven by fBm in Sect. 3, where we prove our main stability results (Theorem 3.13 and Corollary 3.17 from Sect. 3.3) as well as some new auxiliary results on the regularity of the law of solutions (Sect. 3.4). Sections 4 and 5 contain the proofs of our main results, respectively Theorems 2.4, 2.5, 2.6 and 2.7. Finally, we have included in Appendix A a collection of useful analytic lemmas used throughout the paper.

1.1 Notations, conventions and well-known facts

Throughout the article we will always work on a finite time interval [0, T], although arbitrarily large; we will never deal with estimates on the infinite interval \([0,+\infty )\). We write \(a\lesssim b\) whenever there exists a constant \(C>0\) such that \(a\le C b\). To stress the dependence \(C=C(\lambda )\) on a particular parameter \(\lambda \), we will write \(a\lesssim _\lambda b\). For \(p \in [1,\infty ]\) and where it will not cause confusion, we write \(p'\) to denote the dual exponent to p, that is \(1/p+1/p'=1\), with the interpretation \(p=1\iff p'=\infty \).

Throughout the article, whenever not mentioned explicitly, we will consider an underlying probability space \((\Omega ,\mathcal {F},\mathbb {P})\); any \(\sigma \)-algebra appearing is assumed to be \(\mathbb {P}\)-complete. If \(\Omega \) has a topological structure, then \(\mathcal {B}(\Omega )\) denotes its Borel \(\sigma \)-algebra (again up to \(\mathbb {P}\)-completion).

We denote by \(\mathbb {E}_\mathbb {P}\), or simply \(\mathbb {E}\), expectation w.r.t. \(\mathbb {P}\); Given a Banach space E and \(p\in [1,\infty ]\), we will frequently consider E-valued random variables X in the space \(L^p_\Omega E:=L^p(\Omega ,\mathcal {F},\mathbb {P};E)\), with norm \(\Vert X\Vert _{L^p_\Omega } = \mathbb {E}[ \Vert X\Vert _E^p]^{1/p}\) (essential supremum if \(p=\infty \)).

We denote by \(\mathcal {L}_\mathbb {P}(X)\), or simply \(\mathcal {L}(X)\), the law of X on E, namely the pushforward measure \(\mathbb {P}\circ X^{-1} = X \sharp \mathbb {P}\); more generally, we adopt the notation \(F\sharp \mu \) for the pushforward of a measure \(\mu \) under a measurable map F. Given a measure \(\mu \in \mathcal {P}(C_T)\),we mention in particular the pushforward \(\mu _t := e_t \sharp \mu \) where \(e_t (h) = h_t\) denotes the evaluation map, \(e_t:C_T\rightarrow \mathbb {R}^d\).

1.1.1 Function spaces on [0, T]

Given a metric space \((M,d_M)\), we denote by \(C_T M = C([0,T];M)\) the space of all continuous functions \(f:[0,T]\rightarrow M\); for \(\gamma \in (0,1)\), we set \(C^\gamma _T M = C^\gamma ([0,T];M)\) to be the subset of \(\gamma \)-Hölder continuous functions, namely

$$\begin{aligned} \llbracket f\rrbracket _{\gamma ,M}:=\sup _{s\ne t\in [0,T]} \frac{d_M(f_t,f_s)}{|t-s|^\gamma }<\infty . \end{aligned}$$

If \((E,\Vert \cdot \Vert _E)\) is a Banach space, then \(C_T E\) and \(C^\gamma _T E\) are Banach spaces with norms

$$\begin{aligned} \Vert f\Vert _{C_T E} =\sup _{t\in [0,T]} \Vert f_t\Vert _E, \quad \Vert f\Vert _{C^\gamma _T E} = \Vert f\Vert _{C_T E} + \llbracket f\rrbracket _{\gamma ,E}. \end{aligned}$$

In the case \(E=\mathbb {R}^n\) for some \(n\in \mathbb {N}\), whenever it doesn’t create confusion we will simply use \(C_T\), \(C^\gamma _T\) and \(\Vert f\Vert _\gamma \) in place of \(C_T \mathbb {R}^d\), \(C^\gamma _T \mathbb {R}^d\), \(\Vert f\Vert _{C^\gamma _T}\); moreover for any \([s,t]\subset [0,T]\) we set

$$\begin{aligned} \llbracket f\rrbracket _{\gamma ,[a,b]}:=\sup _{s\ne t\in [a,b]} \frac{|f_t-f_s|}{|t-s|^\gamma }. \end{aligned}$$

Given a Banach space E and \(q\in [1,\infty ]\), we denote by \(L^q_T E = L^q(0,T;E)\) the Bochner–Lebesgue space of strongly measurable \(f:[0,T]\rightarrow E\) such that

$$\begin{aligned} \Vert f\Vert _{L^q_T E} = \bigg ( \int _0^T \Vert f_t\Vert _E^q \mathop {}\!\text {d}t\bigg )^{\frac{1}{q}} <\infty \end{aligned}$$

with usual modification for \(q=\infty \); as before we write \(L^q_T\) for \(L^q_T \mathbb {R}^n\).

1.1.2 Function spaces on \(\mathbb {R}^d\)

Given \(d, m\in \mathbb {N}\), we denote by \(C(\mathbb {R}^d;\mathbb {R}^m)\) the space of continuous, bounded functions \(f:\mathbb {R}^d\rightarrow \mathbb {R}^m\), endowed with the supremum norm \(\Vert f\Vert _{C^0_x}\); whenever it doesn’t create confusion we will simply write \(C^0_x\). \(C^\infty _c = C^\infty _c(\mathbb {R}^d;\mathbb {R}^m)\), \(C^n_x = C^n(\mathbb {R}^d;\mathbb {R}^m)\) denote respectively compactly supported smooth functions and n-times differentiable functions with continuous, bounded derivatives up to order n; \(\mathcal {S}=\mathcal {S}(\mathbb {R}^d;\mathbb {R}^m)\) denote Schwartz functions, \(\mathcal {S}'\) their dual. Given f, we denote by Df its Jacobian, i.e. the collection of first order derivatives \((\partial _j f_i)_{i,j}\), possibly interpreted in the distributional sense. For \(\alpha \in (0,1)\), \(C^\alpha _x=C^\alpha (\mathbb {R}^d;\mathbb {R}^m)\) stand for the Banach space of Hölder continuous functions, with norm

$$\begin{aligned} \Vert f\Vert _{C^\alpha _x}:= \Vert f\Vert _{C^0_x} + \llbracket f\rrbracket _{C^\alpha _x}, \quad \llbracket f\rrbracket _{C^\alpha _x}:=\sup _{x\ne y\in \mathbb {R}^d} \frac{|f(x)-f(y)|}{|x-y|^\alpha }. \end{aligned}$$

The definition of \(C^\alpha _x\) extends canonically to \(\alpha \in (1,+\infty )\) by imposing that \(f\in C^\alpha _x\) if \(f\in C^{\lfloor \alpha \rfloor }_x\) and its derivatives of order \(\lfloor \alpha \rfloor \) belong to \(C^{\alpha -\lfloor \alpha \rfloor }_x\), where \(\lfloor \alpha \rfloor \) denotes the integer part of \(\alpha \). We denote by \(C^\alpha _{loc}=C^\alpha _{loc}(\mathbb {R}^d;\mathbb {R}^n)\) the vector space of all continuous \(f:\mathbb {R}^d\rightarrow \mathbb {R}^m\) such that \(\varphi f\in C^\alpha _x\) for all \(\varphi \in C^\infty _c\); we say that \(f^n\rightarrow f\) in \(C^\alpha _{loc}\) if \(\varphi f^n\rightarrow \varphi f\) in \(C^\alpha _x\) for all \(\varphi \in C^\infty _c\).

Given \(\alpha \in \mathbb {R}\) and \(p\in [1,\infty ]\), we denote by \(B^\alpha _{p,p}=B^\alpha _{p,p}(\mathbb {R}^d;\mathbb {R}^m)\) the associated (inhomogeneous) Besov space, given by distributions \(f\in \mathcal {S}'\) such that

$$\begin{aligned} \Vert f\Vert _{B^\alpha _{p,p}} := {\left\{ \begin{array}{ll} \bigg ( \sum _{n=-1}^{+\infty } 2^{\alpha n p} \Vert \Delta _n f\Vert _{L^p}^p\bigg )^{\frac{1}{p}}<\infty \quad &{}for \quad p\in [1,\infty ) \\ \sup _{n\ge -1} 2^{\alpha n}\Vert \Delta _nf\Vert _{L^\infty }<\infty \quad &{}for \quad p=\infty \end{array}\right. } \end{aligned}$$

where \(\Delta _n\) denote the Littlewood-Paley blocks associated to a partition of the unity. We refer to the monograph [2] for details on Besov spaces; throughout the paper we will frequently employ their properties, like Besov embeddings, Bernstein estimates for \(\Delta _n f\) or the regularity of \(f*g\) for f, g in different Besov spaces. Let us also mention that, although the Littlewood–Paley definition will be the most relevant for our purposes, Besov spaces admit alternative equivalent characterizations based on either interpolation or Gagliardo–Nirenberg type integral seminorms, see for instance [33]. For \(\alpha \in \mathbb {R}_+\setminus \mathbb {N}\), the spaces \(C^\alpha _x\) and \(B^\alpha _{\infty ,\infty }\) coincide; however for clarity we will continue to write \(C^{\alpha }_x\) for \(\alpha \ge 0\) and \(B^\alpha _{\infty ,\infty }\) otherwise.

The notations from this section and the previous one can be combined to define \(C^\gamma _T C^\alpha _x\), \(L^q_T B^\alpha _{p,p}\), etc.; similarly, we define \(C^\gamma _T C^\alpha _{loc}\) to be the vector space of all \(f:[0,T]\times \mathbb {R}^d\rightarrow \mathbb {R}^m\) such that \(\varphi f\in C^\gamma _T C^\alpha _x\) for all \(\varphi \in C^\infty _c\), with convergence \(f^n\rightarrow f\) in \(C^\gamma _T C^\alpha _{loc}\) if \(\varphi f^n\rightarrow \varphi f\) in \(C^\gamma _T C^\alpha _x\). Given a function f of time and space, Df always denotes its Jacobian in the space variable only.

1.1.3 Probability measures and Wasserstein distance

Given a separable Banach space E, we denote by \(\mathcal {P}(E)\) the set of probability measures over E; we write \(\mu ^n\rightharpoonup \mu \) for weak convergence of measures, in the sense of testing against continuous bounded functions.

Given \(\mu ,\nu \in \mathcal {P}(E)\), \(\Pi (\mu ,\nu )\) stands for the set of all possible couplings of \((\mu ,\nu )\), i.e. the subset of \(\mathcal {P}(E\times E)\) with first and second marginals given respectively by \(\mu \) and \(\nu \). For any \(p\in [1,\infty )\), we define

$$\begin{aligned} d_p(\mu ,\nu ):=\inf _{m\in \Pi (\mu ,\nu )} \bigg (\int _{E\times E} \Vert x-y\Vert ^p_{E}\, m(\mathop {}\!\text {d}x,\mathop {}\!\text {d}y)\bigg )^{1/p} \end{aligned}$$

which is a well defined quantity (possibly taking value \(+\infty \)). By [47,   Theorem 4.1], an optimal coupling \(\bar{m}\in \Pi (\mu ,\nu )\) realizing the above infimum always exists.

Similarly we define \(\mathcal {P}_p(E)\) to be set of p-integrable probability measures; that is, \(\mu \in \mathcal {P}_p(E)\) if \(\mu \in \mathcal {P}(E)\) and

$$\begin{aligned} \Vert \mu \Vert _p := \bigg ( \int _{E} \Vert x\Vert _E^p \, \mu (\mathop {}\!\text {d}x)\bigg )^{1/p}<\infty . \end{aligned}$$

It is well known that \(d_p(\mu ,\nu )<\infty \) for \(\mu ,\nu \in \mathcal {P}_p(E)\) and that \((\mathcal {P}_p(E),d_p)\) is a complete metric space, usually referred to as the p-Wasserstein space on E; let us stress however that our definition of \(d_p(\mu ,\nu )\) holds for all \(\mu ,\nu \in \mathcal {P}(E)\). We recall that, given a sequence \(\{\mu ^n\}_n\subset \mathcal {P}_p(E)\), \(d_p(\mu ^n,\mu )\rightarrow 0\) is equivalent to \(\mu ^n\rightharpoonup \mu \) weakly and \(\Vert \mu ^n\Vert _p\rightarrow \Vert \mu \Vert _p\), see [47,   Theorem 6.9].

Given \(\mu \in \mathcal {P}(\mathbb {R}^d)\), with a slight abuse of notation we will write \(\mu \in L^q(\mathbb {R}^d)\) (or simply \(L^q_x\)) for \(q\in [1,\infty ]\) to indicate that \(\mu \) admits a density \(\mu (\mathop {}\!\text {d}x)=\rho (x)\mathop {}\!\text {d}x\) with respect to the d-dim. Lebesgue measure, such that \(\rho \in L^q_x\).

1.1.4 Fractional Brownian motion

A real valued continuous process \(\{W_t,\, t\in [0,T]\}\) is a fractional Brownian motion (fBm) with Hurst parameter \(H\in (0,1)\) if it is a centered Gaussian process with covariance function

$$\begin{aligned} \mathbb {E}[W_t W_s] = \frac{1}{2}\big (|t|^{2H}+|s|^{2H}-|t-s|^{2H}\big ); \end{aligned}$$

an \(\mathbb {R}^d\)-valued process W is a d-dimensional fBm if its components are independent 1-dimensional fBms. All the results we are going to recall here are classical and can be found in [36, 39].

For \(H=1/2\), fBm corresponds to classical Brownian motion (Bm), but for \(H\ne 1/2\) it is not a semimartingale nor a Markov process; its trajectories are \(\mathbb {P}\)-a.s. in \(C^{H-\varepsilon }_T\) for any \(\varepsilon >0\).

Given an fBm W of parameter H on a probability space \((\Omega ,\mathcal {F},\mathbb {P})\), it’s always possible to construct a standard Bm B on it such that the following canonical representation holds:

$$\begin{aligned} W_t = \int _0^t K_H(t,s) \mathop {}\!\text {d}B_s \end{aligned}$$

where \(K_H\) is a Volterra-type kernel and B and W generate the same filtration. Given a filtration \(\{\mathcal {F}_t\}_{t\in [0,T]}\), we say that W is an \(\mathcal {F}_t\)-fBm if the associated B is an \(\mathcal {F}_t\)-Bm in the classical sense.

Closely related to the canonical representation are a version of Girsanov theorem for fBm (see e.g. [37,   Theorem 2]) and the strong local non-determinism (LND) of fBm: for any \(H\in (0,1)\) there exists \(c_H>0\) such that

$$\begin{aligned} Var[W_t\big | \mathcal {F}_s] \ge c_H |t-s|^{2H} I_d \quad \forall \, t>s. \end{aligned}$$

The LND property plays a key role in establishing the regularising features of W, cf. [16, 23].

2 Main results

Let us recall that the focus here is an abstract DDSDE of the form

$$\begin{aligned} X_t = \xi + \int _0^t B_s(X_s,\mu _s)\text {d}s + W_t,\quad \mu _t=\mathcal {L}(X_t) \quad \forall \, t\in [0,T] \end{aligned}$$
(2.1)

where \(\mathcal {L}(\xi )=\mu _0\), \(\xi \) independent of W and W is sampled as a fBm of parameter \(H\in (0,1)\).

We want to identify general conditions for measurable drifts \(B:[0,T]\times \mathcal {P}_p(\mathbb {R}^d)\rightarrow B^\alpha _{\infty ,\infty }\), \(\alpha \in \mathbb {R}\), such that we can develop a solution theory for (2.1). As explained in the introduction, our strategy consists in setting up a fixed point for \(\mu \mapsto b^\mu _t:=B_t(\mu _t)\mapsto X\mapsto {\tilde{\mu }}_t:=\mathcal {L}(X_t)\).

To this end, the assumptions on B should enforce two facts: for any flow of measures \(\mu \in C_T \mathcal {P}_p\), the associated drift \(b^\mu _t:=B_t(\mu )\) is regular enough to solve (1.4), namely \(b^\mu \) must satisfy condition (1.5); the map \(\mu \mapsto b^\mu \) should be stable in suitable topologies. Last but not least, the eligible B should include cases of particular interest (most notably \(B(\mu )=b*\mu \)), see Sect. 2.1 below.

Corresponding to the above requirements, for \(H > 1/2\) we define the following space:

Definition 2.1

For \(\alpha ,\beta \in (0,1)\) and \(p\in [1,\infty )\), let \(\mathcal {H}^{\beta ,\alpha }_p\) denote the class of continuous functions \(B:[0,T]\times \mathbb {R}^d\times \mathcal {P}_p(\mathbb {R}^d)\rightarrow \mathbb {R}^d\) satisfying the following condition: there exists \(C>0\) such that

  1. i.

    For all \((t,x,\mu )\in [0,T]\times \mathbb {R}^d\times \mathcal {P}_p(\mathbb {R}^d)\), \(|B_t(x,\mu )|\le C\).

  2. ii.

    For all \((s,t)\in [0,T]^2\), \((x,y)\in (\mathbb {R}^d)^2\) and \((\mu ,\nu )\in \mathcal {P}_p(\mathbb {R}^d)\times \mathcal {P}_p(\mathbb {R}^d)\), we have

    $$\begin{aligned} |B_t(x,\mu )-B_s(y,\nu )| \le C (|t-s|^{\alpha \beta }+|x-y|^\alpha +d_p(\mu ,\nu )^\alpha ). \end{aligned}$$
  3. iii.

    For all \(t\in [0,T]\) and \(\mu ,\nu \in \mathcal {P}_p(\mathbb {R}^d)\)

    $$\begin{aligned} \Vert B_t(\cdot ,\mu )-B_t(\cdot ,\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }} \le C d_p(\mu ,\nu ). \end{aligned}$$

Whenever it does not create confusion, we will simply denote by \(\Vert B\Vert \) the optimal constant C.

Corresponding to the above requirements, for \(H \le 1/2\) we define the following space:

Definition 2.2

For \(\alpha \in \mathbb {R}\), \(p\in [1,\infty )\) and \(q\in [1,\infty ]\), let \(\mathcal {G}^{q,\alpha }_{p}\) denote the class of measurable functions \(B:[0,T]\times \mathcal {P}_p(\mathbb {R}^d)\rightarrow B^\alpha _{\infty ,\infty }\) satisfying the following condition: there exists \(h\in L^q_T\) such that

  1. i.

    For all \((t,\mu )\in [0,T]\times \mathcal {P}_p(\mathbb {R}^d)\), we have \(\Vert B_t(\mu )\Vert _{B^\alpha _{\infty ,\infty }} \le h_t\).

  2. ii.

    For all \((t,\mu ,\nu )\in [0,T]\times \mathcal {P}_p(\mathbb {R}^d)\times \mathcal {P}_p(\mathbb {R}^d)\), we have \(\Vert B_t(\mu )-B_t(\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }} \le h_t d_p(\mu ,\nu )\).

Whenever it does not create confusion, we will simply denote by \(\Vert B\Vert \) the optimal constant \(\Vert h\Vert _{L^q_T}\).

Remark 2.3

It is readily checked that for \(\alpha \le {{\tilde{\alpha }}}\), \(p\ge {\tilde{p}}\) and \(q\le {\tilde{q}}\) we have \(\mathcal {G}^{{\tilde{q}},{\tilde{\alpha }}}_{{\tilde{p}}}\subset \mathcal {G}^{q,\alpha }_p\). Similarly, for \(\alpha \le {\tilde{\alpha }}\), \(\beta \le {\tilde{\beta }}\) and \(p\ge {\tilde{p}}\) it holds \(\mathcal {H}^{{\tilde{\beta }},{\tilde{\alpha }}}_{{\tilde{p}}} \subset \mathcal {H}^{\beta ,\alpha }_p\).

Roughly speaking, we say that X is a solution to the DDSDE (2.1) if, setting \(b^\mu _t:=B_t(\mathcal {L}(X_t))\), then X is a solution to the standard SDE (1.4) associated to \(b^\mu \), being interpreted in the Catellier–Gubinelli sense whenever \(b^\mu \) is singular; the pathwise theory for singular SDEs will be recalled in detail in Sect. 3. All the concepts of strong existence, pathwise uniqueness and uniqueness in law for DDSDEs then follow from the standard ones, see Definition 4.2 from Sect. 4.2.

Our first main result is the well-posedness of DDSDE (2.1) under suitable conditions on B; it can be seen as an extension of [17,   Theorem 15] to the distribution dependent case. The proof of the following theorem is given in Sects. 4.1 and 4.2.

Theorem 2.4

Let \(H>1/2\) and let \(B\in \mathcal {H}^{H,\alpha }_p\) for parameters

$$\begin{aligned} \alpha>1-\frac{1}{2H}>0,\quad p\in [1,\infty ). \end{aligned}$$
(2.2)

Then for any \(\mu _0\in \mathcal {P}_p(\mathbb {R}^d)\), strong existence, pathwise uniqueness and uniqueness in law hold for the DDSDE (2.1).

Similarly, let \(H\le 1/2\) and let \(B\in \mathcal {G}^{q,\alpha }_p\) for parameters

$$\begin{aligned} \alpha >1+\frac{1}{Hq}-\frac{1}{2H},\quad \alpha \in \mathbb {R}, \quad q\in (2,\infty ], \quad p\in [1,\infty ). \end{aligned}$$
(2.3)

Then for any \(\mu _0\in \mathcal {P}_p(\mathbb {R}^d)\), strong existence, pathwise uniqueness and uniqueness in law hold for the DDSDE (2.1).

Given a DDSDE (2.1), we will consider either \((\xi ,B)\) or \((\mu _0,B)\) to be the data of the problem, where we recall that \(\mathcal {L}(\xi )=\mu _0\). As already mentioned in the introduction, the solution X is entirely determined by the associated flow of measures \(\mu \in C_T\mathcal {P}_p\) given by \(\mu _t=\mathcal {L}(X_t)\): once this is known, the drift \(b^\mu _t=B_t(\mu _t)\) is determined as well and so we can reconstruct the strong solution X (or construct another copy of it on any probability space of interest). For this reason, it is quite useful to regard \(\mu \in C_T\mathcal {P}_p\) to be itself a solution to the DDSDE; the exact equivalence between \(\mu \) and X will be discussed rigorously in Lemma 4.4 from Sect. 4.2.

The next theorem provides stability estimates for the data-to-solution map \((\mu _0,B) \mapsto \mu \) (respectively \((\xi ,B)\mapsto X\)), showing that it is locally Lipschitz. The next theorem is proved in Sect. 4.3.

Theorem 2.5

Let \(\mu _0,\nu _0\in \mathcal {P}_p\) for some \(p\in [1,\infty )\). Then the following holds:

  1. i.

    For \(H>1/2\), let \(B^1,\,B^2\), be drifts in \(\mathcal {H}^{H,\alpha }_p\) with parameters satisfying (2.2) and let \(M>0\) be a constant such that \(\Vert B^i\Vert \le M\). Then there exists a constant \(C=C(\alpha ,H,T,M,p)\) such that, for any \(\mu _0^i\in \mathcal {P}_p(\mathbb {R}^d)\), the associated solutions \(\mu ^i\in C_T \mathcal {P}_p\) satisfy

    $$\begin{aligned} \sup _{t\in [0,T]} d_p(\mu ^1_t,\mu ^2_t) \le C \big (d_p(\mu ^1_0,\mu ^2_0) + \Vert B^1-B^2\Vert _{\infty }\big ), \end{aligned}$$
    (2.4)

    where

    $$\begin{aligned} \Vert B^1-B^2\Vert _{\infty }:=\sup _{(t,\mu )\in [0,T]\times \mathcal {P}_p} \Vert B^1(t,\mu )-B^2(t,\mu )\Vert _{B^{\alpha -1}_{\infty ,\infty }}. \end{aligned}$$

    If \(X^1,X^2\) are two associated solutions, in the sense of stochastic processes, defined on the same probability space, then there exists \(\gamma >1/2\) such that

    $$\begin{aligned} \mathbb {E}\Big [\Vert X^1-X^2 \Vert _{\gamma ;[0,T]}^p\Big ]^{1/p} \le C\big ( \Vert \xi ^1-\xi ^2\Vert _{L^p_\Omega } + \Vert B^1-B^2\Vert _\infty \big ). \end{aligned}$$
    (2.5)
  2. ii.

    For \(H\le 1/2\), let \(B^1,\,B^2\), be drifts in \(\mathcal {G}^{q,\alpha }_p\) with parameters satisfying (2.3) and let \(M>0\) be a constant such that \(\Vert B^i\Vert \le M\). Then there exists a constant \(C=C(\alpha ,H,T,M,p,q)\) such that, for any \(\mu _0^i\in \mathcal {P}_p(\mathbb {R}^d)\), the associated solutions \(\mu ^i\in C_T \mathcal {P}_p\) satisfy

    $$\begin{aligned} \sup _{t\in [0,T]} d_p(\mu ^1_t,\mu ^2_t) \le C \big (d_p(\mu ^1_0,\mu ^2_0) + \Vert B^1-B^2\Vert _{q,\infty }\big ). \end{aligned}$$
    (2.6)

    where

    $$\begin{aligned} \Vert B^1-B^2\Vert _{q,\infty }:=\bigg ( \int _0^T \sup _{\mu \in \mathcal {P}_p} \Vert B^1(t,\mu )-B^2(t,\mu )\Vert _{B^{\alpha -1}_{\infty ,\infty }}^{q} \mathop {}\!\text {d}t\bigg )^{1/q}. \end{aligned}$$

    If \(X^1,X^2\) are two associated solutions, in the sense of stochastic processes, defined on the same probability space, then there exists \(\gamma >1/2\) such that

    $$\begin{aligned} \mathbb {E}\Big [\Vert X^1-X^2\Vert _{\gamma ;[0,T]}^p\Big ]^{1/p} \le C\big ( \Vert \xi ^1-\xi ^2\Vert _{L^p_\Omega } + \Vert B^1-B^2\Vert _{q,\infty }\big ). \end{aligned}$$
    (2.7)

As the settings of Theorems 2.4 and 2.5 are very general, they do not allow one to exploit any specific structure of the DDSDE in consideration to obtain sharper results. A prototypical example of such structure, which arises in many practical applications, is given by convolutional drifts \(B_t(x,\mu ):= (b_t*\mu )(x)\). The associated DDSDE takes the form

$$\begin{aligned} X_t = \xi + \int _0^t (b_s *\mathcal {L}(X_s))(X_s)\mathop {}\!\text {d}s + W_t \quad \forall \, t\in [0,T]. \end{aligned}$$
(2.8)

As before we allow the drift b to be distributional, at least of the form \(b \in L^1_T B^\alpha _{p,p}\) for some \(\alpha \in \mathbb {R}\), \(p\in [1,\infty ]\); at this stage pointwise evaluation of \(b_s*\mathcal {L}(X_s)\) is not meaningful, instead we again interpret the equation in the Catellier–Gubinelli sense.

The heuristic idea behind the next results is that we can use the convolutional structure in a recursive way: assuming we are given a solution X with sufficiently regular law \(\mathcal {L}(X_\cdot )\), this in turn leads to an improved regularity for the effective drift \(b_\cdot *\mathcal {L}(X_\cdot )\), compared to the original b. The argument can be made rigorous by establishing a priori estimates and working with smooth approximations; as a result, we are able to establish well-posedness for (2.8) in situations where the general Theorem 2.4 does not apply.

In both results we are going to present, we will need some additional regularity for the initial data \(\mu _0\), in the form of an integrability assumption. This is because, as explained in the introduction, the lack of an underlying parabolic PDE prevents us from proving a smoothing effect at strictly positive times analogous to that of parabolic equations; rather, in order to develop a priori estimates, we will show that such integrability is propagated by the dynamics.

The next result shows existence and uniqueness of solutions to (2.8) in a suitable class, under an additional condition on \(divb\), which is by now quite standard since the pioneering work [12]. The proof of the next theorem is given in Sect. 5.1.

Theorem 2.6

Let \(H\in (0,1)\), \(q \in (2,\infty ]\), \(p\in [1,\infty ]\), \(p'\) its conjugate exponent. Assume that \(divb \in L^1_T L^\infty _x\) and either:

  1. i.

    if \(H>1/2\), then \(b\in C^{\alpha H}_T L^p_x\cap C^0_T B^\alpha _{p,p}\) for some \(\alpha >1-\frac{1}{2H}\);

  2. ii.

    if \(H \le 1/2\), then \(b \in L^q_T B^\alpha _{p,p}\) with \(1>\alpha >1-\frac{1}{2H}+\frac{1}{Hq}\).

Then for any \(\mu _0\in L^{p'}_x\) there exists a strong solution to (2.8), which satisfies

$$\begin{aligned} \sup _{t\in [0,T]} \Vert \mathcal {L}(X_t)\Vert _{L^{p'}_x}<\infty ; \end{aligned}$$
(2.9)

moreover uniqueness holds, both pathwise and in law, in the class of solutions satisfying (2.9).

Our second result in the convolutional case is established under \(L^q_T L^p_x\)-type assumptions on b; here instead of relying on a bound for \(divb\), we exploit Girsanov-based arguments to establish integrability of \(\mathcal {L}(X_t)\). This technique however only works in the regime \(H\le 1/2\). Section 5.2 contains the proof of the following result.

Theorem 2.7

Let \(d\ge 2\), \(H \le 1/2\), \((r,p,q) \in [1,\infty )^2 \times (2,\infty ]\) be such that

$$\begin{aligned} r>\frac{d}{d-1}, \quad \frac{1}{q}+\frac{Hd}{p}<\frac{1}{2}. \end{aligned}$$
(2.10)

Then for any \(b \in L^q_T L^{p}_x\) and \(\mu _0 \in L^r_x\), there exists a strong solution to (2.1), which satisfies

$$\begin{aligned} \sup _{t\in [0,T]} \Vert \mathcal {L}(X_t)\Vert _{L^{{\tilde{r}}}_x}<\infty \quad \forall \, {\tilde{r}}<r; \end{aligned}$$
(2.11)

moreover uniqueness holds, both pathwise and in law, in the class of solutions satisfying (2.11).

Remark 2.8

Condition (2.10) can be generalized in a way that allows values \(r\le d/(d-1)\) and that applies for \(d=1\), see Theorem 5.9 in Sect. 5.2 for more details. We warn the reader not to interpret Theorems 2.6 and 2.7 as full pathwise uniqueness (resp. uniqueness in law) statements: in general they do not exclude the existence of irregular solutions X which do not satisfy condition (2.9) (resp. (2.11)). However, as the proofs show, any solution constructed as the limit of smooth drifts \(b^n\rightarrow b\) does satisfy (2.9) (resp. (2.11)), thus it is the only physical solution to the DDSDE (2.1).

2.1 Examples

To illustrate the variety of situations to which Theorems 2.4 and 2.5 apply, we provide here several examples of functions contained in \(\mathcal {G}^{q,\alpha }_p\) and \(\mathcal {H}^{\beta ,\alpha }_p\).

Example 2.9

Let \(\alpha \in \mathbb {R}\), and for any \(y\in \mathbb {R}^d\), \(b:[0,T]\times \mathbb {R}^d \rightarrow B^\alpha _{\infty ,\infty }\) be a measurable map, \(b_t(\cdot ,y):=b_t(y)(\cdot )\), and suppose there exists \(h\in L^q_T\) for some \(q\in [1,\infty ]\) such that

$$\begin{aligned} \Vert b_t(\cdot ,y)\Vert _{B^\alpha _{\infty ,\infty }}\le & {} h_t,\quad \Vert b_t(\cdot ,y)-b_t(\cdot ,y')\Vert _{B^{\alpha -1}_{\infty ,\infty }}\\\le & {} h_t\, |y-y'|\quad \forall \, t\in [0,T],\, (y,y')\in \mathbb {R}^{2d}. \end{aligned}$$

Define now another measurable map \(B:[0,T]\times \mathcal {P}(\mathbb {R}^d)\rightarrow B^\alpha _{\infty ,\infty }\) by

$$\begin{aligned} B_t(\cdot ,\mu ):= \int _{\mathbb {R}^d}b_t(\cdot ,y)\,\mu (\mathop {}\!\text {d}y), \quad \forall \,(t,\mu )\in [0,T]\times \mathcal {P}_p(\mathbb {R}^d) \end{aligned}$$

where the integral is meaningful in the Bochner sense; then \(B\in \mathcal {G}^{q,\alpha }_p\) for any \(p\in [1,\infty )\).

Indeed, by the hypothesis on b, it is readily checked that

$$\begin{aligned} \Vert B_t(\cdot ,\mu )\Vert _\alpha \le \int _{\mathbb {R}^d}\Vert b_t(\cdot ,y)\Vert _\alpha \,\mu (\mathop {}\!\text {d}y) \le h_t \quad \forall \, t\in [0,T], \mu \in \mathcal {P}(\mathbb {R}^d); \end{aligned}$$

given \(\mu ,\nu \in \mathcal {P}(\mathbb {R}^d)\), let \(m\in \mathcal {P}(\mathbb {R}^{2d})\) be an optimal coupling for \(d_1(\mu ,\nu )\), then

$$\begin{aligned} \Vert B_t(\cdot ,\mu )-B_t(\cdot ,\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }}\le & {} \int _{\mathbb {R}^{2d}}\Vert b_t(\cdot ,y)-b_t(\cdot ,y')\Vert _{B^{\alpha -1}_{\infty ,\infty }}\, m(\text {d}y,\text {d}y')\\\le & {} h_t \int _{\mathbb {R}^{2d}} |y-y'|\, m(\text {d}y,\text {d}y') \end{aligned}$$

which implies that

$$\begin{aligned} \Vert B_t(\cdot ,\mu )-B_t(\cdot ,\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }} \le h_t\, d_1(\mu ,\nu ) \le h_t\, d_p(\mu ,\nu ) \quad \forall \, p\in [1,\infty ). \end{aligned}$$

Example 2.10

Given \(\alpha , \beta \in (0,1)\), assume that \(b:[0,T]\times \mathbb {R}^d\times \mathbb {R}^d\rightarrow \mathbb {R}^d\) satisfies

$$\begin{aligned} |b_t(x,y)| \le C,\quad |b_t(x,y)-b_s(x',y')| \le C(|t-s|^{\alpha \beta } + |x-x'|^\alpha + |y-y'|^\alpha ) \end{aligned}$$

for some \(C>0\), uniformly over \(s,\,t,\, x,\,x',\,y,\,y'\); we can identify b with the map \(b:[0,T]\times \mathbb {R}^d\rightarrow C^\alpha _x=B^{\alpha }_{\infty ,\infty }\) given by \((t,y)\mapsto b_t(\cdot ,y)\). Assume additionally that for the same constant C it holds

$$\begin{aligned} \Vert b_t(\cdot ,y)-b_t(\cdot ,y')\Vert _{B^{\alpha -1}_{\infty ,\infty }} \le C\, |y-y'| \end{aligned}$$

and define \(B:[0,T]\times \mathbb {R}^d\times \mathcal {P}(\mathbb {R}^d)\rightarrow \mathbb {R}^d\) by

$$\begin{aligned} B_t(x,\mu ):=\int _{\mathbb {R}^d} b_t(x,y)\,\mu (\mathop {}\!\text {d}y). \end{aligned}$$

Then \(B\in \mathcal {H}^{\beta ,\alpha }_p\) for any \(p\in [1,\infty )\). The verification of Conditions i. and iii. of Definition 2.1 is identical to that of Example 2.9, so we only need to focus on Condition ii. for \(p=1\).

Given \(\mu ,\,\nu \in \mathcal {P}(\mathbb {R}^d)\), let m be an optimal coupling for \(d_1(\mu ,\nu )\), then

$$\begin{aligned} |B_t(x,\mu )-B_s(x',\nu )|&= \bigg |\int _{\mathbb {R}^d} b_t(x,y)\mu (\mathop {}\!\text {d}y)-\int _{\mathbb {R}^d} b_s(x',y')\nu (\mathop {}\!\text {d}y')\bigg |\\&\le \int _{\mathbb {R}^{2d}} |b_t(x,y)-b_s(x',y')|\, m(\mathop {}\!\text {d}y,\mathop {}\!\text {d}y')\\&\le C \bigg (|t-s|^{\alpha \beta } + |x-x'|^\alpha + \int _{\mathbb {R}^{2d}} |y-y'|^\alpha \, m(\mathop {}\!\text {d}y,\mathop {}\!\text {d}y')\bigg )\\&\le C \big ( |t-s|^{\alpha \beta } + |x-x'|^\alpha + d_1(\mu ,\nu )^\alpha \big ) \end{aligned}$$

where in the last step we used Jensen’s inequality and the optimality of m.

Example 2.11

Consider now \(B_t(\cdot ,\mu ):=b_t*\mu \), where \(b\in L_T^q B^\alpha _{\infty ,\infty }\) for some \(\alpha \in \mathbb {R}\); then \(B\in \mathcal {G}^{q,\alpha }_p\) for any \(p\in [1,\infty )\).

Indeed, the verification of Condition i. from Definition 2.2 is the same as in Example 2.9, where now we can take \(h_\cdot = \Vert b_\cdot \Vert _{B^\alpha _{\infty ,\infty }}\in L^q_T\). Moreover by Lemma A.7 in Appendix A, for any \(\mu ,\nu \in \mathcal {P}(\mathbb {R}^d)\) and any \(p\in [1,\infty )\) it holds

$$\begin{aligned} \Vert B_t(\cdot ,\mu )-B_t(\cdot ,\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }} = \Vert b_t *(\mu -\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }} \lesssim \Vert b_t\Vert _{B^\alpha _{\infty ,\infty }}\, d_p(\mu ,\nu ). \end{aligned}$$

Similarly, given \(\alpha ,\beta \in (0,1)\), let \(b\in C^{\alpha \beta }_T C^0_x\cap C_T C^\alpha _x\) and set \(B_t(\cdot ,\mu )=b_t*\mu \); then \(B\in \mathcal {H}^{\beta ,\alpha }_p\) for any \(p\in [1,\infty )\).

The verification of Conditions i. and ii. from Definition 2.1 follows from Example 2.10, as we can simply set \({\tilde{b}}_t(x,y):=b_t(x-y)\) and apply the calculations therein to \({\tilde{b}}\). Condition i. instead follows as above from an application of Lemma A.7.

Finally, let us point out that all the computations carry over to the case \(B_t(\cdot ,\mu )=b^1_t*\mu +b^2_t\) for \(b^i\in L^q_T B^\alpha _{\infty ,\infty }\) (resp. \(b^i\in C^{\alpha \beta }_T C^0_x\cap C_T C^\alpha _x\)).

Example 2.12

Let \(b:[0,T]\times \mathbb {R}^d\rightarrow B^\alpha _{\infty ,\infty }\) be as in Example 2.9 and \(\phi :\mathbb {R}^d\rightarrow \mathbb {R}^d\) be a globally Lipschitz with constant \(\llbracket \phi \rrbracket _{Lip}\); define \(B:[0,T]\times \mathcal {P}_1(\mathbb {R}^d)\rightarrow B^\alpha _{\infty ,\infty }\) by

$$\begin{aligned} B_t(\cdot ,\mu )= b_t(\cdot , \langle \phi ,\mu \rangle ), \quad \text {where } \langle \phi ,\mu \rangle :=\int _{\mathbb {R}^d} \phi (x)\, \mu (\mathop {}\!\text {d}x). \end{aligned}$$

Then \(B\in \mathcal {G}^{q,\alpha }_p\) for any \(p\in [1,\infty )\). Similarly, given b as in Example 2.10, with B defined as above, it is easy to verify that \(B\in \mathcal {H}^{\alpha ,\beta }_p\) for any \(p\in [1,\infty )\).

As a prototypical example, one may consider \(b\in B^\alpha _{\infty ,\infty }\) and define

$$\begin{aligned} B_t(\cdot ,\mu )=B(\cdot ,\mu ):=b(\cdot -\langle x,\mu \rangle ) \quad \text {where } \langle x,\mu \rangle :=\int _{\mathbb {R}^d} x\,\mu (\mathop {}\!\text {d}x) \end{aligned}$$

in which case, similarly to before, it holds \(B\in \mathcal {G}^{q,\alpha }_p\) for any \(q\in [1,\infty ]\) (resp. \(B\in \mathcal {H}^{\beta ,\alpha }_p\) for any \(\beta \in (0,1)\)) and \(p\in [1,\infty )\).

We highlight that this class of examples are quite important since B is only defined on \(\mathcal {P}_1(\mathbb {R}^d)\) and not on the whole \(\mathcal {P}(\mathbb {R}^d)\), thus making the use of other notions of distance between measures (e.g. total variation norm) more difficult to handle. It can be further generalized to the case \(\phi :\mathbb {R}^d\rightarrow \mathbb {R}^m\) for another \(m\in \mathbb {N}\) (namely, B is determined by m statistics associated to \(\mu \)) or to dependence on p-moments like \(B_t(\cdot ,\mu )=b_t(\cdot ,\Vert \mu \Vert _p)\) for \(\mu \in \mathcal {P}_p(\mathbb {R}^d)\); for \(p>1\) we can also allow \(\phi \) to grow more than linearly at infinity.

3 SDEs driven by fBm

In this section we revisit the theory of singular SDEs driven by fBm, in order to derive useful estimates to apply later to the DDSDE setting. Sections 3.1 and 3.2 serve as a recap of key facts, respectively the pathwise meaning of singular SDEs and the regularising properties of fractional Brownian motion. Sections 3.3 and 3.4 instead provide novel results, Theorem 3.13 being the most important for our purposes.

Although the material of Sects. 3.13.2 is strongly based on the works previous [9, 15, 17, 23], we felt obliged to provide the proofs of several key results for technical but rather important reasons. On the one hand, the aforementioned works are focused entirely on a pathwise setting, never establishing clear probabilistic concepts of solutions (cf. Definitions 3.43.5 below); on the other hand, previously singular drifts \(b\in L^q_T B^\alpha _{\infty ,\infty }\) were treated in [9] only in the autonomous case, while in [17] when they are compactly supported in space. As neither option fits our setting nicely (consider drifts of the form \(b=\tilde{b}*\mu _t\)) we extend the results therein to suit our analysis of DDSDEs.

3.1 Pathwise SDEs as nonlinear Young equations

Consider a standard SDE of the form

$$\begin{aligned} X_t=X_0+\int _0^t b(s,X_s)\mathop {}\!\text {d}s + W_t,\quad \forall \, t\in [0,T], \end{aligned}$$
(3.1)

where \(b\in L^1_T B^\alpha _{\infty ,\infty }\) with \(\alpha \in \mathbb {R}\) and W is an \(\mathbb {R}^d\)-valued fractional Brownian motion.

When \(\alpha >0\), the SDE has a classical meaning; it can be solved pathwise by standard ODE theory if b is regular enough, e.g. \(\alpha >1\). We will say that b is a distributional drift (sometimes distributional field) if instead \(\alpha <0\), in which case pointwise evaluation is not allowed, and we cannot give meaning to the integral appearing in (3.2) in the classical Lebesgue sense.

To deal with distributional drifts, we will employ the nonlinear Young integral framework, first developed in [9]; to present it, we first need the concept of averaged field.

Let us give an heuristic motivation before going into technical details. In the regular regime \(\alpha >0\), if X is a solution to (3.1), by the change of variables \(\theta _t:=X_t-W_t\) we find that \(\theta \) solves

$$\begin{aligned} \theta _t=\theta _0+\int _0^t b(s,\theta _s+W_s)\mathop {}\!\text {d}s. \end{aligned}$$
(3.2)

Closely related to the above integral is the averaging of the field b along the curve W, namely the space-time function

$$\begin{aligned} T^W b(t,x):=\int _0^t b(s,x+W_s)\mathop {}\!\text {d}s \end{aligned}$$
(3.3)

which we call an averaged field; we will write \(T^W_{s,t}b(x):=T^W b(t,x)-T^W b(s,x)\).

As long as b is at least measurable and bounded, both integrals appearing in (3.2) and (3.3) are well defined. However, for distributional b, while equation (3.2) breaks down, the averaged field \(T^W b\) is still meaningful in the distributional sense, see [17,   Section 3.1]; moreover, depending on the properties of W, \(T^W b\) might even be continuous or (higher order) differentiable in the spatial variable.

The fundamental intuition of [9] is that the regularity of \(T^W b\) can be used to give meaning to (3.2), thus also to (3.1), by reformulating the SDE as a nonlinear Young equation.

As the next statement shows, given any space-time function \(A:[0,T]\times \mathbb {R}^d\rightarrow \mathbb {R}^d\) and path \(\theta :[0,T]\rightarrow \mathbb {R}^d\) of suitable regularity, it’s possible to give meaning to \(\int _0^t \partial _t A(s,\theta _s)\mathop {}\!\text {d}s\) also when \(\partial _t A\) is not well defined anymore.

Proposition 3.1

Let \(\gamma >1/2\) and consider a function \(A\in C^\gamma _T C^1_{loc}\) and a path \(\theta \in C^\gamma _T\). Then for any interval \([s,t]\subset [0,T]\) and any sequence of partitions \(\mathcal {D}_n\) of [st] with mesh converging to zero, the following limit exists and is independent of the chosen sequence:

$$\begin{aligned} \int _s^t A(\mathop {}\!\text {d}s,\theta _s):= \lim _{n\rightarrow \infty } \sum _{[u,v]\in \mathcal {D}_n} A_{u,v}(\theta _u). \end{aligned}$$

We will refer to it as a nonlinear Young integral. Furthermore:

  1. i.

    The integral is additive: \(\int _s^tA(\mathop {}\!\text {d}u,\theta _u)=\int _s^rA(\mathop {}\!\text {d}u,\theta _u)+\int _r^t A(\mathop {}\!\text {d}u,\theta _u)\) for any \(r\in [s,t]\).

  2. ii.

    If \(\partial _t A\) exists and is continuous, then \(\int _0^t A(\mathop {}\!\text {d}u,\theta _u)=\int _0^t \partial _u A(u,\theta _u)\mathop {}\!\text {d}u\).

  3. iii.

    The map from \(C^\gamma _T C^1_{loc}\times C^\gamma _T\) to \(C^\gamma _T\) given by \((A,\theta )\mapsto \int _0^\cdot A(\mathop {}\!\text {d}u,\theta _u)\) is linear in A and continuous in both variables. Namely, if \(A^n\rightarrow A\) in \(C^\gamma _T C^1_{loc}\) and \(\theta ^n\rightarrow \theta \) in \(C^\gamma _T\), then \(\int _0^\cdot A^n(\mathop {}\!\text {d}s,\theta ^n_s)\rightarrow \int _0^\cdot A(\mathop {}\!\text {d}s,\theta _s)\) in \(C^\gamma _T\).

Proof

The statement is a particular subcase of [15,   Theorem 2.7]. \(\square \)

We provided the statement only for \(A\in C^\gamma _T C^1_{loc}\) as this setting is sufficient for our purposes, but let us mention that the theory is more general and allows to consider \(A\in C^\gamma _T C^\nu _{loc}\), \(\theta \in C^\rho _T\) for \(\gamma +\nu \rho >1\). With the above result at hand, we can now define nonlinear Young equations.

Definition 3.2

Let \(A\in C^\gamma _T C^1_{loc}\) with \(\gamma >1/2 \), \(\theta _0\in \mathbb {R}^d\); we say that \(\theta \) is a solution to the nonlinear Young equation associated to \((\theta _0,A)\) if \(\theta \in C^\gamma _T\) and

$$\begin{aligned} \theta _t=\theta _0+\int _0^t A(\mathop {}\!\text {d}s,\theta _s)\quad \forall \, t\in [0,T]. \end{aligned}$$
(3.4)

For later use, we provide the following technical lemma; loosely speaking it shows that solutions to nonlinear Young equations have a closure property.

Lemma 3.3

Let \(\gamma >1/2\), \(A\in C^\gamma _T C^1_{loc}\) and \(\{A^n\}_{n\in \mathbb {N}}\) be a sequence converging to A in \(C^\gamma _T C^1_{loc}\); suppose that for each n there exists a solution \(\theta ^n\) associated to \((\theta _0,A^n)\) and that \(\theta ^n\rightarrow \theta \) in \(C^\gamma _T\). Then \(\theta \) solves the nonlinear Young equation associated to \((\theta _0,A)\).

Proof

This is a direct consequence of Point iii. of Proposition 3.1. By assumption

$$\begin{aligned} \theta ^n_t = \theta _0 + \int _0^t A^n(\mathop {}\!\text {d}s,\theta ^n_s) \quad \forall \, t\in [0,T],\,n\in \mathbb {N}\end{aligned}$$

and we can pass to the limit on both sides thanks to the continuity of \((\theta ,A)\mapsto \int _0^\cdot A(\mathop {}\!\text {d}s,\theta _s)\). \(\square \)

We are now ready to explain what it means for X to be a solution to (3.1) when b is distributional but \(T^W b\) is regular enough: roughly speaking, we impose the condition \(X=\theta + W\), where \(\theta \) solves the nonlinear YDE associated to \(A=T^W b\), which is the natural extension of (3.2). Although so far we have always dealt with a stochastic process W, this is a pathwise notion of solution, in the sense that for any fixed realization of \(W(\omega )\) such that \(T^{W(\omega )} b\in C^\gamma _T C^1_{loc}\) we have an analytically well-defined equation of the form (3.4). This is encoded in the next definition, inspired by [18,   Section 4.3], which contains a more in-depth discussion of various related concepts.

Definition 3.4

Let \((\Omega ,\mathcal {F},\mathbb {P})\) be a probability space, \((\xi ,W)\) an \(\mathbb {R}^d\times C_T\)-valued random variable defined on it and let b be a distributional field. We say that another \(C_T\)-valued random variable X on \((\Omega ,\mathcal {F},\mathbb {P})\) is a pathwise solution to the SDE (3.1) associated to \((b,\xi , W)\) if there exists \(\Omega '\subset \Omega \) with \(\mathbb {P}(\Omega ')=1\) and a deterministic \(\gamma >1/2\) such that for all \(\omega \in \Omega '\) the following hold:

  1. i.

    \(T^{W(\omega )} b \in C^\gamma _T C^1_{loc}\);

  2. ii.

    \(\theta (\omega ):= X(\omega )-W(\omega )\in C^\gamma _T\);

  3. iii.

    \(\theta (\omega )\) satisfies the nonlinear Young equation

    $$\begin{aligned} \theta _t(\omega ) = \xi (\omega ) - W_0(\omega ) + \int _0^t T^{W(\omega )} b(\mathop {}\!\text {d}s, \theta _s(\omega )) \quad \forall \, t\in [0,T]. \end{aligned}$$

The following definition relates standard probabilistic notions of weak and strong solutions and of uniqueness to the notion of pathwise existence given in Definition 3.4.

Definition 3.5

Let b be a distributional field, \(\nu \in \mathcal {P}(\mathbb {R}^d\times C_T)\). A tuple \((\Omega ,\mathcal {F},\mathbb {P}; X,\xi , W)\) given by a probability space \((\Omega ,\mathcal {F},\mathbb {P})\) and a \(C_T\times \mathbb {R}^d\times C_T\)-valued random variable is a weak solution to the SDE (3.1) associated to \((b,\nu )\) if \(\mathcal {L}_\mathbb {P}(\xi ,W)=\nu \) and X is a pathwise solution associated to \((b,\xi ,W)\) in the sense of Definition 3.4. We say that X is a strong solution if it is adapted to the filtration \(\mathcal {F}_t=\sigma \{\xi , W_s\,|\, s\le t\}\). Weak uniqueness holds for the SDE associated to \((b,\nu )\) if any given weak solutions \((\Omega ^i ,\mathcal {F}^i,\mathbb {P}^i;X^i,\xi ^i,W^i)\), \(i=1,2\), associated to the same data \((b,\nu )\), satisfy \(\mathcal {L}_{\mathbb {P}^1}(X^1)=\mathcal {L}_{\mathbb {P}^2}(X^2)\). Similarly, pathwise uniqueness holds if any two given solutions \((X^i,\xi ,W)\) defined on the same probability space, w.r.t. the same \((b,\xi ,W)\), satisfy \(X^1=X^2\) \(\mathbb {P}\)-a.s.

In line with the above definition, we will use the standard terminology that weak (resp. strong) existence holds for the SDE associated to \((b,\nu )\) to mean that we can construct a weak (resp. strong) solution \((\Omega ,\mathcal {F},\mathbb {P};X,\xi ,W)\). In particular, if strong existence holds, then \((\Omega ,\mathcal {F},\mathbb {P})\) can be chosen to be the canonical space, namely with \(\Omega =\mathbb {R}^d\times C_T\), \(\mathbb {P}=\nu \) and \(\mathcal {F}\) the completion of \(\mathcal {B}(\mathbb {R}^d\times C_T)\) under \(\nu \).

Remark 3.6

If \(b\in C_T C^1_{loc}\), then any classical solution to (3.1) is of the form \(X=W+\theta \) for \(\theta \in C^1_T\); moreover in this case \(T^W b\in C^1_T C^1_{loc}\) and \(\partial _t T^W b(t,x) = b(t,x+W_t)\). It then follows from Point ii. of Proposition 3.1 that in this setting the concept of pathwise solution from Definition 3.4 is equivalent to the standard one. Moreover for \(b\in C_T C^1_{loc}\) standard ODE theory guarantees pathwise uniqueness, uniqueness in law and strong existence of solutions for the SDE associated to \((b,\nu )\) for any choice of \(\nu \in \mathcal {P}(\mathbb {R}^d\times C_T)\).

The next lemma provides a simple condition to establish uniqueness of solutions to (3.1).

Lemma 3.7

Let \((\Omega ,\mathcal {F},\mathbb {P})\) be a probability space, \((X,\xi ,W)\) be a triple defined on it such that X solves the SDE associated to \((b,\xi ,W)\) in the sense of Definition 3.4. If \(T^{X(\omega )} b\in C^\gamma _T C^1_{loc}\) for \(\mathbb {P}\)-a.e. \(\omega \), then any other solution \({\tilde{X}}\) defined on the same probability space and associated to \((b,\xi ,W)\) must coincide with it, in the sense that \(X={\tilde{X}}\) \(\mathbb {P}\)-a.s.

Proof

The statement is a useful rewriting of [17,   Remark 15]. \(\square \)

Let us stress that, even when the assumptions of Lemma 3.7 are met, pathwise uniqueness doesn’t immediately follow, unless one can additionally show that X is a strong solution.

3.2 Regularity of averaged fields and Girsanov transform for fBm

In Sect. 3.1 we have treated the SDE (3.1) in full generality, but in the remainder of Sect. 3 we will deal with a slightly more specific setting. We will always take W to be an \(\mathbb {R}^d\)-valued fBm of parameter \(H\in (0,1)\) and \(\xi \) to be random initial data independent of it; in particular \(W_0\equiv 0\) and \(\nu = \mathcal {L}(\xi ,W)=\mathcal {L}(\xi )\otimes \mathcal {L}(W)=\mu _0\otimes \mu ^H\) for some \(\mu _0\in \mathcal {P}(\mathbb {R}^d)\), where \(\mu ^H\in \mathcal {P}(C_T)\) denotes the law of fBm of parameter \(H\in (0,1)\). Therefore for fixed H we can regard the data of the problem to be the pair \((\mu _0,b)\); if the initial data \(\xi =x_0\in \mathbb {R}^d\) is deterministic, with a slight abuse we will write \((x_0,b)\) in place of \((\delta _{x_0},b)\).

We begin by showing the \(\mathbb {P}\)-a.s. regularity of averaged fields \(T^W b\) for W sampled as an fBm. We continue to make use of the intuitive notation

$$\begin{aligned} \int _0^t b(r,x+W_r)\mathop {}\!\text {d}r = T^W b, \end{aligned}$$

despite the fact that in general these objects will not be defined as Lebesgue integrals; rather they are random variables defined on \((\Omega ,\mathcal {F},\mathbb {P})\) constructed as the unique limits of \(\int _0^t b^n(r,x+W_r)\mathop {}\!\text {d}r\) for any sequence \(b^n\rightarrow b\) in appropriate topologies. More precisely, for \(b\in L^q_T B^\alpha _{\infty ,\infty }\) satisfying the assumptions of Proposition 3.8 below, one can consider any sequence \(b^n\) of smooth bounded fields such that \(b^n\rightarrow b\) in \(L^q_T B^{\alpha -\varepsilon }_{\infty ,\infty }\) for all \(\varepsilon >0\).

Proposition 3.8

Let \(b\in L^q_T B^\alpha _{\infty ,\infty }\) with \(\alpha <0\), \(q\in (2,\infty ]\), W be a fBm of parameter \(H\in (0,1)\); suppose \((\alpha ,q)\) satisfy

$$\begin{aligned} \gamma :=1-\frac{1}{q}+\alpha H>\frac{1}{2}. \end{aligned}$$
(3.5)

Then for any \({\tilde{\gamma }}<\gamma \) there exists an increasing function K (depending on dT and the above parameters) such that

(3.6)

uniformly over \(x\in \mathbb {R}^d\) and \(b\in L^q_T B^\alpha _{\infty ,\infty }\), \(b\ne 0\).

Proof

As the proof follows quite closely the ones given in [17,   Section 3.3], we only provide a sketch. Let b be smooth and compactly supported, otherwise one can argue by density; up to reasoning componentwise, scaling and shifting, we can assume \(x=0\), \(b\in C^\infty _c(\mathbb {R}^d)\) and \(\Vert b\Vert _{L^q_T B^\alpha _{\infty ,\infty }} = 1\), so it will never appear in the computations in the sequel.

Set \(W^{(2)}_{s,t}=\mathbb {E}[W_t|\mathcal {F}_s]\) for \(\mathcal {F}_s=\sigma \{W_r: r\le s\}\), then by [17,   Lemma 5] there exist \(c_H,\,\tilde{c}_H>0\) such that

$$\begin{aligned} \int _s^t b(r,W_r)\mathop {}\!\text {d}r= & {} \int _s^t P_{\tilde{c}_H |r-s|^{2H}} b(r,W^{(2)}_{s,r})\mathop {}\!\text {d}r \nonumber \\&\quad + c_H\int _s^t \int _u^t P_{\tilde{c}_H |r-u|^{2H}} \nabla b(r,W^{(2)}_{u,r})\, |r-u|^{H-1/2} \mathop {}\!\text {d}r \cdot \mathop {}\!\text {d}B_u \nonumber \\ \end{aligned}$$
(3.7)

where \(P_t\) denotes the Gaussian heat kernel, and \(B_t\) is a standard Brownian motion in \(\mathbb {R}^d\). In the following we will drop the constants \(c_H,\tilde{c}_H\), as they don’t play any significant role. Thus for any fixed \(s<t\), it holds

$$\begin{aligned} \int _s^t b(r,W_r)\mathop {}\!\text {d}r = I^1_{s,t} + I^2_{s,t} = I^1_{s,t} + \int _s^t J_{u,t}\cdot \mathop {}\!\text {d}B_u \end{aligned}$$

where

$$\begin{aligned} I^1_{s,t} := \int _s^t P_{|r-s|^{2H}} b(r,W^{(2)}_{s,r})\mathop {}\!\text {d}r, \quad J_{u,t} :=\int _u^t P_{|r-u|^{2H}} \nabla b(r,W^{(2)}_{u,r}) |r-u|^{H-1/2} \mathop {}\!\text {d}r. \end{aligned}$$

Let us show how to obtain exponential estimates for \(I^2\), the ones for \(I^1\) being similar. Going through analogous computations to [17,   Theorem 4], invoking heat kernel type estimates, it holds

$$\begin{aligned} |J_{u,t}|&\lesssim \int _u^t \Vert P_{|r-u|^{2H}} \nabla b_r \Vert _{L^\infty _x}\, |r-u|^{H-1/2} \mathop {}\!\text {d}r\\&\lesssim \int _u^t \Vert b_r\Vert _{B^\alpha _{\infty ,\infty }} |r-u|^{H\alpha -1/2} \mathop {}\!\text {d}r\\&\lesssim \Vert b\Vert _{L^q_T B^\alpha _{\infty ,\infty }} |t-u|^{1/2 - 1/q + H\alpha } = |t-u|^{\gamma -1/2}. \end{aligned}$$

Applying Burkholder-Davis-Gundy inequality with optimal asymptotic behaviour for large p, we deduce that

$$\begin{aligned} \mathbb {E}[|I^2_{s,t}|^p]^{1/p} \lesssim \sqrt{p}\, \mathbb {E}\bigg [ \bigg (\int _s^t |J_{u,t}|^2\mathop {}\!\text {d}u\bigg )^{p/2}\bigg ]^{1/p} \lesssim \sqrt{p} |t-s|^\gamma \end{aligned}$$

Putting everything together, there exists a constant \(C>0\) such that for any \(\eta >0\)

and by Stirling’s approximation the last series is convergent for any \(\eta < (C e)^{-1}\). Together with similar estimates for \(I^1\), we conclude that there exists \(\tilde{\eta }>0\) sufficiently small and \(C>0\) such that

The above estimate together with [17,   Lemma 18] implies that for any \(\tilde{\gamma }<\gamma \) there exist \(\bar{\eta }>0\) and \(\kappa >0\) such that

(3.8)

It remains to show that we can improve the above inequality by allowing any value \(\eta >0\), so that we reach (3.6). To do so, we will resort to an interpolation trick, similar in style to techniques already applied in [17,   Theorem 15], [9,   Corollary 4.6].

First, observe that if \(\alpha ,q,H\) satisfy (3.5) and we fix \({\tilde{\gamma }}<\gamma \), then we can find \(\varepsilon \) sufficiently small so that \(\gamma ^\varepsilon = 1-1/q-(\alpha -\varepsilon )H>1/2\) and \({\tilde{\gamma }} < \gamma ^\varepsilon \); then by estimate (3.8) (for \(\alpha -\varepsilon \) in place of \(\alpha \)) and linearity, there exist \(\bar{\eta }>0\) and \(\kappa >0\) such that

(3.9)

As before we can assume \(\Vert b\Vert _{L^q_T B^\alpha _{\infty ,\infty }}=1\) and we fix \(\varepsilon >0\) as above. Then for any \(N\in \mathbb {N}\) we can decompose b as

$$\begin{aligned} b_t = b^{1,N}_t + b^{2,N}_t,\quad b^{1,N}_t = \sum _{j\le N} \Delta _j b_t,\quad b^{2,N}_t =\sum _{j>N} \Delta _j b_t \end{aligned}$$

where \(\Delta _j\) denote Littlewood-Paley blocks. There exists \(C>0\) such that

$$\begin{aligned} \Vert b^{1,N}\Vert _{L^q_T C^0_x} \le C\, 2^{-N\alpha }, \quad \Vert b^{2,N}\Vert _{L^q_T B^{\alpha -\varepsilon }_{\infty ,\infty }} \le C\, 2^{-N\varepsilon }. \end{aligned}$$

Now for a given \(\eta >0\), choose \(N=N(\eta )\in \mathbb {N}\) such that \(\eta \le C^{-2} 2^{2N\varepsilon -1} \bar{\eta }\) and decompose b as above; w.l.o.g. we may assume that \(b^{2,N}\ne 0\), otherwise the stated estimate is trivial. Clearly under (3.5) it holds that \({\tilde{\gamma }}\le 1-1/q\), therefore setting \(\beta =1-1/q-{\tilde{\gamma }}\) we have

where the estimate is deterministic; combining it with (3.9) applied to \({\tilde{b}}=b^{2,N}\), we get

where the estimate now holds for all \(\eta \ge 0\). \(\square \)

Corollary 3.9

Let \(b\in L^q_T B^\alpha _{\infty ,\infty }\) with \(\alpha <0\), \(q\in (2,\infty ]\), W be a fBm of parameter \(H\in (0,1)\) and let \(\rho \in (0,1]\); suppose \((\alpha ,\rho ,q)\) satisfy

$$\begin{aligned} \gamma :=1-\frac{1}{q}+(\alpha -\rho ) H>\frac{1}{2}. \end{aligned}$$
(3.10)

Then for any \({\tilde{\gamma }}<\gamma \) there exists an increasing function K (depending on dT and the previous parameters) such that

$$\begin{aligned} \mathbb {E}\Bigg [\exp \bigg (\eta \, \bigg | \frac{\llbracket \int _0^\cdot b(r,x+W_r)\text {d}r-\int _0^\cdot b(r,y+W_r)\text {d}r\rrbracket _{\tilde{\gamma }}}{\Vert b\Vert _{L^q_T B^\alpha _{\infty ,\infty }} |x-y|^{\rho }}\bigg |^2\bigg )\Bigg ]\le K(\eta )\quad \forall \, \eta \ge 0\nonumber \\ \end{aligned}$$
(3.11)

uniformly over \(x\ne y\in \mathbb {R}^d\) and \(b\in L^q_T B^\alpha _{\infty ,\infty }\), \(b\ne 0\); as a consequence, for any \(\varepsilon >0\), \(\mathbb {P}\)-a.s. \(T^W b \in C^{\gamma -\varepsilon }_T C^{\rho -\varepsilon }_{loc}\). Suppose now \(b\in L^q_T B^\alpha _{\infty ,\infty }\) with \(\alpha <1\), \(q\in (2,\infty ]\) satisfying

$$\begin{aligned} \alpha -\frac{1}{Hq}>1-\frac{1}{2H}, \end{aligned}$$
(3.12)

then the following hold:

  1. i.

    There exists \({\tilde{\gamma }}>1/2\) such that \(\mathbb {P}\)-a.s. \(T^W b\in C^{{\tilde{\gamma }}}_T C^1_{loc}\).

  2. ii.

    There exists \({\tilde{\gamma }}>1/2\) such that for any \(b^1,b^2\in L^q_T B^\alpha _{\infty ,\infty }\) and any \(n\in \mathbb {N}\)

  3. iii.

    If \(H<1/2\), \(\alpha <0\), there exists \({\tilde{\gamma }}>H+1/2\) and an increasing function K such that

Proof

Given b as above, \(x\ne y\) fixed, define \({\tilde{b}}(t,\cdot )= |x-y|^{-\rho }\, [b(t,x+\cdot )-b(t,y+\cdot )]\); by properties of Besov spaces

$$\begin{aligned} \Vert \tilde{b}\Vert _{L^q_T B^{\alpha -\rho }_{\infty ,\infty }} \lesssim \Vert b \Vert _{L^q_T B^\alpha _{\infty ,\infty }}. \end{aligned}$$

Inequality (3.11) follows from (3.6) applied to \({\tilde{b}}\), since by assumption (3.10) \({\tilde{\alpha }}=\alpha -\rho \) satisfies (3.5); \(T^W b\) belonging to \(C^{\gamma -\varepsilon }_T C^{\rho -\varepsilon }_{loc}\) is a consequence of Garsia-Rodemich-Rumsay lemma.

We now assume (3.12) holds and prove points i.-iii.

If \(b\in L^q_T B^\alpha _{\infty ,\infty }\), then \(D_x b\in L^q_T B^{\alpha -1}_{\infty ,\infty }\) with \({\tilde{\alpha }}=\alpha -1\) satisfying (3.5), so we can find \(\rho >0\) small enough such that \(({\tilde{\alpha }},\rho )\) satisfy (3.10) as well. It follows that \(T^W D_x b=D_x T^W b\in C^{\gamma -\varepsilon }_T C^0_{loc}\), namely \(T^W b\in C^{\gamma -\varepsilon }_T C^1_{loc}\), for any \(\varepsilon >0\), showing i..

For \(\tau =T\), the statement in part ii. is again a consequence of (3.6) (for \(x=0\) and \({\tilde{\alpha }}=\alpha -1\)) and the linearity of \(b\mapsto T^W b\). For general \(\tau \in [0,T]\), define \(\tilde{b}^i_t= b^i_t\, \mathbbm {1}_{[0,\tau ]}(t)\) and observe that

the estimate for general \(\tau \) thus follows applying the one for \(\tau =T\) to \({\tilde{b}}^i\).

Finally, in order to prove iii. it is enough to show that

$$\begin{aligned} \gamma = 1-\frac{1}{q}+\alpha H > H + 1/2, \end{aligned}$$

as in that case we can find \(\tilde{\gamma }\in (H+1/2,\gamma )\) such that (3.6) holds. But the above condition on \(\gamma \) is exactly (3.12). \(\square \)

In order to apply Lemma 3.7, we need some information on the pathwise properties of weak solutions X. From this perspective, techniques based on Girsanov theorem are very natural, as they suggest that \(T^X b\) may have the same regularity as \(T^W b\). As already mentioned, Girsanov transform holds for fBm, see [37]; sufficient conditions in order to apply it in our context (in particular to check that Novikov condition is satisfied) can be found in [17,   Section 4.2.2], to which we also refer for more details on the explicit formula for \(\mathop {}\!\text {d}\mathbb {P}/ \mathop {}\!\text {d}\mathbb {Q}\).

Proposition 3.10

Let \((\Omega ,\mathcal {F},\{\mathcal {F}_t\}_{t\ge 0},\mathbb {P})\) be a filtered probability space, W be an \(\mathcal {F}_t\)-fBm of parameter \(H\in (0,1)\) and h be an \(\mathcal {F}_t\)-adapted process with trajectories in \(C^\gamma _T\), \(\gamma >H+1/2\), such that \(h_0=0\) and

$$\begin{aligned} \mathbb {E}_\mathbb {P}[\exp (\eta \llbracket h\rrbracket _{\gamma }^2)] \le K(\eta )<\infty \quad \forall \, \eta \in \mathbb {R}. \end{aligned}$$

Then there exists another probability measure \(\mathbb {Q}\), given by Girsanov theorem, such that \(h+W\) is distributed as an \(\mathcal {F}_t\)-fBm under \(\mathbb {Q}\). Moreover \(\mathbb {P}\) and \(\mathbb {Q}\) are equivalent and it holds

$$\begin{aligned} \mathbb {E}_\mathbb {Q}\Big [ \Big ( \frac{\mathop {}\!\text {d}\mathbb {P}}{\mathop {}\!\text {d}\mathbb {Q}}\Big )^n + \Big (\frac{\mathop {}\!\text {d}\mathbb {Q}}{\mathop {}\!\text {d}\mathbb {P}} \Big )^n\Big ] <\infty \quad \forall \, n\in \mathbb {N}\end{aligned}$$
(3.13)

where the above estimate only depends on the function K.

Proof

Follows almost exactly as the proof of [17,   Theorem 14]. \(\square \)

Remark 3.11

For \(H\le 1/2\) and \(b\in L^q_T B^\alpha _{\infty ,\infty }\) with \((\alpha ,q,H)\) satisfying (3.12), it follows from Corollary 3.9 and Proposition 3.10 that we can construct a weak solution \((\Omega ,\mathcal {F},\mathbb {P};X,W)\) to the SDE associated to \((x_0,b)\), with the property that there exists a measure \(\mathbb {Q}\) equivalent to \(\mathbb {P}\) such that \(\mathcal {L}_\mathbb {Q}(X)=\mathcal {L}_\mathbb {P}(x_0+W)\); moreover all the moments of \(\mathop {}\!\text {d}\mathbb {P}/\mathop {}\!\text {d}\mathbb {Q}\) and \(\mathop {}\!\text {d}\mathbb {Q}/\mathop {}\!\text {d}\mathbb {P}\) can be controlled in a way that depends on \(\Vert b\Vert _{L^q_T B^\alpha _{\infty ,\infty }}\) but not on the specific \((x_0,b)\). In particular, the estimates can be performed uniformly over \(x_0\in \mathbb {R}^d\) and \(\Vert b\Vert _{L^q_T B^\alpha _{\infty ,\infty }}\le M\) for a fixed parameter \(M>0\).

Similarly, in the case \(H>1/2\), given \(b\in E=C^{\alpha H}_T C^0_x\cap C^0_T C^\alpha _x\) for some \(\alpha >1-1/(2H)\), using the regularity of fBm trajectories it’s easy to check that the map \(t\mapsto b(t,x_0+W_t)\) belongs \(\mathbb {P}\)-a.s. to \(C^{\alpha H-\varepsilon }_T\) for any \(\varepsilon >0\). Furthermore, reasoning as in the proof of [17,   Theorem 15], it can be shown that there exists \(\gamma >H+1/2\) and an increasing function K such that

$$\begin{aligned} \mathbb {E}\bigg [\exp \bigg (\eta \Big \Vert \int _0^\cdot b(r,x_0+W_r)\mathop {}\!\text {d}r\Big \Vert _{\gamma }^2\bigg )\bigg ]\le K(\eta )<\infty \quad \forall \, \eta \ge 0. \end{aligned}$$

Therefore also in this case we can apply Proposition 3.10 to construct weak solutions to the SDE. Moreover the function K only depends on \(\Vert b\Vert _E\), therefore as before all estimates are uniform over \(x_0\in \mathbb {R}^d\) and \(b\in E\) with \(\Vert b\Vert _E\le M\), M fixed parameter.

If both cases, if in addition b is smooth, then the weak solution constructed in this way necessarily coincides with the unique strong one; thus the above reasoning also provide uniform estimates for the solutions associated to smooth drifts.

3.3 Stability estimates for SDEs

In light of the above results, in the remainder of Sect. 3 we will always impose the following assumption on the drift b.

Assumption 3.12

Given \(H\in (0,1)\), b satisfies one of the following:

  • If \(H>1/2\), then \(b\in C^{\alpha H}_T C^0_x\cap b\in C^0_T C^\alpha _x\) for some

    $$\begin{aligned} \alpha >1-\frac{1}{2H}; \end{aligned}$$

    equivalently, there exists a constant \(C>0\) s.t., for all \((s,t,x,y)\in [0,T]^2\times \mathbb {R}^{2d}\), such that

    $$\begin{aligned} |b(t,x)|\le C, \quad |b(t,x)-b(s,y)|\le C(|t-s|^{\alpha H} + |x-y|^\alpha ). \end{aligned}$$
  • If \(H\le 1/2\), then \(b\in L^q_T B^\alpha _{\infty ,\infty }\) for some \((\alpha ,q)\) satisfying (3.12).

In both cases we will use the notation \(\Vert b\Vert _E\) for \(E=C^{\alpha H}_T C^0_x\cap C^0_T C^\alpha _x\) when \(H>1/2\), respectively \(E=L^q_T B^\alpha _{\infty ,\infty }\) when \(H\le 1/2\).

We are now ready to present the main result of this section. Its novelty, compared to previous results like those in [9, 17], lies in the comparison of two solutions driven by different drifts \(b^i\), which gives rise to the term \(\Vert b^1-b^2\Vert _{L^q_T B^{\alpha -1}_{\infty ,\infty }}\) (and not \(\Vert b^1-b^2\Vert _{L^q_T B^{\alpha }_{\infty ,\infty }}\)!) appearing in (3.15). On a more technical level, we drop slightly restrictive assumptions from previous works, like working with autonomous drifts as in [9], or compactly supported in space ones like in [17]; let us also point out that we fill a gap in the statements of Theorem 4.23 and Corollary 4.24 from [17], which do not cover the case of \(b\in L^q_t B^\alpha _{\infty ,\infty }\) with \(q<\infty \) and \(\alpha >0\).

Theorem 3.13

Let W be an fBm of parameter \(H\in (0,1)\) and let b satisfy Assumption 3.12. Then for any \(x_0\in \mathbb {R}^d\) strong existence, pathwise uniqueness and uniqueness in law hold for the SDE

$$\begin{aligned} X_t = x_0 + \int _0^t b(r,X_r)\text {d}r + W_t \end{aligned}$$
(3.14)

in the sense of Definition 3.5. Given \(x_0^i\in \mathbb {R}^d\) and \(b^i\) satisfying Assumption 3.12, \(i=1,2\), denote by \(X^i\) the solutions associated to \((x^i_0,b^i)\) and let \(M>0\) be a constant such that \(\Vert b^i\Vert _E\le M\) for \(i=1,2\). Let \((\alpha ,{\tilde{q}})\) be another pair satisfying (3.12) with the same \(\alpha \) as in Assumption 3.12 and \({\tilde{q}}\le q\). Then there exists \(\gamma >1/2\) with the following property: for any \(p\in [1,\infty )\) there exists a constant \(C>0\) (depending on \(\gamma , p, M, T, d, {\tilde{q}}\) and the parameters appearing in Assumption 3.12) such that

$$\begin{aligned} \mathbb {E}\Big [\Vert X^1-X^2\Vert ^p_{\gamma ;[0,\tau ]} \Big ]^{1/p} \le C \Big (|x^1_0-x^2_0|+\Vert b^1-b^2\Vert _{L^{{\tilde{q}}}(0,\tau ; B^{\alpha -1}_{\infty ,\infty })}\Big )\quad \forall \, \tau \in [0,T]. \nonumber \\ \end{aligned}$$
(3.15)

Proof

We will only treat the case \(H\le 1/2\), the other one being almost identical.

Let us first assume \(b^i\) to be smooth functions and show that (3.15) holds; in this case by Remark 3.6 strong existence and uniqueness hold automatically. Moreover by Remark 3.11, there exist probability measures \(\mathbb {Q}^i\) equivalent to \(\mathbb {P}\) such that \(\mathcal {L}_{\mathbb {Q}^i}(X^i)=\mathcal {L}_\mathbb {P}(x^i_0+W)\), with moment estimates depending on M but not on \((x^i_0,b^i)\); the solutions decompose as \(X^i=x^i_0+h^i+W^i\) with \(h^i_0=0\), \(h^i\in C^{{\tilde{\gamma }}}_T\) with \(\tilde{\gamma }>H+1/2\) and such that

$$\begin{aligned} \mathbb {E}\big [\exp \big (\eta \Vert h^i\Vert _{\gamma }^2\big )\big ]\le K(\eta )<\infty \quad \forall \, \eta \ge 0 \end{aligned}$$

where again K depends on M but not on the specific \((x^i_0,b^i)\).

For any \(\lambda \in [0,1]\), let us define \(x^\lambda _0:=x^2_0+\lambda (x^1_0-x^2_0)\), \(h^\lambda :=h^2+\lambda (h^1-h^2)\), so that \(X^2+\lambda (X^1-X^2) = x^\lambda _0+h^\lambda +W\). By Taylor expansion and elementary addition and subtraction, the difference \(Y=X^1-X^2\) satisfies

$$\begin{aligned} Y_t= & {} x^1_0-x^2_0 + \int _0^t \bigg (\int _0^1 D b^1 (r, x^\lambda +h^\lambda _r+W_r)\text {d}\lambda \bigg )\\&\cdot Y_r\,\text {d}r + \int _0^t (b^1-b^2)(r,X^2_r)\text {d}r; \end{aligned}$$

let us define

$$\begin{aligned} A_t := \int _0^t \int _0^1 D b^1 (r, x^\lambda _0+h^\lambda _r+W_r)\text {d}\lambda \text {d}r,\quad \psi _t := \int _0^t (b^1-b^2)(r,X^2_r)\text {d}r. \end{aligned}$$

In order to get estimates for Y, it turns out to be useful to reinterpret the above equation as a linear Young differential equation of the form

$$\begin{aligned} Y_t=Y_0+\int _0^t A_{\mathop {}\!\text {d}s} Y_s+\psi _t. \end{aligned}$$
(3.16)

Indeed for any \(\gamma >1/2\), we can apply [15,   estimate (3.16), Theorem 3.9] to obtain the existence of a constant \(C=C(\gamma )\) such that for any \(\tau \le T\) it holds

$$\begin{aligned} \Vert Y\Vert _{\gamma ;[0,\tau ]}&\le C \exp (C\tau (1+\llbracket A\rrbracket _{\gamma ;[0,\tau ]}^2)) (|Y_0|+(1+\tau ^\gamma ) \llbracket \psi \rrbracket _{\gamma ;[0,\tau ]})\\&\lesssim _T \exp (C T \llbracket A\rrbracket _{\gamma }^2) (|x_0^1-x_0^2|+\llbracket \psi \rrbracket _{\gamma ;[0,\tau ]}) \end{aligned}$$

and so our task reduces to finding estimates for quantities of the form

$$\begin{aligned} \mathbb {E}_\mathbb {P}\big [ \exp (\eta \llbracket A\rrbracket _\gamma ^2)\big ],\quad \mathbb {E}_\mathbb {P}\big [\Vert \psi \Vert _{\gamma ;[0,\tau ]}^p\big ]. \end{aligned}$$

We start by estimating \(\psi \), which is the simplest term. Recalling that \(\mathcal {L}_{\mathbb {Q}^2}(X^2)=\mathcal {L}_\mathbb {P}(x^2_0+W)\), by Point ii. of Corollary  3.9 and Cauchy inequality we can find \(\gamma >1/2\) such that, for any \(p\ge 1\),

In order to get estimates for A, observe first of all that by convexity of \(z\mapsto \exp (\eta z^2)\), it holds

$$\begin{aligned} \mathbb {E}\big [\exp \big (\eta \Vert h^\lambda \Vert _{\gamma }^2\big )\big ] \le \lambda \mathbb {E}\big [\exp \big (\eta \Vert h^1\Vert _{\gamma }^2\big )\big ] + (1-\lambda ) \mathbb {E}\big [\exp \big (\eta \Vert h^2\Vert _{\gamma }^2\big )\big ] \le K(\eta )<\infty \end{aligned}$$

where the estimate is uniform in \(\lambda \) and \(\eta \); therefore by Proposition 3.10, for any \(\lambda \) there exists a probability \(\mathbb {Q}^\lambda \) equivalent to \(\mathbb {P}\) such that \(\mathcal {L}_{\mathbb {Q}^\lambda }(h^\lambda +W)=\mathcal {L}_\mathbb {P}(W)\); moreover estimates of the form (3.13) only depend on K and thus on M, but not \((x^i_0,b^i)\). Therefore by Jensen’s inequality and Proposition 3.8, we can find \(\gamma >1/2\) such that, for any \(\eta \ge 0\), it holds that

Putting everything together, we have obtained

$$\begin{aligned} \mathbb {E}_\mathbb {P}\big [\Vert Y\Vert _{\gamma ;[0,\tau ]}^p\big ]&\lesssim \mathbb {E}_\mathbb {P}\big [ \exp (p C T\llbracket A\rrbracket _\gamma ^2) (|Y_0|^p+\llbracket \psi \rrbracket _{\gamma ;[0,\tau ]}^p)\big ]\\&\lesssim \mathbb {E}_\mathbb {P}\big [ \exp (2pCT\llbracket A\rrbracket _\gamma ^2)\big ]^{1/2} \left( |Y_0|^p+\mathbb {E}_\mathbb {P}\big [ \llbracket \psi \rrbracket _{\gamma ;[0,\tau ]}^{2p}\big ]^{1/2}\right) \\&\lesssim _{p,T,M} |x_0^1-x_0^2|^p + \Vert b^1-b^2 \Vert _{L^{{\tilde{q}}}(0,\tau ;B^\alpha _{\infty ,\infty })}^p \end{aligned}$$

which proves (3.15) for smooth \(b^i\).

Assume now we are given \(x_0\in \mathbb {R}^d\) and b satisfying Assumption 3.12; we can find \(\tilde{q}\le q\), \(\tilde{q}<\infty \) such that \((\alpha ,{\tilde{q}})\) satisfy (3.12) and a sequence \(\{b^n\}_n\) be smooth drifts s.t. \(\Vert b^n\Vert _E\le \Vert b\Vert _E\) for all \(n\ge 1\) and \(b^n\rightarrow b\) in \(L^{{\tilde{q}}}_T B^{\alpha -\varepsilon }_{\infty ,\infty }\) for any \(\varepsilon >0\) (for instance set \(b^n = b*\psi ^n\) with \(\{\psi ^n\}_{n\ge 1}\) a standard family of mollifiers). Let \(X^n\) be the unique solutions to (3.14) associated to \((x_0,b^n)\), then by (3.15) it holds

$$\begin{aligned} \mathbb {E}\big [\Vert X^n-X^m\Vert _\gamma ^p\big ]^{1/p} \lesssim \Vert b^n-b^m\Vert _{L^{{\tilde{q}}}_T B^{\alpha -1}_{\infty ,\infty }} \end{aligned}$$

showing that the random variables \(\theta ^n=X^n-W\) are a Cauchy sequence in \(L^p_\Omega C^\gamma _T\). Therefore they converge to a unique limit \(\theta \), which is adapted to the filtration \(\mathcal {F}_t=\sigma \{W_s:s\le t\}\) since \(\theta ^n\) are so. Similarly the \(X^n\) converge to \(X=\theta +W\) which is adapted.

The estimates from Corollary 3.9, the linearity of \(b\mapsto T^w b\) and the property \(b^n\rightarrow b\) in \(L^{{\tilde{q}}}_T B^{\alpha -\varepsilon }_{p,p}\) together imply that \(\mathbb {P}\)-a.s. \(T^W b^n\rightarrow T^W b\) in \(C^\gamma _T C^1_{loc}\). Since we have \(\mathbb {P}\)-a.s. \(\theta ^n\rightarrow \theta \) in \(C^\gamma _T\) as well, we can invoke the closure property of nonlinear Young equations (Lemma 3.3) to deduce that \(X=\theta + W\) is a pathwise solution to (3.14) in the sense of Definition 3.4.

Furthermore, by Fatou’s lemma

$$\begin{aligned} \mathbb {E}_\mathbb {P}\big [ \exp (\eta \llbracket \theta \rrbracket _{\tilde{\gamma }}^2)\big ] \le \liminf _{n\rightarrow \infty } \mathbb {E}_\mathbb {P}\big [ \exp (\eta \llbracket \theta ^n\rrbracket _{\tilde{\gamma }}^2)\big ] \le K(\eta )<\infty \quad \forall \eta \ge 0; \end{aligned}$$

it follows that Girsanov can be applied to \(X=\theta +W=x_0+h+W\) to deduce that X is distributed as \(x_0+ W\) under another probability measure equivalent to \(\mathbb {P}\). In particular, \(\mathbb {P}\)-a.s. it must hold \(T^X b\in C^\gamma _T C^1_{loc}\). To summarise, X is a strong solution (so that a copy of it can be constructed on any probability space supporting the measure \(\mu ^H\)) such that \(T^X b\in C^\gamma _T C^1_{loc}\), which implies by Lemma 3.7 that pathwise uniqueness must hold. This also implies that the law of any solution coincides with the one constructed by Girsanov theorem, from which uniqueness in law follows.

The extension of inequality (3.14) to any pair of solutions \(X^i\) associated to distributional drifts \(b^i\) is now a direct consequence of the approximation argument. \(\square \)

Remark 3.14

At the price of making the statement of Theorem 3.13 slightly more technical, we have allowed the presence of the additional parameter \({\tilde{q}}\le q\) to handle \(q=\infty \). Indeed finding approximation sequences in \(L^\infty _T B^{\alpha -1}_{p,p}\) can be a hard task since this is not a separable space; the use of \(L^{\tilde{q}}_T B^{\alpha -1}_{p,p}\) with \({\tilde{q}}<\infty \) will also be useful later in the proofs in Sect. 4.3.

Remark 3.15

Theorem 3.13 gives us the information that, for drifts b satisfying Assumption 3.12, the nonlinear Young interpretation of the SDE is the only physical one. Namely, any other solution concept sharing the fundamental property of being the limit of solutions associated to smooth drifts \(b^n\rightarrow b\) will coincide with ours. The statement of Theorem 3.13 can be further strengthened to establish path-by-path uniqueness, see [9], however we will not need this for our purposes.

Remark 3.16

Although we have proved the stability estimate (3.15) in order to apply to DDSDEs, it is of interest on its own. Indeed it can be applied to construct the stochastic flow associated to SDE (3.14), or to develop numerical schemes for distributional drifts b by first approximating them by smoother \(b^n\). We leave both applications for future research.

The next lemmas extend the previous results to the case of random initial data.

Corollary 3.17

Given \(H\in (0,1)\) and b satisfying Assumption 3.12, strong existence, uniqueness in law and pathwise uniqueness also hold for random initial data \(X_0=\xi \) independent of W. Assume \(b^i\) are drifts satisfying the assumptions of Theorem 3.13 and \((\xi ^1,\xi ^2)\in L^p(\Omega ;\mathbb {R}^{2d})\) is independent of W, then the solutions \(X^i\) associated to \((\xi ^i,b^i)\) satisfy

$$\begin{aligned}&\mathbb {E}_\mathbb {P}\big [\Vert X^1-X^2\Vert ^p_{\gamma ;[0,\tau ]} \big ]^{1/p}\nonumber \\&\quad \le C \Big (\mathbb {E}_\mathbb {P}[|\xi ^1-\xi ^2|^p]^{1/p}+\Vert b^1-b^2\Vert _{L^{{\tilde{q}}}(0,\tau ; B^{\alpha -1}_{\infty ,\infty })}\Big )\quad \forall \,\tau \in [0,T]. \end{aligned}$$
(3.17)

where the constant C and the parameters \(\gamma ,\,\alpha \,{\tilde{q}}\) are the same as in (3.15). Moreover, denoting by \(\mu ^i_t = \mathcal {L}(X^i_t)\) the laws of the unique solutions \(X^i\) associated to \((\mu _0^i,b^i)\) with \(\mathcal {L}(\xi ^i,W)=\mu _0^i\otimes \mathcal {L}(W)\), it holds

$$\begin{aligned} \sup _{t\in [0,\tau ]} d_p (\mu ^1_t,\mu ^2_t) \le C \Big (d_p(\mu ^1_0,\mu ^2_0) + \Vert b^1-b^2 \Vert _{L^{{\tilde{q}}}(0,\tau ; B^{\alpha -1}_{\infty ,\infty })}\Big )\quad \forall \, \tau \in [0,T]. \nonumber \\ \end{aligned}$$
(3.18)

Proof

Strong existence and pathwise uniqueness for random initial data follows from that for deterministic ones by classical arguments. Given a probability space \((\Omega ,\mathcal {F},\mathbb {P})\) with \((W,\xi ^1,\xi ^2)\) defined on it and drifts \((b^1,b^2)\), we can condition on the variables \((\xi ^1,\xi ^2)\) independent of W and apply estimate (3.15) to deduce that

$$\begin{aligned} \mathbb {E}_\mathbb {P}\big [\Vert X^1-X^2\Vert ^p_{\gamma ;[0,\tau ]} \big \vert \, \xi ^1,\xi ^2 \big ]^{1/p} \le C \Big ( |\xi ^1-\xi ^2|+\Vert b^1-b^2\Vert _{L^{{\tilde{q}}}(0,\tau ; B^{\alpha -1}_{\infty ,\infty })}\Big ); \end{aligned}$$

inequality (3.17) follows taking the \(L^p_\Omega \)-norm on both sides, using the tower property of conditional expectation.

Now assume we are given a pair \((\mu _0^1,\mu _0^2)\in \mathcal {P}(\mathbb {R}^d)\times \mathcal {P}(\mathbb {R}^d)\) and let \(m\in \Pi (\mu ^1_0,\mu ^2_0)\) be an optimal coupling for them. On the canonical space \(\Omega =\mathbb {R}^{2d}\times C_T\), endowed with \(\mathbb {P}=m\otimes \mu ^H\), we can construct random variables \((\xi ^1,\xi ^2,W)\) and solutions \(X^i\) associated to \((\xi ^i,b^i)\), in such a way that \(\mathcal {L}_\mathbb {P}(\xi ^1,\xi ^2)=m\), \(\mathbb {E}[|\xi ^1-\xi ^2|^p]^{1/p}=d_p(\mu ^1_0,\mu ^2_0)\). But then by the definition of \(d_p\) it must hold \(d_p(\mu ^1_t,\mu ^2_t) \le \Vert X^1_t-X^2_t\Vert _{L^p_\Omega }\) and so estimate (3.18) follows from (3.17) applied in this setting. \(\square \)

Corollary 3.18

Let \(H\in (0,1)\), b satisfying Assumption 3.12, \(\xi \) random initial data independent of W and X be the solution associated to \((\xi ,b)\). Then there exists another probability measure \(\mathbb {Q}\) equivalent to \(\mathbb {P}\) such that \(\mathcal {L}_\mathbb {Q}(X_\cdot )=\mathcal {L}_\mathbb {P}(\xi +W_\cdot )\); moreover

$$\begin{aligned} \mathbb {E}_\mathbb {Q}\Big [ \Big (\frac{\text {d}\mathbb {P}}{\text {d}\mathbb {Q}}\Big )^n + \Big (\frac{\text {d}\mathbb {Q}}{\text {d}\mathbb {P}}\Big )^n\Big ] <\infty \quad \forall \, n\in \mathbb {N}. \end{aligned}$$

Proof

It suffices to work on the canonical space \((\Omega ,\mathcal {F},\mathbb {P})\) with \(\Omega =\mathbb {R}^d\times C_T\ni (x,\omega )\), \(\mathbb {P}=\mu _0\otimes \mu ^H\) where \(\mu _0:=\mathcal {L}(\xi )\). For any \(x\in \mathbb {R}^d\), denote by \(\omega \mapsto X^x(\omega )\) the unique strong solution associated to (xb), so that \((x,\omega )\mapsto X^x(\omega )\) gives the solution to the SDE with initial distribution \(\mu _0\). Recall from Proposition 3.10 that for any \(x\in \mathbb {R}^d\), there exists a probability measure on \(C_T\) denoted by \(\mathbb {Q}^x\), equivalent to \(\mu ^H\), such that \(\mathcal {L}_{\mathbb {Q}^x} (X^x)=\mathcal {L}_{\mu ^H}(x+W)\); therefore for any measurable \(F:\Omega \rightarrow \mathbb {R}\)

$$\begin{aligned} \mathbb {E}_\mathbb {P}[F(\xi + W)]&= \int _{\mathbb {R}^d} \int _{C_T} F(x+\omega )\, \mu ^H(\mathop {}\!\text {d}\omega ) \mu _0(\mathop {}\!\text {d}x)\\&= \int _{\mathbb {R}^d} \int _{C_T} F(X^x(\omega )) \mathbb {Q}^x(\mathop {}\!\text {d}\omega ) \mu _0(\mathop {}\!\text {d}x). \end{aligned}$$

Thus if we define a probability measure \(\mathbb {Q}\) on \(\Omega =\mathbb {R}^d\times C_T\) by

$$\begin{aligned} \mathbb {Q}(E_1\times E_2)=\int _{E_1} \mathbb {Q}^x(E_2) \mu _0(\mathop {}\!\text {d}x) \quad \forall \, E_1\in \mathcal {B}(\mathbb {R}^d),\, E_2\in \mathcal {B}(C_T), \end{aligned}$$

it must hold that \(\mathcal {L}_\mathbb {Q}(X)=\mathcal {L}_\mathbb {P}(\xi +W)\); since \(\mu ^H\ll \mathbb {Q}^x\) for every x, \(\mathbb {P}=\mu ^H\otimes \mu ^\xi \ll \mathbb {Q}\) with Radon-Nikodym derivative given by

$$\begin{aligned} \frac{\mathop {}\!\text {d}\mathbb {P}}{\mathop {}\!\text {d}\mathbb {Q}}(x,\omega ) = \frac{\mathop {}\!\text {d}\mu ^H}{\mathop {}\!\text {d}\mathbb {Q}^x} (\omega )\quad \text {for }\mathbb {P}\text {-a.e. } (x,\omega ). \end{aligned}$$

Exploiting the bounds from Proposition 3.10 (which for given b are uniform in \(x\in \mathbb {R}^d\)) we find

$$\begin{aligned} \mathbb {E}_\mathbb {Q}\Big [ \Big (\frac{\text {d}\mathbb {P}}{\text {d}\mathbb {Q}}\Big )^n + \Big (\frac{\text {d}\mathbb {Q}}{\text {d}\mathbb {P}}\Big )^n\Big ]&= \int _{\mathbb {R}^d} \int _{C_T} \bigg [\Big ( \frac{\mathop {}\!\text {d}\mu ^H}{\mathop {}\!\text {d}\mathbb {Q}^x}(\omega ) \Big )^n + \Big ( \frac{\mathop {}\!\text {d}\mathbb {Q}^x}{\mathop {}\!\text {d}\mu ^H}(\omega ) \Big )^n\bigg ]\, \mathbb {Q}^x(\mathop {}\!\text {d}\omega ) \mu _0(\mathop {}\!\text {d}x)\\&\lesssim _{n,b} \int _{\mathbb {R}^d} 1\, \mu _0(\mathop {}\!\text {d}x) <\infty \end{aligned}$$

providing the conclusion. \(\square \)

Remark 3.19

It follows from the above that for any \(p\in [1,\infty )\) and any \(\varepsilon >0\)

$$\begin{aligned} \mathbb {E}_\mathbb {P}\big [\llbracket X\rrbracket _{H-\varepsilon }^p\big ] \le \mathbb {E}_\mathbb {Q}\big [\llbracket X\rrbracket _{H-\varepsilon }^{2p}\big ]^{1/2} \,\mathbb {E}_\mathbb {Q}\bigg [ \Big (\frac{\text {d}\mathbb {P}}{\text {d}\mathbb {Q}}\Big )^{2p}\bigg ]^{1/2} \lesssim \mathbb {E}_\mathbb {P}\big [\llbracket W\rrbracket _{H-\varepsilon }^{2p}\big ]^{1/2}<\infty . \end{aligned}$$

In particular, if \(\xi \in \mathcal {P}_{{\tilde{p}}}\) for another \({\tilde{p}}\in [1,\infty )\), then \(\mathbb {E}_\mathbb {P}[\Vert X\Vert _{H-\varepsilon }^{{\tilde{p}}}]<\infty \). As in the case of Remark 3.11, for fixed \(\xi \) the estimate can be performed uniformly over \(\Vert b\Vert _E \le M\).

3.4 Regularity of the solution laws

Although our main interest is the study of DDSDEs, our analysis also yields results on the regularity of the law \(\mathcal {L}(X_t)\) for the solution to a standard SDE with singular drift. The method is quite simple but appears to be new and does not rely on PDE techniques nor Malliavin calculus; rather we exploit Girsanov transform and the averaging estimates for fBm, in combination with duality arguments.

Proposition 3.20

Let b satisfy Assumption 3.12, X be the solution associated to \((\xi ,b)\) for random initial \(\xi \) independent of W. Then \(\mathcal {L}(X_\cdot )\in L^{\bar{q}}_T B^{\bar{\alpha }}_{1,1}\) for all \((\bar{\alpha },\bar{q})\in (0,\infty )\times (1,2)\) satisfying

$$\begin{aligned} \bar{\alpha } < \frac{1}{H} \Big (\frac{1}{\bar{q}}-\frac{1}{2}\Big ). \end{aligned}$$
(3.19)

Proof

Observe that if \((\bar{\alpha }, \bar{q}, H)\) satisfy (3.19), then we can find \(\varepsilon >0\) small enough so that \((-\bar{\alpha }-2\varepsilon ,\bar{q}',H)\) satisfy the assumptions of Proposition 3.8, where \(\bar{q}'\) denotes the conjugate of \(\bar{q}\). By Corollary 3.18, there exists an equivalent measure \(\mathbb {Q}\) such that \(\mathcal {L}_\mathbb {Q}(X)=\mathcal {L}_\mathbb {P}(\xi +W)\), therefore for any \(f\in L^{\bar{q}'}_T B^{-\bar{\alpha }-2\varepsilon }_{\infty ,\infty }\) it holds

$$\begin{aligned} \bigg |\int _0^T \langle f_s, \mathcal {L}_\mathbb {P}(X_s)\rangle \mathop {}\!\text {d}s\bigg |&\le \mathbb {E}_\mathbb {P}\bigg [\Big |\int _0^T f(s,X_s)\mathop {}\!\text {d}s\Big |\bigg ]\\&\le \mathbb {E}_\mathbb {Q}\bigg [\Big (\frac{\mathop {}\!\text {d}\mathbb {P}}{\mathop {}\!\text {d}\mathbb {Q}}\Big )^2\bigg ]^{1/2}\, \mathbb {E}_\mathbb {Q}\bigg [\Big |\int _0^T f(s,X_s)\mathop {}\!\text {d}s\Big |^2\bigg ]^{1/2}\\&\lesssim \bigg (\int _{\mathbb {R}^d} \mathbb {E}_{\mu ^H}\Big [\Big |\int _0^T f(s,x+\omega _s)\mathop {}\!\text {d}s\Big |^2\Big ] \mu _0(\mathop {}\!\text {d}x) \bigg )^{1/2}\\&\lesssim \Vert f\Vert _{L^{\bar{q}'}_T B^{-\bar{\alpha }-2\varepsilon }_{\infty ,\infty }}. \end{aligned}$$

where in the last passage we used the fact that estimate (3.6) is uniform in \(x\in \mathbb {R}^d\).

Using the embedding \(B^{-\bar{\alpha }-\varepsilon }_{p,p}\hookrightarrow B^{-\bar{\alpha }-\varepsilon -d/p}_{\infty ,\infty }\hookrightarrow B^{-\bar{\alpha }-2\varepsilon }_{\infty ,\infty }\) for \(p<\infty \) big enough, by the duality \((L^{\bar{q'}}_T B^{-s}_{p,p})^*\simeq L^{\bar{q}}_T B^s_{p',p'}\) we deduce that \(\mathcal {L}(X_\cdot )\in L^{\bar{q}}_T B^{\bar{\alpha }+\varepsilon }_{p',p'}\). Thus \(h:=(I-\Delta )^{\bar{\alpha }/2} \mathcal {L}(X_\cdot )\in L^{\bar{q}}_T B^\varepsilon _{p',p'} \hookrightarrow L^{\bar{q}}_T L^{p'}_x\); in order to conclude, it’s enough to show that \(h\in L^{\bar{q}}_T L^1_x\). Observe that \(h\in L^1_{loc}([0,T]\times \mathbb {R}^d)\) and for any \(\varphi \in L^{\bar{q}'}_T L^\infty _x\) it holds

$$\begin{aligned} \int _0^T \langle \varphi _s, h_s\rangle \mathop {}\!\text {d}s&= \int _0^T \langle (I-\Delta )^{\bar{\alpha }/2} \varphi _s, \mathcal {L}(X_s)\rangle \mathop {}\!\text {d}s \lesssim \Vert (I-\Delta )^{\bar{\bar{\alpha }}/2} \varphi \Vert _{L^{q'}_T B^{-\bar{\alpha }}_{\infty ,\infty }}\\&\lesssim \Vert \varphi \Vert _{L^{\bar{q}'}_T L^\infty _x}; \end{aligned}$$

the conclusion then follows from an application of Lemma A.1 from Appendix A. \(\square \)

Proposition 3.21

Let \(X,\,b,\,\xi \) be as in Proposition 3.20. Then \(\mathcal {L}(X_\cdot )\in L^q_T L^p_x\) for all \((q,p)\in (1,\infty )^2\) satisfying

$$\begin{aligned} \frac{1}{q}+\frac{Hd}{p}>Hd. \end{aligned}$$
(3.20)

If in addition \(Hd<1\), then \(\mathcal {L}(X_\cdot )\in L^q_T L^\infty _x\) for all \(q\in (1,\infty )\) satisfying \(q<(Hd)^{-1}\).

Proof

Observe that \((q,p)\in (1,\infty )^2\) satisfy (3.20) if and only if the conjugates \((q',p')\) satisfy

$$\begin{aligned} \frac{1}{q'} + \frac{H d}{p'}<1. \end{aligned}$$

By [32,   Lemma 6.4] (more precisely equation (6.11) right after the proof therein) and estimates based on Girsanov theorem analogous to the proof of Proposition 3.20, we deduce that

$$\begin{aligned} \bigg |\int _0^T \langle f_s,\mathcal {L}(X_s)\rangle \mathop {}\!\text {d}s\bigg | \le \mathbb {E}\bigg [\Big |\int _0^T f(s,X_s)\mathop {}\!\text {d}s\Big |\bigg ] \lesssim \Vert f\Vert _{L^{q'}_T L^{p'}_x}; \end{aligned}$$

therefore by duality \(\mathcal {L}(X_\cdot )\in L^q_T L^p_x\) if \((q,p)\in (1,\infty )^2\). Taking \(p=\infty ,\, q<(Hd)^{-1}\) (correspondingly \(p'=1, 1/q'<1-Hd\)), we obtain

$$\begin{aligned} \bigg |\int _0^T \langle f_s,\mathcal {L}(X_s)\rangle \mathop {}\!\text {d}s\bigg | \lesssim \Vert f\Vert _{L^{q'}_T L^1_x} \quad \forall \, f\in L^{q'}_T L^1_x \end{aligned}$$

and so we can conclude by Lemma A.2 in the Appendix that in this case \(\mathcal {L}(X_\cdot )\in L^q_T L^\infty _x\). \(\square \)

4 Proofs of the main results

We split the proof of Theorem 2.4 into sections, which deal respectively with the cases \(H>1/2\) and \(H\le 1/2\); the proof of Theorem 2.5 is presented in Sect. 4.3 instead.

We recall to the reader that in this section we will be dealing with DDSDEs of the form

$$\begin{aligned} X_t = \xi + \int _0^t B_s(X_s,\mathcal {L}(X_s))\,\mathop {}\!\text {d}s + W_t \quad \forall \, t\in [0,T] \end{aligned}$$
(4.1)

with the drift B belonging to either \(\mathcal {G}^{q,\alpha }_p\) or \(\mathcal {H}^{\beta ,\alpha }_p\) (cf. Definitions 2.22.1) depending on the value of \(H\in (0,1)\). The variable \(\xi \) is independent of W and with prescribed law \(\mu _0\in \mathcal {P}(\mathbb {R}^d)\), thus depending on the context we will treat both \((\xi ,B)\) and \((\mu _0,B)\) as the data of the problem.

4.1 The case \(H>1/2\)

In this regime we will always consider drifts \(B\in \mathcal {H}^{\beta ,\alpha }_p\) with \(\alpha ,\beta >0\) and \(p\in [1,\infty )\). In particular here \(B:[0,T]\times \mathbb {R}^d\times \mathcal {P}_p(\mathbb {R}^d)\rightarrow \mathbb {R}^d\) is bounded and uniformly continuous in all of its arguments; in this sense, although the concept of solution introduced in Sect. 3 does include the standard one by Remark 3.6, we do not employ it here.

Rather, we will simply say that a tuple \((X,\xi ,W)\), defined on a probability space \((\Omega ,\mathcal {F},\mathbb {P})\), such that \(\mathcal {L}_\mathbb {P}(\xi ,W)=\mu _0\otimes \mu ^H\), is a solution to (4.1) if \(\mathcal {L}(X_t)\in \mathcal {P}_p(\mathbb {R}^d)\) for all \(t\in [0,T]\) and the integral equation (4.1) holds \(\mathbb {P}\)-a.s. The concepts of strong existence, pathwise uniqueness and uniqueness in law immediately carry over from the usual ones for SDEs.

Proposition 4.1

Suppose \(B\in \mathcal {H}^{H,\alpha }_{p}\) with

$$\begin{aligned} H>\frac{1}{2}, \quad \alpha >1-\frac{1}{2H},\quad p\in [1,\infty ). \end{aligned}$$

Then for any \(\mu _0\in \mathcal {P}_p(\mathbb {R}^d)\) strong existence, pathwise uniqueness and uniqueness in law hold for the DDSDE (4.1) with data \((\mu _0,B)\).

Proof

We divide the proof in several steps.

Step 1: weak existence. By hypothesis \(B:[0,T]\times \mathbb {R}^d\times \mathcal {P}_p(\mathbb {R}^d)\rightarrow \mathbb {R}^d\) is a uniformly continuous, bounded map; existence of weak solutions on [0, T] then follows from [19,   Proposition 3.10].

Step 2: any weak solution is a strong one. Let X be a weak solution of the DDSDE w.r.t. \((\xi ,W)\) on a probability space \((\Omega ,\mathcal {F},\mathbb {P})\). Then setting \(\mu _t=\mathcal {L}(X_t)\), \(b^\mu (t,x)=B_t(x,\mu _t)\), X solves the SDE associated to \(b^\mu \), which satisfies \(|b^\mu (t,x)|\le \Vert B\Vert \) uniformly over (tx). As a consequence

$$\begin{aligned} d_p(\mu _s,\mu _t)&\le \Vert X_t-X_s\Vert _{L^p_\Omega } \le \int _s^t \Vert b^\mu (r,X_r)\Vert _{L^p_\Omega } \mathop {}\!\text {d}r + \Vert W_t-W_s\Vert _{L^p_\Omega } \nonumber \\&\lesssim _{T,p} (1+\Vert B\Vert ) |t-s|^H \end{aligned}$$
(4.2)

where we repeatedly applied Minkowski’s inequality; by assumption

$$\begin{aligned} |b^\mu (t,x)-b^\mu (s,y)|&= |B_t(x,\mu _t)-B_s(y,\mu _s)|\\&\le \Vert B\Vert \,\big (|t-s|^{\alpha H} + |x-y|^\alpha + d_p(\mu _t,\mu _s)^\alpha \big )\\&\lesssim _{T,p} (1+\Vert B\Vert ^2)\,\big ( |t-s|^{\alpha H} + |x-y|^\alpha \big ), \end{aligned}$$

where we applied (4.2) to obtain the last inequality. Namely, \(b^\mu \) satisfies Assumption 3.12, implying that strong existence and uniqueness in law holds for the associated SDE; therefore X is adapted to \((\xi ,W)\).

Step 3: reduction to the canonical space. As we are dealing with a strong solution X, we can regard it as a random variable on the canonical space \((\Omega , \mathcal {F}, \mathbb {P})\) with \(\Omega =\mathbb {R}^d\times C_T\), \(\mathbb {P}=\mu _0\otimes \mu ^H\), \(\mathcal {F}\) the \(\mathbb {P}\)-completion of \(\mathcal {B}(\mathbb {R}^d\times C_T)\). Applying the same reasoning to any pair of weak (thus strong) solutions \(X^1,\, X^2\), possibly defined on different probability spaces, we can construct a coupling \((\tilde{X}^1,\tilde{X}^2)\) of solutions defined on the canonical space and w.r.t. the same random variables \((\xi ,W)\). If we show that \(\tilde{X}^1\equiv \tilde{X}^2\), then the equality \(\mathcal {L}(X^1)=\mathcal {L}(X^2)\) follows.

Step 4: pathwise uniqueness on the canonical space. Let us drop the tilde and adopt the notations \(\mu ^i_t = \mathcal {L}(X^i_t)\), \(b^i(t,x):=B_t(x,\mu ^i_t)\). It follows from the computations of Step 2 that we can find \(M\sim 1+\Vert B\Vert ^2\) such that \(b^i\) satisfy Assumption 3.12 with \(\Vert b^i\Vert _E \le M\). We can therefore apply estimate (3.17) for the choice \({\tilde{q}}=q=\infty \), together with \(X^1_0=X^2_0=\xi \), to find constants \(\gamma >1/2\), \(C>0\) such that

$$\begin{aligned} \mathbb {E}\Big [\Vert X^1-X^2\Vert _{\gamma ;[0,\tau ]}^p\Big ]^{1/p} \le C \sup _{t\in [0,\tau ]} \Vert b^1(t,\cdot )-b^2(t,\cdot )\Vert _{B^{\alpha -1}_{\infty ,\infty }}\quad \forall \, \tau \in [0,T]. \end{aligned}$$

By assumption \(B\in \mathcal {H}^{H,\alpha }_p\), therefore

$$\begin{aligned} \Vert b^1(t,\cdot )-b^2(t,\cdot )\Vert _{B^{\alpha -1}_{\infty ,\infty }} \le \Vert B\Vert \, d_p(\mu ^1_t,\mu ^2_t); \end{aligned}$$

combining everything, using again \(X^1_0=X^2_0\), for any \(\tau \in [0,T]\) it holds

$$\begin{aligned} \sup _{t\in [0,\tau ]} d_p(\mu ^1_t,\mu ^2_t)&\le \tau ^\gamma \, \mathbb {E}\Big [ \Vert X^1-X^2\Vert _{\gamma ; [0,\tau ]}^p\Big ]^{1/p} \le C \Vert B\Vert \, \tau ^\gamma \sup _{t\in [0,\tau ]} d_p(\mu ^1_t,\mu ^2_t). \end{aligned}$$

Choosing \({\bar{\tau }}\) small enough so that \(C \Vert B\Vert \, {\bar{\tau }}^\gamma <1\), we conclude that \(\mu ^1_t=\mu ^2_t\) for all \(t\in [0,{\bar{\tau }}]\) and so that \(\mathbb {E}[ \Vert X^1-X^2\Vert _{\gamma ;[0,{\bar{\tau }}]}]=0\), i.e. \(\mathbb {P}\)-a.s. \(X^1\equiv X^2\) on \([0,{\bar{\tau }}]\). In light of this, choosing now \(\tau =2{\bar{\tau }}\), going through similar computations we have

$$\begin{aligned} \sup _{t\in [0,\tau ]} d_p(\mu ^1_t,\mu ^2_t)&= \sup _{t\in [{\bar{\tau }},2{\bar{\tau }}]} d_p(\mu ^1_t,\mu ^2_t) \le {\bar{\tau }}^\gamma \, \mathbb {E}\Big [ \Vert X^1-X^2\Vert _{\gamma ; [{\bar{\tau }},2{\bar{\tau }}]}^p\Big ]^{1/p}\\&\le C \Vert B\Vert {\bar{\tau }}^\gamma \sup _{t\in [0,\tau ]} d_p(\mu ^1_t,\mu ^2_t) \end{aligned}$$

implying that the solutions also coincide on \([0,2{\bar{\tau }}]\). Iterating the reasoning for \(\tau =n{\bar{\tau }}\) until we cover [0, T] gives the conclusion. \(\square \)

4.2 The case \(H\le 1/2\)

In this case we can allow the drift to be singular, i.e. take values in \(B^\alpha _{\infty ,\infty }\) with \(\alpha <0\). We start by defining what we mean by solution to the DDSDE in this case.

Definition 4.2

Let \((\Omega ,\mathcal {F},\mathbb {P})\) be a probability space, \((X,\xi ,W)\) be a \(C_T\times \mathbb {R}^d\times C_T\)-valued random variable defined on it with \(\mathcal {L}_\mathbb {P}(\xi ,W)=\mathcal {L}(\xi )\otimes \mu ^H\); let \(B:[0,T]\times \mathcal {P}_p\rightarrow \mathcal {S}'\) be a measurable map for some \(p\in [1,\infty )\). We say that X is a solution to the DDSDE (4.1) associated to \((\xi ,B)\) if \(\mu _t:=\mathcal {L}_\mathbb {P}(X_t)\in \mathcal {P}_p\) for all \(t\in [0,T]\) and setting \(b^\mu (t,\cdot )=B_t(\mu _t)(\cdot )\), X is a pathwise solution to the SDE

$$\begin{aligned} X_t= \xi + \int _0^t b^\mu (s,X_s)\,\mathop {}\!\text {d}s + W_t, \end{aligned}$$

associated to \((b^\mu ,\xi ,W)\) in the sense of Definition 3.4. All the concepts of strong solution, pathwise uniqueness and uniqueness in law are similarly readapted from those of Definition 3.5.

As before, we will consider both \((\xi ,B)\) and \((\mu _0,B)\) to be the data of the problem, depending on whether we are focusing on solutions on a prescribed probability space or on their laws.

Assume now \(B\in \mathcal {G}^{q,\alpha }_p\) with

$$\begin{aligned} H\le \frac{1}{2}, \quad \alpha >1+\frac{1}{qH}-\frac{1}{2H}, \quad q\in (2,\infty ], \quad p\in [1,\infty ); \end{aligned}$$
(4.3)

then to any \(\mu _\cdot \in C_T \mathcal {P}_p\) we can associated a singular drift \(b^\mu (t,\cdot )=B(t,\mu _t)(\cdot )\) such that

$$\begin{aligned} \Vert b^\mu (t,\cdot )\Vert _{B^\alpha _{\infty ,\infty }} \le h_t, \end{aligned}$$

where \(h\in L^q_T\) is the function associated to B from Definition 2.2. Thus \(b^\mu \) satisfies Assumption 3.12 and the associated SDE has a unique solution X by Corollary 3.17; if in addition \(\mu _0\in \mathcal {P}_p\), then by Remark 3.19 the map \(t\mapsto \mathcal {L}(X_t)\) belongs to \(C_T \mathcal {P}_p\).

Thus for fixed \(\mu _0\), setting \(\mathcal {I}^{\mu _0}(\mu )_\cdot =\mathcal {L}(X_\cdot )\), we can define a map \(\mathcal {I}^{\mu _0}\) from \(C_T \mathcal {P}_p\) to itself; this map comes with an alternative notion of solution to the DDSDE.

Definition 4.3

Assume \(B\in \mathcal {G}^{q,\alpha }_p\) with parameters satisfying (4.3), \(\mu _0\in \mathcal {P}_p\); we say that a flow of measures \(\mu \in C_T \mathcal {P}_p\) is a solution to the DDSDE associated to \((\mu _0,B)\) if it satisfies \(\mathcal {I}^{\mu _0}(\mu )=\mu \).

The next lemma clarifies the relation between Definitions 4.2 and 4.3 .

Lemma 4.4

Let \(B\in \mathcal {G}^{q,\alpha }_p\) with parameters satisfying (4.3), \(\mu _0\in \mathcal {P}_p\). The following hold:

  1. i.

    if X is a weak solution to (4.1), then \(\mu _t=\mathcal {L}(X_t)\) is a fixed point for \(\mathcal {I}^{\mu _0}\);

  2. ii.

    if \(\mu \) is a fixed point for \(\mathcal {I}^{\mu _0}\), then there exists a strong solution X to (4.1);

  3. iii.

    if there exists at most one fixed point for \(\mathcal {I}^{\mu _0}\), then pathwise uniqueness and uniqueness in law hold for (4.1).

Proof

Point i. immediately follows from the definitions. To see Point ii., assume \(\mathcal {I}^{\mu _0}(\mu )_\cdot =\mu _\cdot \) and set \(b^\mu (t,\cdot )= B_t(\mu _t)(\cdot )\); then \(b^\mu \in L^q_T B^\alpha _{\infty ,\infty }\), so by the results of Sect. 3, we can construct a strong solution X to the SDE associated to \((\mu _0,b^\mu )\). But then by definition of \(\mathcal {I}^{\mu _0}\) it holds \(\mathcal {L}(X_t)=\mu _t\) and so X solves the DDSDE. It remains to show Point iii.; assume \(X^i\) are two solutions and set \(\mu ^i_\cdot =\mathcal {L}(X^i_\cdot )\). Then by Point ii., \(\mu ^i\) are both fixed points for \(\mathcal {I}^{\mu _0}\), so \(\mu ^1=\mu ^2\) and \(b^{\mu ^1}=b^{\mu ^2}\). But then \(X^i\) both solve the SDE associated to \(b^{\mu ^1}\), for which uniqueness holds both pathwise and in law, so the conclusion follows. \(\square \)

It follows from the above that, in order to show strong existence, pathwise uniqueness and uniqueness in law for the DDSDE (4.1) in the sense of Definition 4.2, it’s enough to show that there exists exactly one solution \(\mu \in C_T \mathcal {P}_p\) in the sense of Definition 4.3.

Proposition 4.5

Let \(B\in \mathcal {G}^{q,\alpha }_p\) with parameters satisfying (4.3); then for any \(\mu _0\in \mathcal {P}_p(\mathbb {R}^d)\) strong existence, pathwise uniqueness and uniqueness in law hold for the DDSDE (4.1) associated to \((\mu _0,B)\).

Proof

Define the map \(\mathcal {I}^{\mu _0}:C_T\mathcal {P}_p\rightarrow C_T\mathcal {P}_p\) associated to \((\mu _0,B)\) as before; in order to show that there exists exactly one fixed point to \(\mathcal {I}^{\mu _0}\), it’s enough to establish its contractivity.

Given \(\mu ^i \in C_T\mathcal {P}_p\), \(i=1,2\), set \(b^i: = b^{\mu ^i} = B(t,\mu ^i_t)\); denote by \(X^i\) two solutions, defined on the same probability space and with respect to the same data \((\xi ,W)\), to the SDEs associated to \((\xi ,b^i)\), where \(\mathcal {L}(\xi )=\mu _0\). By definition of \(\mathcal {G}^{q,\alpha }_p\), there exists \(h\in L^q_T\), such that for any \(\tau \in (0,T]\) we have

$$\begin{aligned} \Vert b^1_t-b^2_t\Vert _{B^{\alpha -1}_{\infty ,\infty }}\le h_t\, \sup _{t \in [0,\tau ]} d_p(\mu ^1_t,\mu ^2_t). \end{aligned}$$

Applying Corollary 3.17, using the fact that \(X^1_0=X^2_0=\xi \), we can find \(\gamma >1/2\) and \(C>0\) such that for any \(\tau \in (0,T]\), we have

$$\begin{aligned} \sup _{t\in [0,\tau ]} d_p(\mathcal {I}^{\mu _0}(\mu ^1)_t,\mathcal {I}^{\mu _0}(\mu ^2)_t)&\le \sup _{t\in [0,\tau ]} \mathbb {E}_{\mathbb {P}}[|X^1_t-X^2_t|^p]^{\frac{1}{p}}\\&\le \tau ^\gamma \, \mathbb {E}_{\mathbb {P}}\big [\Vert X^1-X^2\Vert _{\gamma ;[0,\tau ]}^p\big ]^{\frac{1}{p}}\\&\le C\tau ^\gamma \left( \int _0^\tau \Vert b^1_t-b^2_t\Vert _{B^{\alpha -1}_{\infty ,\infty }}^{q}\mathop {}\!\text {d}t\right) ^{\frac{1}{q}}\\&\le C \Vert h\Vert _{L^q_T}\, \tau ^\gamma \, \sup _{t\in [0,\tau ]}d_p(\mu ^1_t,\mu ^2_t). \end{aligned}$$

Choosing \({\bar{\tau }}>0\) sufficiently small such that \(C\Vert h\Vert _{L^q_T}\,\tau ^\gamma <1\), we find that \(\mathcal {I}^{\mu _0}\) is a contraction from \(C([0,{\bar{\tau }}];\mathcal {P}_p)\) to itself, so therein there exists a unique fixed point \(\bar{\mu } = \mathcal {I}^{\mu _0}(\bar{\mu })\); it remains to show we can extend uniquely this fixed point to the whole interval [0, T].

To do this, the classical argument for SDEs would require to restart the equation at \(t={\bar{\tau }}\); however we can’t perform this, as the fractional Brownian motion is not a Markov process. We can exploit the fact that \({\bar{\tau }}\) only depends on \(C\Vert h\Vert _{L^q_T}\) and not the history of the paths \(X^i\) nor \(\mu ^i\) to give the following alternative reasoning.

Given \({\bar{\tau }}\), \(\bar{\mu }\in C([0,{\bar{\tau }}];\mathcal {P}_p)\) as above, consider \(E := \{ \mu _{\,\cdot \,} \in C([0,2{\bar{\tau }}];\mathcal {P}_p) \,:\, \mu |_{[0,\tau ]} =\bar{\mu } \}\), which is a closed subset of \(C([0,2{\bar{\tau }}];\mathcal {P}_p)\) and thus a complete metric space with the same norm. Since \(\bar{\mu }\) is a fixed point on \([0,{\bar{\tau }}]\), \(\mathcal {I}^{\mu _0}\) leaves E invariant; for any \(\mu ^i \in E\), arguing as above it holds

$$\begin{aligned} \sup _{t \in [0,2{\bar{\tau }}]} d_p(\mathcal {I}^{\mu _0}(\mu ^1)_t,\mathcal {I}^{\mu _0}(\mu ^2)_t)&= \sup _{t\in [{\bar{\tau }},2{\bar{\tau }}]} d_p(\mathcal {I}^{\mu _0}(\mu ^1)_t,\mathcal {I}^{\mu _0}(\mu ^2)_t)\\&\le \sup _{t \in [{\bar{\tau }},2{\bar{\tau }}]} {\bar{\tau }}^\gamma \, \mathbb {E}_{\mathbb {P}}\big [\Vert X^1-X^2\Vert _{\gamma ;[{\bar{\tau }};2{\bar{\tau }}]}^p\big ]^{\frac{1}{p}}\\&\le {\bar{\tau }}^\gamma C\Vert h\Vert _{L^q_T}\, \sup _{t \in [0,2{\bar{\tau }}]}d_p(\mu ^1_t,\mu ^2_t). \end{aligned}$$

It follows that \(\mathcal {I}^{\mu _0}\) is a contraction on E and admits a unique fixed point on it, which is necessarily the only possible extension of \({\bar{\mu }}\) on \([0,2{\bar{\tau }}]\). Repeating the argument on \([0,n{\bar{\tau }}]\) as many times as necessary to cover [0, T] concludes the proof. \(\square \)

4.3 Stability estimates for DDSDEs

The purpose of this section is to provide the proof of Theorem 2.5, which loosely speaking establishes Lipschitz dependence of the solutions \(\mu ^i\in C_T \mathcal {P}_p\) in terms of the data \((\mu ^i_0,B^i)\) for \(i=1,2\).

We assume that we are given drifts \(B^i\) belonging to \(\mathcal {H}^{H,\alpha }_p\) for parameters satisfying (2.2) when \(H>1/2\), respectively \(B^i\in \mathcal {G}^{q,\alpha }_p\) for parameters satisfying (2.3) when \(H\le 1/2\); in both cases we denote the optimal constants by \(\Vert B^i\Vert \). Given \(\mu _0^i\in \mathcal {P}_p\), we denote by \(\mu ^i\in C_T \mathcal {P}_p\) the unique solutions associated to \((\mu ^i_0,B^i)\), whose existence is granted by Theorem 2.4.

Finally, for \(\alpha ,\,q\) given as above, let us recall the notation introduced in Theorem 2.5:

$$\begin{aligned} \Vert B^1-B^2\Vert _{\infty }:=\sup _{(t,\nu )\in [0,T]\times \mathcal {P}_p} \Vert B^1_t(\nu )-B^2_t(\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }} \end{aligned}$$

and

$$\begin{aligned} \Vert B^1-B^2\Vert _{q,\infty }:=\bigg ( \int _0^T \sup _{\nu \in \mathcal {P}_p} \Vert B^1_t(\nu )-B^2_t(\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }}^{q} \mathop {}\!\text {d}t\bigg )^{1/q}. \end{aligned}$$

Proof of Theorem 2.5

Let \(\mu ^i\) be the solutions as above and set \(b^i_t:=B^i(t,\mu ^i_t)\). Recall from the proofs of Propositions 4.1 and 4.5 that if \(\Vert B^i\Vert \le M\), then \(\Vert b^i\Vert _E \le C(M)\), E being suitable spaces for which Assumption 3.12 is met; so we are in a position to apply estimates from Sect. 3.3. First observe that, by addition and subtraction of \(B^1_t(\mu ^2)\), we have

$$\begin{aligned} \Vert b^1_t-b^2_t\Vert _{B^{\alpha -1}_{\infty ,\infty }} \le \Vert B^1_t(\mu ^1_t)-B^1_t(\mu ^2_t)\Vert _{B^{\alpha -1}_{\infty ,\infty }} + \sup _{\nu \in \mathcal {P}_p} \Vert B^1_t(\nu )-B^2_t(\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }}. \nonumber \\ \end{aligned}$$
(4.4)

The argument slightly differs in the \(H>\frac{1}{2}\) and \(H\le \frac{1}{2}\) cases, so we will handle them separately.

We begin with \(H>\frac{1}{2}\). Let us choose \(\tilde{q}<\infty \) big enough so that \((\alpha ,{\tilde{q}},H) \) satisfies (3.12); then we can apply estimate (3.18) to obtain

$$\begin{aligned} \sup _{t\in [0,\tau ]} d_p(\mu ^1_t,\mu ^2_t)^{{\tilde{q}}} \lesssim _{M,{\tilde{q}},T} d_p(\mu ^1_0,\mu ^2_0)^{{\tilde{q}}} + \left( \int _0^\tau \Vert b^1_s-b^2_s\Vert _{B^{\alpha -1}_{\infty ,\infty }}^{{\tilde{q}}} \mathop {}\!\text {d}s\right) ; \end{aligned}$$
(4.5)

on the other hand, by estimate (4.4) and the assumption \(B^1\in \mathcal {H}^{H,\alpha }_p\) with \(\Vert B^1\Vert \le M\), it holds

$$\begin{aligned} \Vert b^1_s-b^2_s\Vert _{B^{\alpha -1}_{\infty ,\infty }} \le M \sup _{r\in [0,s]} d_p(\mu ^1_r,\mu ^2_r)+ \Vert B^1-B^2\Vert _{\infty }. \end{aligned}$$

Putting everything together, setting \(f_t:=\sup _{s\in [0,t]} d_p(\mu ^1_s,\mu ^2_s)^{\tilde{q}}\), we obtain

$$\begin{aligned} f_t \lesssim _{{\tilde{q}}} f_0 + \int _0^t M^{{\tilde{q}}} f_s \mathop {}\!\text {d}s + T \Vert B^1-B^2\Vert _{\infty }^{{\tilde{q}}} \quad \forall \, t\in [0,T]; \end{aligned}$$

applying Grönwall to f and taking the power \(1/{\tilde{q}}\) on both sides readily gives

$$\begin{aligned} \sup _{t\in [0,T]} d_p(\mu ^1_t,\mu ^2_t) \lesssim d_p(\mu ^1_0,\mu ^2_0) + \Vert B^1-B^2\Vert _\infty \end{aligned}$$

which is exactly the desired estimate (2.4).

Suppose now \(X^i\) are solutions defined on the same probability space, then combining estimate (3.17) with the ones above we find

$$\begin{aligned} \mathbb {E}[\Vert X^1-X^2\Vert _\gamma ^p]^{1/p}&\lesssim \Vert X^1_0-X^2_0\Vert _{L^p_\Omega } + \sup _{t\in [0,T]} \Vert b^1_t-b^2_t\Vert \\&\lesssim _M \Vert X^1_0-X^2_0\Vert _{L^p_\Omega } + \sup _{t\in [0,T]} d_p(\mu ^1_t,\mu ^2_t) + \Vert B^1-B^2\Vert _\infty \\&\lesssim \Vert X^1_0-X^2_0\Vert _{L^p_\Omega } + d_p(\mu ^1_0,\mu ^2_0) + \Vert B^1-B^2\Vert _\infty \end{aligned}$$

and the conclusion readily follows from \(d_p(\mu ^1_0,\mu ^2_0)\le \Vert X^1_0-X^2_0\Vert _{L^p_\Omega }\).

We now move on to the case \(H\le \frac{1}{2}\); for \(q=\infty \) the proof is the same as above, so we can assume w.l.o.g. \(q<\infty \) here. For \(B^i\in \mathcal {G}^{q,\alpha }_p\), it follows again by (4.4) that

$$\begin{aligned} \Vert b^1_t-b^2_t\Vert _{B^{\alpha -1}_{\infty ,\infty }} \le h^1_t \sup _{r\in [0,t]} d_p(\mu ^1_r,\mu ^2_r)+\sup _{\nu \in \mathcal {P}_p}\Vert B^1_t(\nu )-B^2_t(\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }}, \end{aligned}$$
(4.6)

where we recall that \(h^i\in L^q_T\) are the functions associated to \(B^i\) given in Definition 2.2. Following the same strategy as before, by (4.6) and (4.5), \(f_t:=\sup _{s\in [0,t]} d_p(\mu ^1_s,\mu ^2_s)^q\) satisfies

$$\begin{aligned} f_t \le f_0+\int _0^t |h^1_s|^q f_s\mathop {}\!\text {d}s + \Vert B^1-B^2\Vert _{q,\infty } \quad \forall t\in [0,T]; \end{aligned}$$

by this inequality and the assumption \(\Vert h^1\Vert _{L^q_T} = \Vert B^1\Vert \le M\), we conclude again by Grönwall’s inequality that

$$\begin{aligned} \sup _{t\in [0,T]} d_p(\mu ^1_t,\mu ^2_t) \lesssim _M d_p(\mu ^1_0,\mu ^2_0) + \Vert B^1-B^2\Vert _{q,\infty } \end{aligned}$$

which gives estimate (2.6). The statement for \(\mathbb {E}[\Vert X^1-X^2\Vert ^p_\gamma ]\) now follows exactly as in the case \(H>1/2\). \(\square \)

We conclude this section with the application of Theorem 2.5 to a particularly relevant case.

Example 4.6

Let \(H\in (0,1)\), \(\alpha >1-1/(2H)\) and set

$$\begin{aligned} E = {\left\{ \begin{array}{ll} C^{\alpha H}_T C^0_x\cap C^0_T C^\alpha _x \quad &{} \text {for } H>\frac{1}{2}\\ L^\infty _T B^\alpha _{\infty ,\infty } &{} \text {for } H\le \frac{1}{2} \end{array}\right. }. \end{aligned}$$

Let \(B^i\) be of the form \(B^i(t,\mu )=f^i_t + g^i_t *\mu \) for \(f^i,g^i\in E\) with \(\Vert f^i\Vert _E,\Vert g^i\Vert _E\le M\). Then \(B^i\) satisfy the assumptions of Theorem 2.5 in \(\mathcal {P}_p\) for any \(p\in [1,\infty )\) and estimate (2.4) becomes

$$\begin{aligned} \sup _{t\in [0,T]} d_p(\mu ^1_t,\mu ^2_t) \lesssim d_p(\mu ^1_0,\mu ^2_0) + \Vert f^1-f^2 \Vert _{L^\infty _T B^{\alpha -1}_{\infty ,\infty }} + \Vert g^1-g^2\Vert _{L^\infty _T B^{\alpha -1}_{\infty ,\infty }}. \end{aligned}$$

5 Refined results in the convolutional case

In this section we focus on the case of DDSDEs with convolutional structure, namely

$$\begin{aligned} X_t = \xi + \int _0^t (b_s*\mu _s)(X_s)\,\text {d}s+W_t,\quad \mu _t=\mathcal {L}(X_t)\quad \forall \, t\in [0,T]. \end{aligned}$$
(5.1)

They correspond to the case \(B_t(\mu )=b_t*\mu \) and can therefore be solved under suitable assumptions on b (e.g. \(b\in E\) as in Example 4.6). Due to their specific structure however, as soon as the associated solution X has a regular law \(\mu \), its regularity immediately transfers to the drift \(b^\mu _t=b_t*\mu _t\), as the next simple lemma shows.

Lemma 5.1

Let \(H\in (0,1)\), \(b\in B^\alpha _{\infty ,\infty }\) for \(\alpha >1-1/(2H)\), \(\mu _0\in \mathcal {P}_1\) and X denote the unique solution to the DDSDE (5.1) with \(\mathcal {L}(\xi ,W)=\mu _0\otimes \mu ^H\). Then X also solves an SDE with drift \(b^\mu \) which belongs to \(L^1_T C^1_x\).

Proof

Let X be the aforementioned solution, then by the proof of Theorem 2.4 we know that it solves an SDE with drift \(b^\mu \) satisfying Assumption 3.12; applying Proposition 3.20 for the choice \(q=1\) in (3.19), we deduce that \(\mu \in L^1_T B^{{\tilde{\alpha }}}_{1,1}\) for any \({\tilde{\alpha }} <1/(2H)\). Therefore by the hypothesis and Young’s inequality it holds \(b^\mu =b_\cdot *\mu _\cdot \in L^1_T B^{\alpha +{\tilde{\alpha }}}_{\infty ,\infty }\) for some \(\alpha >1-1/(2H)\) and all \({\tilde{\alpha }}<1/(2H)\); choosing \({\tilde{\alpha }}\) appropriately gives the conclusion. \(\square \)

Remark 5.2

Up to technicalities, the proof readapts to the case of time-dependent drifts \(B_t(\mu )=b_t*\mu \) with b satisfying Assumption 3.12, with the same conclusion that \(b^\mu \in L^1_T C^1_x\).

Lemma 5.1 shows that in this setting the effective drift \(b^\mu \) is much more regular than the original b, to the point that the SDE associated to \(b^\mu \) can be solved classically. However in order to give meaning to the DDSDE, it suffices to know that \(b^\mu \) satisfies the weaker Assumption 3.12; for this reason we expect the criteria coming from Theorem 2.4 to be suboptimal for convolutional DDSDEs (5.1), as they don’t take into account the different regularity of b and \(b^\mu \).

A partial improvement of those results is given by Theorems 2.6 and 2.7, whose proofs are presented respectively in Sects. 5.1 and 5.2; they rely heavily on general results presented in Appendix A, especially Lemma A.7 and Corollary A.9. The first provides the basic estimate \(\Vert b*(\mu -\nu )\Vert _{B^\alpha _{\infty ,\infty }} \lesssim \Vert b\Vert _{B^{\alpha }_{\infty ,\infty }}\), which is true for all measures \(\mu ,\nu \in \mathcal {P}(\mathbb {R}^d)\); while the latter shows a similar result under additional assumptions on \(\mu \) and \(\nu \) but requiring less integrability on b (in the sense that b is only required to be in \(B^\alpha _{p,p}\) instead of \(B^\alpha _{\infty ,\infty }\)). For example, a special case of Corollary A.9 provides the estimate \(\Vert b*(\mu -\nu )\Vert _{B^{\alpha -1}_{\infty ,\infty }} \lesssim \Vert b\Vert _{B^\alpha _{p,p}} d_r(\mu ,\nu ) (1+\Vert \mu \Vert _{L^q} \Vert \nu \Vert _{L^q})\), as long as the relation \(1\le r/p+1/q\) holds.

Before moving further, let us rigorously define what we mean by solutions here, although the concept is very similar to that of Definition 4.2.

Definition 5.3

Fix \(H\in (0,1)\); let \((\Omega ,\mathcal {F},\mathbb {P})\) be a probability space, \((X,\xi ,W)\) be a \(C_T\times \mathbb {R}^d\times C_T\)-valued random variable defined on it with \(\mathcal {L}_\mathbb {P}(\xi ,W)=\mu _0\otimes \mu ^H\) and b be a distributional drift. We say that X is a solution to the DDSDE (5.1) associated to \((\mu _0,b)\) if setting \(\mu _t:=\mathcal {L}_\mathbb {P}(X_t)\), \(b^\mu _t:=b_t*\mu _t\), X satisfies the SDE

$$\begin{aligned} X_t= \xi + \int _0^t b^\mu _s(X_s)\,\mathop {}\!\text {d}s + W_t \quad \forall \, t\in [0,T], \end{aligned}$$

where we additionally require that either:

  1. i.

    \(b^\mu \) satisfies Assumption 3.12 and the SDE is interpreted in the sense of Definition 3.4, or

  2. ii.

    \(b^\mu \in L^1_T C^0_x\) and the SDE is interpreted in the standard integral sense.

All the concepts of weak solution, strong solution, pathwise uniqueness and uniqueness in law are readapted similarly.

A major role in the proofs of Theorems 2.6 and 2.7 is given by the following conditional uniqueness result.

Proposition 5.4

Let \(H\in (0,1)\), \(p\in [1,\infty )\), \(p'\) its conjugate exponent; let b be a distributional drift satisfying one of the following conditions:

  1. i.

    If \(H>1/2\), then \(b\in C^{\alpha H}_T L^p_x\) and \(b\in C^0_T B^\alpha _{p,p}\) with \(\alpha >1-\frac{1}{2H}\).

  2. ii.

    If \(H\le 1/2\) then \(b \in L^q_T B^\alpha _{p,p}\), with \(\alpha >1-\frac{1}{2H}+\frac{1}{Hq}\).

Assume furthermore that for a given \(\mu _0\in L^{p'}_x\) there exists a weak solution X to the DDSDE (5.1) associated to \((\mu _0,b)\), satisfying

$$\begin{aligned} \sup _{t\in [0,T]} \Vert \mathcal {L}(X_\cdot )\Vert _{L^{p'}_x} <\infty . \end{aligned}$$
(5.2)

Then X is a strong solution; moreover it is the unique one (both pathwise and in law) in the class of solutions satisfying condition (5.2)

Proof

We handle the cases \(H\le 1/2\) and \(H>1/2\) slightly differently.

The case \(H\le 1/2\). First observe that, if X satisfies (5.2), then by Young’s inequality \(b^\mu _\cdot =b_\cdot *\mathcal {L}(X_\cdot )\in L^q_T B^\alpha _{\infty ,\infty }\); in particular \(b^\mu \) satisfies Assumption 3.12, Definition 5.3 is meaningful and X is necessarily a strong solution.

Now let \(X^i\), \(i=1,2\), be two solutions to (5.1) satisfying (5.2); as they are both strong solutions, by the usual arguments, we can assume them to be defined on the same probability space, w.r.t. the same \((\xi ,W)\), and we only need to check that \(X^1=X^2\) \(\mathbb {P}\)-a.s. Moreover thanks to the strict inequality \(\alpha >1-\frac{1}{2H}+\frac{1}{Hq}\) here we can assume w.l.o.g. \(q<\infty \).

For \(i=1,2\), set \(\mu ^i_t = \mathcal {L}(X^i_t)\); as \(b *\mu ^i\) both satisfy Assumption 3.12, we may apply Corollary 3.17 to find

$$\begin{aligned} \sup _{t \in [0,T]} d_{r'}(\mu ^1_t,\mu ^2_t)^{r'}&\le \sup _{t\in [0,T]}\mathbb {E}[| X^1_t-X^2_t|^{r'}] \le \mathbb {E}[\Vert X^1-X^2\Vert _\gamma ^{r'}]\\&\lesssim \Vert b*( \mu ^1-\mu ^2)\Vert _{L^q_T B^{\alpha -1}_{\infty ,\infty }}^{r'}\\&\lesssim \Vert b\Vert _{L^q_T B^\alpha _{p,p}}^{r'} (\Vert \mu ^1\Vert _{L^\infty _T L^{p'}_x}^{r'} + \Vert \mu ^2\Vert _{L^\infty _T L^q_x}^{r'})<\infty . \end{aligned}$$

In particular the quantity \(d_{r'}(\mu ^1_t,\mu ^2_t)\) is finite for any \(r'\in [1,\infty )\) and any \(t\in [0,T]\).

We now wish to apply Corollary A.9 from Appendix A to obtain better control on the difference of the drifts \(b*\mu ^1-b*\mu ^2\). To do so observe that, under our assumptions on the parameters \((\alpha ,q,p)\), we can find new parameters \((s,r)\in (1,\infty )^2\) with s large and r close to 1 such that

$$\begin{aligned} \alpha - \frac{d}{s} > 1-\frac{1}{2H} + \frac{1}{Hq}, \quad 1+\frac{r}{s}\le \frac{r}{p}+\frac{1}{p'}. \end{aligned}$$
(5.3)

For this choice, set \({\tilde{\alpha }} := \alpha - d/s\); by construction the parameters \((p,p',s,r)\) satisfy the assumptions of Corollary A.9 from Appendix A; its application, together with standard Besov embeddings, yields

$$\begin{aligned} \Vert b_t*(\mu ^1_t-\mu ^2_t) \Vert _{B^{{\tilde{\alpha }}-1}_{\infty ,\infty }}&\lesssim \Vert b_t*(\mu ^1_t-\mu ^2_t) \Vert _{B^{\alpha -1}_{s,s}}\\&\lesssim \Vert b_t\Vert _{B^\alpha _{p,p}} (\Vert \mu ^1_t\Vert _{L^{p'}}^{1/r} + \Vert \mu ^2_t\Vert _{L^{p'}}^{1/r}) d_{r'}(\mu ^1_t,\mu ^2_t). \end{aligned}$$

Now since under (5.3) the triple \((\tilde{\alpha },q,H)\) also satisfies (3.12), we can again apply estimate (3.17) from Corollary 3.17 to find

$$\begin{aligned} d_{r'}(\mu ^1_t,\mu ^2_t)^q&\le \Vert X^1_t-X^2_t\Vert _{L^{r'}_\Omega }^q \lesssim \int _0^t \Vert b_u*(\mu ^1_u-\mu ^2_u)\Vert _{B^{{\tilde{\alpha }}-1}_{\infty ,\infty }}^q\, \mathop {}\!\text {d}u \\&\lesssim \int _0^t \Vert b_u\Vert ^q_{B^\alpha _{p,p}} d_{r'}(\mu ^1_u,\mu ^2_u)^q \mathop {}\!\text {d}u. \end{aligned}$$

Applying Grönwall’s lemma we conclude that \(d_{r'}(\mu ^1_t,\mu ^2_t)=0\) and so \(\mu ^1_t=\mu ^2_t\) for all \(t\in [0,T]\). Thus, \(X^i\) are solutions to the same SDE and therefore \(X^1_\cdot =X^2_\cdot \) \(\mathbb {P}\)-a.s.

The case \(H>1/2\). We argue essentially in the same way, only this time checking that X is a strong solution starting from the available information on b and \(\mathcal {L}(X_\cdot )\) is less straightforward.

First observe that Young’s inequality still provides \(b^\mu \in C^0_T C^\alpha _x\), so that by Definition 5.3 the DDSDE is meaningful in the classical integral sense. In order to check Assumption 3.12 for \(b^\mu \) (which implies X being strong), it remains to show that \(b*\mu \in C^{{\tilde{\alpha }} H}_T C^0_x\) for some \({\tilde{\alpha }}\) such that

$$\begin{aligned} 1-\frac{1}{2H} < {\tilde{\alpha }}\le \alpha . \end{aligned}$$

By addition and subtraction, \(b^\mu _t -b^\mu _s = (b_t-b_s)*\mu _t + b_s*(\mu _t-\mu _s)\); by the hypothesis on \(b,\,\mu ,\) we can estimate the first term by

$$\begin{aligned} \Vert (b_t-b_s)*\mu _s\Vert _{C^0_x} \le \Vert b_t-b_s\Vert _{L^p_x}\Vert \mu \Vert _{L^\infty _T L^{p'}_x} \lesssim |t-s|^{\alpha H}. \end{aligned}$$
(5.4)

Since X is solution to the SDE \(X_t=\xi + \int _0^t b^\mu _s(X_s)\text {d}s+ W_t\), for any \(r'\in (1,\infty )\) it holds

$$\begin{aligned} d_{r'}(\mu _t,\mu _s) \le \Vert X_t-X_s\Vert _{L^{r'}_\Omega } \le \Vert b*\mu ^i\Vert _{L^\infty _T C^0_x}\, |t-s| +\Vert W_t-W_s\Vert _{L^{r'}_\Omega } \lesssim _{r',T} |t-s|^H. \end{aligned}$$

Now similarly to the case \(H\le 1/2\), choose \((s,r) \in (1,\infty )^2\) such that

$$\begin{aligned} {\tilde{\alpha }}:=\alpha -\frac{d}{s} > 1-\frac{1}{2H}, \quad 1+\frac{r}{s}\le \frac{r}{p}+ \frac{1}{p'}; \end{aligned}$$

Applying Corollary A.9 and the previous estimate for \(d_{r'}(\mu _t,\mu _s)\), we then find

$$\begin{aligned} \Vert b_t*(\mu ^i_t- \mu ^i_s)\Vert _{B^{{\tilde{\alpha }}-1}_{\infty ,\infty }} \lesssim d_{r'}(\mu _t,\mu _s) \lesssim |t-s|^H. \end{aligned}$$

On the other hand, since \(b\in C^0_T B^\alpha _{p,p}\) and \(\mu \in L^\infty _T L^{p'}_x\), by Young’s inequality, we have \(\Vert b_t *(\mu ^i_t - \mu ^i_s)\Vert _{B^{\alpha }_{\infty ,\infty }}\lesssim 1\). We can now interpolate between the two estimates: choose \(\theta ={\tilde{\alpha }}\in (0,1)\), so that \(\theta ({\tilde{\alpha }} -1) + (1-\theta )\alpha = (1-{\tilde{\alpha }})(\alpha -{\tilde{\alpha }})>0\), then by the embedding \(B^\varepsilon _{\infty ,\infty }\hookrightarrow C^0_x\) for \(\varepsilon >0\), we obtain

$$\begin{aligned} \Vert b_t*(\mu ^i_t-\mu ^i_s)\Vert _{C^0_x} \lesssim \Vert b*(\mu ^i_t-\mu ^i_s)\Vert _{B^\alpha _{\infty ,\infty }}^{1-{\tilde{\alpha }}} \,\Vert b_t*(\mu ^i_t-\mu ^i_s)\Vert _{B^{{\tilde{\alpha }}-1}_{\infty ,\infty }}^{{\tilde{\alpha }}} \lesssim |t-s|^{{\tilde{\alpha }} H}. \end{aligned}$$

So we conclude that \(b*\mu ^i\in C^0_T C^{{\tilde{\alpha }}}_x\cap C^{\tilde{\alpha } H}_T C^0_x\), where by construction \({\tilde{\alpha }}>1-1/(2H)\).

The second part of the argument, concerning the comparison of two solutions \(X^i\) satisfying (5.2), now proceeds identically as in the case \(H\le 1/2\). \(\square \)

5.1 Distributional kernels with bounded divergence

Proposition 5.4 reduces the problem of uniqueness of solutions (in a suitable class) to that of establishing their regularity, in the sense of equation (5.2).

One classical way to show that the condition \(\mu _0\in L^{p'}_x\) is propagated at positive times, which has been exploited systematically after [12], is to impose boundedness of \(divb\); in the setting of DDSDEs with general additive noise and regular drift b, an analogous statement can be found in [19,   Proposition 4.3].

Proposition 5.5

Let \(H\in (0,1)\), \(q \in (2,\infty ]\), \(p\in [1,\infty )\), \(p'\) its conjugate exponent. Let b be a distributional drift such that \(divb \in L^1_{T} L^\infty _x\) and either:

  1. i.

    If \(H>1/2\), then \(b\in C^{\alpha H}_T L^p_x\cap C^0_T B^\alpha _{p,p}\) for some \(\alpha >1-\frac{1}{2H}\).

  2. ii.

    If \(H\le 1/2\), then \(b \in L^q_T B^\alpha _{p,p}\) for some \(\alpha >1-\frac{1}{2H}+\frac{1}{Hq}\).

Then for any \(\mu _0\in L^{p'}_x\) there exists a strong solution to the DDSDE (5.1) associated to \((\mu _0,b)\), which moreover satisfies \(\mathcal {L}(X_\cdot )\in L^\infty _T L^{p'}_x\).

Proof

We start by dealing with the case \(H\le 1/2\); at the end of the proof we explain how the reasoning needs to be modified for \(H>1/2\).

The case \(H\le 1/2\). In this case we can assume w.l.o.g. \(q<\infty \); recall that if \(f^n\) is a bounded sequence in \(L^q_T B^\alpha _{\infty ,\infty }\), for \((\alpha ,q)\) satisfying Assumption 3.12 such that \(f^n\rightarrow f\) in \(L^q_T B^{\alpha -1}_{\infty ,\infty }\), then by Corollary 3.17 the associated solutions

$$\begin{aligned} X^n_t = \xi + \int _0^t f^n(X^n_s)\,\text {d}s + W_t \end{aligned}$$

converge to the unique strong solution X of the SDE associated to \((\xi ,W,f)\) (we can assume \(\{X^n\}_{n\ge 1}\) and X to be defined on the same probability space for the same \((\xi ,W)\)).

Given b as in the hypothesis, consider a sequence of smooth, bounded functions \(b^n\) such that \(b^n\rightarrow b\) in \(L^q_T B^\alpha _{p,p}\) with \(\Vert divb^n\Vert _{L^1_T L^\infty _x} \le \Vert divb\Vert _{L^1_T L^\infty _x}\); let \(X^n\) be the solutions to

$$\begin{aligned} X^n_t = \xi + \int _0^t b^n*\mathcal {L}(X^n_s)(X^n_s)\text {d}s + W_t, \end{aligned}$$

whose existence is granted by classical results (see e.g. [11,   Theorem 7]) and set \(\mu ^n_t = \mathcal {L}(X^n_t)\). By [19,   Proposition 4.3], there exists \(C=C(\Vert divb\Vert _{L^1_T L^\infty _x})>0\) such that

$$\begin{aligned} \sup _{n\in \mathbb {N}} \sup _{t\in [0,T]} \Vert \mu ^n_t\Vert _{L^{p'}_x} \le C \Vert \mu _0\Vert _{L^{p'}_x}<\infty . \end{aligned}$$

As a consequence, each \(X^n\) solves an SDE with drift \(f^n=b^n*\mu ^n\) satisfying

$$\begin{aligned} \sup _{n\in \mathbb {N}} \Vert f^n \Vert _{L^q_T B^\alpha _{\infty ,\infty }} \lesssim \sup _{n\in \mathbb {N}}\Vert b^n\Vert _{L^q_T B^\alpha _{p,p}}\, \sup _{n\in \mathbb {N}}\sup _{t\in [0,T]} \Vert \mu ^n_t\Vert _{L^{p'}_x} \lesssim C \Vert b\Vert _{L^q_T B^\alpha _{p,p}} \Vert \mu _0\Vert _{L^{p'}_x}<\infty . \end{aligned}$$

In turn this implies by Remark 3.19 that for any fixed \(\varepsilon >0\) we have the uniform estimate \(\sup _n \mathbb {E}[\llbracket X^n\rrbracket _{C^{H-\varepsilon }}]<\infty \); since moreover \(X^n_0=\xi \) for all \(n\in \mathbb {N}\), we can conclude by Ascoli–Arzelà that the sequence \(\{X^n\}_{n\ge 1}\) is tight in \(C_T\). We can then extract a (not relabelled) subsequence such that \(\mathcal {L}(X^n)\) converge weakly to some \(\mu \in \mathcal {P}(C_T)\); consequently \(\mu ^n_t\rightharpoonup \mu _t\) in \(\mathcal {P}(\mathbb {R}^d)\) for any \(t\in [0,T]\), where \(\mu _t = e_t \sharp \mu \) and \(e_t:C_T\rightarrow \mathbb {R}^d\) is the evaluation map. It follows from the uniform estimates that \(\Vert \mu _t\Vert _{L^{p'}_x} \le C \Vert \mu _0\Vert _{L^{p'}}\) as well.

We claim that the drifts \(f^n_t=b^n_t*\mu ^n_t\) converge to \(f_t:=b_t*\mu _t\) in \(L^q_T B^{\alpha -1}_{\infty ,\infty }\). Once this is shown, by the initial observation the solutions \(X^n\) must converge to the unique solution X associated to \((\xi ,W,f)\); then it must hold \(\mathcal {L}(X_t)=\mu _t\), \(f_t=b_t*\mathcal {L}(X_t)\) and so we can conclude that X is a solution to (5.1) with the desired regularity.

It remains to show the claim; to this end, we set

$$\begin{aligned} f^n_t-f_t =b_t^n*(\mu ^n_t-\mu _t)+(b_t^n-b_t)*\mu _t =: g^n_t + h^n_t. \end{aligned}$$

By Corollary A.5 in Appendix A, \(g^n_t\rightarrow 0\) in \(B^{{\tilde{\alpha }}}_{\infty ,\infty }\) for all \({\tilde{\alpha }}<\alpha \) and a.e. \(t\in [0,T]\); the bound \(|g^n_t|\le \Vert b_t\Vert _{B^\alpha _{p,p}} (\Vert \mu ^n_t\Vert _{L^{p'}_x}+\Vert \mu _t\Vert _{L^{p'}_x})\) and dominated convergence imply that \(g^n \rightarrow g\) in \(L^q_T B^{{\tilde{\alpha }}}_{\infty ,\infty }\) for all \({\tilde{\alpha }}<\alpha \). For \(h^n\) we have the estimate

$$\begin{aligned} \lim _{n\rightarrow \infty } \Vert h^n\Vert _{L^q_T B^\alpha _{\infty .\infty }} \le \sup _{t\in [0,T]} \Vert \mu _t\Vert _{L^{p'}_x}\,\lim _{n\rightarrow \infty }\Vert b^n-b\Vert _{L^q_T B^\alpha _{p,p}} = 0. \end{aligned}$$

Hence we have shown the claim and thus the conclusion in this case.

The case \(H>1/2\). As in the proof of Proposition 5.4, in this regime \(\mathcal {L}(X_\cdot )\in L^\infty _T L^{p'}_x\) is not enough to deduce straightaway that \(b*\mathcal {L}(X_\cdot )\) satisfies Assumption 3.12; however up to technical details, the proof is almost the same as above.

Specifically, we can consider a sequence \(\{b^n\}_n\) of smooth functions, uniformly bounded in \(C^{\alpha H}_T L^p_x \cap C^0_T B^\alpha _{p,p}\), with \(divb^n\) uniformly bounded in \(L^1_T L^\infty _x\) and such that \(b^n\rightarrow b\) in \(L^q_T B^\alpha _{p,p}\) for any \(q<\infty \). Then exploiting the a priori bound from [19,   Proposition 4.3] and the argument from Proposition 5.4, one can derive uniform estimates for the solutions \(X^n\) associated to X and finally pass to the limit with the help of Corollary 3.17.

Alternatively, let us mention that the existence of a weak solution X satisfying \(\mathcal {L}(X_\cdot )\in L^\infty _T L^{p'}_x\) in this setting can be obtained by an application of [19,   Proposition 4.4]. \(\square \)

Proof of Theorem 2.6

It is now an immediate consequence of Propositions 5.4 and 5.5. \(\square \)

5.2 Integrable kernels

We now restrict ourselves to the case \(H \le 1/2\) and drifts \(b\in L^q_T L^p_x\); in this setting we can present a second route to establishing existence of a solution with sufficiently regular law, to which we can apply Proposition 5.4.

Before proceeding further, let us explain why it is reasonable to expect so. By the Besov embedding \(L^p_t \hookrightarrow B^{-d/p}_{\infty ,\infty }\), drifts \(b\in L^q_T L^p_x\) satisfy Assumption 3.12 if and only if

$$\begin{aligned} \frac{1}{q}+\frac{Hd}{p}<\frac{1}{2}-H; \end{aligned}$$
(5.5)

however, differently from the class \(L^q_T B^\alpha _{p,p}\), for \(b\in L^q_T L^p_x\) it is known after the works [32, 38] that Girsanov transform (and thus weak existence and uniqueness in law for associated SDEs) is available as soon as

$$\begin{aligned} \frac{1}{q}+\frac{Hd}{p}<\frac{1}{2}. \end{aligned}$$
(5.6)

As already seen in Sect. 3.4, Girsanov transform allows to deduce information on the regularity of \(\mathcal {L}(X_t)\), which in turn provides higher regularity of the effective drift \(b^\mu \) for the convolutional DDSDE. In particular, we may hope that starting from \(b\in L^q_T L^p_x\) for (qp) satisfying (5.6), we end up with \(b^\mu \in L^{{\tilde{q}}}_T L^{{\tilde{p}}}_x\) with \(({\tilde{q}},{\tilde{p}})\) satisfying (5.5).

At a technical level, we will proceed similarly as in Sect. 5.1, first establishing uniform a priori estimates for regular b and then running an approximation procedure. We start by establishing the recalling and improving the available results on Girsanov transform; as we are only interested in smooth approximations, for simplicity we restrict to regular drifts.

Lemma 5.6

Let \((\Omega ,\mathcal {F},\mathbb {P})\) be a probability space, \((\xi ,W)\) a \(\mathbb {R}^d\times C_T\)-valued r.v. on it with \(\mathcal {L}_\mathbb {P}(\xi ,W)=\mu _0\otimes \mu ^H\) for some \(H \le 1/2\) and let \(f:[0,T]\times \mathbb {R}^d\rightarrow \mathbb {R}^d\) be a globally Lipschitz drift, \(f\in L^q_T L^p_x\) for parameters \((p,q)\in [1,\infty ]^2\) satisfying (5.6); let X be the unique strong solution to

$$\begin{aligned} X_t= \xi + \int _0^t f_s (X_s)\mathop {}\!\text {d}s+ W_t \quad \forall \, t\in [0,T]. \end{aligned}$$

Then there exists a measure \(\mathbb {Q}\) equivalent to \(\mathbb {P}\) such that \(\mathcal {L}_\mathbb {Q}(X)=\mathcal {L}_\mathbb {P}(\xi +W)\) and there exists an increasing function F, depending on HTpq, such that

$$\begin{aligned} \mathbb {E}_\mathbb {Q}\bigg [\Big (\frac{\mathop {}\!\text {d}\mathbb {P}}{\mathop {}\!\text {d}\mathbb {Q}}\Big )^n\bigg ] + \mathbb {E}_\mathbb {Q}\bigg [\Big (\frac{\mathop {}\!\text {d}\mathbb {Q}}{\mathop {}\!\text {d}\mathbb {P}}\Big )^n\bigg ] \le F(n, \Vert f\Vert _{L^q_T L^p_x})<\infty \quad \forall \, n\in \mathbb {N}\end{aligned}$$

where the estimate does not depend on \(\mu _0\) nor the specific function f.

Proof

For deterministic initial data \(\xi =x_0\in \mathbb {R}^d\) (equiv. \(\mu _0=\delta _{x_0}\)), the statement is a direct consequence of [32,   Lemma 6.7], where it is already stressed that the estimates only depend on \(\Vert f\Vert _{L^q_T L^p_x}\) but not on \(x_0\) nor the specific f. The proof for random initial data \(\xi \) independent of W is now identical to that of Corollary 3.18; the estimate not depending on \(\xi \) follows from the property that \(\Vert f(x_0+\cdot )\Vert _{L^q_T L^p_x} = \Vert f\Vert _{L^q_T L^p_x}\) for all \(x_0\in \mathbb {R}^d\). \(\square \)

The next lemma shows that the initial regularity of \(\mu _0\) is propagated at positive times, establishing useful a priori estimates; the proof is similar to that of Proposition 3.21.

Lemma 5.7

Let \(\xi ,\,W,\,X,\,f,\,(p,q)\) be as in Lemma 5.6 and assume \(\mu _0\in L^r_x\) for some \(r\in (1,\infty )\); then

$$\begin{aligned} \sup _{t\in [0,T]} \Vert \mathcal {L}_\mathbb {P}(X_t)\Vert _{L^{{\tilde{r}}}_x} <\infty \quad \forall \, {\tilde{r}}\in (1,r). \end{aligned}$$

Proof

Fix \({\tilde{r}}<r\) and denote by \({\tilde{r}}'\) the conjugate exponent of \({\tilde{r}}\); take \(\varepsilon >0\) such that \(r'(1+\varepsilon )={\tilde{r}}'\). Let \(\mathbb {Q}\) be the measure given by Lemma 5.6 such that \(\mathcal {L}_\mathbb {Q}(X)=\mathcal {L}_\mathbb {P}(\xi +W)\); since \(\text {d}\mathbb {P}/\text {d}\mathbb {Q}\) admits moments of any order, for any \(g\in C^\infty _c(\mathbb {R}^d)\), by Hölder

$$\begin{aligned} |\langle g,\mathcal {L}(X_t)\rangle |&\le \mathbb {E}_\mathbb {P}[|g|(X_t)] = \mathbb {E}_\mathbb {Q}\bigg [|g|(X_t)\, \frac{\text {d}\mathbb {P}}{\text {d}\mathbb {Q}} \bigg ]\\&\le \mathbb {E}_\mathbb {Q}\big [|g|^{1+\varepsilon } (X_t)\big ]^{\frac{1}{1+\varepsilon }} \, \mathbb {E}_\mathbb {Q}\bigg [\Big ( \frac{\text {d}\mathbb {P}}{\text {d}\mathbb {Q}}\Big )^{1+\frac{1}{\varepsilon }}\bigg ]^{\frac{\varepsilon }{1+\varepsilon }}\\&\lesssim _\varepsilon \mathbb {E}_{\mathbb {P}}\big [|g|^{1+\varepsilon } (\xi +W_t)\big ]^{\frac{1}{1+\varepsilon }}\\&= \langle |g|^{1+\varepsilon }, \mu _0*\mathcal {L}_{\mathbb {P}}(W_t)\rangle ^{\frac{1}{1+\varepsilon }} \end{aligned}$$

where in the last passage we used the fact that \(\xi \) and \(W_t\) are independent under \(\mathbb {P}\). Recalling that \(\mathcal {L}(W_t)\) is a probability measure, by Hölder’s and then Young’s inequality we arrive at

$$\begin{aligned} |\langle g,\mathcal {L}(X_t)\rangle | \lesssim _\varepsilon \Vert |g|^{1+\varepsilon }\Vert _{L^{r'}_x}^{\frac{1}{1+\varepsilon }}\, \Vert \rho *\mathcal {L}(W_t) \Vert _{L^r_x}^{\frac{1}{1+\varepsilon }} \lesssim \Vert g\Vert _{L^{r'(1+\varepsilon )}_x}\, \Vert \mu _0 \Vert _{L^r_x}^{\frac{1}{1+\varepsilon }} = \Vert g\Vert _{L^{{\tilde{r}}'}_x}\, \Vert \mu _0 \Vert _{L^r_x}^{\frac{1}{1+\varepsilon }} \end{aligned}$$

As the estimate is uniform over all \(g\in C^\infty _c(\mathbb {R}^d)\) and \(t\in [0,T]\), by duality we deduce that

$$\begin{aligned} \sup _{t\in [0,T]} \Vert \mathcal {L}(X_t)\Vert _{L^{{\tilde{r}}}_x}\lesssim _{\varepsilon } \Vert \mu _0\Vert _{L^r_x}^{\frac{1}{1+\varepsilon }}; \end{aligned}$$

as the reasoning holds for all \({\tilde{r}}<r\), the conclusion follows. \(\square \)

We are now ready to prove the existence of solutions to the DDSDE (5.1) for \(b\in L^q_T L^p_x\) and sufficiently integrable \(\mu _0\).

Proposition 5.8

Let \(H \le 1/2\), \((p,r,q) \in [1,\infty )\times (1,\infty ) \times (2,\infty ]\) such that

$$\begin{aligned} \frac{1}{q}+Hd \bigg (\frac{1}{p}+\frac{1}{r}-1\bigg )< \frac{1}{2} -H, \quad \frac{1}{q}+\frac{Hd}{p}<\frac{1}{2}. \end{aligned}$$
(5.7)

Then for any \(b \in L^q_T L^{p}_x\) and any \(\mu _0\in L^r_x\) there exists a strong solution X to the associated DDSDE (5.1), which moreover satisfies \(\mathcal {L}(X_\cdot )\in L^\infty _T L^{{\tilde{r}}}_x\) for any \({\tilde{r}}\in [1,r)\).

Proof

We pursue the same general strategy as in the proof of Proposition 5.5.

As condition (5.7) only contains strict inequalities, w.l.o.g. we can assume \(q<\infty \); consider a sequence \(\{b^n\}_n\) of Lipschitz, compactly supported functions such that \(b^n\rightarrow b\) in \(L^q_T L^p_x\) and \(\Vert b^n\Vert _{L^q_T L^p_x}\le \Vert b\Vert _{L^q_T L^p_x}\). It follows from [11,   Theorem 7] that for every n there exists a unique solution \(X^n\) to the approximating DDSDE

$$\begin{aligned} X^n_t = \xi + \int _0^t (b^n_s*\mu ^n_s)(X^n_s)\,\mathop {}\!\text {d}s+ W_t, \quad \mu ^n_t=\mathcal {L}(X^n_t). \end{aligned}$$

In particular, each \(X^n\) is also a solution to an SDE with drift \(f^n_t:=b^n_t*\mathcal {L}(X^n_t)\) and by Young’s inequality

$$\begin{aligned} \sup _n \Vert f^n\Vert _{L^q_T L^p_x} \le \sup _n \Vert b^n\Vert _{L^q_T L^p_x} \le \Vert b\Vert _{L^q_T L^p_x}. \end{aligned}$$

Therefore we may apply Lemmas 5.6 and 5.7 to obtain the uniform bound

$$\begin{aligned} \sup _n \Vert \mu ^n_\cdot \Vert _{L^\infty _T L^{{\tilde{r}}}_x} <\infty \end{aligned}$$

for all \({\tilde{r}}\in [1,r)\). Applying Hölder’s inequality to the integral in time and and Young’s inequalities to the convolution in space, we find

$$\begin{aligned} \sup _n \Vert b^n *\mu ^n\Vert _{L^q_T L^{{\tilde{p}}}_x} <\infty \end{aligned}$$

for any \({\tilde{p}}< \bar{p}\), where

$$\begin{aligned} 1+\frac{1}{\bar{p}} = \frac{1}{r} + \frac{1}{p}. \end{aligned}$$

Using the fact that \({\tilde{p}}\) can be chosen arbitratrily close to \(\bar{p}\) and that the first inequality in (5.7) is strict, we see that the family \(\{b^n*\mu ^n\}\) is bounded in \(L^q_T L^{{\tilde{p}}}_x\) for parameters \((q,{\tilde{p}})\) satisfying

$$\begin{aligned} \frac{1}{q}+ \frac{Hd}{{\tilde{p}}} <\frac{1}{2}-H; \end{aligned}$$

but this is exactly condition (5.5), i.e. the regularity regime in which we know how to solve the SDE in a strong sense. On the other hand, the uniform bound for \(\Vert b^n\Vert _{L^q_T L^p_x}\) and the use of Girsanov transform allows to derive a uniform bound for \(\mathbb {E}[\llbracket X^n\rrbracket _{H-\varepsilon }]\) for any \(\varepsilon >0\); together with \(X^n_0=\xi \) for all n this implies tightness of \(\{X^n\}_n\), so that we can extract a (not relabelled) subsequence such that \(\mathcal {L}(X^n)\rightharpoonup \mu \) in \(\mathcal {P}(C_T)\), \(\mu ^n_t\rightharpoonup \mu _t=e_t\sharp \mu \) for all \(t\in [0,T]\).

From here, the argument is almost identical to that of Proposition 5.5: once we show that \(b^n*\mu ^n\rightarrow b*\mu \) in a sufficiently strong topology, then by Corollary 3.17 the solutions \(X^n\) will converge to the unique strong solution X associated to \((b*\mu ,\xi )\), which must therefore be a solution to the DDSDE associated to \(\mu _0=\mathcal {L}(\xi )\) and b. By the uniform bounds on \(\{\mu ^n\}_n\) and weak convergence \(\mu ^n_t\rightharpoonup \mu _t=\mathcal {L}(X_t)\), we also deduce that \(\mathcal {L}(X_\cdot )\in L^\infty _T L^{{\tilde{r}}}_x\) for all \({\tilde{r}}< r\).

Since by construction \(b^n\rightarrow b\) in \(L^q_t L^p_x\), \(\mu ^n_t\rightharpoonup \mu _t\) and \(\{b^n*\mu ^n\}_n\) is bounded in \(L^q_T L^{{\tilde{p}}}_x\) for some \((q,{\tilde{p}})\) satisfying (5.5), we can apply Corollary A.6 from Appendix A to deduce that (up to further relabelling \({\tilde{p}}-\varepsilon \) into \({\tilde{p}}\)) \(b^n*\mu ^n\rightarrow b*\mu \) in \(L^q_T L^{{\tilde{p}}}_x\). In light of the embedding \(L^q_T L^{{\tilde{p}}}_x \hookrightarrow L^q_T B^{-d/\tilde{p}}_{\infty ,\infty }\) and Corollary 3.17, this implies the conclusion. \(\square \)

We are now ready to prove the main result of this subsection.

Theorem 5.9

Let \(H \le 1/2\), \((p,r,q) \in [1,\infty )\times (1,\infty ) \times (2,\infty ]\) satisfy (5.7). Then for any \(b \in L^q_T L^p_x\) and \(\mu _0 \in L^r_x\), then there exists a strong solution X to (5.1), which satisfies \(\mathcal {L}(X_\cdot )\in L^\infty _T L^{{\tilde{r}}}_x\) for any \({\tilde{r}}<r\); pathwise uniqueness and uniqueness in law hold in the class of solutions satisfying this condition.

Proof

The proof is based on a (non-trivial) combination of Propositions 5.4 and 5.8.

Under our assumptions, the existence of a strong solution such that \(\mathcal {L}(X_\cdot )\in L^\infty _T L^{{\tilde{r}}}_x\) for any \({\tilde{r}}<r\) is granted; in particular if \(r>p'\), then we can choose \({\tilde{r}}=p'\) and then assumptions of Proposition 5.5 in this case are satisfied thanks to the embedding \(L^q_T L^p_x \hookrightarrow L^q_T B^{-\varepsilon }_{p,p}\) for any \(\varepsilon >0\), giving the uniqueness part of the statement. Up to technicalities the borderline case \(r=p'\) can be treated similarly, exploiting the embedding \(L^q_T L^p_x \hookrightarrow L^q_T B^{-\varepsilon }_{{\tilde{p}},{\tilde{p}}}\) for some \({\tilde{p}}={\tilde{p}}(\varepsilon )>p\) chosen so that \(1/{\tilde{r}} + 1/{\tilde{p}} =1\).

Thus it remains to study the regime \(r< p'\), equivalently \(r'> p\); in this case we can choose \({\tilde{r}}<r\) such that \({\tilde{r}}'>p\) as well. By Besov embedding it then holds

$$\begin{aligned} L^q_T L^p_x \hookrightarrow B^\alpha _{{\tilde{r}}',{\tilde{r}}'} \quad \text {for}\quad \alpha := -d\bigg ( \frac{1}{p}-\frac{1}{{\tilde{r}}'}\bigg ) = -d\bigg ( \frac{1}{p}+\frac{1}{{\tilde{r}}}-1\bigg ); \end{aligned}$$

to verify that b satisfies the assumptions of Proposition 5.4, it then suffices to check that

$$\begin{aligned} \frac{1}{q} + Hd \bigg ( \frac{1}{p}+\frac{1}{{\tilde{r}}}-1\bigg ) < \frac{1}{2}-H; \end{aligned}$$

since \({\tilde{r}}\) can be taken arbitrarily close to r, this follows from the first strict inequality in (5.7). \(\square \)

Remark 5.10

For \(r>d/(d-1)\), condition (5.6) implies condition (5.7). Therefore Theorem 2.7 is a particular subcase of Theorem 5.9.