Abstract
Motivated by the recent advances in the theory of stochastic partial differential equations involving nonlinear functions of distributions, like the Kardar–Parisi–Zhang (KPZ) equation, we reconsider the unique solvability of one-dimensional stochastic differential equations, the drift of which is a distribution, by means of rough paths theory. Existence and uniqueness are established in the weak sense when the drift reads as the derivative of a \(\alpha \)-Hölder continuous function, \(\alpha >1/3\). Regularity of the drift part is investigated carefully and a related stochastic calculus is also proposed, which makes the structure of the solutions more explicit than within the earlier framework of Dirichlet processes.
1 Introduction
Given a family of continuous paths \(({\mathbb {R}}\ni x \mapsto Y_t(x))_{t \ge 0}\) with values in \({\mathbb {R}}\), we are interested in the solvability of the stochastic differential equation
with a given initial condition, where \(\partial _{x} Y_t\) is understood as the derivative of \(Y_t\) in the sense of distribution and \((B_{t})_{t \ge 0}\) is a standard one-dimensional Wiener process.
When \(\partial _{x} Y_t\) makes sense as a measurable function, with suitable integrability conditions, pathwise existence and uniqueness are known to hold: See the earlier papers by Zvonkin [32] and Veretennikov [30] when the derivative exists as a bounded function, in which case existence and uniqueness hold globally, together with the more recent result by Krylov and Röckner [23] when \(\partial _{x} Y_t\) is in \(L^p_\mathrm{loc}((0,+\infty ) \times {\mathbb {R}}^d)\) for some \(p>d+2\)—the equation being set over \({\mathbb {R}}^d\) instead of \({\mathbb {R}}\)—, in which case existence and uniqueness just hold locally; see also the Saint-Flour Lecture Notes by Flandoli [10] for a complete account. In the case when \(\partial _{x} Y_t\) only exists as a distribution, existence and uniqueness have been mostly discussed within the restricted time homogeneous framework. When the field \(Y\) is independent of time, \(X\) indeed reads as a diffusion process with \((1/2)\exp (-2Y(x)) \partial _{x} (\exp (2Y(x)) \partial _{x})\) as generator. Then, solutions to (1) can be proved to be the sum of a Brownian motion and of a process of zero quadratic variation and are thus referred to as Dirichlet processes. In this setting, unique solvability can be proved to hold in the weak or strong sense according to the regularity of \(Y\), see for example the papers by Flandoli et al. [12, 13] on the one hand and the paper by Bass and Chen [3] on the other hand. We also refer to the more recent work by Catellier and Gubinelli [6] for the case when \((B_{t})_{t \ge 0}\) is replaced by a general rough signal, like the trajectory of a fractional Brownian motion with an arbitrary Hurst parameter.
In the current paper, we allow \(Y\) to depend upon time, making impossible any factorization of the generator of \(X\) under a divergence form and thus requiring a more systematic treatment of the singularity of the drift. In order to limit the technicality of the paper, the analysis is restricted to the case when the diffusion coefficient in (1) is 1, which is already, as explained right below, a really interesting case for practical purposes and which is, anyway, somewhat universal because of the time change property of the Brownian motion. As suggested in the aforementioned paper by Bass and Chen [3], pathwise existence and uniqueness are then no more expected to hold whenever the path \(Y_t\) has oscillations of Hölder type with a Hölder exponent strictly less than \(1/2\). For that reason, we will investigate the unique solvability of (1) in the so-called weak sense by tackling a corresponding formulation of the martingale problem. Indeed, we will consider the case when \(Y_t\) is Hölder continuous, the Hölder exponent, denoted by \(\alpha \), being strictly greater than \(1/3\), hence possibly strictly less than \(1/2\), thus yielding solutions to (1) of weak type only, that is solutions that are not adapted to the underlying noise \((B_{t})_{t \ge 0}\). At this stage of the introduction, it must be stressed that the threshold \(1/3\) for the Hölder exponent of the path is exactly of the same nature as the one that occurs in the theory of rough paths. It is also worth mentioning that a variant of our set-up has just been considered by Flandoli et al. [11], which handle the same equation, the dimension of the state space being possibly larger than 1 but the Hölder exponent of \(Y_{t}\) being (strictly) greater than \(1/2\).
Actually, the theory of rough paths will play a major role in our analysis. The strategy for solving (1) is indeed mainly inspired by the papers [23, 30, 32] we mentioned right above and consists in finding harmonic functions associated with the (formal) generator
Solving Partial Differential Equations (PDEs) driven by \(\partial _{t} + {{\mathcal {L}}}_{t}\), say in the standard mild formulation, then requires to integrate with respect to \(\partial _{x} Y_t(x)\) (in \(x\)), which is a non-classical thing. This is precisely the place where the rough paths theory initiated by Lyons (see [24, 25]) comes in: As recently exposed by Hairer in his seminal paper [19] on the KPZ equation and in the precursor paper [18] on rough stochastic PDEs, mild solutions to PDEs driven by \(\partial _{t} + {{\mathcal {L}}}_{t}\) may be expanded as rough integrals involving the standard heat kernel on the one hand and the ‘rough’ increments \(\partial _{x} Y_t\) on the other hand. In our case, we are interested in the solutions of the PDE
when set on a cylinder \([0,T] \times {\mathbb {R}}\), with a terminal boundary condition at time \(T>0\), and when driven by a smooth function \(f\). Solutions obtained by letting the source term \(f\) vary generates a large enough ‘core’ in order to apply the standard martingale problem approach by Stroock and Varadhan [28] and thus to characterize the laws of the solutions to (1).
Unfortunately, although such a strategy seems quite clear, some precaution is in fact needed. When \(\alpha \) is between \(1/3\) and \(1/2\), which is the typical range of application of Lyons’ theory, the expansion of mild solutions as rough integrals involving the heat kernel and the increments of \(\partial _{x} Y_{t}\) is not so straightforward. It is indeed not enough to assume that the path \({\mathbb {R}}\ni x \mapsto Y_{t}(x)\) has a rough path structure for any given time \(t \ge 0\). As explained in detail in Sect. 2, the rough path structures, when taken at different times, also interact, asking for the existence, at any time \(t \ge 0\), of a ‘lifted’ 2-dimensional rough path with \(Y_{t}\) as first coordinate. We refrain from detailing the shape of such a lifting right here as it is longly discussed in the sequel. We just mention that, in Hairer [19], the family \((Y_{t}(x)))_{t \ge 0,x \in {\mathbb {R}}}\) has a Gaussian structure, which permits to construct the lifting by means of generic results on rough paths for Gaussian processes, see Friz and Victoir [16]. Existence of the lifting under more general assumptions is thus a challenging question, which is (partially) addressed in Sect. 5: The lifting is proved to exist in other cases, including that when \(\alpha >1/2\) and when \((Y_{t}(x))_{t \ge 0,x \in {\mathbb {R}}}\) is smooth enough in time (and in particular when it is time homogeneous). Another difficulty is that, contrary to Hairer [18, 19] in which problems are set on the torus, the PDE is here set on a non-compact domain. This requires an additional analysis of the growth of the solutions in terms of the behavior of \((Y_{t}(x))_{t \ge 0,x \in {\mathbb {R}}}\) for large values of \(\vert x \vert \), such an analysis being essential to discuss the non-explosion of the solutions to (1).
Besides existence and uniqueness, it is also of great interest to understand the specific dynamics of the solutions to (1). Part of the paper is thus dedicated to a careful analysis of the infinitesimal variation of \(X\), that is of the asymptotic behavior of \(X_{t+h} - X_{t}\) as \(h\) tends to \(0\). In this perspective, we prove that the increments of \(X\) may be split into two pieces: a Brownian increment as suggested by the initial writing of Eq. (1) and a sort of drift term, the magnitude of which is of order \(h^{(1+\beta )/2}\), for some \(\beta >0\) that is nearly equal to \(\alpha \). Such a decomposition is much stronger than the standard decomposition of a Dirichlet process into the sum of a martingale and of a zero quadratic variation process. Somehow it generalizes the one obtained by Bass and Chen [3] in the time homogeneous framework when \(\alpha \ge 1/2\). As a typical example, \((1+\beta )/2\) is nearly equal to \(3/4\) when \(Y_t\) is almost \(1/2\)-Hölder continuous, which fits for instance the framework investigated by Hairer [19]. In particular, except trivial cases when the distribution is a true function, integration with respect to the drift term in (1) cannot be performed as a classical integration with respect to a function of bounded variation. In fact, since the value of \((1+\beta )/2\) is strictly larger than \(1/2\), it makes sense to understand the integration with respect to the drift term as a kind of Young integral, in the spirit of the earlier paper [31]. We here say ‘a kind of Young integral’ and not ‘a Young integral’ directly since, as we will see in the analysis, it sounds useful to develop a stochastic version of Young’s integration, that is a Young-like integration that takes into account the probabilistic notion of adaptedness as it is the case in Itô’s calculus.
In the end, we prove that, under appropriate assumptions on the regularity of the field \((Y_t(x))_{t \ge 0,x\in {\mathbb {R}}}\), Eq. (1) is uniquely solvable in the weak sense (for a given initial condition) and that the solution reads as
where \(b\) maps \([0,+\infty ) \times {\mathbb {R}}\times [0,+\infty )\) to \({\mathbb {R}}\) and the integral with respect to \(b(t,X_{t},\mathrm{d}t)\) makes sense as a stochastic Young integral, the magnitude of \(b(t,X_{t},\mathrm{d}t)\) being of order \(\mathrm{d}t^{(1+\beta )/2}\).
The examples we have in mind are twofold. The first one is the so-called ‘Brownian motion in a time-dependent random environment’ or ‘Brownian motion in a time-dependent random potential’. Indeed, much has been said about the long time behavior of the Brownian motion in a time-independent random potential such as the Brownian motion in a Brownian potential, see for example [2, 5, 8, 20, 21, 27, 29]. We expect our paper to be a first step forward toward a more general analysis of one-dimensional diffusions in a time-dependent random potential, even if, in the current paper, nothing is said about the long run behavior of the solutions to (1), this question being left to further investigations. As already announced, the second example we have in mind is the so-called Kardar–Parisi–Zhang (KPZ) equation (see [22]), to which much attention has been paid recently, see among others Bertini and Giacomin [4], Hairer [19] and Friz and Hairer [15, Chap.15] about the well-posedness and Amir et al. [1] about the long time behavior. In this framework, \(Y\) must be thought as a realization of the time-reversed solution of the KPZ equation, that is \(Y_t(x)=u(\omega ,T-t,x), T\) being positive and \(u(\omega ,\cdot ,\cdot )\) denoting the random solution to the KPZ equation and being defined either as in Bertini and Giacomin by means of the Cole–Hopf transform or as in Hairer by means of rough paths theory. Then, Eq. (1) reads as the equation for describing the dynamics of the canonical path \((w_{t})_{0 \le t \le T}\) on the canonical space \({\mathcal {C}}([0,T],{\mathbb {R}})\) under the polymer measure
where \(\dot{\zeta }\) is a space-time white noise and \({\mathbb {P}}\) is the Wiener measure, the white noise being independent of the realizations of the Wiener process under \({\mathbb {P}}\). In this perspective, our result provides a quenched description of the infinitesimal dynamics of the polymer.
The paper is organized as follows. We remind the reader of the rough paths theory in Sect. 2. Main results about the solvability of (1) are also exposed in Sect. 2. Section 3 is devoted to the analysis of PDEs driven by the operator (2). In Sect. 4, we propose a stochastic variant of Young’s integral in order to give a rigorous meaning to (4). We discuss in Sect. 5 the construction of the ‘rough’ iterated integral that makes the whole construction work. Finally, in Sect. 6, we explain the connection with the KPZ equation.
2 General strategy and main results
Our basic strategy to define a solution to the SDE (1) relies on a suitable adaptation of Zvonkin’s method for solving SDEs driven by a bounded and measurable drift (see [32]) and of Stroock and Varadhan’s martingale problem (see [28]). The main point is to transform the original equation into a martingale. For sure such a strategy requires a suitable version of Itô’s formula and henceforth a right notion of harmonic functions for the generator of the diffusion process (1). This is precisely the point where the rough paths theory comes in, on the same model as it does in Hairer’s paper for solving the KPZ equation.
This section is thus devoted to a sketchy presentation of rough paths theory and then to an appropriate reformulation of Zvonkin’s method.
2.1 Rough paths on a segment
We start with reminders about rough paths, following Gubinelli’s approach in [17]. Given \(\alpha \in (0,1]\), \(n\in {\mathbb {N}}{\setminus } \{0\}\) and a segment \({\mathbb {I}} \subset {\mathbb {R}}\), we denote by \({\mathcal {C}}^{\alpha }({\mathbb {I}},{\mathbb {R}}^n)\) the set of \(\alpha \)-Hölder continuous functions \(f : {\mathbb {I}} \rightarrow {\mathbb {R}}^n\) and we define the seminorm

with \(\Vert f\Vert _{\infty }^{\mathbb {I}} := \sup _{x \in {\mathbb {I}}} |f(x)|\) and \(a \vee b= \max (a,b)\). Note that the factor \((1\vee \max _{x\in {\mathbb {I}}}|x|)^{-\alpha /2}\) is somewhat useless and could be replaced by \(1\) at this stage of the paper. Actually it will really matter in the sequel, when considering paths over the whole line. Similarly, we denote by \({\mathcal {C}}_2^{\alpha }({\mathbb {I}},{\mathbb {R}}^{n})\) the set of functions \({\fancyscript{R}}\) from \({\mathbb {I}}^2\) to \({\mathbb {R}}^n\) such that \({\fancyscript{R}}(x,x)=0\) for every \(x\) and with finite norm \(\Vert {\fancyscript{R}}\Vert _{\alpha }^{\mathbb {I}}:=\sup _{x,y \in {\mathbb {I}},x\ne y}\{ |{\fancyscript{R}}(x,y)|/|y-x|^\alpha \}\). (Functionals defined on the product space \({\mathbb {R}}^2\) will be denoted by calligraphic letters).
For \(\alpha \in (1/3,1]\), we call \(\alpha \)-rough path (on \({\mathbb {I}}\)) a pair \((W,{\fancyscript{W}})\) where \(W\in {\mathcal {C}}^\alpha ({\mathbb {I}},{\mathbb {R}}^n)\) and \({\fancyscript{W}}\in {\mathcal {C}}_2^{2\alpha }({\mathbb {I}},{\mathbb {R}}^{n^2})\) such that, for any indices \(i,j\in \{1,\dots ,n\}\), the following relation holds: for any \( x\leqslant y \leqslant z,\)
We then denote by \({{\mathcal {R}}}^{\alpha }({\mathbb {I}},{\mathbb {R}}^n)\) the set of \(\alpha \)-rough paths; we will often only write \({\varvec{W}}\) for the rough path \((W,{\fancyscript{W}})\). The quantity \({\fancyscript{W}}^{i,j}(x,y)\) must be understood as a value for the iterated integral (or cross integral) “\(\int _x^y(W^i(z)-W^i(x))\mathrm{d}W^j(z)\)” of \(W\) with respect to itself (we will also use the tensorial product “\(\int _x^y(W(z)-W(x)) \otimes \mathrm{d}W(z)\)” to denote the product between coordinates). When \(\alpha = 1\), such an integral exists in a standard sense. When \(\alpha > 1/2\), it exists as well, but in the so-called Young sense (see [24, 31] and Lemma 24 below). When \(\alpha \in (1/3,1/2]\), which is the typical range of values in rough paths theory, there is no more a canonical way to define the cross integral and it must be given a priori in order to define a proper integration theory with respect to \(\mathrm{d}W\). In that framework, condition (5) imposes some consistency in the behavior of \({\fancyscript{W}}\) when intervals of integration are concatenated. Of course, \({\fancyscript{W}}\) plays a role in the range \(\alpha \in (1/3,1/2]\) only, but in order to avoid any distinction between the cases \(\alpha \in (1/3,1/2]\) and \(\alpha \in (1/2,1]\), we will refer to the pair \((W,{\fancyscript{W}})\) in both cases, even when \(\alpha >1/2\), in which case \({\fancyscript{W}}\) will be just given by the iterated integral of \(W\).
Given \({\varvec{W}} \in {\mathcal {R}}^{\alpha }({\mathbb {I}},{\mathbb {R}}^n)\) as above, the point is then to define the integral “\(\int _{x}^y v(z) \mathrm{d}W(z)\)” of some function \(v\) (from \({\mathbb {I}}\) into itself) with respect to the coordinates of \(\mathrm{d}W\) for some \([x,y] \subset {\mathbb {I}}\). When \(v\) belongs to \({\mathcal {C}}^{\beta }({\mathbb {I}},{\mathbb {R}})\), for \(\beta > 1-\alpha \), Young’s theory applies, without any further reference to the second-order structure \({\fancyscript{W}}\) of \({\varvec{W}}\). When \(\beta \le 1- \alpha \), Young’s theory fails, but, in the typical example when \(v\) is \(W-W(x)\) itself (or one coordinate of \(W-W(x)\)), the integral is well-defined as it is precisely given by \({\fancyscript{W}}\). In order to benefit from the second-order structure of \({\fancyscript{W}}\) for integrating a more general \(v\), the increments of \(v\) must actually be structured in a similar fashion to that of \(W\). This motivates the following notion (which holds whatever the sign of \(\alpha +\beta -1\) is): For \(\beta \in (1/3,1]\), a path \(v\) is said to be \(\beta \)-controlled by \(W\) if \(v\in {\mathcal {C}}^\beta ({\mathbb {I}},{\mathbb {R}})\) and there is a function \(\partial _Wv\in {\mathcal {C}}^{\beta }({\mathbb {I}},{\mathbb {R}}^n)\) such that the remainder term
is in \({\mathcal {C}}_2^{2 \beta '}({\mathbb {I}},{\mathbb {R}})\), with \(\beta ' := \beta \wedge 1/2\). In the above right-hand side, \(\partial _{W} v(x)\) reads as a row vector—as it is often the case for gradients—and \((W(y)-W(x))\) as a column vector. Although \(\partial _{W} v\) may not be uniquely defined, we will sometimes write \(v\) for \((v,\partial _{W} v)\) when no confusion is possible on the value of \(\partial _{W} v\). For instance, any function \(v \in {\mathcal {C}}^{2\beta '}({\mathbb {I}},{\mathbb {R}})\) is \(\beta \)-controlled by \(W\), a possible (but not necessarily unique) choice for the ‘derivative’ \(\partial _{W} v\) being \(\partial _{W} v \equiv 0\).
We are then able to define the integral of a function \(v\) controlled by \(W\) (see [17–19]):
Theorem 1
Given \(\alpha ,\beta \in (1/3,1]\), let \({\varvec{W}} \in {\mathcal {R}}^{\alpha }({\mathbb {I}},{\mathbb {R}}^n)\) be a rough path and \(v \in {\mathcal {C}}^\beta ({\mathbb {I}},{\mathbb {R}})\) be a path \(\beta \)-controlled by \(W\). For two reals \(x<y\) in \({\mathbb {I}}\), consider the compensated (vectorial) Riemann sum:
where \(\Delta =(x=x_0<\cdots <x_N=y)\) is a partition of \([x,y]\) (above \(\partial _{W} v(x_{i})\) is a row vector and \({\fancyscript{W}}(x_{i},x_{i+1})\) a matrix). Then, as the step size \(\pi (\Delta )\) of the partition tends to 0, \(S(\Delta )\) converges to a limit, denoted by \(\int _x^y v(z) \mathrm{d}W(z)\), independent of the choice of the approximating partitions. Moreover, there is a constant \(C=C(n,\alpha ,\beta )\) such that,
Observe that, with our prescribed range of values for \(\alpha \) and \(\beta \), the exponents \(2\alpha +\beta \) and \(\alpha +2 \beta '\) are (strictly) greater than 1. This observation is crucial to prove the convergence of \(S(\Delta )\) as the step size tends to \(0\). When \(v\) is any arbitrary function in \({\mathcal {C}}^{2\beta '}({\mathbb {I}},{\mathbb {R}})\), Definition 1 applies and the integral of \(\int _{x}^y v(z) dW(z)\) coincides with the Young integral. Notice also that, most of the time, we shall work with \(\beta < \alpha \).
We now address the problem of stability of the integral with respect to \(W\). Replacing \(((v,\partial _{W}v),{\varvec{W}})\) by a sequence of smooth approximations \(((v^n, \partial _{W^n} v^n),{\varvec{W}}^{n})_{n \ge 1}\), a question is to decide whether the (classical) integrals of the \((v^n)_{n \ge 1}\)’s with respect to the approximated paths are indeed close to the rough integral of \(v\) with respect to \(W\). Actually, it is true if
-
(i)
the convergence of \({\varvec{W}}^n\) to \({\varvec{W}}\) holds in the sense of rough paths, that is \([\![W-W^{n}]\!]_{\alpha }^{\mathbb {I}} + \Vert {\fancyscript{W}}-{\fancyscript{W}}^{n}\Vert _{2\alpha }^{\mathbb {I}}\) tends to \(0\) as \(n\) tends to the infinity (\({\fancyscript{W}}^n\) standing for the true iterated integral of \(W^n\)), in which case we say that the rough path \({\varvec{W}}\) (or \((W,{\fancyscript{W}})\)) is geometric;
-
(ii)
the convergence of \((v^n,\partial _{W^n} v^n)\) to \((v,\partial _{W} v)\) holds in the sense of controlled paths, that is \([\![v-v^n ]\!]_{\beta }^{\mathbb {I}}+ [\![\partial _{W}v-\partial _{W^n}v^n ]\!]_{\beta }^{\mathbb {I}} + \Vert {\fancyscript{R}}^v - {\fancyscript{R}}^{v^n} \Vert _{2\beta '}^{\mathbb {I}}\) tends to \(0\) as \(n\) tends to the infinity.
2.2 Time indexed families of rough paths
It is well-guessed that, in order to handle (1), we have in mind to choose \(W(x)=Y_{t}(x)\), \(x \in {\mathbb {R}}\), and to apply rough paths theory at any fixed time \(t \ge 0\) (thus requiring to choose \({\mathbb {I}} = {{\mathbb {R}}}\) and subsequently to extend the notion of rough paths to the whole \({\mathbb {R}}\), which will be done in the next paragraph). Anyhow a difficult aspect for handling (1) is precisely that \((Y_{t}(x))_{t \ge 0,x \in {\mathbb {R}}}\) is time dependent. If it were time homogeneous, part of the analysis we provide here would be useless: we refer for instance to [3, 12, 13]. From the technical point of view, the reason is that, in the homogeneous framework, the analysis of the generator of the process \(X\) reduces to the analysis of a standard one-dimensional ordinary differential equation. Whenever coefficients depend on time, the connection with ODEs boils down, thus asking for non-trivial refinements. From the intuitive point of view, time-inhomogeneity makes things much more challenging as the underlying differential structure in space varies at any time: In order to integrate with respect to \(\partial _{x} Y_{t}(x)\) in the rough paths sense, the second-order structure of the rough paths must be defined first and it is well-understood that it is then time-dependent as well. This says that the problem consists of a time-indexed family of rough paths, but, a priori (and unfortunately), it is not clear whether defining the rough paths time by time is enough to handle the problem. Actually, as we explain below, it may not be enough as the rough paths structures interact with one another, thus requiring additional assumptions on \((Y_{t}(x))_{t \ge 0,x \in {\mathbb {R}}}\).
As above, we first limit our exposition of time-dependent rough paths to the case when \(x\) lives in a segment \({\mathbb {I}}\). For some time horizon \(T>0\), and for \(\alpha ,\gamma >0\), we define the following (semi-)norms for continuous functions \(f:[0,T)\times {\mathbb {I}}\rightarrow {\mathbb {R}}^n\) and \({\fancyscript{M}}: [0,T) \times {\mathbb {I}}^{2}\rightarrow {\mathbb {R}}^{n}\):

with the convention that \(\Vert f\Vert _{0,\alpha }^{[0,T) \times {\mathbb {I}}} = \sup _{0 \le t < T} \Vert f \Vert _{\alpha }^{\mathbb {I}}\), together with
We then define the spaces \({\mathcal {C}}^{\gamma ,\alpha }([0,T)\times {\mathbb {I}},{\mathbb {R}}^n)\) and \({\mathcal {C}}_2^{\gamma ,\alpha }([0,T) \times {\mathbb {I}},{\mathbb {R}}^{n})\) accordingly.
For \(\alpha \in (1/3,1]\), we call time dependent \(\alpha \)-rough path a family of rough paths \( ({\varvec{W}}_{t})_{0 \le t <T}= (W_{t},{\fancyscript{W}}_{t})_{0 \le t < T}\) where \(W\in {\mathcal {C}}([0,T)\times {\mathbb {I}},{\mathbb {R}}^n)\) and \({\fancyscript{W}}\in {\mathcal {C}}([0,T)\times {\mathbb {I}}^2,{\mathbb {R}}^{n^2})\) such that, for any \(t\in [0,T)\), the pair \((W_t,{\fancyscript{W}}_t)\) is an \(\alpha \)-rough path and
We denote by \({{\mathcal {R}}}^{\alpha }([0,T) \times {\mathbb {I}},{\mathbb {R}}^n)\) the set of time-dependent \(\alpha \)-rough paths endowed with the seminorm \(\Vert \cdot \Vert ^{[0,T) \times {\mathbb {I}}}_{0,\alpha }.\) For \(\beta \in (1/3,1]\), we then say that \(v\in {\mathcal {C}}([0,T)\times {\mathbb {I}},{\mathbb {R}})\) is \(\beta \)-controlled by the paths \((W_{t})_{0 \le t < T}\) if \(v\in {\mathcal {C}}^{\beta /2,\beta }([0,T) \times {\mathbb {I}},{\mathbb {R}})\) and there exists a function \(\partial _W v\in {\mathcal {C}}^{\beta /2,\beta }([0,T) \times {\mathbb {I}},{\mathbb {R}}^n)\) such that, for any \(t \in [0,T)\), the remainder below is in \({\mathcal {C}}_{2}^{2\beta '}({\mathbb {I}},{\mathbb {R}}^n)\):
2.3 Rough paths on the whole line
So far, we have only defined rough paths (or time dependent rough paths) on segments. As Eq. (1) is set on the whole space, we must extend the definition to \({\mathbb {R}}\), the point being to specify the behavior at infinity of the underlying (rough) paths and of the corresponding controlled functions.
When the family \((Y_{t}(x))_{t \ge 0,x \in {\mathbb {R}}}\) is differentiable in \(x\), a sufficient condition to prevent a blow-up in (1) is to require \((\partial _{x} Y_{t}(x))_{t \ge 0,x \in {\mathbb {R}}}\) to be at most of linear growth in \(x\). In our setting, \((Y_{t}(x))_{t \ge 0, x \in {\mathbb {R}}}\) is singular and it makes no sense to discuss the growth of its derivative. The point is thus to control the growth of the local Hölder norm of \((Y_{t}(x))_{t \ge 0,x \in {\mathbb {R}}}\) together with (as shown later) the growth of the local Hölder norm of the associated iterated integral.
This motivates the following definition. For \(\alpha \in (1/3,1]\) and \(\chi >0\), we call \(\alpha \)-rough path (on \({{\mathbb {R}}}\)) with rate \(\chi \) a pair \({\varvec{W}}=(W,{\fancyscript{W}})\) such that, for any \(a\ge 1\), the restriction of \((W,{\fancyscript{W}})\) to \([-a,a]\) is in \({{\mathcal {R}}}^{\alpha }([-a,a])\), and
We denote by \({{\mathcal {R}}}^{\alpha ,\chi }({\mathbb {R}},{\mathbb {R}}^n)\) the set of all such \((W,{\fancyscript{W}})\).
This definition extends to time-dependent families of rough paths. Given \(T>0\), we say that \((W_{t},{\fancyscript{W}}_{t})_{0 \le t <T}\) belongs to \({{\mathcal {R}}}^{\alpha ,\chi }([0,T) \times {\mathbb {R}},{\mathbb {R}}^n)\) if
In a similar way, we must specify the admissible growth of the functions that are controlled by rough paths on the whole \({\mathbb {R}}\). A comfortable framework is to require exponential bounds. Given \((W,{\fancyscript{W}}) \in {{\mathcal {R}}}^{\alpha ,\chi }({\mathbb {R}},{\mathbb {R}}^n)\) and \(\vartheta \ge 1\), we say that a function \(v: {\mathbb {R}}\rightarrow {\mathbb {R}}\) is in \({\mathcal B}^{\beta ,\vartheta }({\mathbb {R}},W)\) for some \(\beta \in (1/3,1]\) if, for any segment \({\mathbb {I}} \subset {\mathbb {R}}\), the restriction of \(v\) to \({\mathbb {I}}\) is \(\beta \)-controlled by \(W\) and
Abusively, we omit the dependence upon \(\partial _{W} v\) in \(\Theta ^{\vartheta }(v)\). Similarly, for \((W_{t},{\fancyscript{W}}_{t})_{0 \le t <T} \in {{\mathcal {R}}}^{\alpha ,\chi }([0,T) \times {\mathbb {R}},{\mathbb {R}}^n)\), a function \(v : [0,T) \times {\mathbb {R}}\rightarrow {\mathbb {R}}\) is in \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\) if, for any \(a \ge 1\), its restriction to \([0,T) \times [-a,a]\) is \(\beta \)-controlled by \((W_{t})_{0 \le t < T}\) and, for some \(\lambda > 0\),

is finite, with \( E_T^{\vartheta ,\lambda }(t,a):=\exp [\lambda (T-t)+\vartheta a(1 + T-t)]\) (it reflects the backward nature of (3)). Note that the set \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\) does not depend on \(\lambda \), but that \(\Theta ^{\vartheta ,\lambda }_{T}(v)\) does.
By Theorem 1, we can easily obtain a control of the integral \(\int v_t\mathrm{d}Y_t\) by the norm \(\Theta _T^{\vartheta ,\lambda }(v)\):
Lemma 2
Assume \(\beta \le \alpha \). Then, there exists a constant \(C=C(n,\alpha ,\beta )\), such that, for any \(\vartheta ,\lambda ,a\ge 1\), any \(v\in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\) and any \((t,x,y)\in [0,T)\times [-a,a]^2\),
with \({\fancyscript{D}}(t,a,z) := |z|^{2\alpha }a^{2\chi }+|z|^{2\alpha +\beta } a^{2\chi +\frac{\beta }{2}} + \vert z \vert ^{{\alpha + 2\beta '}} a^{\chi } (a^{{\beta '}} +(T-t)^{-{\frac{\beta '}{2}}})\).
2.4 Enlargement of the rough path structure
We now discuss how the time dependent rough path structures of the drift \((Y_{t}(x))_{t \ge 0,x \in {\mathbb {R}}}\) interact with one another as time varies.
Formally the generator associated with (1) reads \({{\mathcal {L}}} = \partial _{t} + \partial _{x}( Y_{t}(x)) \partial _{x} + (1/2) \partial _{xx}^2\). This suggests that, on \([0,T) \times {\mathbb {R}}\), harmonic functions (that is zeros of the generator) read as
where \(p\) denotes the standard heat kernel and \(P\) the standard heat semi-group (so that \(P_{t} f(x) = \int _{{\mathbb {R}}} p_{t}(x-y) f(y) \mathrm{d}y\)). In the case when the boundary condition of the function \(v\) is given by \(u_{T}(x)=x\), a formal expansion of \(\partial _{x} u_{t}(x)\) in the neighborhood of \(T\) gives
In the first order term of the expansion, the space integral makes sense as the singularity can be transferred from \(Y_{r}\) onto \(\partial _{x} p_{r-t}(x-z)\), provided the integration by parts is licit: using the approximation argument discussed above, it is indeed licit when the rough path is geometric. In order to give a sense to this first order term, the point is to check that the resulting singularity in time is integrable, which is addressed in Sect. 3. Unfortunately, the story is much less simple for the second order term. Any formal integration by parts leads to a term involving a ‘cross’ integral between the space increments of \(Y\), but taken at different times: This is the place where rough structures, indexed by different times, interact.
We refrain from detailing the computations at this stage of the paper and feel more convenient to defer their presentation to Sect. 3 below. Basically, the point is to give, at any time \(t \in [0,T)\), a sense to the integral \(\int _{x}^y Z_{t}^T(z) \mathrm{d}Y_{t}(z)\), where, for all \(t \in [0,T)\) and \(x \in {\mathbb {R}}\),
Assuming that \(\sup _{0 \le t <T} \sup _{x,y \in {\mathbb {R}}} [ (1+ \vert x \vert ^{\chi } + \vert y \vert ^{\chi })^{-1} \Vert Y_{t} \Vert _{\alpha }^{[x,y]}]\) is finite (for some \(\chi >0\)), the above integral is well-defined (see Lemma 19 below). In order to make sure that the cross integral of \(Z_{t}^T\) with respect to \(Y_{t}\) exists, the point is to assume that the pair \((Y_{t},Z_{t}^T)\) can be lifted up to a rough path of dimension 2, which is to say that there exists some \({\fancyscript{W}}^T\) with values in \({\mathbb {R}}^4\) such that \(((Y,Z^T),{\fancyscript{W}}^T)\) is an \(\alpha \)-time dependent rough path, for some \(\alpha >1/3\). We will see in Sect. 5 conditions under which such a lifting \({\fancyscript{W}}^T\) indeed exists.
2.5 Generator of the diffusion and related Dirichlet problem
We now provide some solvability results for the Dirichlet problem driven by \(\partial _{t} + {{\mathcal {L}}}_{t}\) in (2):
Definition 3
Given \(Y\in {\mathcal {C}}([0,T)\times {\mathbb {R}},{\mathbb {R}})\), assume that there exists \({\fancyscript{W}}^T\) such that \((W^T=(Y,Z^T),{\fancyscript{W}}^T)\) belongs to \({{\mathcal {R}}}^{\alpha ,\chi }([0,T) \times {\mathbb {R}},{\mathbb {R}}^2)\) with \(\alpha >1/3\) and \(\chi >0\). Given \(f\in {\mathcal {C}}([0,T]\times {\mathbb {R}},{\mathbb {R}})\), with \(\sup _{a \ge 1} \sup _{0 \le t \le T} e^{-\vartheta a} \Vert f_{t} \Vert _{\infty }^{[-a,a]} < \infty \) for some \(\vartheta \ge 0\), a function \(u : [0,T] \times {\mathbb {R}}\rightarrow {\mathbb {R}}\) is a mild solution on \([0,T]\times {\mathbb {R}}\) to the problem \(\mathcal {P}(Y,f,T)\):
if \(u\) is continuously differentiable with respect to \(x\), with \(\partial _{x} u \in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W^T)\) for some \(\beta \in (1/3,1]\), and satisfies
Finiteness of the integrals over \({\mathbb {R}}\) will be checked in Lemma 11 below. We also emphasize that a notion of weak solution could be given as well, but we won’t use it.
Remark 4
When \((W^T,{\fancyscript{W}}^T)\) is geometric, the last term in the right-hand side coincides (by integration by parts, which is made licit by approximation by smooth paths) with \(\int _t^T\int _{{\mathbb {R}}} p_{r-t}(x-y) \partial _x u_r(y)\mathrm{d}Y_r(y) \mathrm{d}r\), which reads as a more ‘natural formulation’ of a mild solution and which is, by the way, the formulation used in Sections 3.1 and 3.2 of Hairer [19] for investigating the KPZ equation and in Section 3.1 of Hairer [18] for handling rough SPDEs. The formulation (14) seems a bit more tractable as it splits into two well separated parts the rough integration and the regularization effect of the heat kernel. Once again, both are equivalent in the geometric (and in particular smooth) setting.
Here is a crucial result in our analysis (the proof is postponed to Sect. 3).
Theorem 5
Let \(Y\) be as in Definition 3. Then, for any \(f \in {\mathcal {C}}([0,T] \times {\mathbb {R}},{\mathbb {R}})\) and \(u^T\in \mathcal {C}^1({\mathbb {R}},{\mathbb {R}})\), with
for some \(\vartheta \ge 1\), \(\gamma >0\) and \(\beta \in (1/3,\alpha )\), with \(\beta > 2 \chi \), there is a unique solution, in the space \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W^T)\), of the problem \(\mathcal {P}(Y,f,T)\) with \(u_T=u^T\) as terminal condition.
Letting \(m:=\max [1, T,\vartheta ,m_{0},\kappa _{\alpha ,\chi }(W^T,{\fancyscript{W}}^T) ]\), we can find \(C=C(m,\alpha ,\beta ,\chi )\), such that, for any \((t,x) \in [0,T] \times {\mathbb {R}}\),
and for any \((s,t,x,y) \in [0,T]^2 \times {\mathbb {R}}^2\),
We now address the question of stability of mild solutions under mollification of \((W^T,{\fancyscript{W}}^T)\). We call a mollification of \(W^T\) ‘physical’ if it consists in mollifying \(Y\) in \(x\) first—the mollification is then smooth in \(x\), the derivatives being continuous in space and time—and then in replacing \(Y\) by its mollified version in (13). Denoting by \(Y^n\) the mollified path at the \(n\)th step of the mollification, the resulting \(Z^{n,T}\) is smooth in \(x\), the derivatives being also continuous in space and time. This permits to define the corresponding pair \((W^{n,T},{\fancyscript{W}}^{n,T})\) directly. In that specific geometric setting, we claim (once again, the proof is deferred to Sect. 3).
Proposition 6
In the same framework as in Theorem 5, assume that the rough path \((W^T,{\fancyscript{W}}^T)\) is geometric in the sense that there exists a sequence of smooth paths \((Y^n)_{n \ge 1}\) such that the corresponding sequence \((W^{n,T}=(Y^n,Z^{n,T}))_{n \ge 1}\) satisfies
-
(1)
\(\Vert (W^T-W^{n,T},{\fancyscript{W}}^T-{\fancyscript{W}}^{n,T})\Vert _{0,\alpha }^{[0,T) \times {\mathbb {I}}}\) tends to \(0\) as \(n\) tends to \(\infty \) for any segment \({\mathbb {I}} \subset {{\mathbb {R}}}\), where \({\fancyscript{W}}^{n,T}_{t}(x,y) = \int _{x}^y (W_{{t}}^{n,T}(z) - W_{{t}}^{n,T}(x)) \otimes \mathrm{d}W_{{t}}^{n,T}(z)\), for \(t \in [0,T)\) and \(x,y \in {\mathbb {R}}\),
-
(2)
\(\sup _{n \ge 1}\kappa _{\alpha ,\chi }((W^{n,T}_{t}, {\fancyscript{W}}_{t}^{n,T} )_{0 \le t \le T})\) is finite (see (11) for the definition of \(\kappa _{\chi }\)).
Then, the associated solutions \((u^n)_{n \ge 1}\) (in the sense of Definition 3) and their gradients \((v^n = \partial _{x} u^n)_{n \ge 1}\) converge towards \(u\) and \(v= \partial _{x} u\) uniformly on compact subsets of \([0,T] \times {\mathbb {R}}\).
It is worth noting that each \(u^n\) is actually a classical solution of the PDE (3) driven by \(Y^n\) instead of \(Y\). The reason is that, in the characterization (14) of a mild solution (in the rough sense), the rough integral coincides with a standard Riemann integral when \(W^n\) is smooth. We refer to [18, Corollary3.12] for another use of this (quite standard) observation.
2.6 Martingale problem
We now define the martingale problem associated with (1).
Definition 7
Let \(T_{0}>0\) and \(x_{0}\in {\mathbb {R}}\). Given \(Y\in {\mathcal {C}}([0,T_{0})\times {\mathbb {R}},{\mathbb {R}})\), assume that, for any \(0 \le T \le T_{0}\), there exists \({\fancyscript{W}}^T\) such that \((W^T=(Y,Z^T),{\fancyscript{W}}^T)\) belongs to \({{\mathcal {R}}}^{\alpha ,\chi }([0,T) \times {\mathbb {R}},{\mathbb {R}}^2)\) with \(\alpha >1/3\) and \(\chi < \alpha /2\), the supremum \(\sup _{0 \le T \le T_{0}} \kappa _{\alpha ,\chi }((W_{t}^T,{\fancyscript{W}}_{t}^T)_{0 \le t <T})\) being finite.
A probability measure \({\mathbb {P}}\) on \(\mathcal {C}([0,T_{0}],{\mathbb {R}})\) (endowed with the canonical filtration \(({\mathcal F}_{t})_{0 \le t \le T_{0}}\)) is said to solve the martingale problem related to \({\mathcal {L}}\) starting from \(x\) if the canonical process \((X_t)_{0 \le t \le T_{0}}\) satisfies the following two conditions:
-
(1)
\({\mathbb {P}}(X_0=x_{0})=1\),
-
(2)
for any \(T \in [0,T_{0}]\), \(f \in {\mathcal {C}}([0,T] \times {\mathbb {R}},{\mathbb {R}})\) and \(u^{T} \in {\mathcal {C}}^1({\mathbb {R}},{\mathbb {R}})\) satisfying (15) with respect to some \(\vartheta \ge 1\), \(\gamma >0\) and \(\beta \in (2\chi ,\alpha )\), the process \(( u_{t}(X_{t})-\int _0^{t}f_r(X_r)\mathrm{d}r)_{0\le t\le T}\) is a square integrable martingale under \({\mathbb {P}}\), where \(u\) is the solution of \(\mathcal {P}(Y,f,T)\) with \(u_{T}=u^T\).
A similar definition holds by letting the canonical process start from \(x_{0}\) at some time \(t_{0} \ne 0\), in which case we say that the initial condition is \((t_{0},x_{0})\) and (1) is replaced by \({\mathbb {P}}(\forall s \in [0,t_{0}], \ X_{s}=x_{0})=1\).
Note that we require more in Definition 7 than in Definition 3 as we let the terminal time \(T\) vary within the interval \([0,T_{0}]\). In particular, in order to consider a solution to the martingale problem, it is not enough to assume that, at time \(T_{0}\), \((W^{T_{0}},{\fancyscript{W}}^{T_{0}})\) belongs to \({{\mathcal {R}}}^{\alpha ,\chi }([0,T_{0}) \times {\mathbb {R}},{\mathbb {R}}^2)\). The rough path structure must exist at any \(0 \le T \le T_{0}\), the regularity of the path \(W^T\) and of its iterated integral \({\mathcal W}^T\) being uniformly controlled in \(T \in [0,T_{0}]\).
Our goal is then to prove existence and uniqueness of a solution:
Theorem 8
In addition to the assumption of Definition 7, assume that, at any time \(0 \le T \le T_{0}\), \((W^T,{\fancyscript{W}}^T)\) is geometric (in the sense of Proposition 6), the paths \((Y^n)_{n \ge 1}\) used for defining the approximating paths \((W^{n,T},{\fancyscript{W}}^{n,T})_{n \ge 1}\) being the same for all the \(T\)’s and the supremum \(\sup _{0 \le T \le T_{0}} \sup _{n \ge 1} \kappa _{\alpha ,\chi }((W_{t}^{n,T},{\fancyscript{W}}_{t}^{n,T})_{0 \le t <T})\) being finite. Then, for an initial condition \((t_{0},x_{0}) \in [0,T_{0}] \times {\mathbb {R}}\), there exists a unique solution to the martingale problem (on \([0,T_{0}]\)) with \((t_{0},x_{0})\) as initial condition. It is denoted by \({\mathbb {P}}_{t_{0},x_{0}}\). The mapping \([0,T_{0}] \times {\mathbb {R}}\ni (t,x) \mapsto {\mathbb {P}}_{t,x}(A)\) is measurable for any Borel subset \(A\) of the canonical space \({\mathcal {C}}([0,T_{0}],{\mathbb {R}})\). Moreover, it is strong Markov.
Remark 9
The martingale problem is here set on the finite interval \([0,T_{0}]\). Obviously, existence and uniqueness extend to \([0,\infty )\).
The proof of Theorem 8 is split into two distinct parts: Existence of a solution is discussed in Sect. 2.7 whereas uniqueness is investigated in Sect. 2.8.
2.7 Solvability of the martingale problem
We start with:
Proposition 10
Given \(T_{0}>0\), assume that the assumption of Theorem 8 is in force. For an initial condition \((t_{0},x_{0}) \in [0,T_{0}] \times {\mathbb {R}}\), there exists a solution to the martingale problem (on \([0,T_{0}]\)) with \((t_{0},x_{0})\) as initial condition.
Proof of Proposition 10
First step. Without any loss of generality, we can assume that \(t_{0}=0\). Considering a sequence of paths \((Y^n)_{n \ge 1}\) as in the statement of Proposition 6, we can also assume that \(Y^n\) has bounded derivatives on the whole space, see Lemma 33 in the appendix. We then notice that, for a given \(x_{0} \in {\mathbb {R}}\), the following SDE (set on some filtered probability space endowed with a Brownian motion \((B_{t})_{0 \le t \le T_{0}}\)) admits a unique solution:
Second step. Choosing \(\beta \in (1/3,\alpha )\) with \(\beta > 2 \chi \) and letting \(u^T(x)=\exp ( \vartheta x )\) for a given \(T \in [0,T_{0}]\), we denote by \((u^n_{t}(x))_{0 \le t \le T,x \in {\mathbb {R}}}\) the mild solution to (14) with \(f=0\) and \(Y\) replaced by \(Y^n\). Following the remark after Proposition 6, \(u^n\) is a classical solution of
so that, by Itô’s formula, the process \((u^n_{t}(X^n_{t}))_{0 \le t \le T}\) is a true martingale (since we know, from Theorem 5, that \(u^n\) is at most of exponential growth). Then, (16) yields
where \(C=C(m,\alpha ,\beta ,\chi )\) as in Theorem 5. A crucial thing is that \(m\) is uniformly bounded in \(T \in [0,T_{0}]\) so that it can be assumed to be independent of \(T\). Replacing \(u^T(x)\) by \(u^T(-x)\), we get the same result with \(\vartheta \) replaced by \(-\vartheta \) in the above inequality, so that
Therefore, the exponential moments of \(X^n_{T}\) are bounded, uniformly in \(n \ge 1\). As \(C\) is independent of \(T \in [0,T_{0}]\), we deduce that the marginal exponential moments of \((X^n_{t})_{0 \le t \le T_{0}}\) are bounded, uniformly in \(n \ge 1\).
Third step. Now we change the domain of definition and the terminal condition of the PDE. We consider the PDE on \([0,t+h] \times {\mathbb {R}}\) with \(u^{t+h}(x)=x\) as boundary condition, where \(0 \le t \le t+h \le T_{0}\). To simplify, we still denote by \((u^n_{s}(x))_{0 \le s \le t+h,x \in {\mathbb {R}}}\) the mild solution to (14) with \(f=0\), \(Y\) replaced by \(Y^n\) and \(u^n_{t+h}=u^{t+h}\) as terminal condition. By Itô’s formula,
Therefore, by (16) and (17), we deduce that, for any \(q \ge 2\), there exists a constant \(C_{q}\), independent of \(n\), such that
By the second step (uniform boundedness of the exponential moments) and by Kolmogorov’s criterion, we deduce that the processes \((X^n_{t})_{0 \le t \le T_{0}}\) are tight.
Fourth step. It remains to prove that any weak limit \((X_{t})_{0 \le t \le T_{0}}\) is a solution to the martingale problem. The basic argument is taken from [9, Lemma 5.1]. Anyhow, it requires a careful adaptation since the test functions \(u\) in Definition 7 may be of exponential growth (whereas test functions are assumed to be bounded in [9, Lemma 5.1]). We thus give the complete proof. For \(T \in [0,T_{0}]\), we know from Proposition 6 that we can find a sequence \((u^n)_{n \ge 1}\) of classical solutions to the problems \({\mathcal P}(Y^n,f,T)\) such that the sequence \((u^n,\partial _{x} u^n)_{n \ge 1}\) converges towards \((u,\partial _{x} u)\), uniformly on compact subsets of \([0,T] \times {\mathbb {R}}\). Applying Itô’s formula to each \((u^n_{t}(X^n_{t}))_{0 \le t \le T}\), \(n \ge 1\), we deduce that
By (16), we know that the functions \((\partial _{x} u^n)_{n \ge 1}\) are at most of exponential growth, uniformly in \(n \ge 1\). Moreover, we recall that the processes \(((X^n_{t})_{0 \le t \le T})_{n \ge 1}\) have finite marginal exponential moments, uniformly in \(n \ge 1\) as well. Therefore, the martingales \(((u^n_{t}(X^n_{t}) - u^n_{0}(X^n_{0}) - \int _{0}^t f_{s}(X^n_{s}) \mathrm{d}s)_{0 \le t \le T})_{n \ge 1}\) are bounded in \(L^2\), uniformly in \(n \ge 1\). Letting \(n\) tend to the infinity, this completes the proof. \(\square \)
2.8 Proof of Theorem 8
We now complete the proof of Theorem 8. Existence has been already proved in Proposition 10. The point is thus to prove uniqueness and measurability of the solution with respect to the initial point.
We first establish uniqueness of the marginal laws. Assume indeed that \({\mathbb {P}}_{1}\) and \({\mathbb {P}}_{2}\) are two solutions of the martingale problem with the same initial condition \((t_{0},x_{0})\). Then, for any \(f \in {\mathcal {C}}([0,T] \times {\mathbb {R}},{\mathbb {R}})\) satisfying (15), it holds
where \({\mathbb {E}}_{1}\) and \({\mathbb {E}}_{2}\) denote the expectations under \({\mathbb {P}}_{1}\) and \({\mathbb {P}}_{2}\) (\((X_{t})_{0 \le t \le T_{0}}\) denotes the canonical process). Indeed, denoting by \(u\) the solution of the PDE \({\mathcal P}(Y,f,T_{0})\) with \(0\) as terminal condition at time \(T_{0}\), we know from the definition of the martingale problem that, both under \({\mathbb {P}}_{1}\) and \({\mathbb {P}}_{2}\), the process \(( u_{s}(X_{s}) - \int _{t_{0}}^{s} f_{r}(X_{r}) \mathrm{d}r )_{t_{0} \le s \le T_{0}}\) is a martingale. Therefore, taking the expectation under \({\mathbb {E}}_{1}\) and \({\mathbb {E}}_{2}\) and noticing that \(u_{T_{0}}(X_{T_{0}}) = 0\) almost surely under \({\mathbb {P}}_{1}\) and \({\mathbb {P}}_{2}\), we deduce that both sides in (21) are equal to \(-u_{t_{0}}(x_{0})\), which is enough to complete the proof of (21) and thus to prove that the marginal laws of the canonical process are the same under \({\mathbb {P}}_{1}\) and \({\mathbb {P}}_{2}\).
Following Theorems 4.2 and 4.6 in [9], we deduce that the martingale problem has a unique solution (note that the results in [9] hold for time homogeneous martingale problems whereas the martingale problem we are here investigating is time inhomogeneous; adding an additional variable in the state space, the problem we are considering can be easily turned into a time-homogeneous one). Measurability and strong Markov property are proved as in [9].
3 Solving the PDE
This section is devoted to the proof of Theorem 5. As the definition of a mild solution in Definition 3 consists in a convolution of a rough integral with the heat kernel, the first step is to investigate the smoothing effect of a Gaussian kernel onto a rough integral. Existence and uniqueness of a mild solution to (14) is then proved by means of a contraction argument.
Parts of the results presented here are variations of the ones obtained in Sections 3.1 and 3.2 of Hairer [19] for solving the KPZ equation, but differ slightly in the very construction of a mild solution, see Remark 4. The reader may also have a look at Section 3 in Hairer [18] for a quite simpler framework.
3.1 Mild solutions as Picard’s fixed points
In this subsection, we fix \(\alpha ,\beta , \chi , \vartheta , \lambda \) such that \(1/3<\beta <\alpha \le 1\), \(\chi <\beta /2\) and \(\vartheta , \lambda \ge 1\). Given \(Y\in {\mathcal {C}}([0,T)\times {\mathbb {R}},{\mathbb {R}})\) for some final time \(T\le 1\), we assume that there exists \({\fancyscript{W}}^T\) such that \((W_t^T=(Y_{t},Z_{t}^T),{\fancyscript{W}}_{t}^T)_{0 \le t \le T}\) is in \({{\mathcal {R}}}^{\alpha ,\chi }([0,T) \times {\mathbb {R}},{\mathbb {R}}^2)\), \((Z_{t}^T)_{0 \le t \le T}\) being given by (13). We will simply denote by \(\kappa \) the semi norm \(\kappa _{\alpha ,\chi }((W_{t}^T,{\fancyscript{W}}_{t}^T)_{t\in [0,T)})\) and we will omit the superscript \(T\) in \(Z^T\), \(W^T\) and \({\fancyscript{W}}^T\). We also recall the definition of \(\Theta _T^{\vartheta ,\lambda }(v)\) for \(v\in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\):

with \(E_{T}^{\vartheta ,\lambda }(t,a)=\exp [\lambda (T-t)+\vartheta a(1 + T-t)]\). We start with the following technical lemma, which plays a crucial role in the proof of Theorem 5.
Lemma 11
For any \(\gamma _1\le \gamma _2 \le \beta /2\) and \(k\in {\mathbb {N}}^*\), there is a constant \(C =C(\alpha ,\beta ,\gamma _1,\gamma _2,\chi ,k)\) (independent of \(\vartheta \) and \(\lambda \)) such that, for any \(t,\tau \in [0,T)\), with \(\tau \le T-t\), and any \(a \ge 1\), the following bounds hold for any \(v\in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\) and any \(x \in [-a,a]\):
with \(\Psi = Ce^{C T \vartheta ^2}\kappa \Theta _T^{\vartheta ,\lambda }(v)E_{T}^{\vartheta ,\lambda }(t,a)\). When \(2 \gamma _{1} \le \beta '\), we also have
Proof
In the whole proof, we just denote \(\Theta _T^{\vartheta ,\lambda }(v)\) and \(E_{T}^{\vartheta ,\lambda }(t,a)\) by \(\Theta \) and \(E(t,a)\). We start with the proof of the first inequality. The point is to apply the second inequality in Lemma 2 with \(y\) replaced by \(x - \sqrt{s} y\) and thus \(a\) replaced by \(a+\vert y \vert \). We get
where \(C=C(\alpha ,\beta )\). Noting that \(E(t+s,a+\vert y \vert ) \le \exp [- ( \lambda + \vartheta (a + \vert y \vert )) s + \vartheta (1+T) \vert y \vert ) ] E(t,a)\) and that \({\fancyscript{D}}(t+s,a+\vert y\vert ,\sqrt{s} y ) \le C(1+ \vert y \vert ^3) {\fancyscript{D}}(t+s,a+\vert y\vert ,\sqrt{s})\), we deduce that
where
We thus have to bound integrals of the form \(\rho ^{b'-\gamma _2}\int _0^{\tau }e^{-(\lambda +\vartheta \rho )s}s^{a'-\gamma _1-1}\mathrm{d}s\) with \(a'\ge \alpha /2\) (\(\ge \gamma _{2}\)), \(0<b'\le a'\) and \(\rho \ge 1\). Bounding \(s^{\gamma _2-\gamma _1}\) by \(\tau ^{\gamma _2-\gamma _1}\) and noticing that
we get the following upper bound for the integral (performing a change of variable to pass from the first to the second line and recalling that \(\gamma _{2} \le \beta /2\) to derive the last inequality):
Because of the term in \((T-t-s)\) in the definition of \({\fancyscript{D}}'\), we also have to control
In order to bound the integral in the second line, we use the inequality \(x^{a'}e^{-xs} \le (a')^{a'}e^{-a'}/s^{a'}\), which holds for \(s\in (0,1]\) and \(a',x\ge 0\). Using also the bounds \(\tau \le T-t\) and \(\lambda +\vartheta \rho \ge 1\) together with (24), we get (for a possibly new value of the constant \(C\)):
A careful inspection of (23) shows that we can apply (25) and (27) with \(a' \ge \alpha /2\) and \(b'-a' \le \chi - \alpha /2\) in order to bound (22) [\(a'\) is the part different from \(-1\) in the exponent of \(s\) and \(b'\) is the exponent of \(\rho \) in (23)]. We obtain
As \(a^{-\gamma _2}\le (1+|y|)^{\gamma _2}(a+|y|)^{-\gamma _2}\), we get the first bound of the lemma by integrating (28) against \(|\partial _x^kp_{1}(y)|\).
We now turn to the proof of the second inequality in the statement. We make use of the first inequality in Lemma 2. Replacing \(v_{t+s}(z)\) by \(v_{t+s}(z) - v_{t+s}(x)\) in (22), we get the same inequality but with a simpler form of \({\mathcal {D}}'(t,s,a+\vert y \vert )\), namely the first term in the right-hand side in (23) doesn’t appear. This says that we can now apply (25) with \(a' \ge \alpha \wedge (\alpha /2+\beta ') \ge \beta '\) and \(b'-a' \le \chi - \alpha /2\). The value of \(a'\) being larger than \(\beta '\), this permits to apply (24) with \(\gamma _{2}\) replaced by \(\beta '\). Then, we can replace \(\gamma _{1}\) and \(\gamma _{2}\) by \(2 \gamma _{1}\) and \(\beta '\) in (25) (with \(\gamma _{1} \le \beta '/2\)). With the prescribed values of \(a'\) and \(b'\), the resulting bound in (25) is \(C \tau ^{\beta ' - 2 \gamma _{1}} \lambda ^{(b' \vee \beta ') -a' }\). Following (28), we see that the contribution of (25) in the second inequality of the statement is \( \lambda ^{(\alpha -\beta )/8} \Psi \lambda ^{(\beta -\alpha )/2} \tau ^{\beta '-2 \gamma _{1}} a^{\beta '} \le \Psi \lambda ^{(\beta -\alpha )/8} \tau ^{\beta '-2 \gamma _{1}} a^{\beta '}\), which fits the first part of the inequality. To recover the second part of the inequality, we must discuss the contribution of (26). Going back to (23), we have to analyze [pay attention that, in comparison with (26), \(\gamma _{2}\) is set to \(0\)]:
the first inequality being valid for \(2\gamma _{1} \le \beta '\) only and the last inequality following from (24). Noting that \(\chi < \beta /2\), this gives the second part of the second inequality of the statement. \(\square \)
Here is now the key result to prove Theorem 5.
Theorem 12
Keep the notations and assumptions introduced at the beginning of Sect. 3.1. For \((v,\partial _{W} v) \in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\), define the function \({\mathcal {M}}(v,\partial _{W}v) : [0,T) \times {\mathbb {R}}\rightarrow {\mathbb {R}}\) together with its \(W\)-derivative by letting, for any \(t\in [0,T)\) and \(x\in {\mathbb {R}}\),
(With an abuse of notation, we will just write \(({\mathcal {M}}v)_{t}(x)\) for \([{\mathcal {M}}(v,\partial _{W} v)]_{t}(x)\).) Then \({\mathcal {M}}\) defines a bounded operator from \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\) into itself. Moreover, there exists a positive constant \(C=C(\alpha ,\beta ,\chi )\) such that for every \(v\in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\),
Proof
As in the proof of Lemma 11, we just denote \(\Theta _T^{\vartheta ,\lambda }(v)\) and \(E_{T}^{\vartheta ,\lambda }(t,a)\) by \(\Theta \) and \(E(t,a)\). By an obvious change of variable, we get for any \(a \ge 1\), \(x \in [-a,a]\) and \(t\in [0,T)\),
Then the first inequality of Lemma 11 with \(\gamma _1=\gamma _2=0\), \(\tau =T-t\) and \(k=2\) leads to
where \(C=C(\alpha ,\beta ,\chi )\).
We now study the time variations of \({\mathcal {M}}v\). For \(0 \le t\le s \le T\) and \(x\in {\mathbb {R}}\), we deduce from the identity \(\frac{1}{2}\partial _{x}^2p=\partial _tp\):
By the changes of variable \((r,u) \mapsto (s+r-u,s-u)\) and then \(y \mapsto x- \sqrt{r} s\), we get:
Applying Lemma 11 with \(\tau =T-t\), \(\gamma _1=\gamma _2=\beta /2\) and \(k=4\), we obtain
where \(C=C(\alpha ,\beta ,\chi )\). In order to handle \({\mathcal T}_{2}\), we can directly use Lemma 11 with \(\tau =s-t\), \(\gamma _1=0\), \(\gamma _{2}=\beta /2\) and \(k=2\). We then obtain the same bound as for \(\mathcal {T}_1\), so that
We now investigate the space variations. Fix \(-a \le x < x'\le a\). If \(|x'-x|^2\le T-t\), the space increment between \(x\) and \(x'\) reads:
with (using the fact that the mapping \({\mathbb {R}}\ni z \mapsto \partial ^2_{x} p_{s}(z)\) is centered)
By Lemma 11 with \(\tau =|x'-x|^2\), \(\gamma _1=0\), \(\gamma _2=\beta /2\) and \(k=2\), we get
The term \({\mathcal I}_{2}^{x,x'}\) can be bounded in the following way:
Using now Lemma 11 with \(\tau =T-t\), \(\gamma _1=\gamma _2=\beta /2\) and \(k=3\) we obtain:
We end up with the following bound for the space increment:
Recall that (36) holds true when \(\vert x'-x\vert ^2 \le T-t\). When \(|x'-x|^2> T-t\), the argument is obvious as the space increment is smaller than \({\mathcal I}_{1}^{x,x'}(x) + {\mathcal I}_1^{x,x}(x')\), so that (36) holds as well.
We study the remainder term in a similar way. Recalling the definition (9), we then make use of the definition of \(Z^T\), see (13):
where
with
and
We start with \({\fancyscript{R}}^{\prime }\). The strategy is similar to the one used to prove (36) except that we now apply the second inequality in Lemma 11 and not the first one. In order to handle \({\mathcal I}_{1}^{x,x',\prime }(\xi )\), with \(\xi =x\) or \(x'\), we apply the second inequality in Lemma 11 (with \(k=2\), \(\tau = \vert x-x' \vert ^2 \wedge (T-t)\) and \(\gamma _{1}=0\)) in the spirit of (34). Similarly, we can play the same game as in (35) to tackle \({\mathcal I}_{2}^{x,x',\prime }\), writing \(s^{-3/2} = s^{-1-\beta '} s^{-1/2+\beta '} \le \vert x' - x \vert ^{2\beta '-1} s^{-1-\beta '}\) and applying the second inequality in Lemma 11 (with \(k=3\), \(\tau = T-t\) and \(2 \gamma _{1} = \beta '\)). We get
It thus remains to discuss \({\mathcal J}_{1}^{x,x'}\) and \({\mathcal J}_{2}^{x,x'}\). We start with the following general bound that holds true for any \(\xi \in [x,x']\) and \(s \in [0,T-t]\). Since \(\beta \le 2\beta ' \le 2\beta \), we indeed have
so that (using the rate of growth of \( [\![v ]\!]_{\beta /2,\beta }^{[t,T) \times [-a,a]}\) in \(a\))
We now handle \({\mathcal J}_{1}^{x,x'}\). Following (34) (but noticing that the integrand is here constant in \(z\)), we deduce from (39) with \(s \le \vert x'-x \vert ^2\),
Note that there is no decay in \(\lambda \) because \(\vert v_{t+s}(x')-v_{t}(x) \vert \) is bounded by means of \(E(t,a)\) and not of \(E(t+s,a)\). Similarly, using (39) with \(\xi =u\) and \(\vert u-x\vert \le s^{1/2}\),
from which we deduce that
Together with (38), we get
Finally, as the \(W\)-derivative of \(({\mathcal {M}}v)_{t}\) is defined as \(\partial _{W} ({\mathcal {M}}v)_{t}=(0,v_{t})\), we have
From (31), (32), (36), (40) and (41), this completes the proof. \(\square \)
3.2 Proof of Theorem 5
First step. As in the previous subsection, we omit the superscript \(T\) in \(Z^T\), \(W^T\) and \({\fancyscript{W}}^T\). We also notice that Theorem 12 remains true when \(T \le T_{0}\), for some \(T_{0} \ge 1\), provided that the constant \(C\) in the statement is allowed to depend upon \(T_{0}\).
Now, for \(f\) and \(u^T\) as in (15), we let for \((t,x) \in [0,T) \times {\mathbb {R}}\):
By standard regularization properties of the heat kernel, \(\psi \) is \((\beta /2,\beta )\)-Hölder continuous on any \([0,T] \times [-a,a]\), \(a \ge 1\), the Hölder norm being less than \(C \exp (\vartheta a)\). Moreover,
For \(v \in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\), we then let
The point is to check that \(\widehat{{\mathcal {M}}}v\) can be lifted up into an element of \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\). By Theorem 12, the last part of the right-hand side is in \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\). Its derivative with respect to \(W\) is \(\partial _{W}[{\mathcal {M}}v]\), as defined in the statement of Theorem 12. By (43), for any \(t \in [0,T)\), \(\psi _{t}\) is \(2 \beta '\)-Hölder continuous (in \(x\)) and belongs to \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\) with a zero derivative with respect to \(W\). Moreover, from (43), \(\widehat{{\mathcal {M}}}v \in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\), with \([\partial _{W} (\widehat{{\mathcal {M}}}v)]_{t}(x) = [\partial _{W} ({\mathcal {M}}v)]_{t}(x)=(0,v_{t}(x))\) for \(t \in [0,T)\).
Second step. We construct a solution to (14) by a contraction argument when \(T \le 1\) (the same argument applies when \(T \ge 1\)). We choose \(\lambda \) large enough such that \(C \kappa \exp (C T \vartheta ^2) \lambda ^{-\epsilon } \le 1/4\) (with the same \(C\) as in Theorem 12) and we note that \(({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W),\Theta _T^{\vartheta ,\lambda })\) is a Banach space. Since \(\widehat{{\mathcal {M}}}u-\widehat{{\mathcal {M}}}v={\mathcal {M}}(u-v)\) for any \(u,v\in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\) (the equality holding true for the lifted versions), we deduce from Theorem 12 and Picard fixed point Theorem that \(\widehat{{\mathcal {M}}}\) admits a unique fixed point \(\bar{v}\) in \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\). Letting
with \(\phi \) as in (42), we obtain \(\partial _{x} \bar{u} = \bar{v}\) so that \(\bar{u}\) is a mild solution, as defined in (14). It must be unique as the \(x\)-derivative of any other mild solution (when lifted up) is a fixed point of \(\widehat{{\mathcal {M}}}\). Differentiation under the integral symbol in (45) and in the mild formulation (14) can be justified by Lebesgue’s Theorem, using bounds in the spirit of Lemma 11.
Third step. We finally prove (16) and (17). We first estimate \(\bar{v}\). With our choice of \(\lambda \) and by Theorem 12, we have \(\Theta ^{\vartheta ,\lambda }_{T}(\bar{v}) \le \Theta ^{\vartheta ,\lambda }_{T}(\widehat{{\mathcal {M}}}0) + (3/4) \Theta ^{\vartheta ,\lambda }_{T}(\bar{v})\), where \(0\) is the null function, so that
As \(\widehat{{\mathcal {M}}}0 = \psi \in {\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\), the right-hand side is bounded by some \(C\) (which would depend on \(T_{0}\) if \(T\) was less than \(T_{0}\) for some \(T_{0} \ge 1\)). Since \(\partial _{x} \bar{u} = \bar{v}\), this gives the exponential bound for \(\bar{v}\) and for the \((\beta /2,\beta )\)-Hölder constant of \(\bar{v}\) in time and space.
In order to get the same estimate for \(\bar{u}\), we go back to (45). The function \(\phi \) can be estimated by standard properties of the heat kernel: it is at most of exponential growth and it is locally \((1+\beta )/2\)-Hölder continuous in time, the Hölder constant growing at most exponentially fast in the space variable. The second term can be handled by repeating the analysis of \({\mathcal {M}}v\) in the proof of Theorem 12: Following (31) and (32), it is at most of exponential growth and it is locally \((1+\beta )/2\)-Hölder continuous in time, the Hölder constant growing at most exponentially fast in the space variable (in comparison with (32), the additional \(1/2\) comes from the fact there is one derivative less in the heat kernel).
3.3 Proof of Proposition 6
As above, we omit the superscript \(T\) in \(Z^{n,T}\), \(W^{n,T}\) and \({\fancyscript{W}}^{n,T}\). Stability of solutions under mollification of the input follows from a classical compactness argument. Given a sequence \((W^n,{\fancyscript{W}}^n)_{n \ge 1}\) as in the statement, we can solve (14) for any \(n \ge 1\): The solution is denoted by \(u^n\) and its gradient by \(v^n :=\partial _{x} u^n\). By (2) in Proposition 6 and by (46), it is well-checked that
where \([\partial _{W^n}( v^n)]_{t}=(0,v^n_{t})\). As a consequence, the sequence \((v^n)_{n \ge 1}\) is uniformly continuous on compact subsets of \([0,T] \times {\mathbb {R}}\). In the same way, the sequence \((u^n)_{n \ge 1}\) is also uniformly continuous on compact subsets. Moreover, \(u^n\) and \(v^n\) are at most of exponential growth (in \(x\)), uniformly in \(n \ge 1\). By Arzelà–Ascoli Theorem, we can extract subsequences (still indexed by \(n\)) that converge uniformly on compact subsets of \([0,T] \times {\mathbb {R}}\). Limits of \((u^n)_{n \ge 1}\) and \((v^n)_{n \ge 1}\) are respectively denoted by \(\hat{u}\) and \(\hat{v}\). In order to complete the proof, we must prove that \((\hat{u},\hat{v})\) is a mild solution of (14).
Writing (9) for each of the \((v^n)_{n \ge 1}\), exploiting (47) to control the remainders \(({\fancyscript{R}}^{v^n_{t}})_{n \ge 1}\) uniformly in \(n \ge 1\) and then letting \(n\) tend to \(\infty \), we deduce that the pair \((\hat{v},(0,\hat{v}))\) belongs to \({\mathcal B}^{\beta ,\vartheta }([0,T),{\mathbb {R}})\), the remainder at any time \(t \in [0,T)\) being denoted by \(\hat{{\fancyscript{R}}}^{t}\). By (9), \(\lim _{n}\Vert \hat{{\fancyscript{R}}}^{t}-{\fancyscript{R}}^{v^n_{t}} \Vert _{\infty }^{[-a,a]}=0\) for any \(a \ge 1\). By (47), it holds as well in \(\beta ''\)-Hölder norm, for any \(\beta '' \in (1/3,\beta ')\), that is \(\lim _{n} \Vert \hat{{\fancyscript{R}}}^{t} - {\fancyscript{R}}^{v^n_{t}} \Vert _{2\beta ''}^{[-a,a]}=0\).
Replacing \(\beta '\) by \(\beta ''\) in (7), this suffices to pass to the limit in the rough integrals appearing in the mild formulation (14) of the PDE satisfied by each of the \((v^n)_{n \ge 1}\)’s. To pass to the limit in the whole formulation, we can invoke Lebesgue’s Theorem, using bounds in the spirit of Lemma 11. Thus the pair \((\hat{v},(0,\hat{v}))\) satisfies \(\hat{v} = \widehat{{\mathcal {M}}}\hat{v}\) in \({\mathcal B}^{\beta ,\vartheta }([0,T) \times {\mathbb {R}},W)\), which is enough to conclude by uniqueness of the solution.
4 Stochastic calculus for the solution
In Theorem 8, we proved existence and uniqueness of a solution to the martingale problem associated with (1), but we said nothing about the dynamics of the solution. In this section, we answer to this question and give a sense to the formulation (4).
4.1 Recovering the Brownian part
Equation (4) suggests that the dynamics of the solution to (1) indeed involves some Brownian part. The point we discuss here is thus twofold: (i) We recover in a quite canonical way the Brownian part in the dynamics of the solution; (ii) we discuss the structure of the remainder.
Theorem 13
Under the assumption of Theorem 8, for any given initial condition \(x_{0}\), we can find a probability measure (still denoted by \({\mathbb {P}}\)) on the enlarged canonical space \({\mathcal {C}}([0,T_{0}],{\mathbb {R}}^2)\) (endowed with the canonical filtration \(({\mathcal F}_{t})_{0 \le t \le T_{0}}\)) such that, under \({\mathbb {P}}\), the canonical process, denoted by \((X_{t},B_{t})_{0 \le t \le T_{0}}\), satisfies the followings:
-
(i)
The law of \((X_{t})_{0 \le t \le T_{0}}\) under \({\mathbb {P}}\) is a solution to the martingale problem with \(x_{0}\) as initial condition at time \(0\) and the law of \((B_{t})_{0 \le t \le T_{0}}\) under \({\mathbb {P}}\) is a Brownian motion.
-
(ii)
For any \(q \ge 1\) and any \(\beta <\alpha \), there is a constant \(C = C(\alpha ,\beta ,\chi ,\kappa _{\alpha ,\chi }(W,{\fancyscript{W}}),q,T_{0})\) such that, for any \(0 \le t \le t +h \le T_{0}\),
$$\begin{aligned} {\mathbb {E}} [ \vert X_{t+h} - X_{t} - (B_{t+h} - B_{t}) \vert ^q ]^{\frac{1}{q}} \le C h^{(1+\beta )/2}. \end{aligned}$$(48) -
(iii)
For any \(0 \le t \le t+ h \le T_{0}\),
$$\begin{aligned} {\mathbb {E}} [ X_{t+h} - X_{t} \vert {\mathcal F}_{t}] = {\mathfrak b}(t,X_{t},h) : = u_{t}^{t+h}(X_{t}) - X_{t}, \end{aligned}$$(49)where the mapping \(u^{t+h} : [0,t+h] \times {\mathbb {R}}\ni (s,x) \mapsto u^{t+h}(s,x)\) is the mild solution of \({\mathcal P}(Y,0,t+h)\) with \(u_{t+h}^{t+h}(x)=x\) as terminal condition.
Proof
The point is to come back to the proof of the solvability of the martingale problem in Sect. 2.7. For free and with the same notations, we have the tightness of the family \((X^n_{t},B_{t})_{0 \le t \le T_{0}}\), which is sufficient to extract a converging subsequence. The (weak) limit is the pair \((X_{t},B_{t})_{0 \le t \le T_{0}}\) in (i). (Pay attention that we do not claim that the ‘\(B\)’ at the limit is the same as the ‘\(B\)’ in the regularized problems but, for convenience, we use the same letter.) We then repeat the proof of (20) which writes:
Repeating the analysis of the the third step in Sect. 2.7, we know that the third term in the right hand side satisfies the bound (48). The point is thus to prove that the second term also satisfies this bound. Recalling that \(u^n_{t+h}(x)=x\), we notice that \(\partial _{x} u_{s}^n(X^n_{s})-1=\partial _{x} u_{s}^n(X^n_{s}) - \partial _{x} u_{t+h}^n (X^n_{s})\). The bound then follows from the fact that \(\partial _{x} u^n\) is locally \(\beta /2\)-Hölder continuous in time, the Hölder constant being at most of exponential growth, as ensured by Theorem 5. Letting \(n\) tend to \(\infty \), this completes the proof of (ii).
The last assertion (iii) is easily checked for with \(X\) replaced by \(X^n\) and \(u^{t+h}\) replaced by \(u^n\) (and for sure with \({\mathcal F}_{t}\) replaced by the \(\sigma \)-field generated by \((X^n_{s},B_{s})_{0 \le s \le t}\)). It is quite standard to pass to the limit in \(n\). \(\square \)
4.2 Expansion of the drift
The next proposition gives a more explicit insight into the shape of the function \({\mathfrak b}\) in (49).
Proposition 14
Given \(T_{0}>0\), there exist a constant \(C\) and an exponent \(\varepsilon >0\) such that
\(O(\cdot )\) standing for the Landau notation (the underlying constant in the Landau notation being uniform in \(0 \le t \le t+h \le T_{0}\)).
Remark 15
The first term in the definition of \(b(t,x,h)\) reads as a mollification (in \(x\)) of the gradient (in \(x\)) of \((Y_{t}(x))_{t \le s \le t+h, x \in {\mathbb {R}}}\) by means of the transition density of \((B_{t})_{t \ge 0}\) (which is the martingale process driving \(X\)). It is (locally in \(x\)) of order \(h^{1/2+\alpha /2}\). The second term reads as a correction in the mollification of \((Y_{s}(x))_{t \le s \le t+h,x \in {\mathbb {R}}}\). It keeps track of the rough path structure of \((Y_{s}(x))_{t \le s \le t+h, x \in {\mathbb {R}}}\). The proof right below shows that it is of order \(h^{1/2+\alpha }\), thus proving that it can be ‘hidden’ in the remainder \(O(h^{1+\epsilon })\) when \(\alpha > 1/2\). This requirement \(\alpha >1/2\) fits the standard threshold in rough paths above which Young’s theory applies.
Proof
From (14), we know that \(u_{t}^{t+h}(x)\) expands as
where \(\bar{v}_{s}^{t+h}(y) = \partial _{x} u_{s}^{t+h}(y)\). Here, the function \(\phi \) in (14) is equal to \(\phi _{t}(x)=x\) for any \(t \in [0,t+h]\) and \(x \in {\mathbb {R}}\), and thus \(\partial _{x} \phi \equiv 1\). By Theorem 12, \(\bar{v}^{t+h} \in {\mathcal B}^{\beta ,\vartheta }([0,t+h) \times {\mathbb {R}},W^{t+h})\) and solves the equation \(\bar{v} = 1+ {\mathcal {M}}\bar{v}\). In particular, \(\partial _{Y} \bar{v}_{t}(x) = 0\) and \(\partial _{Z^{t+h}} \bar{v}_{t}(x) = \bar{v}_{t}(x)\). Therefore, we can write
which we can plug into the expression for \(u_{t}^{t+h}(x)\) by means of Theorem 1:
where \({\fancyscript{U}}_{s}^{t+h}(x,y)\) is a remainder term that derives from the approximation of the rough integral of \(\bar{v}_{s}^{t+h}\) with respect to \(Y_{s}\). By Theorem 1, there exist a constant \(C\) and an exponent \(\varepsilon >0\) such that
Above, the exponential factor permits to handle the polynomial growth of \({\varvec{W}}^{t+h}\), with \(W^{t+h}=(Y,Z^{t+h})\), and the exponential growth of \(\bar{v}^{t+h}\) (see the definition of \(\Theta _{T}^{\vartheta ,\lambda }(v)\) in the statement of Theorem 12), the exponent in the exponential factor being arbitrarily chosen as \(1\) (which leaves ‘some space’ to handle additional polynomial growth and which is possible since the terminal condition \(u_{t+h}^{t+h}\) is of polynomial growth).
We now investigate the second term in the right hand side of (50). We recall that, by assumption, there exists a constant \(C\), independent of \(h\), such that
We also recall from Theorem 5 that \(\bar{v}\) is \((\alpha - \epsilon )/2\)-Hölder continuous in time, locally in space (the rate of growth of the Hölder constant being at most exponential and Theorem 12 allowing to choose \(1\) as exponent in the exponential), so that \(\vert \bar{v}^{t+h}_{s}(y) - 1 \vert \le C h^{(\alpha -\epsilon )/2} \exp (\vert y \vert )\), for \(s\in [t,t+h]\) and for a possibly new value of the constant \(C\). Therefore,
the last term being less than
the last inequality holding true since \(\alpha \) is strictly larger than \(1/3\) and \(\epsilon \) can be chosen arbitrarily small. Therefore, from (50), (51) and (52), we deduce that
Using (52) once more and following the proof of (53), we also have
It then remains to look at the first term in the right-hand side of (50). The point is to expand \(v_{t}^{t+h}(x)\) on the same model as \(u_{t}^{t+h}(x)\) right above. Basically, the same expansion holds but, because of the derivative in the definition of \(v_{t}^{t+h}(x) = \partial _{x} u_{t}^{t+h}(x)\), we loose \(1/2\) in the power of \(h\) in the Landau notation. Therefore, for \(t \le s \le t+h\), the above expansion turns into
Using once again the fact that \(v^{t+h}\) is \((\alpha -\epsilon )/2\)-Hölder continuous in time (locally in space, the Hölder constant being at most of exponential growth), we obtain
The last term can be bounded by \(O(\exp (2\vert x \vert ) h^{\alpha - \epsilon /2})\). Now, by (54),
It thus remains to bound
By (13), it is plain to see that \(Z_{s}^{t+h}(x) = O( \exp (2 \vert x \vert ) h^{\alpha /2})\). Then, the above term must at most of order \(O(\exp (2 \vert x \vert ) h^{1/2+\alpha })\), from which the proof of the proposition is easily completed.
In order to complete the proof of Remark 15, it remains to show the announced bound for
We already have a bound when \(Z_{s}^{t+h}(z)\) is replaced by \(Z_{s}^{t+h}(x)\). By (52), we also have a bound when \(Z_{r}^{t+h}(z)\) is replaced by \(Z_{r}^{t+h}(z) - Z_{r}^{t+h}(x)\). \(\square \)
4.3 Purpose
The goal is now to prove that Theorem 13 and Proposition 14 are sufficient to define a differential calculus for which the infinitesimal variation \(dX_{t}\) reads
or, in a macroscopic way, \(X_{t} = X_{0} + B_{t} + \int _{0}^t b(s,X_{s},\mathrm{d}s)\), which gives a sense to (1). In that framework, Proposition 14 and Remark 15 give some insight into the shape of the drift.
As explained below, we are able to define a stochastic calculus in such a way that the process \((\int _{0}^t b(s,X_{s},\mathrm{d}s))_{0 \le t \le T}\) has a Hölder continuous version, with \((1+\alpha )/2-\epsilon \) as Hölder exponent, for \(\epsilon >0\) as small as desired, thus making \((X_{t})_{0 \le t \le T}\) a Dirichlet process. More generally, we manage to give a sense to the integrals \(\int _{0}^T \psi _{t} \mathrm{d}X_{t}\) and \(\int _{0}^T \psi _{t} b(t,X_{t},\mathrm{d}t)\) for a large class of integrands \((\psi _{t})_{0 \le t \le T}\), thus making meaningful the identity
The above integrals will be constructed with respect to processes \((\psi _{t})_{0 \le t \le T}\) that are progressively-measurable and \((1-\alpha )/2+\epsilon \) Hölder continuous in \(L^p\) for some \(p>2\) and some \(\epsilon >0\). The construction of the integral consists of a mixture of Young’s and Itô’s integrals. Precisely, the progressive-measurability of \((\psi _{t})_{0 \le t \le T}\) permits to ‘get rid of’ the martingale increments in \(X\) that are different from the Brownian ones and thus to focus on the function \(b\) only in order to define the non-Brownian part of the dynamics. Then, the Hölder property of \((\psi _{t})_{0 \le t \le T}\) permits to integrate with respect to \((b(t,X_{t},\mathrm{d}t))_{0 \le t \le T}\) in a Young sense. For that reason, the resulting integral is called a stochastic Young integral. It is worth mentioning that it permits to consider within the same framework integrals defined with respect to the martingale part of \(X\) and integrals defined with respect to the zero quadratic variation part of \(X\). Following the terminology used in [6], in which the authors address a related problem (see Remark 18 below for a precise comparison), the Young integral with respect to \((b(t,X_{t},\mathrm{d}t))_{0 \le t \le T}\) may be called ‘nonlinear’.
The construction we provide below is given in a larger set-up. In the whole section, we thus use the following notation: \((\Omega ,({\mathcal F}_{t})_{t \ge 0},{\mathbb {P}})\) denotes a filtered probability space satisfying the usual conditions; moreover, for any \(0 \le s \le t\), \({\mathcal S}(s,t)\) denotes the set \(\{s' \in [0,s], t' \in [0,t], s' \le t'\}\). The application to (48) is discussed in Sect. 4.6.
4.4 \(L^p\) construction of the integral
4.4.1 Materials
We are given a real \(T>0\) and a continuous progressively-measurable process \((A(s,t))_{0 \le s \le t \le T}\) in the sense that, for any \(0 \le s \le t\), the mapping \(\Omega \times {\mathcal S}(s,t) \ni (\omega ,s',t') \mapsto A(s',t')\) is measurable for the product \(\sigma \)-field \({\mathcal F}_{t} \otimes {\mathcal B}({\mathcal S}(s,t))\) and the mapping \({\mathcal S}(T,T) \ni (s,t) \mapsto A(s,t)\) is continuous. We assume that there exist a constant \(\Gamma \ge 0\), three exponents \(\varepsilon _{0} \in (0,1/2]\), \(\varepsilon _{1},\varepsilon _{1}' >0\) and a real \(q \ge 1\) such that, for any \(0 \le t \le t+h \le t+h' \le T\),
In the framework of (56), we have in mind to choose \(A(t,t+h)=X_{t+h}-X_{t}\) or \(A(t,t+h) = B_{t+h} - B_{t}\), in which cases \(A\) has an additive structure and \(\varepsilon _{1}\) and \(\varepsilon _1'\) can be chosen as large as desired, or \(A(t,t+h)=b(t,X_{t},h)\), in which case \(A\) is not additive. The precise application to (56) is detailed in Sect. 4.6. Generally speaking, we call \(A(t,t+h)\) a pseudo-increment. Considering pseudo-increments instead of increments (that enjoy, in comparison with, an additive property) allows more flexibility and permits, as just said, to give a precise meaning to \(b(t,X_{t},\mathrm{d}t)\) in (56). The strategy is then to split \(A(t,t+h)\) into two pieces:
\(M(t,t+h)\) being seen as a sort of martingale increment and \(R(t,t+h)\) as a sort of drift.
We are also given a continuous progressively-measurable process \((\psi _{t})_{0 \le t \le T}\) and we assume that, for an exponent \(\varepsilon _{2} < \varepsilon _{0}\) and for any \(0 \le t \le t+h \le T\),
for some \(q' \ge 1\). We then let \(p=qq'/(q+q')\) so that \(1/p=1/q+1/q'\).
4.4.2 Objective
The aim of the subsection is to define the stochastic integral \(\int _{0}^T \psi _{t} A(t,t+\mathrm{d}t)\) as an \(L^p(\Omega ,{\mathbb {P}})\) version of the Young integral. In comparison with the standard version of the Young integral, the \(L^p(\Omega ,{\mathbb {P}})\) construction will benefit from the martingale structure of the pseudo-increments \((M(t,t+h))_{0 \le t \le t+h \le T}\), the integral being defined as the \(L^p(\Omega ,{\mathbb {P}})\) limit of Riemann sums as the step size of the underlying subdivision tends to \(0\). Given a subdivision \(\Delta = \{0=t_{0} < t_{1} < \dots < t_{N}=T\}\), we thus define the \(\Delta \)-Riemann sum
We emphasize that this definition is exactly the same as the one used to define Itô’s integral: on the step \([t_{i},t_{i+1}]\), the process \(\psi \) is approximated by the value at the initial point \(t_{i}\). For that reason, we will say that the Riemann sum is adapted. In that framework, we claim:
Theorem 16
There exists a constant \(C=C(q,q',\Gamma ,\varepsilon _{0},\varepsilon _{1},\varepsilon _{2})\), such that, given two subdivisions \(\Delta \subset \Delta '\), with \(\pi (\Delta ) \le 1\),
where \(\pi (\Delta )\) denotes the step size of the subdivision \(\Delta \), that is \(\pi (\Delta ) := \max _{1 \le i \le N} [ t_{i} - t_{i-1}]\), and with \(\eta := \min (\varepsilon _{0} - \varepsilon _{2}, \varepsilon _{1},\varepsilon _{1}'/2)\).
For general partitions \(\Delta \) and \(\Delta '\) (without any inclusion requirement), Theorem 16 applies to the pairs \((\Delta ,\Delta \cup \Delta ')\) and \((\Delta ',\Delta \cup \Delta ')\), so that (61) holds in that case as well provided \(\pi (\Delta )\) in the right-hand side is replaced by \(\max (\pi (\Delta ),\pi (\Delta '))\). We deduce that \(S(\Delta )\) has a limit in \(L^p(\Omega ,{\mathbb {P}})\) as \(\pi (\Delta )\) tends to \(0\). We call it the stochastic Young integral of \(\psi \) with respect to the pseudo-increments of \(A\).
4.4.3 Proof of Theorem 16: first step
First, we consider the case where the two subdivisions \(\Delta \) and \(\Delta '\), \(\Delta \) being included in \(\Delta '\), are not so different one from each other. Precisely, given \(\Delta = \{0=t_{0} < t_{1} < \dots < t_{N} = T\}\) and \(\Delta ' = \Delta \cup \{t_{1}' <\dots < t_{L}'\}\) (\(L \ge 1\)), the \((t_{i})_{1 \le i \le N}\)’s and the \((t_{j}')_{1 \le j \le L}\)’s being pairwise distinct, we assume that, between two consecutive points in \(\Delta \), there is at most one point in \(\Delta '\). For any \(j \in \{1,\dots ,L\}\), we then denote by \(s_{j}^-\) and \(s_{j}^+\) the largest and smallest points in \(\Delta \) such that \(s_{j}^- < t_{j}' < s_{j}^+\). We have \(t_{j}' < s_{j}^+ \le s_{j+1}^- < t_{j+1}'\) for \(1 \le j \le L-1\). We then claim:
Lemma 17
Under the above assumption, the estimate (61) holds with \(\pi (\Delta )\) replaced by \(\rho (\Delta '{\setminus } \Delta )\), where \(\rho (\Delta '{\setminus }\Delta ) := \sup _{1 \le j \le L} [ s_{j}^+ - s_{j}^- ]\).
Proof of Lemma 17
(i) As a first step, we compute the difference \(S(\Delta ') - S(\Delta )\). We write
with \(\Delta ^j = \Delta \cup \{t_{1}',\dots ,t_{j}'\}\), for \(1 \le j \le L\), and \(\Delta ^0 = \Delta \). Then,
Therefore,
(ii) We first investigate \(\delta _{1} S(\Delta ,\Delta ',M)\). The process \(( \sum _{j=1}^{\ell } ( \psi _{t_{j}'} - \psi _{s_{j}^-}) M(t_{j}',s_{j}^+) )_{0 \le \ell \le L}\) is a discrete stochastic integral and thus a martingale with respect to the filtration \(({\mathcal F}_{s_{\ell }^+})_{0 \le \ell \le L}\), with the convention that \(s_{0}^-=s_{0}^+ = 0\). The sum of the squares of the increments is given by \(\sum _{j=1}^{L} ( \psi _{t_{j}'} - \psi _{s_{j}^-} )^2 ( M(t_{j}',s_{j}^+) )^2\). By the second line in (57) and by (59), we observe from Minkowski’s inequality first and then from Hölder’s inequality (recalling \(1/p=1/q+1/q'\)) that there exists a constant \(C\) such that
with \(\eta _{1} := 1- 2 \varepsilon _{2} \ge 2 (\varepsilon _{0}-\varepsilon _{2})\), where we have used \(s_{j}^- < t_{j}' < s_{j}^+\). By discrete Burkholder–Davis–Gundy inequalities, we deduce that \({\mathbb {E}}[ \vert \delta _{1} S(\Delta ,\Delta ',M) \vert ^p ]^{1/p} \le C T^{1/2} ( \rho (\Delta '{\setminus }\Delta ) )^{ \eta _{1}/2}.\)
(iii) We now turn to \(\delta _{1} S(\Delta ,\Delta ',R)\). In the same way, by the first line in (57) and by (59),
with \(\eta _{2} := \varepsilon _{0} - \varepsilon _{2}\). Therefore, \({\mathbb {E}}[ \vert \delta _{1} S(\Delta ,\Delta ',R) \vert ^p]^{1/p} \le C T \bigl ( \rho (\Delta '{\setminus }\Delta ) \bigr )^{ \eta _{2}}\).
(iv) We finally investigate \(\delta _{2} S(\Delta ,\Delta ')\). We split it into two pieces:
with
By the third line in (57) and by (59), we have, with \(\eta _{3} := \varepsilon _{1}\), \({\mathbb {E}}[ \vert \delta _{2} S(\Delta ,\Delta ',R') \vert ^p ]^{1/p}\! \le \, C T ( \rho (\Delta ' {\setminus } \Delta ))^{\eta _{3}}\).
We finally tackle \(\delta _{2} S(\Delta ,\Delta ',M')\). We notice that it generates a discrete time martingale with respect to the filtration \(({\mathcal F}_{s_{\ell }^+})_{0 \le \ell \le L}\). As in the second step, we compute the \(L^{p/2}(\Omega ,{\mathbb {P}})\) norm of the sum of the squares of the increments. By the last line in (57), it is given by
with \(\eta _{4} := \varepsilon _{1}'\). By discrete Burkholder–Davis–Gundy inequality, \({\mathbb {E}}[ \vert \delta _{2} S(\Delta ,\Delta ',M') \vert ^p]^{1/p} \le C T^{1/2} ( \rho (\Delta ' {\setminus }\Delta ))^{ \eta _{4}/2}\). Putting (i), (ii), (iii) and (iv) together, this completes the proof. \(\square \)
4.4.4 Proof of Theorem 16: second step
We now consider the general case when \(\Delta \subset \Delta '\) (\(\Delta ' \not = \Delta \)) without any further assumption on the difference \(\Delta '{\setminus }\Delta \).
As above, we denote the points in \(\Delta \) by \(t_{1},\dots ,t_{N}\). The points in the difference \(\Delta '{\setminus }\Delta \) are denoted in the following way. For \(i=1,\dots ,N\), we denote by \(t_{1,i}',\dots ,t_{L_{i},i}'\) the points in the intersection \((\Delta '{\setminus } \Delta ) \cap (t_{i-1},t_{i})\), where \(L_{i}\) denotes the number of points in \((\Delta '{\setminus }\Delta ) \cap (t_{i-1},t_{i})\). Each \(L_{i}\) may be written as \(L_{i} = 2 \ell _{i} + \varepsilon _{i}\) where \(\ell _{i} \in {\mathbb N}\) and \(\varepsilon _{i} \in \{0,1\}\). We then define \(\Delta _{1}'\) as the subdivision made of the points that are in \(\Delta \) together with the points
This says that, to construct \(\Delta _{1}'\), we delete, for any \(i=1,\dots , N\), the point \(t_{1,i}'\) if \(L_{i}=1\) and the points that are in \((\Delta '{\setminus } \Delta ) \cap (t_{i-1},t_{i})\) and that have an odd index \(2 \ell -1\) with \(1 \le \ell \le \ell _{i}\) if \(L_{i}>1\) (so that the last point is kept even if labelled by an odd integer when \(\ell _{i}\ge 1\)). By construction, \(\Delta _{1}'\) and \(\Delta '\) satisfy the assumption of Sect. 4.4.3, so that
It holds \(\Delta _{1}' \supset \Delta \). If \(\Delta _{1}' \not = \Delta \), we then build a new subdivision \(\Delta _{2}'\) as the subdivision associated with \(\Delta _{1}'\) in the same manner as \(\Delta _{1}'\) is associated with \(\Delta '\). We then obtain
We then carry on the construction up until we reach \(\Delta _{M}' = \Delta \) for some integer \(M \ge 1\). We notice that such an \(M\) does exist: by construction each \(\Delta _{j}'\) contains \(\Delta \) and \(\sharp [\Delta _{j}'] < \sharp [\Delta _{j-1}']\) (with the convention \(\Delta _{0}' = \Delta '\)).
We now make an additional assumption: We assume that \(\Delta '\) is a dyadic subdivision, that is \(\Delta ' = \{ 2^{-P} k T, 0 \le k \le 2^P\}\) for some \(P \ge 1\). This says that \(\Delta \) is also made of dyadic points of order \(P\). We denote by \(Q\) the unique integer such that
and by \(i_{Q}\) some index such that \(L_{i_{Q}} = 2^Q + r\). At the first step, the \(2^Q\) first points in \((\Delta '{\setminus } \Delta ) \cap (t_{i_{Q}-1},t_{i_{Q}})\) are reduced into \(2^{Q-1}\) points. At the second step, they are reduced into \(2^{Q-2}\) points and so on... Therefore, it takes \(Q\) steps to reduce the \(2^Q\) first points in \((\Delta '{\setminus }\Delta ) \cap (t_{i_{Q}-1},t_{i_{Q}})\) into a single one. Meanwhile, it takes at most \(Q\) steps to reduce the \(r\) remaining points in \((\Delta '{\setminus }\Delta ) \cap (t_{i_{Q}-1},t_{i_{Q}})\) into a single one (without any interferences between the two reductions). We deduce that, after the \(Q\)th step, there are at most two operations to perform to reduce \(\Delta _{Q}'\) into \(\Delta \). This says that \(M\) is either \(Q+1\) or \(Q+2\) and that, at each step \(j \in \{1,\dots ,Q\}\) of the induction, we are doubling the step size \(\rho (\Delta _{j-1}'{\setminus } \Delta _{j}')\), that is
so that
Therefore, \(\rho (\Delta _{j-1}'{\setminus }\Delta _{j}') \le 2^{j-M+2} \pi (\Delta )\), \(j=1,\dots ,M\). By extending (64) to each of the steps of the induction, we get (up to a new value of \(C\))
When \(\Delta \) and \(\Delta '\) contain non-dyadic points (so that they are different from \(\{0,T\}\)), we can argue as follows. We can find a dyadic subdivision, denoted by \(D_{2}\), such that, in any open interval delimited by two consecutive points in \(D_{2}\), there is at most one element of \(\Delta \). Then, we remove points from \(D_{2}\) to obtain a minimal subdivision \(D_{1}\), made of dyadic points, such that, in any open interval delimited by two consecutive points in \(D_{1}\), there is exactly one element of \(\Delta \). In such way, in any open interval delimited by two consecutive points in \(\Delta \), there is at most one point in \(D_{1}\). Therefore, we can apply Lemma 17 to \((D_{1},D_{1} \cup \Delta )\) and \((\Delta ,D_{1} \cup \Delta )\). We get
since \(\pi (D_{1}) \le 2 \pi (\Delta )\). By the same argument, we can find a dyadic subdivision \(D_{1}'\) for which the above inequality applies with \((D_{1},\Delta )\) replaced by \((D_{1}',\Delta ')\). Then, we can find a dyadic subdivision \(D\) such that both \(D_{1} \subset D\) and \(D_{1}' \subset D\). Applying (65) to \((D_{1},D)\) and to \((D_{1}',D)\), we can bound the difference between \(S(D_{1}')\) and \(S(D_{1})\). The result follows.
4.5 Further properties of the integral
4.5.1 Extension of the integral
Given the decomposition (58), it is worth noting that both the integrals \(\int _{0}^T \psi _{t} M(t,t+dt)\) and \(\int _{0}^T \psi _{t} R(t,t+dt)\) are also defined as \(L^p\) limits of the associated adapted Riemann sums. The main point is to check that Lemma 17 applies to \(S_M\) and \(S_{R}\), where, with the same notation as in (60), \(S_{M}(\Delta ) = \sum _{i=0}^{N-1} \psi _{t_{i}} M(t_{i},t_{i+1})\) and \( S_{R}(\Delta ) = \sum _{i=0}^{N-1} \psi _{t_{i}} R(t_{i},t_{i+1})\). A careful inspection of the proof of Lemma 17 shows that the non-trivial point is to control the quantities \(\delta _{2} S(\Delta ,\Delta ',M)\) and \(\delta _{2} S(\Delta ,\Delta ',R)\), obtained by replacing \(A\) by \(M\) and \(R\) respectively in the definition of \(\delta _{2} S(\Delta ,\Delta ')\) in (62). Actually, since we already have a control of the sum of the two terms (as it coincides with \(\delta _{2} S(\Delta ,\Delta ')\) in the proof of Lemma 17), it is sufficient to control one of them only. Clearly,
We emphasize that the first term above is nothing but \(\delta _{2} S(\Delta ,\Delta ',R')\) in (63), for which we already have a bound. Therefore, the only remaining point is to control the second term above. Again, we notice that it has a martingale structure, which can be estimated by Burkholder–Davis–Gundy inequality. By the first line in (57) and by (59),
which is enough to conclude that Theorem 16 is also valid when replacing \(A\) by \(R\) or \(M\) in Sect. 4.4.4. Therefore, we are allowed to split the integral of \(\psi \) as \(\int _{0}^T \psi _{t}A(t,t+\mathrm{d}t) = \int _{0}^T \psi _{t} M(t,t+\mathrm{d}t) + \int _{0}^T \psi _{t} R(t,t+\mathrm{d}t)\). The reader must pay attention to the fact that neither \(M\) nor \(R\) must satisfy (57) even if \(A\) does. The extension of the integral to the case when they are driven by \(M\) or \(R\) is thus a consequence of the proof of Theorem 16 itself.
4.5.2 Continuity in time
It is plain to see that the integral is additive in the sense that, for any \(0 \le S \le S+S' \le T\),
An important question in practice is the regularity property of the process \([0,T) \ni t \mapsto \int _{0}^t \psi _{s} A(s,s+\mathrm{d}s)\), which is not well-defined for the moment. At this stage of the procedure, each of the integrals is uniquely defined up to an event of zero probability which depends on \(t\). A continuity argument is thus needed in order to give a sense to all the integrals at the same time. By Theorem 16, we know that, for \(h \in (0,1)\),
for \(\eta >0\) as in the statement of Theorem 16, so that, by the two first lines in (57), \( \Vert \int _{t}^{t+h} \psi _{s} A(s,s+\mathrm{d}s) \Vert _{L^p(\Omega ,{\mathbb {P}})} \le C h^{1/2}\), for possibly new values of \(C\). By Kolmogorov’s continuity criterion, this says that there exists a Hölder continuous version of the process \(( \int _{0}^t \psi _{s} A(s,s+\mathrm{d}s))_{0 \le t \le T}\), with \(1/2 - 1/p-\epsilon \) as pathwise Hölder exponent, for any \(\epsilon >0\).
By the same argument, we notice that there exist Hölder continuous versions of the processes \(( \int _{0}^t \psi _{s} M(s,s+\mathrm{d}s) )_{0 \le t \le T}\) and \(( \int _{0}^t \psi _{s} R(s,s+\mathrm{d}s))_{0 \le t \le T}\). The Hölder exponent of the second one is actually better. Indeed, noticing that (66) also holds for \(R\) and taking advantage of the first line in (57), we deduce that \(\Vert \int _{t}^{t+h} \psi _{s} R(s,s+\mathrm{d}s) \Vert _{L^p(\Omega ,{\mathbb {P}})} \le C h^{(1+\eta )/2}\), so that the pathwise Hölder exponent can be chosen as \((1+\eta )/2-1/p-\epsilon \) for any \(\epsilon >0\).
4.5.3 Dirichlet decomposition
It is well-checked that the process \(( \int _{0}^t \psi _{s} M(s,s+\mathrm{d}s))_{0 \le t \le T}\) is a martingale, thus showing that the integral of \(\psi \) with respect to the pseudo-increments of \(A\) can be split into two terms: a martingale and a drift. We expect that, in practical cases, the exponent \(p\) can be choose as large as desired: In this setting, the martingale part has \((1/2-\epsilon )\)-Hölder continuous paths, for \(\epsilon >0\) as small as desired, and the drift part has \((1/2 + \eta -\epsilon )\)-Hölder continuous paths, also for \(\epsilon >0 \) as small as desired, thus proving that the integral is a Dirichlet process.
4.6 Application to diffusion processes driven by a distributional drift
We now explain how the stochastic Young integral applies to (1). First, we can choose \(A(t,t+h) = X_{t+h} - X_{t}\), for \(0 \le t \le t+h \le T_{0}\). Then the process \(A\) is additive. In particular, the two last lines in (57) are automatically satisfied with \(\varepsilon _{1}\) and \(\varepsilon _{1}'\) as large as needed. By (48), the second line in (57) is also satisfied. Finally, we notice that
so that, by (48) again, the first line in (57) is satisfied with \(\varepsilon _{0} = \beta /2\).
With our construction, this permits to define \((\int _{0}^t \psi _{s} \mathrm{d}X_{s})_{0 \le t \le T_{0}}\) for any progressively measurable process \((\psi _{t})_{0 \le t \le T_{0}}\) satisfying (59) with \(\varepsilon _{2} < \beta /2\). It also permits to define the integrals \((\int _{0}^t \psi _{s} M(s,s+\mathrm{d}s))_{0 \le t \le T_{0}}\) and \((\int _{0}^t \psi _{s} R(s,s+\mathrm{d}s))_{0 \le t \le T_{0}}\), where
By (49), we have \(R(t,t+h) = {\mathfrak b}(t,X_{t},h)\), so that \((\int _{0}^t \psi _{s} {\mathfrak b}(s,X_{s},\mathrm{d}s) )_{0 \le t \le T_{0}}\) is well-defined.
Moreover, by Proposition 14 and by boundedness of the exponential moments of \((X_{t})_{0 \le t \le T_{0}}\) (see the proof of Theorem 8), we know that \(\hat{R}(t,t+h) = (b - {\mathfrak b})(t,X_{t},h)\) also satisfies (57), from which we deduce that \((\int _{0}^t \psi _{s} (b-{\mathfrak b})(s,X_{s},\mathrm{d}s))_{0\le t \le T_{0}}\) and so \((\int _{0}^t \psi _{s} b(s,X_{s},\mathrm{d}s))_{0\le t \le T_{0}}\) are well-defined. Actually the exponent in the power of \(h\) appearing in the difference \((b - {\mathfrak b})(t,X_{t},h)\) being strictly greater than 1, the integral process \((\int _{0}^t \psi _{s} (b-{\mathfrak b})(s,X_{s},\mathrm{d}s))_{0\le t \le T_{0}}\) must be \(0\). We deduce that \((\int _{0}^t \psi _{s} b(s,X_{s},\mathrm{d}s) = \int _{0}^t \psi _{s}{\mathfrak b}(s,X_{s},\mathrm{d}s))_{0 \le t \le T_{0}}\).
We finally discuss the integral \((\int _{0}^t \psi _{s} M(s,s+\mathrm{d}s) )_{0 \le t \le T}\). We let
By (48), \({\mathbb {E}}[ \vert \hat{M}(t,t+h) \vert ^q \vert ]^{1/q} \le C_{q}' h^{(1+\beta )/2}\) for some \(C_{q}' \ge 0\), which reads as a super-diffusive bound for the pseudo-increments of \(\hat{M}\). It is then well-checked that \((\hat{M}(t,t+h))_{0 \le t \le t+h \le T_{0}}\) fulfills all the requirements in (57). Therefore, the integral \((\int _{0}^t \psi _{s} \hat{M}(s,s+\mathrm{d}s) )_{0 \le t \le T_{0}}\) makes sense. By Sect. 4.5, it is a martingale but by the super-diffusive bound of the pseudo-increments it must be the null process. Put it differently, only the Brownian part really matters in \(M\) and we can justify (56) thanks to the equality
Remark 18
In [6], the authors already introduced a ‘nonlinear’ version of the Young integral. The motivation was similar to ours as the underlying objective was to solve singular differential equations driven by a distributional (but time-homogeneous) velocity field and perturbed by a rough signal. The construction suggested therein also consists of an approximation by means of Riemann sums, but the convergence is shown pathwise. The proof relies on a suitable control on the default of additivity of the nonlinear integrator, on the model of the third line in (57), but expressed in a pathwise (instead of \(L^p\)) form. We refer to [6, Theorem2.4] for the main statement: Therein, the pseudo-increment reads \(G_{t_{i},t_{i+1}}(f_{t_{i}})\) instead of \(A(t_{i},t_{i+1})\) and the condition \(\gamma + \rho \nu >1\) corresponds to the condition \(1+\varepsilon _{1}>1\) in the third line of (57). In the specific framework of singular differential equations driven by a distributional drift and a Brownian path, the Young integral is used in order to give a meaning to the drift part, exactly as we do here. Anyhow, the construction by Catellier and Gubinelli relies on a path by path time averaging principle, which goes back to Davie’s work [7]. Our construction is different as it relies on a space averaging principle, inspired by Zvonkin’s method [32]. We indeed make use of the statistical behavior of the Brownian motion (and its connection with the heat equation) in order to define explicitly the effective drift \(b(t,x,\mathrm{d}t)\). This explains why our approach is of stochastic nature.
5 Construction of the integral of \(Z\) w.r.t. \(Y\): examples
We here address the existence of a rough path structure \(({\varvec{W}}_{t}^T = (W_{t}^T,{\fancyscript{W}}_{t}^T))_{0 \le t \le T}\) for the pair \(W_{t}^T=(Y_{t},Z_{t}^T)\), for \(T\) running in some interval \([0,T_{0}]\), \(T_{0}>0\), the process \((Z_{t}^T)_{0 \le t \le T}\) being given by (13). The process \({\fancyscript{W}}^T\) is intended to encapsulate the iterated integrals of \(W^T\), namely \(\int _{x}^{y} (W_{t}^{i,T}(z) - W_{t}^{i,T}(x)) \mathrm{d}W^{j,T}(z)\), for \(i,j \in \{1,2\}\) and \(x,y \in {\mathbb {R}}\). Here \(W^{i,T}_{t}\) and \(W^{j,T}_{t}\) denote the coordinates of \(W_{t}^T\), namely \(W^{1,T}_{t}(x) = Y_{t}(x)\) and \(W^{2,T}_{t}(x) = Z_{t}^T(x)\).
As we are seeking a ‘geometric’ rough structure, the iterated integrals are expected to be the limits of iterated integrals computed along smooth approximations of the paths \((Y_{t})_{0 \le t \le T}\) and \((Z_{t}^T)_{0 \le t \le T}\), see (1) and (2) in Proposition 6. In particular, if it exists, \({\fancyscript{W}}^T\) must share some of the properties satisfied by iterated integrals of smooth paths, among which the integration by parts. This means that \({\fancyscript{W}}^{1,1,T}_{t}\) and \({\fancyscript{W}}^{2,2,T}_{t}\) must be given by
and that \({\fancyscript{W}}^{1,2,T}_{t}\) and \({\fancyscript{W}}^{2,1,T}_{t}\) must be connected through
To sum up, the only challenge for constructing \({\fancyscript{W}}^T\) is to define the ‘cross-integral’
5.1 Overview of the results
We are given \((Y_{t}(x))_{0 \le t \le T_{0},x \in {\mathbb {R}}}\) satisfying for some \(\alpha \in (1/3,1)\) and \(\chi ,\kappa >0\):
Below, we often write \(\kappa _{\alpha ,\chi }(Y)\) for \(\kappa _{\alpha ,\chi }((Y_{t})_{0 \le t \le T_{0}})\). As a first remark, we note that, for \(T \in [0,T_{0}]\), the process \((Z_{t}^T)_{0 \le t \le T}\) in (13) has the same regularity as \(Y\), uniformly in \(T\):
Lemma 19
Given \(T \in [0,T_{0}]\), recall the definition of \(Z_{t}^T\) in (13). There exists a constant \(C\) only depending on \(T_0\), \(\alpha \) and \(\chi \) such that \(\kappa _{\alpha ,\chi }((Z_{t}^T)_{0 \le t \le T})\le C\kappa \).
Proof
To prove \(\kappa _{\alpha ,\chi }((Z_{t}^T)_{0 \le t \le T}) \le C \kappa \), we go back to (33), noticing that \(({\mathcal M}v)_{t}\) therein is equal to \(Z_{t}^T\) when \(v \equiv 1\) and recalling that the analysis is split into two parts: \(\vert x' - x\vert ^2 \le T-t\) and \(T-t < \vert x'-x\vert ^2\), the first case only being challenging. It is then plain to check that, for \(x,x',\xi \in [-a,a]\), with \(a \ge 1\), \({\mathcal I}_{1}^{x,x'}(\xi ) \le C \kappa a^{\chi } \int _{0}^{\vert x' -x\vert ^2} s^{-1+\alpha /2} \mathrm{d}s \le C \kappa a^{\chi } \vert x'-x \vert ^{\alpha }\). Moreover, following (35) with \(\beta =1\), we also have \({\mathcal I}_{2}^{x,x'} \le C \kappa a^{\chi } \int _{x}^{x'} \int _{\vert x'-x\vert ^2}^{T} s^{-(3-\alpha )/2} \mathrm{d}s \mathrm{d}u \le C \kappa a^{\chi } \vert x'-x\vert ^{\alpha }\), for \(x,x' \in [-a,a]\), which completes the proof. \(\square \)
In order to construct \({\fancyscript{I}}_{t,T}(x,y)\) in (69) as a geometric integral, we must specify what an approximation of \(Y\) is. We shall say that a sequence \((Y^n)_{n\ge 0}\) is a smooth approximation of \(Y\) on \([0,T_{0}]\) if, for each \(t \in [0,T_{0}]\), the function \(Y^n_{t} : {\mathbb {R}}\ni x \mapsto Y^n_{t}(x)\) is a smooth function such that \(\sup _{n\ge 0}\kappa _{\alpha ,\chi }((Y^n_{t})_{0 \le t \le T_{0}})<\infty \) and, for any \(a\ge 1\), \(\lim _{n\rightarrow \infty } \Vert Y^n -Y \Vert _{0,\alpha '}^{[0,T_{0}] \times [-a,a]}=0\) for any \(\alpha ' \in (0, \alpha )\). Below, we shall often use the following trick, that holds true for any \(a \ge 1\) and any \(\alpha ' \in (0,\alpha )\),
In particular, a typical example for \(Y^n\) is to let
where \(\rho \) is a smooth density, \(\rho \) and its derivatives being at most of polynomial decay, in which case the smooth approximation is said to be constructed by spatial convolution.
Given a smooth approximation \((Y^n)_{n \ge 1}\) of \(Y\), we may define, for any \(T \in [0,T_{0}]\), the process \(Z^{n,T}\) by replacing \(Y\) by \(Y^n\) in (13), and then, following (69), we may let
which permits to define the structure \(({\varvec{W}}^{n,T}_{t} = (W_{t}^{n,T},{\fancyscript{W}}^{n,T}_{t}))_{0 \le t \le T}\) accordingly.
The following lemma then provides a general principle for constructing \({\fancyscript{I}}_{t}^{T}(x,x')\):
Lemma 20
Suppose that, for any \(T \in [0,T_0]\), there exists a function \({\fancyscript{I}}^{T} : [0,T] \times {\mathbb {R}}^2 \rightarrow {\mathbb {R}}\) and a smooth approximation \((Y^n)_{n \ge 1}\) of \(Y\) such that, for some \(\alpha '\in (1/3,\alpha )\) and \(\chi ' > \chi \),
Assume without any loss of generality that \(\chi ' > \chi + \alpha - \alpha '\). Then, for any \(T\in [0,T_0]\), there exists \({\fancyscript{W}}^T\in \mathcal {C}([0,T]\times {\mathbb {R}}^2,{\mathbb {R}}^4)\) such that the pair process \(({\varvec{W}}^T_{t}=(W_{t}^T,{\fancyscript{W}}_{t}^T))_{0 \le t \le T}\) is a time dependent geometric rough path with indices \((\alpha ',\chi ')\) in the sense that
-
(1)
\( \sup _{0 \le T \le T_{0}} \kappa _{\alpha ',\chi '} ({\varvec{W}}^T) < \infty \) and \( \sup _{n \ge 1} \sup _{0 \le T \le T_{0}} \kappa _{\alpha ',\chi '} ({\varvec{W}}^{n,T} = (W^{n,T},{\fancyscript{W}}^{n,T})) < \infty \);
-
(2)
for any \(T \in [0,T_{0}]\) and any segment \({\mathbb {I}} \subset {{\mathbb {R}}}\), \(\Vert {\varvec{W}}^T - {\varvec{W}}^{n,T}\Vert _{0,\alpha '}^{[0,T] \times {\mathbb {I}}}= \Vert (W^T-W^{n,T},{\fancyscript{W}}^T-{\fancyscript{W}}^{n,T})\Vert _{0,\alpha '}^{[0,T] \times {\mathbb {I}}}\) tends to \(0\) as \(n\) tends to \(\infty \).
Proof
The cross integral \({\fancyscript{I}}^T\) being given, the definition of \({\fancyscript{W}}^T\) follows from (67) and (68). The point is thus to prove the geometric nature of the rough path \({\varvec{W}}^T\).
By (71), we have, for any \(a \ge 1\), \(\lim _{n \rightarrow \infty } \Vert Y^n - Y \Vert _{0,\alpha '}^{[0,T_{0}] \times [-a,a]} = 0\). Moreover, \(\Vert Y^n_{t} \Vert _{\alpha '}^{[-a,a]} \le (2a)^{\alpha - \alpha '}\Vert Y^n_{t} \Vert _{\alpha }^{[-a,a]} \le C a^{\alpha - \alpha ' + \chi } \kappa _{\alpha ,\chi }(Y^n)\), proving that \(\sup _{n \ge 1} \kappa _{\alpha ',\chi '}(Y^n)< \infty \) if \(\chi ' \ge \alpha - \alpha ' + \chi \).
Applying Lemma 19 to \((Y^n,Z^{n,T})\), we get \(\sup _{n\ge 0} \sup _{0 \le T \le T_{0}} \kappa _{\alpha ',\chi '}((Z^{n,T}_{t})_{0 \le t \le T})<\infty \). Now, it is quite standard to see that, for any \(T \in [0,T_{0}]\) and \(a \ge 1\), \(\sup _{0 \le t \le T} \sup _{x \in [-a,a]} \vert Z^{n,T}_{t}(x) - Z^T_{t}(x) \vert \) tends to \(0\) as \(n \rightarrow \infty \). By Lemma 19 again, for \(a \ge 1\) and \(T \in [0,T_{0}]\), the functions \(([-a,a] \ni x \mapsto Z^{n,T}_{t}(x) \in {\mathbb {R}})_{0 \le t \le T,n \ge 1}\) are uniformly \(\alpha \)-Hölder continuous. By the same trick as in (71), we easily deduce that \(\Vert Z^{n,T} - Z^T \Vert _{0,\alpha '}^{[0,T] \times [-a,a]}\) tends to \(0\).
In order to complete the proof, it suffices to handle the iterated integrals, which follows from (73) and (69) (applied to the pair \((Y^n,Z^{n,T})\) instead of \((Y,Z)\)). \(\square \)
Here is the first main statement of this section:
Theorem 21
Given \(\alpha \in (1/3,1]\) and \(\chi >0\), let \(Y \in {\mathcal {C}}([0,T_{0}] \times {\mathbb {R}},{\mathbb {R}})\) satisfy \(\kappa _{\alpha ,\chi }((Y_{t})_{0 \le t \le T_{0}}) < \infty \) (see (70) for the notation) and

for some \(\kappa \ge 0\) and \(\mu ,\nu \ge 0\) with \(2 \nu + \mu \in ( 1- \alpha ,1]\). Then, \(Y\) satisfies the assumptions of Lemma 20 with respect to any \((\alpha ',\chi ')\) with \(\alpha '<\alpha \) and \(\chi ' > \chi + \alpha - \alpha ' + (1/2 - \alpha )_{+}\). In particular, for any \(T \in [0,T_{0}]\), the pair \(W^T = (Y,Z^T)\), with \(Z^T\) given by (13), may be lifted into a geometric rough path \({\varvec{W}}^T =(W^T,{\fancyscript{W}}^T)\) satisfying the conclusions of Lemma 20.
Moreover, when the smooth approximation used in Lemma 20 is constructed by spatial convolution, \({\varvec{W}}^T\) does not depend upon the kernel \(\rho \) in (72). When \(\alpha >1/2\), \({\varvec{W}}^T\) is always well-defined and remains the same whatever the smooth approximation is (even if not constructed by convolution).
Theorem 21 guarantees that \({\varvec{W}}^T\) exists for any \(T \in [0,T_{0}]\) under some condition on the time-space structure of the environment \((Y_{t})_{0 \le t \le T_{0}}\). When \(Y\) is time homogeneous, (74) is automotically satisfied, and the iterated integral in (69) always exists and is geometric under the simple assumption that \(\kappa _{\alpha ,\chi }(Y) < \infty \). In that case, the cross integral \({\fancyscript{I}}^T_{t}(x,x')\) in (69) can be expressed explicitly, see (78) in Lemma 23 below. Moreover, a careful inspection of the proof shows that the constraint \(\chi ' > \chi + \alpha - \alpha ' + (1/2 - \alpha )_{+}\) can be relaxed into \(\chi ' > \chi + \alpha - \alpha '\). When \(Y\) is time dependent, the additional condition (74) is imposed. It is inspired from the construction of the so-called Young integral between a Hölder continuous function and the increments of another Hölder continuous function, see [31] and Lemma 24 below. For instance, if \(\alpha >1/2\), (74) is always satisfied with \(\mu =\alpha \) and \(\nu =0\) and the constraint on \(\chi '\) reduces to \(\chi ' > \chi + \alpha - \alpha '\). When \(\alpha \le 1/2\), a sufficient condition to imply (74) is that \(Y\) has some \(\beta \)-Hölder regularity in time: \(\vert Y_s(y)-Y_t(y) \vert \le \kappa ' \bigl ( 1+ \vert y \vert ^{\chi } \bigr ) \vert s- t \vert ^{\beta }\) with \(\beta > (1-\alpha )/2\). The bound (74) is then satisfied with \(\mu =0\) and \(\nu =\beta \wedge (1/2)\). A more specific case is when \(Y_{t}(y)\) can be expanded as \(Y_{t}(y) = f_{t}Y(y)\), with \(f\) \(\beta \)-Hölder continuous, for \(\beta > 1/2-\alpha \), and \(Y \in {\mathcal {C}}({\mathbb {R}},{\mathbb {R}})\) with \(\sup _{a \ge 1}[ a^{-\chi } \Vert Y\Vert _{\alpha }^{[-a,a]}] < \infty \), in which case (74) holds with \(\mu =\alpha \) and \(\nu =\beta \wedge (1/2-\alpha /2)\). Notice finally that the constraint \(2 \nu + \mu \le 1\) can be easily overcome: When \(2\nu + \mu >1\), the value of \(\nu \) can be decreased for free so that \(2\nu +\mu = 1\).
As mentioned in Sect. 1 existence of the cross-integral has been also proved within the framework of the KPZ equation by means of general results on rough paths theory applied to Gaussian processes, see [16, 18, Section 3] and [19, Section 7]. Theorem 22 below is a refinement:
Theorem 22
Let \((\Xi ,{\mathcal G},{\mathbf {P}})\) be a probability space with a Brownian sheet \((\zeta (t,x))_{t \ge 0, x \in {\mathbb {R}}}\). Let \(Y^T(t,x) := \int _{t}^{T} \int _{{\mathbb {R}}} p_{s-t}(x-y) \mathrm{d}\zeta (s,y)\), for \(\{0\le t\le T,x\in {\mathbb {R}}\}\). For a smooth density \(\rho \), \(\rho \) and its derivatives being at most of polynomial decay, define in the same way \(Y^{\rho ,T}(t,x) := \int _{t}^{T} \int _{{\mathbb {R}}} p_{s-t}(x-y) \mathrm{d}\zeta ^{\rho }(s,y)\), with \(\zeta ^\rho (t,x) := \int _{0}^t \int _{{\mathbb {R}}} \rho (x-y) \mathrm{d}\zeta (s,y)\).
Then, for any \(T_0>0\), we can find an event \(\Xi ^\star \in {\mathcal G}\), with \({\mathbf {P}}(\Xi ^\star )=1\), such that, for any realization in \(\Xi ^\star \), for any \(Y^{(b)} \in {\mathcal {C}}([0,T_{0}] \times {\mathbb {R}},{\mathbb {R}})\), with \(\kappa _{\alpha _{b},\chi _{b}}(Y^{(b)}) < \infty \) for some \(\alpha _{b} >1/2\) and \(\chi _{b} >0\), for any approximation sequence \((Y^{n,(b)})_{n \ge 1}\) of \(Y^{(b)}\), the function
satisfies the assumption of Lemma 20 with respect to any \(\alpha \in (0,1/2)\) and any \(\chi > \chi _{b} + \alpha _{b}-\alpha \), and with respect to the smooth approximation \((Y^n = Y^{n \rho (n \cdot ),T_0} + Y^{n,(b)})_{n \ge 1}\).
Theorem 22 is specifically designed to handle the KPZ equation and to construct, in the next section, the related polymer measure. In this perspective, an important point is to control the time-dependent rough paths \(({\varvec{W}}^T_{t})_{0 \le t \le T}\), uniformly in \(T \in [0,T_{0}]\), which is one of the reason why we revisit the argument given in [19, Section 7]. Instead of making use of general results on rough paths theory for Gaussian processes, we benefit from the fact that \(Y^{T_0}\) solves the backward stochastic heat equation to identify the cross-integral \({\fancyscript{I}}_{t}^T(x,x')\) in (69) with a stochastic integral. Such a construction can be extended to non-Gaussian cases when \(Y^{T_0}\) solves a stochastic PDE of a more general form (with possibly random coefficients).
5.2 Proof of Theorem 21
Following the decomposition of \(Y\) introduced in the statement of Theorem 21, it makes sense to split \(Z^T_{t}(x)\) into \(Z_t^T(x) = Z_t^{(1),T}(x)+Z_t^{(2),T}(x)\), with
Accordingly, we can split, at least formally, the iterated integral \({\fancyscript{I}}^T_{t}(x,x')\) in (69) into \({\fancyscript{I}}^T_{t}(x,x') = {\fancyscript{I}}^{(1),T}_{t}(x,x') + {\fancyscript{I}}^{(2),T}_{t}(x,x')\), with
The analysis of \({\fancyscript{I}}^{(1),T}_{t}\) relies on
Lemma 23
Given \(\alpha ,\chi ,\kappa >0\), there is a constant \(C\), such that, for any \(Y \in {\mathcal {C}}([0,T_{0}] \times {\mathbb {R}},{\mathbb {R}})\) with \(\kappa _{\alpha ,\chi }(Y) \le \kappa \), the map \(x \mapsto Y_{t}(x)\) being differentiable for any \(t \in [0,T_{0}]\), it holds that
Moreover,
In the framework of Lemma 20, (78) remains true when \(Y\) is not differentiable in \(x\), by passing to the limit along a smooth approximation. When \(Y\) is time-homogeneous, \({\fancyscript{I}}^{T}_{t}(x,x')\) and \({\fancyscript{I}}^{(1),T}_{t}(x,x')\) coincide, and we have an explicit formula for the cross integral in (69).
Proof
Taking benefit of the heat equation satisfied by \(p_{s-t}\), we have
Recalling that, under the assumption of Lemma 23, \(Y\) is smooth in space, we get from (68):
Plugging the formula for \(Z_{t}^{(1),T}\) into the above relationship, we get (78).
By Lemma 19, \(\kappa _{\alpha ,\chi }(Z^{(1),T}) \le C \kappa \). It is then clear that, for \(x,x' \in [-a,a]\), the two first terms in the right hand side of (78) satisfy (77). In order to prove that the third one satisfies it as well, we notice that it may be rewritten under the form (up to the factor \(-2\))
Splitting the increment \(Y_{t}(z) - Y_{t}(x)\) into \(Y_{t}(z) - Y_{t}(y)\) plus \(Y_{t}(y) - Y_{t}(x)\), we deduce from the bound \(\vert \partial _{x} p_{T-t} \vert \le c(T-t)^{-1/2}p_{c(T-t)}\) that \(\vert {\fancyscript{J}}_{T-t}(x,x') \vert \le C a^{2\chi } [ (T-t)^{-(1-\alpha )/2} \vert x'-x \vert ^{1+\alpha } + (T-t)^{-1/2} \vert x'-x \vert ^{1+2\alpha } ]\), so that, for \(T-t \ge \vert x'-x\vert ^2\), \(\vert {\fancyscript{J}}_{T-t}(x,x') \vert \le C a^{2\chi } \vert x'-x \vert ^{2\alpha }\).
In order to handle the case \(T-t \le \vert x'-x \vert ^2\), we first notice, by antisymmetry, that, for \(x<x'\), \({\fancyscript{J}}_{T-t}(x,x') ={\fancyscript{J}}_{T-t}^{(-\infty ,x]}(x,x') +{\fancyscript{J}}_{T-t}^{[x',+\infty )}(x,x')\), that is \({\fancyscript{J}}_{T-t}^{[x,x']}(x,x')=0\), where
We start with \({\fancyscript{J}}_{T-t}^{(-\infty ,x]}(x,x')\) (the other one may be handled in the same way). We have
Bounding \(\vert y-x\vert \) by \(\vert x'-x \vert \), the result follows from the following bound applied with \(\gamma =0\) or \(\chi \) and \(h=T-t\),
\(\square \)
In order to handle \({\fancyscript{I}}^{(2),T}_{t}\), we will make use of a famous result by Young [31]:
Lemma 24
Given two exponents \(\alpha ,\alpha ' >0\) with \(\alpha +\alpha '>1\), there exists a universal constant \(c>0\) such that, for any \(\alpha '\)-Hölder function \(f\) and any \(\alpha \)-Hölder function \(g\) on the interval \([x,x']\), the Stieltjes integral \(\int _{x}^{x'} f(z) \mathrm{d}g(z)\) is well defined and it holds
where \(\Vert f\Vert _{\alpha '}\) (resp. \(\Vert g\Vert _{\alpha }\)) is the Hölder semi-norm of \(f\) (resp. \(g\)).
Young’s result gives directly the existence of \({\fancyscript{I}}^{(2),T}\):
Lemma 25
Consider \(Y \in {\mathcal {C}}([0,T_{0}] \times {\mathbb {R}},{\mathbb {R}})\) satisfying both \(\kappa _{\alpha ,\chi }(Y) \le \kappa \) and (74). Then, for any \(0\le t\le T\le T_0\), the map \({\mathbb {R}}\ni x \mapsto Z_{t}^{(2),T}\) is locally \(2\nu +\mu \)-Hölder in space and there exists a constant \(C\), independent of \(t\) and \(T\), such that \(\kappa _{2\nu +\mu ,\chi }(Z^{(2),T})\le C\kappa \). As a consequence of Young’s theory, the integral \({\fancyscript{I}}^{(2),T}\) is well defined and, for \(a \ge 1\), \(x,x'\in [-a,a]\),
Proof
Let \(a\ge 1\) and \(x\le x'\in [-a,a]\). As in the proof of Lemma 19, we have to bound \(|Z_{t}^{(2),T}(x')-Z_{t}^{(2),T}(x)|\). We split the analysis into two cases: \(|x'-x|^2\le T-t \) and \(|x'-x|^2>T-t\), only the case \(|x'-x|^2\le T-t\) being challenging. To handle it, we go back to (33), letting \(v \equiv 1\) therein and replacing \(Y_{s}(z)\) by \(Y_{s}(z) - Y_{t}(z)\). By (74), we can repeat the computations of Lemma 19, replacing \(s^{-\alpha /2}\) by \(s^{-\nu -\mu /2}\). We deduce that \(|Z_{t}^{(2),T}(x')-Z_{t}^{(2),T}(x)| \le C \kappa a^\chi |x'-x|^{2\nu +\mu }\). Since the sum of the Hölder exponents of \(Z_{t}^{(2),T}\) and \(Y_t\) is larger than 1, the existence of and the bound for \({\fancyscript{I}}^{(2),T}\) are direct consequences of Lemma 24. \(\square \)
Given Lemmas 23 and 25, we now turn to
Proof of Theorem 21
Consider a smooth approximation \((Y^n)_{n \ge 1}\) of \(Y\) constructed by spatial convolution, as in (72). Following (75) and (76), we may split \({\fancyscript{I}}^{n,T}\) accordingly, into \({\fancyscript{I}}^{n,T} = {\fancyscript{I}}^{n,(1),T} + {\fancyscript{I}}^{n,(2),T}\). We then notice that each \(Y^{n}\) satisfies \(\kappa _{\alpha ,\chi }(Y^n) \le c\kappa \) and satisfies (74) with \(\kappa \) replaced by \(c\kappa \), for \(c\) independent of \(n\). If \(\alpha > 1/2\), we can always choose \(\nu =0\) and \(\mu =\alpha \) in (74), in which case, by Lemmas 23 and 25, the first line in (73) is satisfied with \(\chi ' = \chi \). If \(\alpha \le 1/2\), we must have \(2 \nu +\mu > 1-\alpha \ge \alpha \) so that \(\alpha + 2\nu +\mu \ge 2\alpha \). By Lemmas 23 and 25, the first line in (73) is satisfied with \(2\chi ' = 2\chi + (2\nu +\mu -\alpha )\). Since the value of \(\nu \) can be arbitrarily decreased provided that \(2 \nu + \mu > 1-\alpha \) still holds true, we deduce that the first line in (73) is satisfied for any \(\chi ' > \chi + (1-\alpha /2)_{+}\).
It thus remains to check the second line in (73). We first notice that we can pass to the limit in the formula (78) for \({\fancyscript{I}}^{n,(1),T}\), replacing therein \(Z^{(1),T}\) by \(Z^{n,(1),T}\) and \(Y\) by \(Y^n\). Obviously, the limit is \({\fancyscript{I}}^{(1),T}\) (whatever the choice of the smooth approximation is). Following the proof of Lemma 20, the convergence is uniform on any \([0,T] \times [-a,a]\), \(a \ge 1\), which means that
Passing to the limit in (77), \({\fancyscript{I}}^{(1),T}\) satisfies (77). Combining with (80), we deduce, as in (71), that the second line in (73) holds with \({\fancyscript{I}}^{T} - {\fancyscript{I}}^{n,T}\) replaced by \({\fancyscript{I}}^{(1),T} - {\fancyscript{I}}^{n,(1),T}\).
In order to complete the proof, we must prove the second line in (73), but with \({\fancyscript{I}}^{T} - {\fancyscript{I}}^{n,T}\) replaced by \({\fancyscript{I}}^{(2),T} - {\fancyscript{I}}^{n,(2),T}\). We have the decomposition
We start with the second term in the right-hand side. For any \(\alpha '<\alpha \), we know that, locally, the \(\alpha '\)-Hölder norm of \(Y - Y^n\) in space tends to \(0\). Repeating the proof of Lemma 25, we deduce that, locally, the \(2\alpha '\)-Hölder semi-norm of the last term tends to \(0\). In order to prove the same result for the first term, it suffices to notice that, for any pair \((\nu ',\mu ')\) with \(\nu ' \le \nu \) and \(\mu ' \le \mu \), one of the two inequalities being strict, the difference \(Y^n - Y\) satisfies (74) with a constant \(\kappa \) that may depend on \(a\) but that tends to \(0\) as \(n\) tends to \(\infty \). Therefore, the \((2\nu '+\mu ')\)-Hölder norm of the integrand in the first term tends to \(0\).
If \((\tilde{Y}^n)_{n \ge 1}\) is another approximation, also constructed by convolution, we can prove in the same way that the difference \({\fancyscript{I}}^{n,(2),T} -\tilde{\fancyscript{I}}^{n,(2),T}\) tends to \(0\), where \(\tilde{\fancyscript{I}}^{n,(2),T}\) is associated with \(\tilde{Y}^n\). Therefore, \(({\fancyscript{I}}^{n,(2),T})_{n \ge 1}\) and \((\tilde{\fancyscript{I}}^{n,(2),T})_{n \ge 1}\) have the same limit. Things are the same when \(\alpha >1/2\) (with \(\nu =0\) and \(\mu = \alpha \)) and the construction of \((\tilde{Y}^n)_{n \ge 1}\) is arbitrary, since, in that case, \((\tilde{Y}^n)_{n \ge 1}\) necessarily satisfies (74), uniformly in \(n \ge 1\). \(\square \)
5.3 Proof of Theorem 22
The proof is divided in several steps. The first one is to prove a generalization of the well-known Kolmogorov’s Hölder continuity criterion.
Theorem 26
Let \({\mathcal Q}\) be a countable set and \((R_{L} : [-1,1]^2 \times \Xi \ni (x,y,\xi ) \mapsto R_{L}(x,y)(\xi ) \in {\mathbb {R}})_{L \in {\mathcal Q}}\) be a family of random fields on the space \((\Xi ,{\mathcal G},{\mathbf {P}})\), satisfying, for some \(p \ge 1\), some \(C,\beta ,\gamma ,\gamma _{1},\gamma _{2} >0\), some random variable \(\zeta \), all \(a \ge 1\) and all \(x,y,z \in [-a,a]\), \(x<y<z\),
Then, for any \(L \in {\mathcal Q}\) and \(x,y \in {\mathbb {R}}\), we can redefine \(R_{L}(x,y)\) on a \({\mathbf {P}}\) null event, and, for any \(\chi >1/p\) and \(0 < \varsigma < \min (\gamma _{1}+\gamma _{2},\beta /p)\), we can find a constant \(c:=c(\varsigma ,\chi ,\gamma _{1},\gamma _{2},\beta ,p)\) and a non-negative random variable \(\zeta '\), with \({\mathbb {E}}[\vert \zeta '\vert ^p] < c C\), such that, for all \(a \ge 1\),
The result remains true when \({\mathcal Q}\) is a separable metric space and, for any \(x,y \in {\mathbb {R}}\), the mapping \({\mathcal Q} \ni L \mapsto R_{L}(x,y)\) is almost-surely continuous.
Proof
In the case \(a =1\), (82) can be proved by adapting the proof of the standard version of Kolmogorov’s criterion. In order to get the result for any \(a \ge 1\), we can fix \(a \in {\mathbb N}{\setminus }\{0\}\) and then apply the result on \([-1,1]^2\) to the family \((R_{L} : [-1,1]^2 \times \Xi \ni (x,y,\xi ) \mapsto R_{L}(a x,a y)(\xi ) )_{L \in {\mathcal Q}}\). It satisfies (81) with \(C a^{p\gamma }\) replaced by \(C a^{1+\beta +p\gamma }\) in the first line and \(\zeta a^{\gamma }\) replaced by \(\zeta a^{\gamma _{1}+\gamma _{2}+\gamma }\) in the second line. Therefore, for any \(\varsigma \in (0,\min (\gamma _{1}+\gamma _{2},\beta /p))\), we can find a constant \(C'\), independent of \(a\), and a variable \(\zeta _{a}'\) (which may depend on \(a\)), such that (up to a redefinition of each \(R_{L}(a x,a y)\) on a \({\mathbf {P}}\) null event)
with \({\mathbb {E}}[\vert \zeta _{a}' \vert ^p] \le C'\). Choose \(\chi >1/p\) and let \(\Gamma := \sup _{a \in {\mathbb N}{\setminus }\{0\}} [ a^{-\chi } \zeta _a']\). Then, for another constant \(C''>0\), \({\mathbb {E}}[\vert \Gamma \vert ^p] \le C' \sum _{a \ge 1} a^{-p\chi } \le C''\). We have

When \({\mathcal Q}\) is a separable metric space, we consider a countable dense subset \(\hat{\mathcal Q}\). For any realization in an event of probability one, for any \(x,y \in {\mathbb {Q}}\), the map \(\mathcal Q \ni L \mapsto R_{L}(x,y)\) is continuous, and, by the first part, the maps \(({\mathbb {R}}^2 \ni (x,y) \mapsto R_{L}(x,y))_{L \in \hat{\mathcal Q}}\) satisfy (82) and are thus uniformly continuous on compact sets. With probability one, we can extend \(\hat{\mathcal Q} \times {\mathbb {Q}}^2 \ni (L,x,y) \mapsto R_{L}(x,y)\) into a continuous mapping on \({\mathcal Q} \times {\mathbb {R}}^2\), which satisfies (82). \(\square \)
5.3.1 Regularity of \(Y^{T_{0}}\)
We start with:
Lemma 27
There exists \(\Xi ^\star \in {\mathcal G}\), with \({\mathbf {P}}(\Xi ^\star )=1\), such that, on \(\Xi ^\star \), for all \(\alpha <1/2\), \(\chi >0\), the map \([0,T_{0}] \times {\mathbb {R}}\ni (t,x) \mapsto Y_{t}^{T_{0}}(x)\) is continuous and satisfy \(\kappa _{\alpha ,\chi }(Y^{T_{0}})<\infty \). Moreover, \({\mathbb {E}}[(\kappa _{\alpha ,\chi }(Y^{T_{0}}))^p] < \infty \) for all \(\alpha <1/2\), \(\chi >0\) and \(p\ge 1\).
Continuity of \(Y^{T_{0}}\) is a well-known fact, which follows from Kolmogorov’s criterion. Letting \({\mathcal {D}}_{T_{0}} = \{ (t,s) \in [0,T_{0}]^2 : t < s\}\), the almost sure finiteness of \(\kappa _{\alpha ,\chi }(Y)\) is a consequence of the following result:
Lemma 28
Let \({\mathcal K} : {\mathcal {D}}_{T_{0}} \times {\mathbb {R}}\times \Xi \ni ((t,s),y,\xi ) \mapsto {\mathcal K}_{t}(s,y)(\xi ) \in {\mathbb {R}}\) be a random function, continuous in \((t,s,y)\) for any \(\xi \) and differentiable in \(t\) for any \((s,y,\xi )\), such that \({\mathcal K}_{t}(s,y)\) is measurable with respect to the \(sigma\)-field \({\mathcal G}_{s}^{T_{0}}: = \sigma (\zeta (u,y) - \zeta (u,x) - \zeta (s,y) + \zeta (s,x), \ s \le u \le T_{0}, \ x,y \in {\mathbb {R}})\). Assume that there exist a constant \(\epsilon >0\) and a non negative random variable \(\kappa \), with \({\mathbb {E}}[\kappa ^q]<\infty \) for any \(q \ge 1\), such that, \({\mathbb {P}}\)-almost surely, for any \(t \le s \le T_{0}\),
Then, letting \({\mathcal I}_{t}^T\) be the backward stochastic Itô integral \(\int _{t}^T \int _{{\mathbb {R}}} {\mathcal K}_{t}(r,u) \mathrm{d}\zeta (r,u)\), the quantity \(\sup _{0 \le t \le T \le T_{0}} \vert {\mathcal I}_{t}^T \vert \) is a random variable and, for any \(p \ge 1\), we can find a constant \(c_{p}\), only depending upon \(p\) and \(T_{0}\), such that \({\mathbb {E}} [ \sup _{0 \le t \le T \le T_{0}} \vert {\mathcal I}_t^T \vert ^p ]^{1/p} \le c_{p}{\mathbb {E}}[\kappa ^{p/2}]^{1/p}\).
Proof of Lemma 28
For any \(t \le t' \le t'+\delta \le T \le T_{0}\), \({\mathcal I}_t^T - {\mathcal I}_{t'}^{T}\) is equal to
By square integrability of \({\mathcal K}\) in \((s,y)\), \({\mathcal I}_{t}^{T}\) is continuous in \(T\), and we can take the supremum over \(T \in [t'+\delta ,T_{0}]\). Writing \({\mathcal K}_{t'}-{\mathcal K}_{t}= \int _{t}^{t'} \partial _{t} {\mathcal K}_{r} \mathrm{d}r\), we get that, for any \(p \ge 1\),
the first term in the right-hand side being obtained by the Burkhölder–Davies–Gundy inequality and the constant \(c_{p}\) only depending upon \(p\). Using the bounds (83), we get
Choosing \(\delta = t'-t\) and modifying \(c_{p}\) if necessary, we can bound the right-hand side by \( c_{p} {\mathbb {E}}[\kappa ^{p/2}]^{1/p}\vert t'-t \vert ^{\epsilon /2}.\) We easily get a similar bound when the supremum is taken over \(T \in [t,(t'+\delta ) \wedge T_{0}]\), with the convention that \({\mathcal I}_{t'}^T = 0\) when \(t'>T\). We deduce that
The result follows from Kolmogorov’s criterion. \(\square \)
Proof of Lemma 27
Lemma 28 applies to the proof of Lemma 27 with \({\mathcal K}_{t}(s,y) = p_{s-t}(x-y)-p_{s-t}(x'-y)\) and thus \({\mathcal I}_{t}^T = Y_{t}^T(x)- Y_{t}^T(x')\), for \(x,x' \in {\mathbb {R}}\). We have two bounds for \(\int _{{\mathbb {R}}} \vert {\mathcal K}_{t}(s,y) \vert ^2 \mathrm{d}y\). The first one is \(\int _{{\mathbb {R}}} \vert {\mathcal K}_{t}(s,y) \vert ^2 \mathrm{d}y \le C (s-t)^{-1/2}\) and the second one is \(\int _{{\mathbb {R}}} \vert {\mathcal K}_{t}(s,y) \vert ^2 \mathrm{d}y \le C (s-t)^{-3/2} \vert x'-x\vert ^2\). By interpolation, we have, for any \(\alpha \in (0,1)\), \(\int _{{\mathbb {R}}} \vert {\mathcal K}_{t}(s,y) \vert ^2 \mathrm{d}y \le C (s-t)^{-(1/2+\alpha )} \vert x'-x\vert ^{2\alpha }\). A similar argument applies to \(\int _{{\mathbb {R}}} \vert \partial _{t} {\mathcal K}_{t}(s,y) \vert ^2 \mathrm{d}y\). We can bound it by \(C(s-t)^{-5/2}\) and by \(C(s-t)^{-7/2} \vert x'-x\vert ^2\), and thus by \(C (s-t)^{-(5/2+\alpha )} \vert x'-x\vert ^{2\alpha }\). The bound in the statement of Lemma 28 holds true with \(\kappa =C \vert x'-x\vert ^{2\alpha }\) and \(\epsilon = (1/2-\alpha )\). Therefore, for \(p \ge 1\) and \(\alpha \in (0,1/2)\), we can find a constant \(C_{p}\) such that
The conclusion follows now from Theorem 26. We apply it with \(R_{t,T}(x,y) = Y_{t}^T(y) - Y_{t}^T(x)\), \(p\) as large as desired, \(\beta = p \alpha -1\), \(\zeta =0\), \(\gamma =0\), \(\gamma _{1}=\gamma _{2}=1\), \(L=(t,T)\) and \({\mathcal {Q}}={\mathcal {D}}_{T_{0}}\). We get that, for any \(\chi >0\), \(\alpha \in (0,1/2)\) and \(p\ge 1\), \({\mathbb {E}}[(\sup _{T\le T_0}\kappa _{\alpha ,\chi }(Y^{T}))^p]< \infty \) and there is an event \(\Xi ^{\chi ,\alpha ,\star }\), of probability \(1\), on which \(\sup _{T\le T_0}\kappa _{\alpha ,\chi }(Y^{T})<\infty \). Letting \(\Xi ^\star = \cap _{\chi ,\alpha \in {\mathbb {Q}},\chi >0,\alpha \in (0,1/2)} \Xi ^{\chi ,\alpha ,\star }\), this completes the proof. (Note that the result is actually stronger than the claim in the statement. The reason why we included a supremum over \(T\) in the statement of Lemma 28 will become clear in the last part of the proof of Theorem 22.) \(\square \)
5.3.2 Reducing the proof to the case \(Y^{(b)} \equiv 0\)
The next step is to show:
Lemma 29
In order to prove Theorem 22, we can assume \(Y^{(b)} \equiv 0\).
Proof
First step. If \(Y^{(b)} \not \equiv 0\), we consider \(\Xi ^\star \) and then \(\chi \in (0,\chi _{b}]\) and \(\alpha \in (1-\alpha _{b},1/2)\) as in Lemma 27. For a realization in \(\Xi ^\star \) and for a smooth kernel \(\rho \), \(\rho \) and its derivatives being at most of polynomial decay, we let, for every integer \(n \ge 1\), \(Y^{n,T_0}\) be the \(n\)th approximation of \(Y^{T_0}\) constructed by convolution, see (72). Clearly, \(Y^{n,T_0}_{t}(x) = \int _{t}^{T_{0}} \int _{{\mathbb {R}}} p_{s-t}(x-y) \mathrm{d}\zeta ^n(s,y)\), where \(\zeta ^n(s,y) = n \int _{0}^s \int _{{\mathbb {R}}} \rho (n(y-u)) \mathrm{d}\zeta (r,u)\), proving that \(Y^{n,T_0}=Y^{n \rho (n \cdot ),T_0}\). By Lemma 27, (71) holds true (with \((Y,Y^n)\) replaced by \((Y^{T_{0}},Y^{n,T_{0}})\)).
Given a realization in \(\Xi ^\star \) and the path \(Y^{(b)}\), we consider an arbitrary smooth approximation \((Y^{n,(b)})_{n \ge 1}\) of \(Y^{(b)}\), so that \((Y^n := Y^{n,T_0} + Y^{n,(b)})_{n \ge 1}\) is a smooth approximation of \(Y\).
Second step. Letting
we may split, at least formally, \({\fancyscript{I}}^{T}_{t}(x,x')\) into

where
When \((i,j) \not = (T_0,T_0)\), the cross-integrals \({\fancyscript{I}}^{(i,j),T}_{t}(x,x')\) can be constructed as Young integrals by means of Lemma 24. Indeed, when \(i=(b)\), the path \(Z^{(b),T}_{t}\) has the same regularity as \(Y^{(b)}\), see Lemma 19, so that the sum of the Hölder exponents of the two curves involved in the definition of the integral is always greater than \(\alpha +\alpha _{b}>1\) when at least one of the two indices \(i\) or \(j\) is equal to \((b)\). Lemmas 24 and 25 directly say that \(\vert {\fancyscript{I}}^{(i,j),T}_{t}(x,x') \vert \le C a^{2(\chi _{b}+\alpha _{b}-\alpha )} \vert x'-x\vert ^{2\alpha }\) when \(x,x' \in [-a,a]\) with \(a \ge 1\), the constant \(C\) being random (as it depends upon the realization of \(\kappa _{\alpha ,\chi }(Y)\)). Denoting by \(Z^{n,i,T}\) and \({\fancyscript{I}}^{n,(i,j),T}\) the quantities associated with the smooth approximation \(Y^{n}\), \({\fancyscript{I}}^{n,(i,j),T}_{t}(x,x')\) satisfies a similar bound, with the same \(C\). By bilinearity of Young’s integral in \((f,g)\), see Lemma 24, it is clear that, for all \(T \in [0,T_{0}]\) and \(a \ge 1\), \(\sup _{0\le t \le T}\Vert \fancyscript{I}_t^{(i,j),T}-\fancyscript{I}_t^{n,(i,j),T}\Vert _{2\alpha '}^{[-a,a]} \) tends to \(0\) as \(n\) tends to \(\infty \), when \((i,j) \not = (T_0,T_0)\) and \(\alpha '<\alpha \). \(\square \)
5.3.3 Proof of Theorem 22 when \(Y^{(b)} \equiv 0\)
Lemma 30
Theorem 22 is true when \(Y^{(b)} \equiv 0\).
Proof
First step. The point is to construct \({\fancyscript{I}}^{T}\) which is equal to \({\fancyscript{I}}^{(T_0,T_0),T}\) as \(Y^{(b)} \equiv 0\). With \(\rho \) as in the statement, recall \(Y^{\rho ,T}_{t}(x) = \int _{t}^{T} \int _{{\mathbb {R}}} p_{s-t}(x-y) \mathrm{d}\zeta ^\rho (s,y) = \int _{t}^T P_{s-t} \rho (x-u) \mathrm{d}\zeta (s,u)\). With these notations, the smooth approximation \(Y^{n,T_0}\) considered in the first step of the proof of Lemma 29 is obtained by replacing \(\rho \) by \(n \rho (n \cdot )\) and \(Y^{T_0}\) by replacing \(\rho \) by the Dirac mass \(\delta _{0}\) at \(0\). With \(Y^{\rho ,T_{0}}\), we associate a cross-integral as in (69). We let \(Z^{\rho ,T}_{t}(x) := \int _{t}^T \partial ^2_{x} P_{s-t} Y_{s}^{\rho ,T_{0}}(x) \mathrm{d}s\) and
Using the identity \(Y_t^{\rho ,T_{0}}=P_{s-t}Y_s^{\rho ,T_{0}}+Y_t^{\rho ,s}\), for \(0\le t\le s \le T_0\), this leads to \({\fancyscript{I}}_{t}^{\rho ,T}(x,x')={\fancyscript{I}}_{t}^{\rho ,(1),T}(x,x')+{\fancyscript{I}}_{t}^{\rho ,(2),T}(x,x')\), with
With these notations, \({\fancyscript{I}}^{n,T}_{t}(x,x')\), the cross integral corresponding to \(Y^{n,T_0}\), is obtained by replacing \(\rho \) by \(n \rho (n \cdot )\) and \({\fancyscript{I}}^{T}_{t}(x,x')\) by replacing (at least at a formal level) \(\rho \) by \(\delta _0\).
Second step. Direct integration yields
Imitating the proof of Lemma 19, it is now easy to see that, for \(x,x' \in [-a,a]\), with \(a \ge 1\), \(\vert {\fancyscript{I}}_{t}^{\rho ,(1),T}(x,x') \vert \le C \kappa _{\rho }^2 a^{2 \chi } \vert x - x'\vert ^{2 \alpha }\), for a deterministic constant \(C\), independent of \(\rho \), and with \(\kappa _{\rho } := \kappa _{(\alpha +1/2)/2,\chi }(Y^{\rho ,0})\) (the reason why we use \((\alpha +1/2)/2\) will be explained below). The computations also apply when \(\rho \) is replaced by \(\delta _{0}\). Since \(Y^{\rho ,0}\) is constructed by convolution of \(Y^{T_0}\) with respect to \(\rho \), we can bound \(\kappa _{\rho }\) by \(c_{\rho } \kappa \), where \(c_{\rho }\) is a deterministic constant that may depend on the decay of \(\rho \). When \(\rho \) is replaced by \(n \rho (n \cdot )\), the constants \(c_{n\rho (n\cdot )}\) can be uniformly bounded in \(n\), so that, for any \(n \ge 1\), \(\vert {\fancyscript{I}}_{t}^{n \rho (n\cdot ),(1),T}(x,x') \vert \le C_\rho \kappa ^2 a^{2 \chi } \vert x - x'\vert ^{2 \alpha }\).
The bilinearity of the cross-integral shows that \(\sup _{0 \le t \le T \le T_{0}} \sup _{x,x' \in [-a,a]} | {\fancyscript{I}}_{t}^{n \rho (n\cdot ),(1),T}(x,x') - {\fancyscript{I}}_{t}^{\delta _0,(1),T}(x,x') |\) tends to \(0\) as \(n\) tends to \(\infty \). By (71), the convergence holds in Hölder norm.
Third step. We now study \({\fancyscript{I}}_{t}^{\rho ,(2),T}(x,x')\) for \(x,x' \in [-a,a]\), \(a \ge 1\). It is equal to

By integration by parts, \({\fancyscript{I}}_{t}^{\rho ,(2),T}(x,x') =\int _{t}^T \int _{\mathbb {R}}[{\mathcal K}_{t}^{1} - {\mathcal K}_{t}^{2}](r,u) \mathrm{d}\zeta (r,u)\) with
We start with \({\mathcal K}^2\). For \(x,x' \in [-a,a]\), \( \left| \int _{r}^T\partial _{x}^3 P_{s-t} Y_{s}^{\rho ,T_0}(y) \mathrm{d}s \right| \le C \kappa _{\rho } a^{\chi } \int _{r}^T (s-t)^{-3/2+ \alpha /2} \mathrm{d}s \le C \kappa _{\rho } a^{\chi }(r-t)^{-1/2+\alpha /2}\), so that
By Gaussian convolution, the integral is equal to \(\int _{\mathbb {R}}\rho (v)\int _{x}^{x'}\int _{x}^{x'}P_{2(r-t)}\rho (y-y'+v)\mathrm{d}y \mathrm{d}y'\mathrm{d}v\). It is bounded by \(\vert x'-x\vert \) or by \((r-t)^{-1/2} \vert x'-x\vert ^2\). By interpolation, it is less than \( (r-t)^{-\alpha } \vert x'-x\vert ^{1+2\alpha }\). Replacing \(\alpha \) by \((\alpha +1/2)/2\) in (89) and only in (89) (which is always possible since \(\alpha \) can be chosen as close as \(1/2\) as needed), we deduce that
We now reproduce the same analysis, with \(\partial _{t} {\mathcal K}^2_{t}(r,u)\) instead of \({\mathcal K}^2_{t}(r,u)\). Since \(\vert \partial _{t} p_{r-t}(x) \vert \le c (r-t)^{-1} p_{c(r-t)}(x)\), this amounts to replace \((r-t)^{-1+(\alpha +1/2)/2}\) by \((r-t)^{-3+(\alpha +1/2)/2}\) in the above computation, so that, for \(t \le t'\),
The analysis of \({\mathcal K}^1\) may be handled in the same way. It is actually much easier since \(\int _{r}^T [ \partial _x^2P_{s-t}Y^{\rho ,T_0}_s(x') -\partial _x^2P_{s-t}Y^{\rho ,T_0}_s(x)] \mathrm{d}s\) has the same structure as \(Z^{\rho ,T}_{t}(x')- Z^{\rho ,T}_{t}(x)\) and can be bounded by \(C \kappa _{\rho } a^{\chi } \vert x'-x\vert ^{\alpha }\). We thus deduce that (90) and (91) hold true with \({\mathcal K}^2\) replaced by \({\mathcal K}^1-{\mathcal K}^2\). By Lemma 28 with \(\epsilon =(1/2-\alpha )/2\),
where \(c_{p}\) is a constant independent of \(\rho \). Repeating the analysis, (92) also holds when \(\rho \) is replaced by \(\delta _{0}\).
Fourth step. The goal is to prove an analog of (92), but for the difference \({\fancyscript{I}}_t^{\rho ,(2),T}(x,x') -{\fancyscript{I}}_t^{\delta _{0},(2),T}(x,x')\). Letting \(\kappa _{\rho ,\delta _{0}} := \kappa _{(1/2+\alpha )/2,\chi }(Y^{\rho ,T_0} - Y^{T_0})\) and \(\Vert \rho \Vert _{1} := \int _{{\mathbb {R}}} \vert v \vert \rho (v) \mathrm{d}v\), we claim
The proof is as follows. By bilinearity of the cross-integral, \({\fancyscript{I}}^{\rho ,(2),T} - {\fancyscript{I}}^{\delta _0,(2),T}\) reads as the sum of two terms of the same type as \({\fancyscript{I}}_{t}^{\rho ,(2),T}\) but each involving a modification in the definition (88) of \({\mathcal K}_{t}^1(r,u)\) and \({\mathcal K}_{t}^2(r,u)\). The first modification consists in replacing \(Y^{\rho ,T_0}\) by \(Y^{\rho ,T_0}-Y^{T_0}\) and the second one in replacing \(Y^{\rho ,T_0}\) by \(Y^{T_0}\) and then \(P_{r-t}\rho \) by \(P_{r-t}\rho -p_{r-t}\) (or equivalently \(\rho \) by \(\rho - \delta _{0}\)). The first modification contributes for \(c_{p} a^{2\chi } {\mathbb {E}}[ \vert \kappa _{\rho ,\delta _{0}} \vert ^p]^{1/p} \vert x'-x\vert ^{1/2+\alpha }\) in (93) (compare with (92)). Concerning the second modification, when \(\rho \) is replaced by \(\rho -\delta _{0}\) in (89), the quintuple integral becomes (after convolution in \(u\))

Since \(\vert \partial _{x} p_{r-t}(y) \vert \le c (r-t)^{-1/2} p_{c(r-t)}(y)\), the integrals on the square \([x,x'] \times [x,x']\) are bounded by \((r-t)^{-\alpha } \vert x' -x \vert ^{1+2\alpha }\) as in the proof of (90) and by \(C(r-t)^{-\alpha -1/2} \vert x' -x \vert ^{1+2\alpha } \vert v\vert \). By interpolation, it is less than \(C(r-t)^{-\alpha -\eta /2} \vert x' -x \vert ^{1+2\alpha } \vert v\vert ^\eta \), for any \(\eta \in (0,1)\). Choosing \(\eta =(1/2-\alpha )/2\), the right-hand side in (90) becomes \(C \kappa ^2 a^{2\chi } \vert x'-x\vert ^{ 1+ 2 \alpha }(r-t)^{-1+(1/2-\alpha )/4} \Vert \rho \Vert _{1}^{(1/2-\alpha )/2}\) (with \(\kappa :=\kappa _{(1/2+\alpha )/2,\chi }(Y^{T_0})\)). Similarly, the right-hand side in (91) becomes \(C \kappa ^2 a^{2\chi } (r-t)^{-3+(1/2-\alpha )/4} \vert x'-x\vert ^{1+2\alpha } \Vert \rho \Vert _{1}^{(1/2-\alpha )/2}\). Playing the same game with \({\mathcal K}^1\), we get (93).
Fifth step. We now replace \(\rho \) by \(n \rho (n \cdot )\). From the second step, we know that we can find a constant \(c_{\rho }'\) such that \(\kappa _{n \rho (n \cdot )} \le c_{\rho }' \kappa \) for any \(n \ge 1\). Similarly, for any \(\alpha '>(1/2+\alpha )/2\), we can find a deterministic constant \(c\) such that
Now, \( \sup _{a \ge 1} (\Vert Y^{\rho ,T_0}-Y^{T_0} \Vert _{\infty }^{[-a,a]} /a^\chi ) \le c_{\rho }' \kappa \Vert \rho \Vert _{1}^{(1/2+\alpha )/2}\), for a possibly new value of the constant \(c_{\rho }'\). It remains true with the same constant \(c_{\rho }'\) when \(\rho \) is replaced by \(n \rho (n \cdot )\), so that \({\mathbb {E}}[\vert \kappa _{n \rho (n \cdot ),\delta _{0}} \vert ^p]^{1/p} \le C_{p} \Vert n \rho (n \cdot ) \Vert _{1}^{\eta } = C_{p} n^{-\eta } \Vert \rho \Vert _{1}^\eta \), for some \(\eta >0\). Modifying if necessary the value of \(\eta \), we deduce from (93) that
Conclusion. We let \(\Gamma = \sup _{n \ge 1} [n^{\eta /2} \sup _{0 \le t \le T \le T_{0}} \vert {\fancyscript{I}}_t^{n \rho (n\cdot ),(2),T}(x,x') -{\fancyscript{I}}_t^{\delta _{0},(2),T}(x,x') \vert ]\). We have \({\mathbb {E}}[ \vert \Gamma \vert ^p] \le \sum _{n \ge 1} n^{p \eta /2} {\mathbb {E}}[ \vert {\fancyscript{I}}_t^{n \rho (n \cdot ),(2),T}(x,x') -{\fancyscript{I}}_t^{\delta _{0},(2),T}(x,x') \vert ^p]\), which is less than
when \(\eta p >2\). We deduce, for \(x,x' \in [-a,a]\),
We aim at applying Theorem 26 with \(L=(n,t,T)\) and \(R_{L}(x,x') = n^{\eta /2} ( {\fancyscript{I}}_t^{n \rho (n\cdot ),(2),T}(x,x') -{\fancyscript{I}}_t^{\delta _{0},(2),T}(x,x') )\), the issue being to control \(R_{L}(x,x')+R_{L}(x',x'') - R_{L}(x,x'')\). From (87),
All the terms converge in \(L^p\) and the same relationship holds with \(n \rho (n \cdot )\) replaced by \(\delta _{0}\). Making the difference between the relationships with \(n \rho (n\cdot )\) and \(\delta _{0}\), we get an explicit expression for \(R_{L}(x,x')+R_{L}(x',x'') - R_{L}(x,x'')\). All the terms involved are explicitly controlled. By the same method as in the second step, \(\vert R_{L}(x,x')+R_{L}(x',x'') - R_{L}(x,x'') \vert \le \zeta a^{2\chi } \vert x'-x\vert ^{1/2+\alpha }\), for a random variable \(\zeta \). By Theorem 26, for any \(\chi '>\chi \) and \(\alpha ' \in (0,(1/2+\alpha )/2)\), we can find a random variable \(\zeta '\) such that \(\vert {\fancyscript{I}}_t^{n \rho (n\cdot ),(2),T}(x,x') -{\fancyscript{I}}_t^{\delta _{0},(2),T}(x,x') \vert \le \zeta ' n^{-\eta /2} a^{2\chi '} \vert x'-x\vert ^{\alpha '}\), for all \(x,x' \in [-a,a]\), \(a \ge 1\). As \( {\fancyscript{I}}_t^{n,T}= {\fancyscript{I}}_t^{n \rho (n\cdot ),(1),T}+ {\fancyscript{I}}_t^{n \rho (n\cdot ),(2),T}\) and \( {\fancyscript{I}}_t^{T}= {\fancyscript{I}}_t^{\delta _0,(1),T}+ {\fancyscript{I}}_t^{\delta _0,(2),T}\), this last bound combined with the conclusion of the second step prove that the assumptions of Lemma 20 are satisfied. \(\square \)
6 Connection with the KPZ equation
KPZ equation was introduced by Kardar et al. in [22] in order to model the growth of a random surface subjected to three phenomena: a diffusion effect, a lateral growth and a random deposit. It has the formal shape:
with \(0\) as initial condition, where \(\dot{\zeta }\) is a time-space white noise (that is the time-space derivative of a Brownian sheet, defined on \((\Xi ,{\mathcal G},{\mathbf {P}})\) as discussed in Theorem 22). Unfortunately, it is ill-posed since the gradient does not exist as a true function, but as a distribution only.
Two strategies have been developed so far to give a sense to (94). The first one goes back to [4] and consists in linearizing the equation by means of the so-called Hopf–Cole exponential transformation. The second approach is due to Hairer [19] in the case when \(x\) is restricted to the torus (in which case \(\zeta \) is defined accordingly). Therein, the key point is to solve second-order PDEs driven by a distributional first-order term by means of rough paths theory, which is precisely the strategy we used in Sect. 3 to solve (14). The two interpretations coincide but the resulting solution solves a renormalized version of (94), which writes (in a formal sense) as (94) with an additional ‘\(-\infty \)’ in the right-hand side. The normalization must be understood as follows: When mollifying the noise (say \(\dot{\zeta }\) into \(\dot{\zeta }^n)\), Eq. (94) admits a solution, denoted by \(h^n\), but the sequence \((h^n)_{n \ge 1}\) is not expected to converge. To make it converge to the solution of (94), some ‘counterterm’ must be subtracted to the right-hand side of (94): This counterterm is a constant \(\gamma ^{n}\) depending upon \(n\), which tends to \(\infty \) with \(n\), thus explaining the additional ‘\(-\infty \)’.
6.1 Polymer measure on the torus
Below, we make use of the framework defined in [19]. This imposes two restrictions. The first one is that \(\zeta \) has to be defined on \([0,\infty ) \times {\mathbb {S}}^1\), where \({\mathbb {S}}^1\) is the 1d torus, which means that \(\dot{\zeta }\) is a cylindrical Wiener process on \(L^2({\mathbb {S}}^1)\). The second one is that the Fourier transform \(\hat{\rho }\) of the kernel \(\rho \) used to mollify the noise has to be even, compactly supported, smooth and non-decreasing on \([0,\infty )\), in which case \(\rho \) is defined from its Fourier transform. In particular, \(\rho \) has polynomial decay of any order, but may not be positive. The mollified version \(\zeta ^n\) of \(\zeta \) is given by \(\zeta ^n(t,x) = \int _{0}^t \int _{{\mathbb {R}}} n\rho (n(x-y)) \mathrm{d}{\zeta }(s,y)\), with the convention that \(\int _{0}^t \int _{{\mathbb {R}}} \varphi (s,y) \mathrm{d}{\zeta }(s,y) = \sum _{k \in {\mathbb Z}} \int _{0}^t \int _{{\mathbb {S}}^1} \varphi (s,y+k) \mathrm{d}\zeta (s,y)\) if \(\int _{0}^t \int _{{\mathbb {S}}^1} \vert \sum _{k \in {\mathbb Z}} \varphi (s,y+k) \vert ^2 \mathrm{d}s \mathrm{d}y < \infty \).
Given \(T_{0} >0\) and \(n \ge 1\), we introduce the (random) polymer measure:
where \((B_{t})_{0 \le t \le T_{0}}\) is a Brownian motion under \({\mathbb {P}}\) (\((\Omega ,{\mathcal {A}},{\mathbb {P}})\) being distinct of \((\Xi ,{\mathcal G},{\mathbf {P}})\)), the symbol \(\sim \) indicating that the right-hand side is normalized in such a way that \({\mathbb {Q}}_{{\zeta }^n}\) is a probability. The polymer measure describes the law of a continuous random walk evolving in the periodic random environment \(\zeta ^n\). The factor \(\int _{0}^{T_{0}} \int _{{\mathbb {R}}} n \rho ( n(B_{T_{0}-t} - y )) \mathrm{d}\zeta (s,y)\) is sometimes written \(\int _{0}^{T_{0}} \dot{\zeta }^n(t,B_{T_{0}-t}) \mathrm{d}t\) or \(\int _{0}^{T_{0}} \dot{\zeta }^n(T_{0}-t,B_{t}) \mathrm{d}t\).
By applying Itô-Wentzell formula to \((h^n_{T_{0}-t}(B_{t}))_{0 \le t \le T}\), we obtain, \({\mathbf {P}} \otimes {\mathbb {P}}\) a.s.,
proving, by Girsanov Theorem, that, \({\mathbf {P}}\) a.s., the dynamics of \((B_{t})_{0 \le t \le T_{0}}\) under \({\mathbb {Q}}_{{\zeta }^n}\) satisfy the SDE (1) with \(Y_{t}(x) = h^n_{T_{0}-t}(x)\) (\(h^n_{T_{0}}(0)\) and \(\gamma ^n\) are unnoticeable in the definition of the polymer measure as they are hidden in the normalization constant of the right-hand side).
The main challenging question is to define the limit of \({\mathbb {Q}}_{\zeta ^n}\) rigorously. The following theorem provides a new result in that direction:
Theorem 31
Consider the solution to the (normalized) KPZ equation (94) with \(0\) as initial solution and let \(Y_{t}(x):=h_{T_{0}-t}(x)\), for \((t,x) \in [0,T_{0}] \times {\mathbb {T}}\). Then, we can find an event \(\Xi ^\star \), with \({\mathbf {P}}(\Xi ^\star )=1\), such that, for any realization in \(\Xi ^\star \) and any \(T \in [0,T_{0}]\), the pair \(W^T = (Y,Z^T)\), with \(Z^T\) given by (13), may be lifted into a geometric rough path \({\varvec{W}}^T =(W^T,{\fancyscript{W}}^T)\) satisfying the conclusions of Lemma 20, with \((Y_{t}^n(x):=h^n_{T_{0}-t}(x))_{n \ge 1}\), for \((t,x) \in [0,T_{0}] \times {\mathbb {T}}\), as approximation sequence.
Moreover, for any realization \(\xi \in \Xi ^\star \), \({\mathbb {Q}}_{{\zeta }^n}\) converges towards the law (on \(\Omega \)) of the solution \((X_{t})_{0 \le t \le T_{0}}\) to (1) when driven by the trajectory \(Y\) associated with \(\xi \). The limit law is independent of the choice of \(\rho \) in the construction of \(h^n\) and reads as a rigorous interpretation of the (a priori ill-defined) polymer measure \({\mathbb {Q}}_{\zeta } \sim \exp (\int _{0}^{T_{0}} \dot{\zeta }(T_{0}-t,B_{t}) \mathrm{d}t) \cdot {\mathbb {P}}\) on the canonical space \({\mathcal {C}}([0,T_{0}],{\mathbb {T}})\).
Proof
It suffices to check the assumption of Lemma 20. To this end, recall from [19, Theorem1.10] that, \({\mathbf {P}}\) a.s., \(h\) expands as \(Y^{\bullet } + h^b\), where \(Y^\bullet \) solves the stochastic heat equation for some initial condition \(Y^\bullet _{0} \in \cap _{\varepsilon >0}{\mathcal {C}}^{1/2-\varepsilon }({\mathbb {S}}^1)\) and \(h^b\) is a continuous remainder satisfying \(h^b_{t} \in \cap _{\varepsilon >0}{\mathcal {C}}^{1-\varepsilon }({\mathbb {S}}^1)\) for any \(t>0\) (the associated Hölder constant being uniform on any closed interval of \((0,T_{0}]\)). The point is thus to apply Theorem 22 (which easily extends to \({\mathbb {S}}^1\)) with
The fact that \(\rho \) may not be positive is not a problem as we can split it into \(\rho = \rho _{+}- \rho _{-}\) and then check that the results of Sect. 5 still apply with such a decomposition. Clearly, \(Y^{T_0}\) solves the backward stochastic heat equation with zero as terminal condition. Moreover, for any \(T< T_{0}\) and any \(\alpha _{b}<1\), [19, Theorem1.10] ensures that, for \({\mathbf {P}}\) a.e. realization in \(\Xi ^\star \), \(\kappa _{\alpha _{b},0}((Y^{(b)}_{t})_{0 \le t \le T})\) is finite (here we can choose \(\chi _{b}=0\) as we work on \({\mathbb {S}}^1\)). Then, with the same notation as above, we know from [19] that, almost surely on \(\Xi ^\star \), \(\Vert h^n_{T_{0}-\cdot } - Y^{n \rho (n \cdot ),T_0} - Y^{(b)} \Vert _{0,\alpha _{b}}^{[0,T] \times {\mathbb {S}}^1}\) converges to \(0\) as \(n\) tends to \(\infty \). By Theorems 8 (and its proof) and 22, we deduce that, a.s. on \(\Xi ^\star \), the solution to the SDE (1) on \([0,T]\), when driven by \(h^n_{T_{0}-\cdot }\), converges to the solution of (1) driven by \(Y\). This completes the proof on any \([0,T] \subset [0,T_{0})\).
In order to get the convergence on the entire \([0,T_{0}]\), we must revisit [19] in order to control the Hölder norm (in \(x\)) of \(Y^{(b)}_{t}\) uniformly in \(t \in [0,T_{0}]\). The technical issue is that, in [19], the KPZ equation is solved by means of a fixed point argument that allows for irregular initial conditions. As the initial condition may be irregular, solutions exhibit a strong blow-up at the boundary, see [19, Proposition 4.3]. In [19], \(h\) is split into \(h_{t}(x) =u_{t}(x) + h^\star _{t}(x)\), where \(h^{\star }_{t}(x) = \sum _{\tau \in \bar{{\mathcal T}}} Y^{\tau }_{t}(x)\), \(\bar{\mathcal T}\) denoting a finite collection of trees containing the root tree \(\bullet \). For \(\tau \in \bar{\mathcal T}{\setminus }\{ \bullet \}\), \(Y^\tau \) is continuous and, for any \(\varepsilon >0\), \(\Vert Y^\tau _{t} \Vert _{1-\varepsilon }\) is finite, uniformly in \(t \in [0,T_{0}]\). The remainder \(u\) is investigated through its derivative \(v : [0,T_{0}] \times {\mathbb {S}}^1 \ni (t,x) \mapsto v_{t}(x) = \partial _{x} u_{t}(x)\), defined as solution of (see [19, Section 4] for the notations):
for some functionals \({\mathcal M}\), \(G\) and \(F\). Our goal here is to expand \(h_{t}\) as \(h_{t} = [u_{t} - P_{t}u_{0}] + [h_{t}^\star + P_{t} u_{0}]\) and to investigate the regularity of \(u_{t} - P_{t} u_{0}\) directly by taking benefit of the fact that \(h_{0}=0\). Letting \(t=0\), we notice that \(u_{0}=-h_{0}^\star \) so that \(h_{t} = [u_{t} - P_{t} u_{0} ] + [h_{t}^\star - P_{t} h_{0}^\star ]\). We also notice that \(h_{t}^\star - P_{t} h_{0}^\star \) may be written \(Y_{t}^\bullet - P_{t} Y_{0}^\bullet + \sum _{\tau \in \bar{\mathcal T} {\setminus } \{\bullet \}} Y_{t}^{\tau } - P_{t} Y_{0}^\tau \). Here, \(Y^{\bullet } - P_{t} Y^{\bullet }_{0}\) is our \(Y^{T_0}_{T_{0}-\cdot }\) and, for any small \(\varepsilon >0\), \(\sum _{\tau \in \bar{\mathcal T} {\setminus } \{\bullet \}} Y^{\tau }_{t} - P_{t} Y_{0}^\tau \) has a finite norm in \({\mathcal {C}}^{1-\varepsilon }({\mathbb {S}}^1)\), uniformly in \(t \in [0,T_{0}]\), so that \(h_{t}^\star - P_{t} h_{0}^\star \) has the right decomposition to apply Theorems 8 and 22. It thus suffices to focus on \(u_{t} - P_{t} u_{0}\) or, equivalently, on \(v_{t} - P_{t} v_{0}= \partial _{x} [u_{t} - P_{t} u_{0}]\) in (95). The main idea is to see \(\bar{v}_{t}:= v_{t} - P_{t} v_{0}\) as the solution of
with \(\bar{v}_{0}=0\). (Note that, in the second term in the right-hand side, the value of \(v\) is fixed.) We then make use of the norm \(\Vert \cdot \Vert _{\star ,T}\) defined in [19, p. 597], but with different parameters \(\kappa \), \(\delta \), \(\alpha \), \(\beta \) and \(\gamma \). We choose \(\kappa =\varepsilon \) small enough, \(\delta = 2 \varepsilon \), \(\alpha =1/2+2\varepsilon \), \(\beta =1/4+ \varepsilon \) and \(\gamma =\alpha \), which satisfy all the prescriptions [19, Eqs. (76a)–(76g)]. Following [19, Eqs. (83a), (83b), (83c), (85)], we get, for \(C,\theta >0\), \(\Vert \bar{v} \Vert _{\star ,T} \le C + C T^{\theta } (\Vert \bar{v} \Vert _{\star ,T} + \Vert P_{\cdot } v_{0} \Vert _{\star ,T})\), where the derivative of \(P_{\cdot } v_{0}\) with respect to the rough path structure is \(0\). Here \(v_{0} = -\partial _{x} h_{0}^\star \) is a distribution in \({\mathcal {C}}^{-1/2-\varepsilon '}({\mathbb {S}}^1)\), for \(\varepsilon '>0\) as small as desired. Following [19, Eq. (82)], \(\Vert P_{\cdot } v_{0}\Vert _{\star ,T} < \infty \). We deduce that, for \(T\) small enough, \(\Vert \bar{v} \Vert _{\star ,T}<\infty \). By [19, Eq. (73)], we get \(\Vert \bar{v}_{t} \Vert _{\infty } \le C t^{-3\varepsilon }\). Working at the level of the primitive, we obtain \(\Vert u_{t} - P_{t} u_{0} \Vert _{1-3 \varepsilon } < \infty \), uniformly in \(t \in [0,T]\). The fact that \(T\) has to be small is not a problem since we are interested in the behavior of \(h\) near the origin. Therefore, \(h_{t} = Y_{T_{0}-t}^{T_{0}} + Y^{(b)}_{t}\), with \(Y^{(b)}_{t} = [u_{t} - P_{t} u_{0} ] + [h_{t}^\star - Y^{\bullet }_{t} - P_{t} (h_{0}^\star - Y_{0}^\bullet )]\), fits the assumptions in Theorems 8 and 22. The convergence to \(0\) of \(\Vert h^n_{T_{0}-\cdot } - Y^{n \rho (n \cdot ),T_0} - Y^{(b)} \Vert _{0,\alpha _{b}}^{[0,T_{0}] \times {\mathbb {S}}^1}\) (on the whole \([0,T_{0}] \times {\mathbb {S}}^1\)) is handled in the same way. \(\square \)
We end up with:
Theorem 32
For \({\mathbf {P}}\) almost every realization of the environment \(\zeta \), under the polymer measure \({\mathbb {Q}}_{\zeta }\) defined in Theorem 31, the canonical path has dynamics of the form
in the sense of (56), where \(b(t,X_{t},\mathrm{d}t)\) is of order \(O(\mathrm{d}t^{3/4-\varepsilon })\), for \(\varepsilon \) as small as desired, the constant in the Landau notation being random but uniform in \(t \in [0,T_{0}]\). Moreover, in the expression of \(b\) in Proposition 14, the second term can be computed by replacing \((Y,Z^{t+h})\) by \((Y^{T_0},Z^{T_0,t+h})\), where \(Y^{T_0}\) is the solution of the stochastic heat equation as in Theorem 22 and \(Z^{0,t+h}\) is computed accordingly as in (84).
Proof
The proof is a consequence of Proposition 14. The reason why the second term in the decomposition of \(b\) can be simplified follows from the proof of Theorem 31. Indeed, we know that \(Y\) may be split into \(Y^{T_0}+Y^{(b)}\), with \(\kappa _{\alpha _{b},0}(Y^{(b)}) < \infty \) for \(\alpha _{b}\) close to \(1\). The game is then the same as in (85): for \((i,j) \not = (0,0)\), the cross-integrals \({\fancyscript{I}}^{i,j,t+h}\) in (85) give a contribution of order \(O(h^{3/2-\varepsilon })\) in the computation of \(b\), which can be forgotten at the macroscopic level. \(\square \)
References
Amir, G., Corwin, I., Quastel, J.: Probability distribution of the free energy of the continuum directed random polymer in 1+1 dimensions. Commun. Pure Appl. Math. 64, 466–537 (2011)
Andreoletti, P., Diel, R.: Limit law of the local time for Brox’s diffusion. J. Theor. Probab. 24, 634–656 (2011)
Bass, R., Chen, Z.-Q.: Stochastic differential equations for Dirichlet processes (English summary). Probab. Theory Relat. Fields 121, 422–446 (2001)
Bertini, L., Giacomin, G.: Stochastic Burgers and KPZ equations from particle systems. Commun. Math. Phys. 183, 571–607 (1997)
Brox, T.: A one-dimensional diffusion process in a Wiener medium. Ann. Probab. 14, 1206–1218 (1986)
Catellier, R., Gubinelli, M.: Averaging along irregular curves and regularisation of ODEs. Technical report. http://arxiv.org/abs/1205.1735 (2014)
Davie, A.M.: Uniqueness of solutions of stochastic differential equations. Int. Math. Res. Not. 24, 26 (2007)
Diel, R.: Almost sure asymptotics for the local time of a diffusion in Brownian environment. Stoch. Process. Appl. 121, 2303–2330 (2011)
Ethier, S., Kurtz, T.G.: Characterization and Convergence. Wiley, New York (1986)
Flandoli, F.: Random Perturbation of PDEs and Fluid Dynamic Models. Lectures from the 40th Probability Summer School held in Saint-Flour, 2010. Lecture Notes in Mathematics, vol. 2015. Springer, Heidelberg (2011)
Flandoli, F., Issoglio, E., Russo, F.: Multidimensional stochastic differential equations with distributional drift. Technical report. http://arxiv.org/abs/1401.6010 (2014)
Flandoli, F., Russo, F., Wolf, J.: Some SDEs with distributional drift. I. General calculus. Osaka J. Math. 40, 493–542 (2003)
Flandoli, F., Russo, F., Wolf, J.: Some SDEs with distributional drift. II. Lyons–Zheng structure, it’s formula and semimartingale characterization. Random Oper. Stoch. Equ. 12 2, 145–184 (2004)
Friedman, A.: Partial Differential Equations of Parabolic Type. Prentice-Hall Inc, Englewood Cliffs (1964)
Friz, P., Hairer, M.: A Course on Rough Paths (with an Introduction to Regularity Structures). Springer, Berlin (2014)
Friz, P., Victoir, N.: Multidimensional Stochastic Processes as Rough Paths. Theory and Applications. Cambridge University Press, London (2010)
Gubinelli, M.: Controlling rough paths. J. Funct. Anal. 216, 86–140 (2004)
Hairer, M.: Rough stochastic PDEs. Commun. Pure Appl. Math. 64, 1547–1585 (2011)
Hairer, M.: Solving the KPZ equation. Ann. Math. 178, 559–664 (2013)
Hu, Y., Shi, Z.: The limits of Sinai’s simple random walk in random environment. Ann. Probab. 26, 1477–1521 (1998)
Hu, Y., Shi, Z.: The local time of simple random walk in random environment. J. Theor. Probab. 11, 765–793 (1998)
Kardar, M., Parisi, G., Zhang, Y.-C.: Dynamical scaling of growing interfaces. Phys. Rev. Lett. 56, 889–892 (1986)
Krylov, N.V., Röckner, M.: Strong solutions of stochastic equations with singular time dependent drift. Probab. Theory Relat. Fields 131, 154–196 (2005)
Lyons, T., Caruana, M., Lévy, T.: Differential Equations Driven by Rough Paths. Lectures from the 34th Summer School on Probability Theory held in Saint-Flour, 2004. Lecture Notes in Mathematics, vol. 1908. Springer, Berlin (2007)
Lyons, T., Qian, Z.: System Control and Rough Paths. Oxford University Press, Oxford (2002)
Quastel, J.: Introduction to KPZ. Notes from the Saint-Flour summer school. http://www.math.toronto.edu/quastel/survey (2012)
Russo, F., Trutnau, G.: Some parabolic PDEs whose drift is an irregular random noise in space. Ann. Probab. 35, 2213–2262 (2007)
Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Springer, Berlin (1979)
Tanaka, H.: Localization of a diffusion process in a one-dimensional Brownian environment. Commun. Pure Appl. Math. 17, 755–766 (1994)
Veretennikov, A.Yu.: Strong solutions and explicit formulas for solutions of stochastic integral equations. Math. Sb. 111, 434–452 (1980)
Young, L.C.: An inequality of the Hölder type, connected with Stieltjes integration. Acta Math. 67, 251–282 (1936)
Zvonkin, A.K.: A transformation of the phase space of a diffusion process that will remove the drift. Math. Sb. 93, 129–149 (1974)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Lemma 33
Given a sequence of smooth paths \((Y^n)_{n \ge 1}\) such that, for some \(T_{0}>0\) and any \(T \in [0,T_{0}]\), the sequence \((W^{n,T}=(Y^n,Z^{n,T}))_{n \ge 1}\) satisfies the assumption of Proposition 6, with \(\kappa =\sup _{0 \le T \le T_{0}} \sup _{n \ge 1} \kappa _{\alpha ,\chi }((W_{t}^{n,T},{\fancyscript{W}}_{t}^{n,T})_{0 \le t <T}) < \infty \), then, we can assume that, for any \(n \ge 1\), \(Y^n\) has bounded derivatives on the whole space.
Proof
For \(N \in {\mathbb N} {\setminus } \{0\}\), we consider a smooth function \(\varphi ^N : {\mathbb {R}}\rightarrow [0,1]\), symmetric, equal to \(1\) on \([0,N]\) and to \(0\) on \([2N,+\infty )\), non-increasing on \([N,2N]\), satisfying \(\Vert \mathrm{d}^p \varphi ^N/\mathrm{d}x^p \Vert _{\infty } \le c_{p}/N^p\) for some \(c_{p} \ge 1\), independent of \(N\), for any integer \(p \ge 1\). Then, we let \(Y^{n,N}_{t}(x)=Y^n_{t}(0)+\int _{0}^x \varphi ^N(y) \partial _{x} Y^n_{t}(y) \mathrm{d}y\) and, for a given \(T>0\), we define \(Z^{n,N,T}\), \(W^{n,N,T}\) and \({\fancyscript{W}}^{n,N,T}\) accordingly.
For a given \(n\), \((Y^{n,N})_{N \ge 1}\) (resp. \(\partial _{x}^p Y^{n,N}\) for an integer \(p \ge 1\)) converges towards \(Y^n\) (resp. \(\partial _{x}^p Y^n\)) as \(N\) tends to \(\infty \), uniformly in \(x\) in compact sets and in \(t \in [0,T)\). Using the representations of \(Z_{t}^{n,N,T}\) and \(Z^{n,T}_{t}\), see (13), the same holds true for the sequence \((Z^{n,N,T})_{N \ge 1}\) (resp. \(({W}^{n,N,T})_{N \ge 1}\)) with \(Z^{n,T}\) (resp. \(W^{n,T}\)) as limit path. Hence, \(({{\fancyscript{W}}}_{t}^{n,N,T})_{N \ge 1}\) converges towards \({\fancyscript{W}}_{t}^{n,T}\) in norm \(\Vert \cdot \Vert _{\alpha }\), uniformly in \(t \in [0,T)\). Using the same notation as in Proposition 6, \((\Vert (W^{n,N,T}-W^{n,T},{\fancyscript{W}}^{n,N,T} - {\fancyscript{W}}^{n,T})\Vert _{0,\alpha }^{[0,T) \times {\mathbb {I}}})_{N \ge 1}\) tends to \(0\) as \(N\) tends to \(\infty \). Therefore, we can find a sequence \((N_{n})_{n \ge 1}\) such that \(\Vert (W^{n,N_{n},T}-W^{n,T},{\fancyscript{W}}^{n,N_{n},T} - {\fancyscript{W}}^{n,T})\Vert _{0,\alpha }^{[0,T) \times {\mathbb {I}}}\), and thus \(\Vert (W^{n,N_{n},T}-W^{T},{\fancyscript{W}}^{n,N_{n},T} - {\fancyscript{W}}^{T})\Vert _{0,\alpha }^{[0,T) \times {\mathbb {I}}}\), tend to \(0\) as \(n\) tends to \(\infty \), which fits (1) in Proposition 6.
We now discuss (2) in Proposition 6. We start with the Hölder estimate of \(Y^{n,N}_{t}\). For \(0\le x \le y \le a\), with \(a \ge 1\), the second mean-value theorem yields \({Y}^{n,N}_{t}(y) - {Y}^{n,N}_{t}(x) = \varphi _{N}(x) [Y^{n}_{t}(y')-Y_{t}^{n}(x)]\), for \(y' \in [x,y]\). We deduce that \(\vert Y^{n,N}_{t}(y) - Y^{n,N}_{t}(x) \vert \le \kappa a^{\chi } \vert y - x \vert ^{\alpha }\). The same holds true when \(-a \le y \le x \le 0\). Changing \(\kappa \) into \(2\kappa \), we get the same result for any \(x,y \in [-a,a]\). By Lemma 19, the bound \(\vert {Z}^{n,N}_{t}(x) - {Z}^{n,N}_{t}(y) \vert \le \kappa a^{\chi } \vert x-y \vert ^{\alpha }\) follows.
We finally discuss the regularity of the second-order integrals. As discussed in Sect. 5, it suffices to focus on the cross-integral \(\int _{x}^y [ {Z}^{n,N}_{t}(z) - {Z}^{n,N}_{t}(x)] \mathrm{d}{Y}^{n,N}_{t}(z)\).
By (13), \(\partial _{t} Z_{t}^{n,N,T}(x) + (1/2) \partial ^2_{x} Z_{t}^{n,N,T}(x) = - \partial ^2_{x} [Y_{t}^{n,N}](x) = - \partial _{x} [ \varphi ^N \partial _{x} Y^{n}_{t}](x)\). Similarly, \(\partial _{t} Z_{t}^{n,T}(x) + (1/2) \partial ^2_{x} Z_{t}^{n,T}(x) = - \partial ^2_{x} [Y_{t}^{n}](x)\). Therefore,
with \(Z_{T}^{n,N,T} - Z_{T}^{n,T}=0\). Therefore, integrating against \(p_{s-t}\) and then integrating by parts,
The aim is to differentiate both sides of the equality in order to estimate the derivative of the left-hand side. In order to bound the derivative of the right-hand side, we discuss the Hölder constant of the integrands right above. We have \(\vert \varphi _{N}'(y) Y^{n}_{t}(y) - \varphi _{N}'(x) Y^{n}_{t}(x) \vert \le c_{2} \vert Y^n_{t}(x) \vert \vert y - x \vert /N^2 + (c_{1}\kappa /N) a^{\chi } \vert y -x \vert ^{\alpha }\), for \(x,y \in [-a,a]\), \(a \ge 1\). Modifying \(\kappa \) if necessary \(\vert Y^n_{t}(x) \vert \le \kappa a^{1+\chi }\). Therefore, we can find a constant \(C \ge 0\) such that
Since \(\varphi _{N}' =0\) outside \([-2N,2N]\), we can always assume that \(x,y \in [-2N,2N]\) (by projecting \(x\) and \(y\) onto \([-2N,2N])\) and thus that \(a \le 2N\). Then, the left-hand side is less than \(C a^{\chi } \vert y -x \vert ^{\alpha }/N\). Using a similar argument for all the other terms of the same type in the right-hand side of (96), we deduce that the left-hand side in (96) is differentiable and that \(\vert \partial _{x} [ Z_{t}^{n,N,T} - \varphi ^N Z_{t}^{n,T}](x) \vert \le C a^{\chi }/N\), when \(x \in [-a,a]\), \(a \ge 1\). By integration by parts,
Since \(\partial _{x}Y^{n,N}_{t}(z)=0\) when \(\vert z \vert \ge 2N\), we can always assume that \(x,y \in [-2N,2N]\) and \(a \le 2N\). We deduce that the term in the first line is less than \(C a^{2\chi } \vert x-y \vert ^{2\alpha }/N^{\alpha }\). To end up the analysis, it thus suffices to prove that
Since \(\partial _{x} Y^{n,N}_{t}(z) = \varphi ^N(z) \partial _{x} Y^n_{t}(z)\), we can use again the second mean-value theorem to handle \(\int _{x}^y \varphi ^N(z)[ Z_{t}^{n,T}(z) - Z_{t}^{n,T}(x)] \mathrm{d}Y^{n,N}_{t}(z) = \int _{x}^y (\varphi ^N(z))^2[ Z_{t}^{n,T}(z) - Z_{t}^{n,T}(x)] \mathrm{d}Y^{n}_{t}(z)\). Therefore, it suffices to focus on \(Z_{t}^{n,T}(x) \int _{x}^y [ \varphi ^N(z) - \varphi ^N(x)] \mathrm{d}Y^{n,N}_{t}(z)\). By integration by parts,
which is less than \(C a^{2\chi } \vert y -x \vert ^{1+\alpha }/N\) (following Lemma 19, \(Z^{n,T}_{t}\) satisfies \(\vert Z^{n,T}_{t}(x) \vert \le C a^{\chi }\) –better than the elementary but rough bound \(\vert Z^{n,T}_{t}(x) \vert \le C a^{1+\chi }\)–). Limiting the analysis to the case \(a \le 2N\), we conclude as above. \(\square \)
Rights and permissions
About this article
Cite this article
Delarue, F., Diel, R. Rough paths and 1d SDE with a time dependent distributional drift: application to polymers. Probab. Theory Relat. Fields 165, 1–63 (2016). https://doi.org/10.1007/s00440-015-0626-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-015-0626-8
Mathematics Subject Classification
- Primary 60H10
- Secondary 60H05
- 82D60