1 Introduction

The setting considered in this article is as follows. Consider a particle in a rapidly evolving random medium, so that it is governed by a stochastic differential equation of the type \(dx_t= A(x_t, t/\varepsilon )\,dt + \sigma (x_t,t/\varepsilon )\,dB\)Footnote 1 for a small parameter \(\varepsilon >0\). The situation we are interested in is where, in the “static” case (i.e. when A and \(\sigma \) have no explicit time dependence), the system is either super- or subdiffusive. This is the case if the driving noise B is modelled by fractional Brownian motion (fBM) with Hurst parameter \(H \ne \frac{1}{2}\). Recall that fractional noises (i.e. the time derivative of fBM) can be obtained as scaling limits in statistical mechanics models [11, 38, 65] and that fBM with Hurst parameter H is a Gaussian process with stationary increments and self-similarity exponent H. It is therefore characterised (up to an irrelevant global shift) by the fact that \(\mathbf{E}(B_t-B_s)^2=|t-s|^{2H}\), so that it is superdiffusive for \(H > \frac{1}{2}\) and subdiffusive for \(H < \frac{1}{2}\). The covariance of its increments, \(\mathbf{E}(B_{t+1}-B_{t})(B_{s+1}-B_s)\), decays at rate \(|t-s|^{2H-2}\) for large \(|t-s|\) and therefore exhibits long-range dependence when \(H > \frac{1}{2}\).

We furthermore assume that the rapid time evolution of the environment is described by a hidden Markov variable, thus leading to the model

$$\begin{aligned} x_t^\varepsilon =x_0+ \int _0^t F(x_s^\varepsilon ,y_s^\varepsilon )\,dB_s + \int _0^t F_0(x_s^\varepsilon ,y_s^\varepsilon )\,ds, \end{aligned}$$
(1.1)

with B an fBM with Hurst parameter \(H\in (0,1)\) in \(\mathbf{R}^m\) and \(F(x,y)\in {{\mathbb {L}}}(\mathbf{R}^m, \mathbf{R}^d)\). The stochastic integral appearing in the first term is problematic when \(H < \frac{1}{2}\): one should really interpret this equation as \(x_t^\varepsilon = \lim _{\delta \rightarrow 0} x_t^{\varepsilon ,\delta }\) with \(x_t^{\varepsilon ,\delta }\) driven by a smooth approximation \(B^\delta \) to B with relevant timescale \(\delta \ll \varepsilon \ll 1\), see Sect. 1.1 below. Regarding the fast Markov variable, a prototypical situation is that of a system of the type

$$\begin{aligned} dy_t^\varepsilon = \sigma (x_t^\varepsilon ,y_t^\varepsilon )\,{dW_t \over \sqrt{\varepsilon }} + b(x_t^\varepsilon ,y_t^\varepsilon )\,{dt\over \varepsilon }, \; \end{aligned}$$
(1.2)

where W is a Wiener processes independent of the fBM B appearing in (1.1). This allows for the case where the variable x feeds back into the evolution of y, but for most of this article we assume that there is no x-dependence in (1.2). We also assume that \(y_t\) admits a unique invariant probability measure \(\mu \). In the case with feedback, we have a family of invariant measure \(\mu _x\) obtained by “freezing” the value of the variable x in (1.2).

It was recently shown by the authors in [31] that in the case \(H > \frac{1}{2}\) the process \(x_\varepsilon \) converges in probability to the solution to

$$\begin{aligned} dx = {\bar{F}}(x)\,dB + {\bar{F}}_0(x)\,dt, \end{aligned}$$
(1.3)

where the average of any function h is given by \({\bar{h}}(x) = \int h(x,y)\,\mu _x(dy)\). The aim of the present article is to investigate the two cases left out by the aforementioned analysis, namely what happens when either \(H < \frac{1}{2}\) or when \(H > \frac{1}{2}\) but \({\bar{F}} = 0\) in (1.3)?

1.1 Description of the model

It turns out that the effect of the rapid oscillatory motion described by the fast variable y is to slow down the motion of x in the superdiffusive case and to speed it up in the subdiffusive case. This can be explained by the following heuristics. For times of order \(t \lesssim \varepsilon \), the process Y doesn’t evolve much so that, by the scaling property of the driving fBM, one expects the process x to move by about \(\varepsilon ^H\) in a time of order \(\varepsilon \). On large times \(t \gtrsim \varepsilon \) on the other hand we will see that the limiting process is actually Markovian, even in the case with long-range dependence. This suggests that over times of order t the process x performs about \(t/\varepsilon \) steps of a random walk with step size \(\varepsilon ^H\) and therefore moves by about \(\varepsilon ^H \sqrt{t/\varepsilon }\). This suggests that one should multiply F by \(\varepsilon ^{\frac{1}{2}-H}\) in order to obtain a non-trivial limit.

As a consequence, the equations we actually study in this article are of the form:

$$\begin{aligned} dX^\varepsilon = \varepsilon ^{\frac{1}{2} - H} F_i(X^\varepsilon ,Y^\varepsilon )\,dB_i + F_0(X^\varepsilon ,Y^\varepsilon )\,dt, \qquad Y^\varepsilon (t) = Y(t/\varepsilon ), \end{aligned}$$
(1.4)

(summation over i is implied), where B is a fractional Brownian motion with Hurst parameter H ranging from \(\frac{1}{3}\) to 1Footnote 2, and Y is an independent stationary Markov process with values in some Polish space \(\mathcal{Y}\), invariant measure \(\mu \) and generator \(-\mathcal{L}\)Footnote 3. At the moment, we are unfortunately unable to cover the case when X feeds back into the dynamics of Y. When \(H > \frac{1}{2}\), we furthermore assume that \(\int F_i(x,y)\,\mu (dy) = 0\) for every \(i \ne 0\) and every x.

Our main result is that, as \(\varepsilon \rightarrow 0\), solutions to (1.4) converge in law to a limiting Markov process and we provide an expression for its generator. In fact, we have an even stronger form of convergence, namely we show that the flow generated by (1.4) converges to the one generated by a limiting stochastic differential equation of Kunita type (i.e. driven by an infinite-dimensional noise).

Remark 1.1

Of course, (1.4) is not quite the same as (1.2) which was our starting point. One way of relating them more directly is to perform a time change and set \(X^\varepsilon = x_\varepsilon (\varepsilon ^{1-2H}t)\) with \(x_\varepsilon \) solving (1.2). Then \(X^\varepsilon \) solves the equation \(dX^\varepsilon ={\tilde{\varepsilon }}^{{1\over 2}-H} F_i(X^\varepsilon ,Y^{{\tilde{\varepsilon }}})\,dB_i+ {\tilde{\varepsilon }}^{\frac{1}{2H}-1}F_0(X^\varepsilon ,Y^{{\tilde{{\varepsilon }}}})\,dt\) where we have set \({\tilde{{\varepsilon }}}=\varepsilon ^{2H}\). When \(H < \frac{1}{2}\), this then converges to the same limit as (1.4), but of course with \(F_0\) in (1.4) set to 0. When \(H > \frac{1}{2}\), then one would need to take \(F_i\) centred in (1.2) in order to obtain a non-trivial limit and our results imply that one again converges to the same limit as (1.4), at least in the case \(F_0 = 0\).

The special case when \(F_0 = 0\) and the \(F_i\) are independent of the x-variable yields a functional central limit theorem for stochastic integrals against fractional Brownian motion. This already appears to be new by itself and might be of independent interest.

As already hinted at, the map \(t \mapsto F_i(\cdot , Y^\varepsilon _t)\) is too irregular to fit into the standard theory of differential equations driven by a fractional Brownian motion, especially when \(H < \frac{1}{2}\), so that it is not even completely clear a priori how to interpret (1.4) for fixed \(\varepsilon > 0\). These questions will be addressed in more detail in Sect. 2 below. Let us put these aside for the moment and consider the following ordinary differential equation

$$\begin{aligned} \dot{X}^{\varepsilon ,\delta } _s = \frac{\delta ^{H-1}}{ \sqrt{\varepsilon }} v\Big (\frac{s}{\varepsilon \delta }\Big ) F(X_s^{\varepsilon ,\delta },Y_s^\varepsilon ) + F_0(X_s^{\varepsilon ,\delta },Y_s^{\varepsilon }), \end{aligned}$$
(1.5)

where v is a smooth stationary Gaussian random process with covariance C such that \(C(t) \sim |t|^{2H-2}\) for |t| large. When \(H < \frac{1}{2}\) we furthermore assume that \(\int C(t)\,dt = 0\) and, when \(H = \frac{1}{2}\), we assume that C decays exponentially and satisfies \(\int C(t)\,dt = 1\). One way of obtaining such a process v is to set \(v = \phi * \dot{B}\) for \(\phi \) a Schwartz test function integrating to 1 (and \(*\) denoting convolution in time). This in particular shows that, at least in law, one has \((\varepsilon \delta )^{H-1}v\big ({t \over \varepsilon \delta }\big ) =( \phi _{\varepsilon \delta }* \dot{B})(t)\), where we set \(\phi _\varepsilon (t) = \varepsilon ^{-1}\phi (t/\varepsilon )\). Since this converges in law to \(\dot{B}\) as \(\varepsilon \delta \rightarrow 0\), we can view (1.5) as an approximation to (1.4).

It is then possible to show that the limit \(X^\varepsilon = \lim _{\delta \rightarrow 0} X^{\varepsilon ,\delta }\) exists and our results hold with \(X^\varepsilon \) interpreted in this way. Furthermore, we will see that all our results hold uniformly over \(\delta \in (0,1]\) as \(\varepsilon \rightarrow 0\). This in particular shows that the converse limit obtained by first sending \(\varepsilon \rightarrow 0\) and then \(\delta \rightarrow 0\) is the same, as are all limits obtained by other ways of jointly sending \(\varepsilon ,\delta \rightarrow 0\).

1.2 Description of the main results

We now give a precise formulation of our main results, albeit with a simplified set of assumptions. The reason is that while the simplified assumptions are straightforward to state, they are very stringent regarding the Markov process Y. The more realistic set of assumptions used in the remainder of the article however is quite technical to formulate. We first recall the following standard definition of the fractional powers of the generator of the process Y.

Definition 1.2

We write \(\mathcal{H}= L^2(\mu )\) with \(\mu \) the invariant measure of Y and \(\langle \cdot ,\cdot \rangle _\mu \) for its scalar product. For \(\alpha \in (0,1)\), we then say that \(f\in \mathrm {Dom}(\mathcal{L}^{\alpha } )\) if, for every \(g \in \mathcal{H}\), the integral

$$\begin{aligned} {1\over \Gamma (-\alpha )}\int _0^\infty t^{-\alpha -1} \langle P_tf-f, g\rangle _{\mu } dt<\infty , \end{aligned}$$

converges and determines a bounded functional on \(\mathcal{H}\) (which we then call \(\mathcal{L}^\alpha f\)). Recall that the generator of the process Y is \(-\mathcal{L}\), so that \(\mathcal{L}\) is indeed a positive operator in the reversible case and \(\mathcal{L}^\alpha \) does then coincide with the definition using functional calculus.

Similarly, for \(\alpha \in (-1,0)\), we write \(\mathcal{L}^\alpha \) for the operator given by

$$\begin{aligned} \mathcal{L}^\alpha f = {1\over \Gamma (-\alpha )}\int _0^\infty t^{-\alpha -1} P_tf\, dt. \end{aligned}$$

Since \(t \mapsto t^{-\alpha -1}\) is locally integrable, it follows from the first point of Assumption 1.3 below that \(\mathcal{L}^\alpha \) is a bounded operator on the subspace of \({\mathrm {Lip}}(\mathcal{Y})\) consisting of mean zero functions.

Assuming that \(X^\varepsilon \) takes values in \(\mathbf{R}^d\), we then define the \(d\times d\) matrix-valued function

$$\begin{aligned} \Sigma (x,{\bar{x}}) = {1\over 2}\Gamma (2H+1)\sum _{k=1}^m\int F_k(x,y) \otimes \big (\mathcal{L}^{1-2H} F_k\big )({\bar{x}},y)\,\mu (dy), \end{aligned}$$
(1.6)

where \(\mathcal{L}\) acts on the second argument of \(F_k\). As we will see in Remark 1.7, the expression (1.6) is naturally interpreted as the limit \(\delta \rightarrow 0\) of a “local” Green–Kubo formula associated to the fluctuations of (1.5).

Note that the condition \({\bar{F}}_k = 0\) is necessary in the case \(H > \frac{1}{2}\) since the negative power of \(\mathcal{L}\) appearing in this expression does not make sense otherwise, see also Remark 2.5 below. We shall assume mixing conditions and Hölder continuity of the Y variable, see Assumptions 2.12.3 below, as well as a regularity condition on \(x \mapsto F(x,\cdot )\) (and also \(F_0\)) as spelled out in Assumption 2.7. A simpler set of conditions is as follows, the first of which is a strengthening of Assumptions 2.1 and 2.3, the second is a strengthening of Assumption 2.2, and the last is just a restatement of Assumption 2.7 in this context.

Assumption 1.3

(Simplified Assumptions). The functions \(F_i\) appearing in (1.4) as well as the Markov process Y satisfy the following.

  1. 1.

    The Markov semigroup associated to the process Y is strongly continuous and has a spectral gap in \({\mathrm {Lip}}(\mathcal{Y})\), the space of bounded Lipschitz continuous functions on \(\mathcal{Y}\).

  2. 2.

    In the case \(H < 1/2\) we assume that, for any \(\alpha < H\), the process \(t \mapsto Y_t\) admits \(\alpha \)-Hölder continuous trajectories and its Hölder seminorm (over intervals of length 1 say) has bounded moments of all orders.

  3. 3.

    When \(H > \frac{1}{2}\), we also assume that \(\int F_i(x,y)\,\mu (dy) = 0\) for every \(i \ne 0\) and every x.

  4. 4.

    There exists \(\kappa > 0\) such that, for every \(i \ge 0\), \(x\mapsto F_i(x,\cdot )\) is \(\mathcal{C}^4\) with values in \({\mathrm {Lip}}(\mathcal{Y})\) and its derivatives of order at most 4 are bounded by \(C (1+|x|)^{-\kappa }\) for some \(C>0\).

Remark 1.4

Recall that a Markov semigroup \((P_t)_{t \ge 0}\) admits a spectral gap in any given Banach space \(E \subset L^2(\mu )\) if \(P_t:E \rightarrow E\) is a bounded linear operator for every t and if there exist constants \(c,C > 0\) such that \(\Vert P_t f - \mu (f)\Vert _E\le Ce^{-ct}\Vert f\Vert _E\) for all \(f \in E\). For this definition to make sense, E of course needs to contain all constant functions.

The reason why we are aiming for a more general result at the expense of a much more technical set of assumptions is that having a spectral gap in \({\mathrm {Lip}}(\mathcal{Y})\) is a very restrictive condition which is not even satisfied for the Ornstein–Uhlenbeck process.Footnote 4

Theorem 1.5

Let \(H\in (\frac{1}{3}, 1)\) and let Assumption 1.3 hold. For fixed \(\varepsilon > 0\), \(\alpha < H\), and \(T>0\), the process \(X^{\varepsilon ,\delta }\) converges in law in \(\mathcal{C}^\alpha ([0,T])\) as \(\delta \rightarrow 0\) to a limit \(X^\varepsilon \) which we interpret as the solution to (1.4).

The solution flow of (1.4) converges in law to that of the Kunita-type stochastic differential equation written in Itô form as

$$\begin{aligned} dX_t=W(X_t, dt)+ G(X_t)\,dt + {\bar{F}}_0(X_t)dt, \end{aligned}$$
(1.7)

where \({\bar{F}}_0(x)=\int F_0(x,y)\mu (dy)\), \(G_i(x) = (\partial ^{(2)}_j\Sigma _{ji})(x,x)\), W is a Gaussian random field with correlation

$$\begin{aligned} \mathbf{E}(W_i(x,t)W_j({\bar{x}},{\bar{t}}) )= (t \wedge {\bar{t}}) \bigl (\Sigma _{ij}(x,{\bar{x}}) + \Sigma _{ji}({\bar{x}},x)\bigr ), \end{aligned}$$
(1.8)

and where \(\partial ^{(2)}_j\) denotes differentiation in the jth direction of the second argument.

Proof

As already suggested, this is a special case of our main result, Theorem 2.8 below. The fact that Assumptions 2.12.3 and 2.7 are implied by Assumption 1.3 is immediate. (Take \(E_n = {\mathrm {Lip}}(\mathcal{Y})\) for every n.) \(\square \)

As a consequence, we also have the following functional CLT.

Corollary 1.6

Let \(H\in (\frac{1}{3}, 1)\) and let Assumption 1.3 hold (or let Assumptions 2.12.3 hold and when \(H > \frac{1}{2}\), let \(\int F_i(y)\,\mu (dy) = 0\) for every \(i \ge 1\).)

Then the stochastic process \(Z^\varepsilon _t = \sqrt{\varepsilon }\int _0^{t/\varepsilon } F(Y_r)\, dB_r\) converges to a Wiener process W, weakly in \(\mathcal{C}^\alpha ([0,T])\) for any \({\alpha <\frac{1}{2}\wedge H}\). Furthermore, defining the random smooth function \(Z^{\varepsilon ,\delta }_t = \sqrt{\varepsilon }\int _0^{t/\varepsilon } F(Y_r)\, dB_r^\delta \) with \(B^\delta = \phi _\delta * B\), its iterated integral satisfies

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\lim _{\delta \rightarrow 0} \int _s^t \bigl (Z^{\varepsilon ,\delta }_r-Z^{\varepsilon ,\delta }_s\bigr )\otimes dZ^{\varepsilon ,\delta }_r = \int _s^t (W_r-W_s)\otimes dW_r+\Sigma (t-s), \end{aligned}$$

where the matrix \(\Sigma \) is given by (1.6) (which is independent of \(x,{\bar{x}}\) in this case).

Remark 1.7

Theorem 1.5 characterises \(\lim _{\varepsilon \rightarrow 0}\lim _{\delta \rightarrow 0}X^{\varepsilon ,\delta }\) and shows that it is a Markov process with generator \({\mathcal {A}}\) given by

$$\begin{aligned} ({\mathcal {A}}g)(x) = \sum _{i,j=1}^d \partial _j \big (\Sigma _{ji}(x,\cdot )\partial _i g\big )(x) + \sum _{i=1}^d{\bar{F}}_0^i(x)\partial _i g(x) . \end{aligned}$$
(1.9)

Our proof actually carries over with minor modifications to the case when \(\varepsilon \rightarrow 0\) for fixed \(\delta \) (but with convergence bounds that are uniform in \(\delta \)!), in which case the limit is given by the same expression (1.7), but with the matrix \(\Sigma \) given by

$$\begin{aligned} \Sigma _\delta (x,{\bar{x}}) = \sum _{k=1}^m \int _0^\infty R_\delta (t) \int F_k(x,y) \otimes \big (P_t F_k\big )({\bar{x}},y)\,\mu (dy)\,dt, \end{aligned}$$
(1.10)

where \(R_\delta (t) = \delta ^{2H-2} \mathbf{E}v(0)v(t/\delta )\) and \(P_t = e^{-\mathcal{L}t}\) denotes the Markov semigroup for Y. We will derive this formula in Sect. 1.3 where we will also see that, for frozen values of x, it is a special case of the Green–Kubo formula [42, 45, 61]. Note that (1.7)–(1.10) (in particular the convergence of the flow) is also consistent with [22, Theorem 4.3] where a somewhat analogous situation is considered. It follows from Definition 1.2 that \(\Sigma _\delta \rightarrow \Sigma \) as \(\delta \rightarrow 0\), so that the two limits commute (in law).

Remark 1.8

There has recently been a surge in interest in the study of slow/fast systems involving fractional Brownian motion. We already mentioned the averaging result [31] which considers the case \(H > \frac{1}{2}\) but with \({\bar{F}} \ne 0\). The work [60] considers the case \(H \in (\frac{1}{3},1)\) like the present article, but with the very strong assumption that F is independent of the fast variable, in which case only \(F_0\) exhibits rapid fluctuations and one essentially recovers classical averaging results. In [2], the authors consider the case \(H > \frac{1}{2}\), but with F independent of the slow variable x and, as in [31], not necessarily averaging to zero. They obtain a description of the fluctuations for (a generalisation of) such systems in the regime where there is an additional small parameter in front of F.

Formula (1.6) holds for the continuum of parameters \(H\in (\frac{1}{3}, 1)\). There are two special cases that were previously known. The case \(H = \frac{1}{2}\) reduces of course to the classical stochastic averaging results [17, 30, 66, 67] which state that the generator of the limiting diffusion is obtained by averaging the generator for the slow diffusion with the x variable frozen against the invariant measure for the fast process. Note that for this to match (1.9) one needs to interpret the stochastic integral in (1.4) in the Stratonovich sense. This is natural given that this is the interpretation that one obtains when replacing B by a smooth approximation, which is consistent with Remark 1.7. The fact that one also has convergence of flows however (in the case without feedback considered here) appears to be new even in this case.

Another set of closely related classical results deals with “time homogenisation”, also known as the Kramers–Smoluchowski limit or diffusion creation [44, 61]. There, one considers random ODEs of the type

$$\begin{aligned} \frac{dX_t^{\varepsilon }}{dt}=\frac{1}{\sqrt{\varepsilon }} F(X_t^{\varepsilon }, Y_t^{\varepsilon }) + F_0(X_t^{\varepsilon }, Y_t^{\varepsilon }), \end{aligned}$$
(1.11)

with F averaging to zero against the stationary measure \(\mu \) for the fast process Y. In this case, one also obtains a Markov process in the limit \(\varepsilon \rightarrow 0\) and its generator coincides with (1.9) if one sets \(H=1\). This can be understood by noting that, at least formally, fractional Brownian motion with Hurst parameter \(H=1\) is given by \(B(t) = ct\) with c a normal random variable, so that (1.4) reduces to (1.11), except for the random constant c, which then appears quadratically in (1.9) and therefore disappears when averaged out.

The standard proofs of averaging/homogenisation results found in the literature tend to fall roughly into two groups. The first contains functional analytic proofs based on general methods for studying singular limits of the form \(\exp (t \mathcal{L}_\varepsilon )\) for \(\mathcal{L}_\varepsilon = \varepsilon ^{-1} \mathcal{L}_0 + \mathcal{L}_1\). This of course requires the full process (slow plus fast) to be Markovian and completely breaks down in our situation. The second group consists of more probabilistic arguments, which typically rely on using corrector techniques to construct sufficiently many martingales to be able to exploit the well-posedness of the martingale problem for the limiting Markov process. The latter are in principle more promising in our situation since the limiting process is still Markovian, but the lack of Markov property makes it unclear how to construct martingales from our process. (But see Sect. 5.3 for a construction that does go in this direction.)

Instead, our proof relies on rough paths theory [15, 53], which has recently been used to recover homogenisation results (formally corresponding to the case \(H=1\)), for example in [40]. See also [1, 7, 12, 14] for more recent results with a similar flavour. In the case when the fast dynamics is non-Markovian and solves an equation driven by a fractional Brownian motion, a collection of homogenisation results were obtained in [23,24,25,26], while stochastic averaging results with non-Markovian fast motions are obtained in [51, 52] for the case \(H>\frac{1}{2}\). The former group of results are proved using rough path techniques, but there is of course an extensive literature on functional limit theorems based on either central or non-central limit theorems, see for example [3, 5, 6, 10, 54, 62, 63].

Finally, note that many physical systems can be regarded as a slow/fast systems, this includes second order Langevin equations and tagged particles in a turbulent random field [4, 8, 16, 41, 42, 61, 68]. They also arise in the context of perturbed completely integrable Hamiltonian systems [20, 49] and geometric stochastic systems [21, 48, 50, 59]. See also [13, 47, 70] for some review articles/monographs.

Remark 1.9

It may be surprising that, when \(H < \frac{1}{2}\), even though \(X_\varepsilon \) is driven by a fractional Brownian motion and F(xy) isn’t assumed to be centred in the y variable, the limit \({\bar{X}}\) is a regular diffusion. This is unlike the case \(H > \frac{1}{2}\) [31, 52] where a non-centred F leads to an averaging result with a process driven by fBm in the limit. This change in behaviour can be understood heuristically as follows. With \(\eta \) as in (3.5), the covariance of \(t\mapsto f(Y_t^\varepsilon ) \dot{B}_t\) is given by \(\zeta _\varepsilon (t-s) = \eta ''(t-s) g((t-s)/\varepsilon )\) for some function \(g(t)=\langle f, P_{t}f\rangle \), that would typically converge quite fast to a non-zero limit. The scaling properties of \(\eta \) then show that

$$\begin{aligned} \int _\mathbf{R}\eta ''(t) g(t/\varepsilon )\,dt = \varepsilon ^{2H-1} \int _\mathbf{R}\eta ''(t) g(t)\,dt = C_g \varepsilon ^{2H-1}, \end{aligned}$$

for some constant \(C_g\) which has no reason to vanish in general. As a consequence, \(\varepsilon ^{1-2H} \zeta _\varepsilon \) converges pointwise to 0 while its integral remains constant, suggesting that \(\varepsilon ^{\frac{1}{2}-H} f(Y_t^\varepsilon ) \dot{B}_t\) indeed converges to a white noise. When \(H > \frac{1}{2}\) however, \(\eta ''\) is not absolutely integrable at infinity and one needs to assume that g vanishes there, which leads to a centering condition. A similar transition from diffusive to super-diffusive behaviour at \(H = \frac{1}{2}\) was observed in a different context in [41].

Remark 1.10

As explained, our result implies more, namely that the (random) flow induced by the SDE (1.4) converges in law to that induced by the Kunita-type SDE [46]

$$\begin{aligned} dx_i = W_i(x,dt) + \left( \partial ^{(2)}_j\Sigma _{ji}\right) (x,x)\,dt + {\bar{F}}_0^i(x)\,dt. \end{aligned}$$
(1.12)

In other words, the flows \(\psi _{s,t}^{\varepsilon }\), where \(\psi _{s,t}^\varepsilon (x)\) denotes the solution at time t to the x-component of (1.4) with initial condition x at time s, converge to a limit \(\psi _{s,t}\) which is Markovian in the sense that \(\psi _{s,t}\) and \(\psi _{u,v}\) are independent whenever \([s,t) \cap [u,v) = \emptyset \). This remark appears to be novel even when \(H=\frac{1}{2}\), but it is unclear whether it extends to the case when x feeds back into the dynamic of y as in (1.2).

Remark 1.11

The term \(\partial ^{(2)}_j\Sigma _{ji}\) appearing in (1.12) looks “almost” like an Itô-Stratonovich correction. In fact, when \(\mathcal{L}\) is self-adjoint on \(L^2(\mu )\), one has \(\Sigma _{ij}(x,{\bar{x}}) = \Sigma _{ji}({\bar{x}},x)\) in which case (1.12) is equivalent to \(dx_i = W_i(x,{\circ }\, dt)+ {\bar{F}}_0^i(x)\,dt\).

1.3 Heuristics for general slow/fast random ODEs

We now show how to heuristically derive (1.10). Consider a random ODE of the form

$$\begin{aligned} \frac{dX_t^{\varepsilon }}{dt}=\frac{1}{\sqrt{\varepsilon }} {\hat{F}}(X_t^{\varepsilon }, Z_t^{\varepsilon }), \end{aligned}$$
(1.13)

where \(Z_t^\varepsilon = Z(t/\varepsilon )\) for some stationary (but not necessarily Markovian!) stochastic process Z and \({\hat{F}}(x,\cdot )\) is assumed to be centred with respect to the stationary measure of Z. In the case when \({\hat{F}}(x,z) = {\hat{F}}(z)\) does not depend on x, it follows from the Green–Kubo formula [42, 45, 61] that, at least when Z has sufficiently nice mixing properties, \(X^\varepsilon \) converges as \(\varepsilon \rightarrow 0\) to a Wiener process with covariance \(\Sigma + \Sigma ^\top \), where

$$\begin{aligned}\Sigma = \int _0^\infty \mathbf{E}\big ( {\hat{F}}(Z_0)\otimes {\hat{F}}(Z_t)\big )\,dt. \end{aligned}$$

This suggests that a natural quantity to consider in the general case is

$$\begin{aligned} \Sigma (x,{\bar{x}}) = \int _0^\infty \mathbf{E}\big ({\hat{F}}(x,Z_0)\otimes {\hat{F}}({\bar{x}},Z_t)\big )\,dt, \end{aligned}$$
(1.14)

and that the limit of \(X^\varepsilon \) as \(\varepsilon \rightarrow 0\) is a diffusion with generator of the form

$$\begin{aligned} \big (\mathcal{A}g\big )(x) = \Sigma _{ij}(x,x)\partial ^2_{ij} g(x) + b_i(x)\partial _i g(x), \end{aligned}$$
(1.15)

for some drift term b.

To derive the correct expression for the drift b, we note that one expects

$$\begin{aligned} \mathbf{E}\big (X^\varepsilon _{t+\delta t} - X^\varepsilon _t\,|\, \mathcal{F}_t\big ) \approx \delta t\, b(X^\varepsilon _t), \end{aligned}$$

in the regime \(\varepsilon \ll \delta t \ll 1\). The left-hand side of this expression is given by

$$\begin{aligned} \frac{1}{\sqrt{\varepsilon }} \int _t^{t+\delta t} \mathbf{E}\big ({\hat{F}}(X_s^{\varepsilon }, Z_s^{\varepsilon })\,|\,\mathcal{F}_t\big )\,ds. \end{aligned}$$
(1.16)

To lowest order, one can approximate this expression by replacing \(X_s^{\varepsilon }\) by \(X_t^{\varepsilon }\), but the resulting expression vanishes rapidly for \(s \gtrsim t+\varepsilon \) due to the centering condition on \({\hat{F}}\). To the next order, one has

$$\begin{aligned}&\mathbf{E}\big ({\hat{F}}(X_s^{\varepsilon }, Z_s^{\varepsilon })\,|\,\mathcal{F}_t\big ) \approx \mathbf{E}\Big ({\hat{F}}\Big (X_t^{\varepsilon }+ \frac{1}{\sqrt{\varepsilon }}\int _t^s {\hat{F}}(X_t^{\varepsilon }, Z_r^{\varepsilon })\,dr , Z_s^{\varepsilon }\Big )\,\Big |\,\mathcal{F}_t\Big )\nonumber \\&\quad \approx \mathbf{E}\big ({\hat{F}}(X_t^{\varepsilon }, Z_s^\varepsilon )\,|\,\mathcal{F}_t\big ) + \frac{1}{\sqrt{\varepsilon }}\int _t^s \mathbf{E}\big (D{\hat{F}}(X_t^{\varepsilon },Z_s^{\varepsilon }) {\hat{F}}(X_t^{\varepsilon }, Z_r^{\varepsilon })\,|\,\mathcal{F}_t\big )\,dr\nonumber \\&\quad \approx \sqrt{\varepsilon }\int _{0}^\infty \mathbf{E}\big (D{\hat{F}}(X_t^{\varepsilon },Z_u) {\hat{F}}(X_t^{\varepsilon }, Z_0)\,|\,\mathcal{F}_t\big )\,du, \end{aligned}$$
(1.17)

where the last identity follows from the substitution \(u = (s-r)/\varepsilon \) combined with the fact that, provided that Z is sufficiently rapidly mixing, we expect the main contribution from this integral to come from \(|u| \approx 1\), while typical values of s are such that \((s-t)/\varepsilon \approx \delta t/\varepsilon \gg 1\). Combining this with (1.16) eventually yields the expression

$$\begin{aligned} b(x) = \int _0^\infty D{\hat{F}}(x,Z_s) {\hat{F}}(x, Z_0)\,ds. \end{aligned}$$

Comparing this with (1.14), we conclude that

$$\begin{aligned} b_i(x) = \big (\partial _j\Sigma _{ji}(x,\cdot )\big )(x), \end{aligned}$$

(summation over repeated indices is implied) so that (1.15) can be written as

$$\begin{aligned}\big (\mathcal{A}g\big )(x) = \partial _j \big (\Sigma _{ji}(x,\cdot )\partial _i g\big )(x), \end{aligned}$$

which does coincide with the expression (1.9) as desired.

In order to link this calculation with the setting of the previous section, we note that (1.5) (with \(F_0 = 0\) for simplicity) can be coerced into the form (1.13) by setting \(Z_t = (\delta ^{H-1}v(t/\delta ), Y_t)\) as well as \({\hat{F}}(x,(v,y)) = F(x,y) v\). In this case, one has

$$\begin{aligned} \mathbf{E}\big ({\hat{F}}(x,Z_0)\otimes {\hat{F}}({\bar{x}},Z_t)\big )&= \delta ^{2H-2} R(t/\delta )\sum _{k=1}^m\mathbf{E}\big (F_k(x,Y_0)\otimes F_k({\bar{x}},Y_t)\big )\\&= R_\delta (t)\sum _{k=1}^m\int F_k(x,y)\otimes \big (P_t F_k\big )({\bar{x}},y)\,\mu (dy), \end{aligned}$$

so that one does indeed recover the expression (1.10) for any fixed \(\delta \).

Remark 1.12

The eagle-eyed reader will have spotted that since the stationary measure of Z is \(\mathcal{N}(0,C) \otimes \mu \) for some multiple C of the identity matrix and since \({\hat{F}}(x,(v,y))\) is linear in v, the centering condition for \({\hat{F}}\) is always satisfied, independently of the choice of F. This explains why our main result does not require any centering condition when \(H \le \frac{1}{2}\). When \(H > \frac{1}{2}\) however, the covariance function R decays too slowly for the heuristic derivation just given to apply. The centering condition for F then guarantees that correlations decay sufficiently fast to justify the second step in (1.17).

The remainder of this article is structured as follows. In Sect. 2 we introduce the assumptions on the nonlinearities \(F_i\) as well as the fast process Y, we discuss a few examples, and we give provide the statement of our main result. In Sect. 3 we then show that solutions to (1.5) converge as \(\delta \rightarrow 0\), which yields in particular a precise interpretation of what we mean by (1.4) when \(H < \frac{1}{2}\). The strategy of proof is as follows. Given a smooth mollification \(B^\delta \) of B, we first show convergence of \(\int _s^t \int _s^r f(u)\, \dot{B}^\delta (u) du\, g(r)\, \dot{B}^\delta (r)dr\) as \(\delta \rightarrow 0\) for any deterministic H-Hölder continuous functions fg. While we are able to reduce this to existing criteria for canonical rough path lifts of Gaussian processes [9, 18] in the case where the two fractional Brownian motions appearing in this expression are independent, the case where they are equal requires a bit more care and relies on a simple trick given in Proposition 3.4, which is of independent interest. This then allows us to build an infinite-dimensional rough path \({{\mathbf {Z}}}^\varepsilon \) (taking values in a space of vector fields on \(\mathbf{R}^d\)) associated to (1.4) in a similar way as in [40, Sec. 1.5] (see also the “nonlinear rough paths” of [58] and [26]) and to reformulate (1.4) as an RDE driven by \({{\mathbf {Z}}}^\varepsilon \) with nonlinearity given by point evaluation. Section 3.2 provides details of the construction of \({{\mathbf {Z}}}^\varepsilon \), while Sect. 3.3 then uses it to formulate our main technical result, namely Theorem 3.14 which shows that \({{\mathbf {Z}}}^\varepsilon \) converges to a certain rough path lift of an infinite-dimensional Wiener process with covariance function given by \(\Sigma \). The remainder of the article is devoted to the proof of this convergence statement. Section 4 shows tightness of the family \(\{{{\mathbf {Z}}}^\varepsilon \}_{\varepsilon \le 1}\), while we identify its limit in Sect. 5. In both sections, the cases \(H < \frac{1}{2}\) and \(H > \frac{1}{2}\) are treated in a completely different way.

The fact that we have convergence of the full infinite-dimensional rough path allows us to conclude that we do not just have convergence of solutions for fixed initial conditions, but of the full solution flow. One point of note is that there are two separate sources of randomness, namely the Markov process Y and the fractional Brownian motion B. Our convergence result is “annealed” in the sense that our convergence in law requires both sources, but a number of intermediate results are “quenched” in the sense that they hold for almost every realisation of Y. It is an open question whether our final convergence result also holds in the quenched sense.

2 Precise Formulation and Results

In this section, we collect the precise assumptions on the functions \(F_i\) as well as the Markov process Y.

Convention. We write \(A \lesssim B\) as shorthand for \(A \le KB\) with a constant K that will differ from statement to statement.

2.1 Technical assumptions on the fast variable Y

Throughout the article we fix \(H \in (\frac{1}{3},1)\) as well as a sequence \((E_n)_{n \ge 0}\) of Banach spaces such that \(E_n \subset E_{n+1}\) and \(E_n \subset L^1(\mathcal{Y},\mu )\) for every \(n \ge 0\), and such that pointwise multiplication is a continuous operation from \(E_0 \times E_n\) into \(E_{n+1}\) for every \(n\ge 0\). We also write simply E instead of \(E_0\) and assume E contains constant functions. See Sect. 2.2 below for two classes of examples showing what type of spaces we have in mind here.

First, we impose that Y has “nice” ergodic properties in the following sense, which in particular implies that \(\mu \) is its unique invariant measure on \(\mathcal{Y}\).

Assumption 2.1

Let \(N = \infty \) for \(H > \frac{1}{2}\) and \(N=2\) for \(H \in (\frac{1}{3},\frac{1}{2}]\). For every \(n \in [1,N)\), the semigroup \(P_t\) extends to a strongly continuous semigroup on \(E_n\) and there exist constants C and \(c> 0\) (possibly depending on n) such that, for every \(f \in E_n\) with \(\int _\mathcal{Y}f d\mu =0\), one has

$$\begin{aligned} \Vert P_t f\Vert _{E_n} \le C e^{-ct} \Vert f\Vert _{E_n}. \end{aligned}$$
(2.1)

In the low regularity case, we also assume that the process Y has some sample path continuity when composed with a function in \(E_2\).

Assumption 2.2

For \(H \in (\frac{1}{3},\frac{1}{2})\) there exists \(p_\star > \max \{4d,12/(3H-1)\}\) such that for every \(f \in E_2\)

$$\begin{aligned} \Vert f(Y_t)-f(Y_0)\Vert _{L^{p_\star }} \le c \Vert f\Vert _{E_2}(t^H \wedge 1) \,\qquad \forall t \ge 0, \end{aligned}$$
(2.2)

for some constant \(c > 0\).

We also need some integrability.

Assumption 2.3

For \(H \ge \frac{1}{2}\), one has \(E_n \subset L^2(\mathcal{Y},\mu )\) for every \(n \ge 0\). For \(H < \frac{1}{2}\), one has \(E_2 \subset L^2(\mathcal{Y},\mu )\) and \(E \subset L^{p_\star }(\mathcal{Y},\mu )\).

Remark 2.4

When combining it with the inclusion of the product, Assumption 2.3 implies that \(E \subset \bigcap _{p \ge 1} L^p(\mathcal{Y},\mu )\) for \(H\ge \frac{1}{2}\).

Another consequence of these assumptions is as follows.

Remark 2.5

As a consequence of Assumption 2.2, we conclude that if \(f \in E_2\) and \(H < \frac{1}{2}\), then

$$\begin{aligned} \Vert P_t f - f\Vert _\mu ^2&= \mathbf{E}\bigl |\mathbf{E}\bigl (f(Y_t) - f(Y_0)\,|\,Y_0\bigr )\bigr |^2 \le \mathbf{E}|f(Y_t) - f(Y_0)|^2\\&\le \Vert f\Vert _{E_2}^2 \big (t^{2H} \wedge 1\bigr ) . \end{aligned}$$

Recalling the definition of \(\mathcal{L}^\alpha \) from Definition 1.2, it follows that \(E_2 \subset \mathrm {Dom}(\mathcal{L}^\alpha )\) for every \(\alpha < H\) so that (1.6) is indeed well defined provided that \(F_k(x,\cdot ) \in E_2\) for every x. This will be guaranteed by Assumption 2.7 below.

2.2 Examples of fast variables

One possible concrete framework is as follows. Fix two weights \(V:\mathcal{Y}\rightarrow [1,\infty ]\) and \(W :\mathcal{Y}\rightarrow (0,\infty )\) and a metric d on \(\mathcal{Y}\) generating its topology with the property that there exists \(C>0\) such that, for all \(x,y \in \mathcal{Y}\) with \(d(x,y) \le 1\), one has

$$\begin{aligned} V(x) \le C V(y),\qquad W(x) \le C W(y). \end{aligned}$$
(2.3)

For \(n \ge 1\), we then let \(\mathcal{B}_{V,W}\) be the Banach space of functions \(f :\mathcal{Y}\rightarrow \mathbf{R}\) such that

$$\begin{aligned} \Vert f\Vert _{V,W} {\mathop {=}\limits ^{\tiny {\hbox {def}}}}\sup _{x \in \mathcal{Y}}{|f(x)|\over V(x)} + \sup _{x,y \in \mathcal{Y}\atop d(x,y) \le 1}{|f(x) - f(y)|\over d(x,y) W(x) V(x)} < \infty . \end{aligned}$$

One choice of scale of function spaces that is suitable for a large class of Markov processes is to take \(E_n = \mathcal{B}_{V,W}\) for every \(n \ge 1\) (and suitably chosen V and W), while \(E_0\) is chosen be the space of bounded Lipschitz continuous functions, namely \(\mathcal{B}_{1,1}\).

This framework is relatively general since it allows for a wide variety of choices of V, W, and of distance functions on \(\mathcal{Y}\), see [33, 36]. For example, it was shown in [32, Thm. 1.4] that the 2D stochastic Navier–Stokes equations exhibit a spectral gap in such spaces under extremely weak conditions on the driving noise. More precisely, for every \(\eta \) small enough there exist constants C and \(\gamma \) such that

$$\begin{aligned} \Big \Vert P_tf-\int f d\mu \Big \Vert _\eta \le C e^{-\gamma t}\Vert f\Vert _\eta , \end{aligned}$$

for every Fréchet differentiable function f for every \(t\ge 0\), where

$$\begin{aligned} \Vert f\Vert _\eta = \sup _x e^{-\eta |x|^2} \bigl (|f(x)| + |Df(x)|\bigr ). \end{aligned}$$
(2.4)

This at first sight appears to fall outside our framework, but one notices that if one sets

$$\begin{aligned} d(x,y) = \inf _{\gamma : x \rightarrow y} \int _0^1 (1+|\gamma (t)|)|{\dot{\gamma }}(t)|\,dt, \end{aligned}$$
(2.5)

then the norm \(\Vert \cdot \Vert _{V,W}\) with \(V(x) = \exp (\eta |x|^2)\) and \(W(x) = 1/(1+|x|)\) is equivalent to the norm (2.4). The reason for the choice of d as in (2.5), which is then “undone” by our choice of W, is to guarantee that (2.3) holds for V, which would not be the case for \(|x-y| \le 1\) in the Euclidean distance.

To verify Assumption 2.2 one can then for example make use of the following.

Lemma 2.6

Suppose that \(\int \big (V(x)(1+W(x))\big )^{p_\star }\,\mu (dx) < \infty \) and there exists a constant c such that, for some \(\alpha _0>1-2H\),

$$\begin{aligned} \Vert d(Y_t, Y_0)\Vert _{L^{p_\star }} \le c\bigl (t^{\alpha _0} \wedge 1\bigr )\,\qquad \forall t \ge 0. \end{aligned}$$
(2.6)

Let \(f \in \mathcal{B}_{V,W}\) and \(2p \le p_\star \), then

$$\begin{aligned} \Vert f(Y_t) - f(Y_0)\Vert _{L^p}\lesssim \Vert f\Vert _{V,W}(1\wedge t^{\alpha _0}). \end{aligned}$$

In particular, on any fixed time interval, we have \(\mathbf{E}\Vert f(Y_\cdot )\Vert _{\mathcal{C}^\alpha }^p < \infty \) provided that \(p(\alpha _0-\alpha ) > 1\).

Proof

For \(f \in E\) with \(\Vert f\Vert _E \le 1\) and for \(p \le \frac{1}{2} p_\star \), one has

$$\begin{aligned} \big \Vert f(Y_t) - f(Y_0)\Vert _{L^p}&\le \Vert W(Y_0) V(Y_0) d(Y_t,Y_0) +\mathbf {1}_{d(Y_t,Y_0)> 1}\Vert f\Vert _E \big (V(Y_0)+V(Y_t)\big )\big \Vert _{L^p} \\&\lesssim \Vert (1+W(Y_0)) V(Y_0) \Vert _{L^{2p}} \Bigl ((1\wedge t^{\alpha }) + \mathbf{P}(d(Y_t,Y_0) > 1)^{\frac{1}{2p}}\Bigr ) \\&\lesssim 1\wedge t^{\alpha }, \end{aligned}$$

where we combined (2.6) with Markov’s inequality in the last step. \(\square \)

When \(H \ge \frac{1}{2}\), Assumption 2.2 is empty, so only integrability conditions are required on the spaces \(E_n\). This allows for example to use Harris’s theorem [29, 34, 55] to verify Assumption 2.1 for spaces of functions with weighted supremum norms. More precisely, one would then take E to be the space of all bounded Borel measurable functions and \(E_n = \mathcal{B}_V\), the Banach space of functions \(f :\mathcal{Y}\rightarrow \mathbf{R}\) such that

$$\begin{aligned} \Vert f\Vert _{V} {\mathop {=}\limits ^{\tiny {\hbox {def}}}}\sup _{x \in \mathcal{Y}}{|f(x)|\over V(x)} < \infty . \end{aligned}$$

In order to verify our assumptions, it then suffices that V is a square integrable Lyapunov function for the Markov process Y and that the sublevel sets of V satisfy a ‘small set’ condition for the transition probabilities of Y [55].

2.3 Main results

One final assumption we need is that the nonlinearities F and \(F_0\) appearing in (1.4) are sufficiently nice E-valued functions of their first argument. More precisely, we assume the following.

Assumption 2.7

The map \(x \mapsto F(x,\cdot )\) is of class \(\mathcal{C}^4\) with values in E and there exists an exponent \(\kappa > \frac{16 d}{p_\star }\) with \(p_\star \) as in Assumption 2.2 (and simply \(\kappa > 0\) when \(H > \frac{1}{2}\)) such that, for every multi-index \(\ell \) of length at most 4,

$$\begin{aligned} \Vert D_x^\ell F(x,\cdot )\Vert _{E} \lesssim (1+|x|)^{-\kappa }. \end{aligned}$$

The same is assumed to hold true for \(F_0\). When \(H > \frac{1}{2}\), we further assume that \(\int F_i(x,y)\,\mu (dy) = 0\) for every \(i \ne 0\) and every x.

The condition \(F \in \mathcal{C}^4\) is of course suboptimal and could probably be lowered to \(F \in \mathcal{C}^\beta \) for \(\beta > \max \{H^{-1},2\}\) and \(F_0 \in \mathcal{C}^\beta \) for \(\beta > 1\), at least if enough integrability is assumed in Assumption 2.3. We also now fix a Schwartz function \(\phi \) integrating to 1 and set \(\phi _\delta (t)=\frac{1}{\delta }\phi (t/ \delta )\). We then write \(B^\delta \) for the convolution of B with this mollifier, namely

$$\begin{aligned} B^\delta (t) = \frac{1}{\delta }(\phi _\delta * B)(t)= \frac{1}{\delta }\int _\mathbf{R}\phi \Big (\frac{t-s}{\delta }\Big ) B(s)\, ds. \end{aligned}$$

With this notation, the solutions to (1.5) are equal in law to the process given by

$$\begin{aligned} \dot{X}^{\varepsilon ,\delta } _t = \varepsilon ^{\frac{1}{2}-H} F(X_t^{\varepsilon ,\delta },Y_t^\varepsilon )\,\dot{B}^{\varepsilon \delta } + F_0(X_t^{\varepsilon ,\delta },Y_t^{\varepsilon }). \end{aligned}$$
(2.7)

Since \(B^{\varepsilon \delta }\) is smooth, this equation should be interpreted as an ordinary differential equation that just happens to have random coefficients. With all these preliminaries at hand, our main result is the following.

Theorem 2.8

For \(H \in (\frac{1}{3},1]\) and under Assumptions 2.12.3, and 2.7, the conclusions of Theorem 1.5 hold. With \(X^{\varepsilon ,\delta }\) defined in (2.7), the convergence \(X^{\varepsilon ,\delta } \rightarrow X^\varepsilon \) furthermore holds in probability.

Proof

The convergence in probability of the flow \(X^{\varepsilon ,\delta } \rightarrow X^{\varepsilon }\) is the content of Proposition 3.2 below. The proof of the conclusion of Theorem 1.5, namely the convergence in law of the flow for (1.4) as \(\varepsilon \rightarrow 0\) is the content of Corollary 3.15. \(\square \)

3 Convergence of Smooth Approximations

We first address the question of the convergence in probability of solutions to (1.5) to those of (1.4) as \(\delta \rightarrow 0\) for \(\varepsilon > 0\) fixed. In fact, we will directly provide an interpretation of (1.4) and show that this interpretation is sufficiently stable to allow for the approximation (1.5).

Our convergence proof relies on the theory of rough paths; we refer to [15] for an introduction. The main insight of this theory is that even though, for \(H \le \frac{1}{2}\), the solution map \(B \mapsto X\) for equations of the type (1.4) isn’t continuous when viewing B as an element of any classical function space large enough to contain typical sample paths of fractional Brownian motion, it does become continuous when enhancing B with its iterated integrals \({{\mathbb {B}}}= \int B\otimes dB\) and endowing the space of pairs \((B,{{\mathbb {B}}})\) with a suitable topology.

For this, consider for any \(x \in \mathbf{R}^d\) the processes

$$\begin{aligned} Z^{\varepsilon ,\delta }_{s,t}(x) = {\varepsilon }^{\frac{1}{2}-H}\int _s^t F(x,Y^\varepsilon _r)\,dB^\delta (r),\quad {\bar{Z}}^{\varepsilon }_{s,t}(x) = \int _s^t F_0(x,Y^\varepsilon _r)\,dr. \end{aligned}$$
(3.1)

Here, the first integral is interpreted as a Wiener integral which makes sense also when \(\delta = 0\) and, when \(\delta > 0\), coincides with the Riemann–Stieltjes integral. Recall that the Wiener integral of a deterministic (or independent) integrand against any Gaussian process B is well-defined provided that the integrand belongs to the reproducing kernel Hilbert space \(\mathcal{H}_B\) of B and provides an isometric embedding \(\mathcal{H}_B \ni f \mapsto \int f\,dB \in L^2(\Omega ,\mathbf{P})\). In the case of fractional Brownian motion, it is known that \(L^2 \subset \mathcal{H}_B\) when \(H \ge \frac{1}{2}\) while for \(H < \frac{1}{2}\) one has \(\mathcal{C}^\alpha \subset \mathcal{H}_B\) for every \(\alpha > \frac{1}{2}-H\). The fact that for fixed x and \(\varepsilon > 0\), \(t \mapsto F(x,Y^\varepsilon _t)\) belongs to \(\mathcal{H}_B\) for all \(H > \frac{1}{3}\) is then a simple consequence of Assumptions 2.2 and 2.3 combined with Kolmogorov’s continuity criterion (when \(H < \frac{1}{2}\)).

Write \(\mathcal{B}= \mathcal{C}_b^3(\mathbf{R}^d,\mathbf{R}^d)\) and \(\mathcal{B}_k = \mathcal{C}_b^3(\mathbf{R}^{d\cdot k},(\mathbf{R}^{d})^{\otimes k})\), so that one has canonical inclusions of the algebraic tensor product \(\mathcal{B}_k \otimes _0 \mathcal{B}_\ell \subset \mathcal{B}_{k+\ell }\) with the usual identification \((f\otimes g)(x,y)= f(x) g(y)\) thanks to the fact that \(\Vert f\otimes g\Vert _{\mathcal{B}_{k+\ell }}\le \Vert f\Vert _{\mathcal{B}_\ell }\Vert g\Vert _{\mathcal{B}_k}\). Given a final time \(T>0\) and \(\alpha \in (\frac{1}{3},\frac{1}{2})\), we define the space \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) of \(\alpha \)-Hölder rough paths in the usual way [15, Def. 2.1], but with all norms of level-2 objects in \(\mathcal{B}_2\). Recall that an \(\alpha \)-Hölder rough path \((X, {{\mathbb {X}}})\) is a pair of functions where \(X\in \mathscr {C}^\alpha ([0,T], \mathcal{B})\) with \(X_0=0\) and \({{\mathbb {X}}}: \Delta _T \rightarrow \mathcal{B}_2\) where \(\Delta _T :=\{(s,t): 0\le s\le t \le T\}\) is the two-simplex with \(|{{\mathbb {X}}}|_{2 \alpha }:=\sup _{(s,t)\in \Delta _T}\frac{\Vert {{\mathbb {X}}}_{s,t}\Vert _{\mathcal{B}_2}}{|t-s|^{2 \alpha }}<\infty \). In addition, Chen’s relation is imposed, namely \({{\mathbb {X}}}_{s,t}-{{\mathbb {X}}}_{s,u}-{{\mathbb {X}}}_{u,t}=X_{s,u}\otimes X_{u,t}\).

We define the second-order processes \({{\mathbb {Z}}}^{\varepsilon ,\delta }\) and \({\bar{{{\mathbb {Z}}}}}^\varepsilon \) by

$$\begin{aligned} {{\mathbb {Z}}}^{\varepsilon ,\delta }_{s,t}(x,{\bar{x}}) = \int _s^t Z^{\varepsilon ,\delta }_{s,r}(x)\, dZ^{\varepsilon ,\delta }_{s,r}({\bar{x}}),\qquad {\bar{{{\mathbb {Z}}}}}^{\varepsilon }_{s,t}(x,{\bar{x}}) = \int _s^t {\bar{Z}}^{\varepsilon }_{s,r}(x)\, d{\bar{Z}}^{\varepsilon }_{s,r}({\bar{x}}), \end{aligned}$$

(the differentials are taken in the r variable) and we define \({{\mathbf {Z}}}^{\varepsilon ,\delta } = (Z^{\varepsilon ,\delta },{{\mathbb {Z}}}^{\varepsilon ,\delta })\), \({\bar{{{\mathbf {Z}}}}}^\varepsilon = ({\bar{Z}}^\varepsilon ,{\bar{{{\mathbb {Z}}}}}^\varepsilon )\). Note here that \(r \mapsto Z_{s,r}^{\varepsilon ,\delta }(x)\) is smooth and \(r \mapsto {\bar{Z}}_{s,r}^{\varepsilon }(x)\) is Hölder continuous for any exponent strictly less than 1, so these integrals should be interpreted as regular Riemann–Stieltjes integrals. In Sect. 3.2 below we will give a proof of the following result.

Proposition 3.1

Let \(H \in (\frac{1}{3},1]\), let \(\alpha \in (\frac{1}{3},H\wedge \frac{1}{2})\) and \(\beta \in (1-\alpha ,1)\), and let Assumptions 2.12.3, and 2.7 hold. Then, \({{\mathbf {Z}}}^{{\varepsilon },\delta }\) and \({\bar{{{\mathbf {Z}}}}}^\varepsilon \) admit versions that are random elements in \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) and \(\mathscr {C}^\beta ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) respectively. Furthermore, \({{\mathbf {Z}}}^{{\varepsilon },\delta }\) converges in probability in \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) as \(\delta \rightarrow 0\) to the random rough path \({{\mathbf {Z}}}^{{\varepsilon }}\) characterised in Proposition 3.11 below. (In particular, the first order component \(Z^{{\varepsilon }}\) of \({{\mathbf {Z}}}^{{\varepsilon }}\) is given, for any fixed x, by the Wiener integral (3.1) with \(\delta = 0\).)

For now, we take this result as granted. With this result in place, we obtain the following convergence result as \(\delta \rightarrow 0\).

Proposition 3.2

The second claim of Theorem 2.8 holds.

Proof

With the space \(\mathcal{B}\) as above, let \(\delta :\mathbf{R}^d \rightarrow L(\mathcal{B},\mathbf{R}^d)\) be the function given by \(\delta (x)(f) = f(x)\). We then claim that, for any \(\varepsilon , \delta > 0\), (2.7) can be rewritten as the rough differential equation (RDE) driven by the infinite-dimensional rough paths \({{\mathbf {Z}}}^{{\varepsilon },\delta }\) and \({\bar{{{\mathbf {Z}}}}}^{{\varepsilon }}\) defined above and given by

$$\begin{aligned} dX = \delta (X)\,d{{\mathbf {Z}}}^{{\varepsilon },\delta } + \delta (X)\,d{\bar{{{\mathbf {Z}}}}}^{{\varepsilon }}. \end{aligned}$$
(3.2)

Note that since \(\alpha + \beta > 1\), there is no need to specify cross-integrals between \({{\mathbf {Z}}}^{\varepsilon ,\delta }\) and \({\bar{{{\mathbf {Z}}}}}^{\varepsilon }\) since they can be defined in a canonical way using Young integration [72].

To check that this RDE is well-posed for any rough paths \({{\mathbf {Z}}}^{{\varepsilon },\delta }\) and \({\bar{{{\mathbf {Z}}}}}^\varepsilon \) belonging to \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) and \(\mathscr {C}^\beta ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) respectively, we note first that one readily verifies that the map \(\delta \) is Fréchet differentiable, and actually even \(\mathcal{C}^3_b\). Its differential \(D\delta \) at \(x\in \mathbf{R}^d\) in the direction of \(y\in \mathbf{R}^d\) is given by \((D\delta )_x(y)(h) = (Dh)_x(y)\) where \(h\in L(\mathcal{B},\mathbf{R}^d)\). Morerover \(|(D\delta )_x(h)|\le |h|_\mathcal{B}\). In particular we may consider the map \(D\delta \cdot \delta :\mathbf{R}^d \rightarrow L(\mathcal{B}\otimes \mathcal{B},\mathbf{R}^d)\) which for \(h = h_1 \otimes h_2\in \mathcal{B}_2\) is given by

$$\begin{aligned} \bigl (D\delta \cdot \delta \bigr )(x) h&= (D\delta )_x \bigl (\delta (x)(h_1)\bigr )(h_2)=(D\delta )_x \bigl (h_1(x)\bigr )(h_2) \nonumber \\&= (Dh_2)_x(h_1(x)) = (\mathop {\mathrm {tr}}\limits D^{(2)} h)(x,x), \end{aligned}$$
(3.3)

for a suitable partial trace \(\mathop {\mathrm {tr}}\limits \). This shows that \(D\delta \cdot \delta \) extends continuously to a \(\mathcal{C}^2_b\) map from \(\mathbf{R}^d\) into \(L(\mathcal{B}_2, \mathbf{R}^d)\). (Since \(\mathcal{B}_2\) differs from the projective tensor product of \(\mathcal{B}\) with itself, this doesn’t automatically follow from the fact that \(\delta \) itself is \(\mathcal{C}^3_b\).) Retracing the standard existence and uniqueness proof for RDEs, [15, Sec. 8.5] then shows that (3.2) admits unique (global) solutions for every initial condition and every driving path. Furthermore, if the sample path \(t \mapsto Y(t)\) is given by any continuous \(\mathcal{Y}\)-valued function then, under the stated regularity conditions on \(F, F_0\), it is straightforward to verify that the solutions to (3.2) coincide with those of (2.7).

Since the RDE solution is a jointly locally Lipschitz continuous function of both the initial data \(x_0\) and the driving paths \({{\mathbf {Z}}}^{\varepsilon ,\delta }\) and \({\bar{{{\mathbf {Z}}}}}^{\varepsilon }\) into \(\mathcal{C}^\alpha (\mathbf{R}_+,\mathbf{R}^d)\), the claim that \(X^{\varepsilon ,\delta } \rightarrow X^\varepsilon \) in probability then follows immediately from Proposition 3.1. \(\square \)

Remark 3.3

This shows that it is consistent to define solutions to (1.4) in the general case as simply being a shorthand for solutions to (3.2) driven by the pair of rough paths \({{\mathbf {Z}}}^\varepsilon \) and \({\bar{{{\mathbf {Z}}}}}^\varepsilon \). This is the interpretation that we will use from now on. The fact that in the special case \(H =\frac{1}{2}\) this coincides with the Stratonovich interpretation of the equation follows as in [15, Thm 9.1].

3.1 Preliminary results

In this section we present a few general results that will be used in the proof of Proposition 3.1. We start with the following elementary property of the second Wiener chaos.

Proposition 3.4

Let \(\mathcal{H}\subset \mathcal{E}\) be an abstract Wiener space and let \(B, {\tilde{B}}\) be two i.i.d. Gaussian random variables on \(\mathcal{E}\) with Cameron–Martin space \(\mathcal{H}\). Let \(K^\delta :\mathcal{E}\times \mathcal{E}\rightarrow \mathbf{R}\) be continuous bilinear maps such that the limit \({\tilde{K}} = \lim _{\delta \rightarrow 0} K^\delta (B,{\tilde{B}})\) exists in \(L^2\). Then, the limit

$$\begin{aligned} K = \lim _{\delta \rightarrow 0} \bigl (K^\delta (B,B)- \mathbf{E}K^\delta (B,B)\bigr ) \end{aligned}$$
(3.4)

exists in \(L^2\). Furthermore, the limit in (3.4) depends only on the limit \({\tilde{K}}\) and not on the approximating sequence \(K^\delta \) and one has the bound \(\mathbf{E}K^2 \le 2 \mathbf{E}{\tilde{K}}^2\).

Proof

The Gaussian probability space generated by the pair \((B,{\tilde{B}})\) has Cameron–Martin space \(\mathcal{H}\oplus {\tilde{\mathcal{H}}}\) where \({\tilde{\mathcal{H}}}\) is a copy of \(\mathcal{H}\). Since \(K^\delta \) is bilinear and \(K^\delta (B,{\tilde{B}})\) has vanishing expectation, it belongs to the second homogeneous Wiener chaos, so that there exists \({\hat{\mathrm {K}}}^\delta \in (\mathcal{H}\oplus {\tilde{\mathcal{H}}}) \otimes _s (\mathcal{H}\oplus {\tilde{\mathcal{H}}})\) with \(K^\delta (B,{\tilde{B}}) = \mathcal{I}_2({\hat{\mathrm {K}}}^\delta )\), where \(\mathcal{I}_k\) denotes the usual isometry between kth symmetric tensor power and kth homogeneous Wiener chaos, see [57].

Note now that, interpreting \({\hat{\mathrm {K}}}^\delta \) as a Hilbert–Schmidt operator on \(\mathcal{H}\oplus {\tilde{\mathcal{H}}}\), there exists \(\mathrm {K}^\delta \in {\tilde{\mathcal{H}}} \otimes \mathcal{H}\) such that \({\hat{\mathrm {K}}}^\delta = \iota \mathrm {K}^\delta \), where \(\iota :{\tilde{\mathcal{H}}} \otimes \mathcal{H}\rightarrow (\mathcal{H}\oplus {\tilde{\mathcal{H}}}) \otimes _s (\mathcal{H}\oplus {\tilde{\mathcal{H}}})\) is given by

$$\begin{aligned}\iota \mathrm {K}= \frac{1}{2} \begin{pmatrix} 0 &{} \tau \mathrm {K}\\ \mathrm {K}&{} 0 \end{pmatrix}, \end{aligned}$$

with the obvious matrix notation and \(\tau :{\tilde{\mathcal{H}}} \otimes \mathcal{H}\rightarrow \mathcal{H}\otimes {\tilde{\mathcal{H}}}\) the transposition operator.

This is because the first diagonal block is obtained by testing against \(\langle k, (B, {\tilde{B}})\rangle \) where \(k=(f, 0) \otimes _s (g,0)\) with \(f,g \in \mathcal{H}\), yielding

$$\begin{aligned}\mathbf{E}\bigl (K^\delta (B,{\tilde{B}}) \bigl (\langle f,B\rangle \langle g,B\rangle - \langle f,g\rangle \bigr )\bigr ) = 0. \end{aligned}$$

The second diagonal block vanishes for the same reason with the roles of B and \({\tilde{B}}\) exchanged. (Here we denote by \(x\mapsto \langle x,f\rangle \) the unique element of \(L^2(\mathcal{E})\) which is linear on a set of full measure containing \(\mathcal{H}\) and coincides with \(\langle f,\cdot \rangle \) there. In fact, \(\langle x,f\rangle =f^*(x)\) when \(f^*\in \mathcal{E}^*\) and \(\langle B,f\rangle =f^*\circ B\). )

On the other hand, one has

$$\begin{aligned}K^\delta _\diamond (B,B) {\mathop {=}\limits ^{\tiny {\hbox {def}}}}K^\delta (B,B)- \mathbf{E}K^\delta (B,B) = \frac{1}{2} \mathcal{I}_2(\mathrm {K}^\delta + \tau \mathrm {K}^\delta ), \end{aligned}$$

where this time \(\mathcal{I}_2\) refers to the isometry between \(\mathcal{H}\otimes _s\mathcal{H}\) and the second chaos generated by B only. Since \(\mathcal{I}_2\) and \(\tau \) are both isometries, it immediately follows that \(\mathbf{E}(K^\delta _\diamond (B,B))^2 \le \Vert \mathrm {K}^\delta \Vert ^2 = 2 \Vert \iota \mathrm {K}^\delta \Vert ^2 = 2\mathbf{E}(K^\delta (B,{\tilde{B}}))^2\), and similarly for differences \(K^\delta - K^{\delta '}\), so that the claim follows. \(\square \)

Remark 3.5

It follows immediately from (3.4) that if we replace \(K^\delta \) by the same sequence of bilinear maps, but with their two arguments exchanged, the limit one obtains is the same.

Before we turn to the precise statement, we introduce the following notation which will be used repeatedly in the sequel. We write \(\eta \) for the distribution on \(\mathbf{R}\) given by

$$\begin{aligned} \langle \eta ,\phi \rangle =\int _\mathbf{R}|t|^{2H}\phi (t)\,dt, \end{aligned}$$
(3.5)

and \(\eta ''\) for its second distributional derivative. For \(a< 0 < b\), we will then make the abuse of notation \(\int _a^b \eta ''(t)\,\phi (t)\,dt\) as a shorthand for \(\lim _{\varepsilon \rightarrow 0} \langle \eta '', \phi \mathbf {1}^\varepsilon _{[a,b]}\rangle \), where \(\mathbf {1}^\varepsilon _{[a,b]}\) denotes a mollification of the indicator function \(\mathbf {1}_{[a,b]}\). The following is elementary.

Lemma 3.6

Let \(a<0<b\) and \(H \in (\frac{1}{3},\frac{1}{2})\). Setting \(\alpha _H= H(1-2H)\) and \(\phi _0 = \phi (0)\), the limit above is given by

$$\begin{aligned} \int _a^b \eta ''(t)\,\phi (t)\,dt = -2\alpha _H \int _a^b |t|^{2H-2}\,\big (\phi (t) - \phi _0\big )\,dt + 2H \phi _0 \bigl (|a|^{2H-1} + |b|^{2H-1}\bigr ),\nonumber \\ \end{aligned}$$
(3.6)

independently of the choice of mollifier, thus justifying the notation. For \(a = 0\), we set

$$\begin{aligned}\int _0^b \eta ''(t)\,\phi (t)\,dt = -2\alpha _H \int _0^b |t|^{2H-2}\,\big (\phi (t) - \phi _0\big )\,dt + 2H \phi _0 |a|^{2H-1}, \end{aligned}$$

which can be justified in a similar way, provided that the mollifier one uses is symmetric. (This in turn is the case if we view \(\eta \) as the limit of covariances of smooth approximations to fractional Brownian motion.)

For \(H = \frac{1}{2}\), one similarly has \(\int _a^b \eta ''(t)\,\phi (t)\,dt = \phi _0\) and \(\int _0^b \eta ''(t)\,\phi (t)\,dt = \frac{1}{2} \phi _0\), while for \(H \in (\frac{1}{2},1)\), \(\eta ''\) is given by the locally integrable function \(t \mapsto -2\alpha _H |t|^{2H-2}\).

\(\square \)

We now show that for any fixed \(\varepsilon > 0\), the processes \(Z^\varepsilon \) satisfy a suitable form of Hölder regularity. To keep notations shorter, we define the collection of processes

$$\begin{aligned} Z^f_{s,t} {\mathop {=}\limits ^{\tiny {\hbox {def}}}}\int _s^t f(r)\,dB_i(r)=\sum _{i=1}^m\int _s^t\langle f(r), e^i\rangle _{\mathbf{R}^m} dB_r^i, \end{aligned}$$
(3.7)

indexed by \(\mathbf{R}^m\)-valued functions f that belong to the reproducing kernel space of the fractional Brownian motion B. Here \(\{e^i\}\) is an o.n.b. of \(\mathbf{R}^m\) and \(B_r=(B_r^1, \dots , B_r^m)\). We start our analysis with some preliminary result for the irregular case \(H < \frac{1}{2}\).

Lemma 3.7

Let \(H \in (\frac{1}{3},\frac{1}{2})\) and let \(f\in \mathcal{C}^\beta ([0,1],\mathbf{R}^m)\) for some \(\beta > (1-2H) \vee 0\). The processes \(Z^f\) satisfy the Coutin–Qian conditions [9, 18, Def. 14] in the sense that

$$\begin{aligned} \mathbf{E}\big (Z^f_{s,t}(x)^2\bigr )&\lesssim \Vert f\Vert _\infty \Vert f\Vert _{\mathcal{C}^\beta } |t-s|^{2H},\\ \big |\mathbf{E}\big (Z^f_{s,s+h}(x)\, Z^f_{t,t+h}(x)\bigr )\big |&\lesssim \Vert f\Vert _\infty ^2 |t-s|^{2H-2} h^2, \end{aligned}$$

for all \(0< s< t < 1\) and all \(h \in (0, t-s]\).

Proof

The mixed second order distributional derivative of \( \mathbf{E}(B^\delta _sB_t^\delta )\) is given by \(\frac{1}{2} \eta _\delta ''(s-t)\), the convolution of \(\frac{1}{2}\eta ''\) with a symmetric mollifier at scale \(\delta \). Mollifying B and taking limits shows that we have the identity

$$\begin{aligned} \mathbf{E}(Z^f_{s,t})^2&= \lim _{\delta \rightarrow 0} \int _s^t\int _s^t \mathbf{E}(\dot{B}_u^\delta \dot{B}_v^\delta ) f(u)f(v)\,du\,dv\;\\&= \frac{1}{2}\int _s^t\int _s^t \eta ''(x-y)f(x)f(y)\,dx\,dy, \end{aligned}$$

(with summation over the components of f implied). For \(H < \frac{1}{2}\), this yields the bound

$$\begin{aligned} \mathbf{E}(Z^f_{s,t})^2&=\frac{1}{4} \int _{s-t}^{t-s}\eta ''(v) \int _{2s+|v|}^{2t-|v|} \Bigl (f\Big (\frac{u+v}{2}\Big )f\Big (\frac{u-v}{2}\Big )\Bigr )\,du\,dv \nonumber \\&= -\frac{1}{2}\alpha _H\int _{s-t}^{t-s} |v|^{2H-2} \int _{2s+|v|}^{2t-|v|} \Bigl (f\Big (\frac{u+v}{2}\Big )f\Big (\frac{u-v}{2}\Big ) - f\Big (\frac{u}{2}\Big )^2\Bigr )\,du\,dv \nonumber \\&\qquad +\frac{\alpha _H}{1-2H}\int _0^{t-s}u^{2H-1} \Bigl (f\Big (t-\frac{u}{2}\Big )^2+ f\Big (s+\frac{u}{2}\Big )^2\Bigr )\,du \nonumber \\&\lesssim \Vert f\Vert _{\mathcal{C}^\beta } \Vert f\Vert _\infty |t-s|^{2H+\beta } + \Vert f\Vert _\infty ^2 |t-s|^{2H}, \end{aligned}$$
(3.8)

as required.

Regarding the covariance, we have

$$\begin{aligned} \big |\mathbf{E}\bigl (Z^f_{s,s+h}Z^f_{t,t+h}\bigr )\bigr |&\lesssim \Vert f\Vert _\infty ^2 \int _0^h \int _0^h|t-s+v-u|^{2H-2} \,du\,dv \\&\lesssim \Vert f\Vert _\infty ^2 h^2 |t-s|^{2H-2}, \end{aligned}$$

when \(0<h\le t-s\) so the intervals overlap only at most one point, as required. \(\square \)

If \({\tilde{B}}\) is an independent copy of B, we can combine this result with those of [18] to conclude that there is a canonical rough path associated with the path \(\big (\int f(r)\,dB(r), \int g(r)\,d{\tilde{B}}(r)\big )\) for any \(f,g\in \mathcal{C}^\beta ([0,1],\mathbf{R}^m)\). We now show that there also exists a canonical lift for \((Z^f, Z^g)\), where the integrals are defined with respect to the same fractional Brownian motion B. With this notation, we then have the following result.

Proposition 3.8

For \(H \in (\frac{1}{3},\frac{1}{2})\), we set \(\alpha = H\) and \(U = \mathcal{C}^\beta ([0,1],\mathbf{R}^m)\) for some \(\beta \in (\frac{1}{3} , H)\). For \(H \in [\frac{1}{2}, 1)\), we fix \(p > 3/(3H-1)\) and set \(\alpha = H-\frac{1}{p}\) and \(U = L^p([0,1],\mathbf{R}^m)\).

Then, for any finite collection \(\{f_i\}_{i \le N} \subset U\), there is a “canonical lift” of

$$\begin{aligned} Z^f = (Z^{f_1},\ldots ,Z^{f_N}) \end{aligned}$$

given by (3.7) to a geometric rough path \({{\mathbf {Z}}}^f = (Z^f, {{\mathbb {Z}}}^f)\) on \(\mathbf{R}^N\) such that

$$\begin{aligned} \sup _{s \ne t} |t-s|^{-q\alpha } \mathbf{E}\Bigl (|Z^f_{s,t}|^q + |{{\mathbb {Z}}}^f_{s,t}|^{\frac{q}{2}}\Bigr ) \lesssim \sum _i\Vert f_i\Vert _{U}^{q}, \end{aligned}$$
(3.9)

for every \(q \ge 1\). This is obtained by taking the limit as \(\delta \rightarrow 0\) of the canonical lift of the smooth paths \(Z^{\delta ,f}\) defined as in (3.7) but with B replaced by \(B^\delta \).

Remark 3.9

As usual, “geometric” here means that \({{\mathbf {Z}}}^f\) is the limit of canonical lifts of smooth functions. Indeed, for \(B^\delta \) the convolution of B with a mollifier at scale \(\delta > 0\), \({{\mathbb {Z}}}^{f}\) is given by

$$\begin{aligned} ({{\mathbb {Z}}}^{f}_{s,t})_{ij} =\lim _{\delta \rightarrow 0} \int _s^t \int _s^r f_i(u)\, \dot{B}^\delta (u) du\, f_j(r)\, \dot{B}^\delta (r)dr, \end{aligned}$$

and this limit is independent of the choice of mollifier (and therefore “canonical”).

Proof

We only need to show (3.9) for \(q=2\) since Z and \({{\mathbb {Z}}}\) belong to a Wiener chaos of fixed order. (Recall that the \(f_i\) are considered deterministic here.)

We start with the case \(H \in (\frac{1}{3},\frac{1}{2})\). Let \({\tilde{B}}\) denote an independent copy of the fractional Brownian motion B and let \({\tilde{Z}}^{f,g} = (Z^f, {\tilde{Z}}^g)\) where \({\tilde{Z}}^g\) is defined like \(Z^{g}\) but with B replaced by \({\tilde{B}}\). By Lemma 3.7, for any \(f,g \in \mathcal{C}^\beta \), we can then apply [18, Thm 35] to construct a second-order process \({\tilde{{{\mathbb {Z}}}}}^{f,g}_{s,t}\) satisfying the Chen identity

$$\begin{aligned}{\tilde{{{\mathbb {Z}}}}}^{f,g}_{s,t} - {\tilde{{{\mathbb {Z}}}}}^{f,g}_{s,u} - {\tilde{{{\mathbb {Z}}}}}^{f,g}_{u,t} = Z^f_{s,u} \, {\tilde{Z}}^g_{u,t}. \end{aligned}$$

It furthermore coincides with the Wiener integral

$$\begin{aligned} {\tilde{{{\mathbb {Z}}}}}^{f,g}_{s,t} = \int _s^t Z^f_{s,r}\,d{\tilde{Z}}^g_{s,r} = \int _s^t \int _s^r f(u)\,dB(u)\, g(r)\,d{\tilde{B}}(r), \end{aligned}$$
(3.10)

which makes sense since the Coutin–Qian condition guarantees that \(r \mapsto Z^f_{s,r}\) belongs to the reproducing kernel space of \(Z^g\). It is furthermore such that smooth approximations to (3.10) (replace B and \({\tilde{B}}\) by \(B^\delta \) and \({\tilde{B}}^\delta \), obtained by convolution with a mollifier at scale \(\delta \rightarrow 0\)) converge to it in \(L^2\). In particular, \({\tilde{{{\mathbb {Z}}}}}^{f,g}\) belongs to the second Wiener chaos generated by \((B,{\tilde{B}})\) and is of the form of the limits considered in Proposition 3.4.

We now want to replace \({\tilde{B}}\) by B. For an approximation \(B^\delta \) as mentioned above, setting

$$\begin{aligned} {{\mathbb {Z}}}^{\delta ,f,g}_{s,t} = \int _s^t \int _s^r f(u)\,dB^\delta (u)\, g(r)\,dB^\delta (r), \end{aligned}$$
(3.11)

we have

$$\begin{aligned} \mathbf{E}{{\mathbb {Z}}}^{\delta ,f,g}_{s,t} ={\frac{1}{2}} \int _s^t \int _s^r f(u) g(r) \, \eta _\delta ''(u-r)\,du\,dr, \end{aligned}$$
(3.12)

where \(\eta _\delta \) is an even \(\delta \)-mollification of \(t \mapsto |t|^{2H}\), so that in particular \(\int _0^\infty \eta _\delta ''(s)\,ds=0\), since \(\eta _\delta '(0)=0\). Similarly to (3.8), we can then rewrite (3.12) as

$$\begin{aligned} \mathbf{E}{{\mathbb {Z}}}^{\delta ,f,g}_{s,t}&={\frac{1}{2}} \int _{s}^{t} g(r) \int _{s}^{r} \bigl (f(u) - f(r)\bigr ) \eta _\delta ''(r-u)\,du\,dr \\&\quad -{\frac{1}{2}} \int _{s}^{t} g(r) \int _{-\infty }^{s} f(r) \eta _\delta ''(r-u)\,du\,dr. \end{aligned}$$

It follows immediately that one has

$$\begin{aligned} \lim _{\delta \rightarrow 0} \mathbf{E}{{\mathbb {Z}}}^{\delta ,f,g}_{s,t}&= H(2H-1)\int _{s}^{t} g(r) \int _{s}^{r} \bigl (f(u) - f(r)\bigr ) |r-u|^{2H-2} \,du\,dr \nonumber \\&\qquad - H\int _{s}^{t} g(r)f(r) |r-s|^{2H-1}\,dr, \end{aligned}$$
(3.13)

which is bounded by some multiple of

$$\begin{aligned}\Vert f\Vert _{\mathcal{C}^\beta } \Vert g\Vert _\infty |t-s|^{2H+\beta } + \Vert f\Vert _\infty \Vert g\Vert _\infty |t-s|^{2H}. \end{aligned}$$

Combining this with Proposition 3.4, we conclude that

$$\begin{aligned} {{\mathbb {Z}}}^{f,g}_{s,t} = \lim _{\delta \rightarrow 0} {{\mathbb {Z}}}^{\delta ,f,g}_{s,t}, \end{aligned}$$
(3.14)

exists in probability, is independent of the choice of mollification, and satisfies the bound

$$\begin{aligned} |\mathbf{E}{{\mathbb {Z}}}^{f,g}_{s,t}| \lesssim \Vert f\Vert _{\mathcal{C}^\beta } \Vert g\Vert _\infty |t-s|^{2H+\beta } + \Vert f\Vert _\infty \Vert g\Vert _\infty |t-s|^{2H}. \end{aligned}$$
(3.15)

It now suffices to set \({{\mathbb {Z}}}^{f,ij}_{s,t} = {{\mathbb {Z}}}^{f_i,f_j}_{s,t}\). Both the fact that Chen’s relation holds and the fact that the resulting rough path is geometric follow at once from the fact that these properties hold for the smooth approximations.

Combining (3.15) with the fact that the rough path obtained from [18, Thm 35] satisfies the bound (3.9) with \(\alpha = H\) as a consequence of Lemma 3.7, the claim follows.

We now turn to the case \(H = \frac{1}{2}\) where it is well known that \({{\mathbb {Z}}}^{\delta ,f,g}_{s,t}\) defined in (3.11) converges to the Stratonovich integral \( \int _s^t \int _s^r f(u)\,dB(u)\, g(r)\, \circ dB(r)\), so that

$$\begin{aligned}{{\mathbb {Z}}}^{f,g}_{s,t} = \int _s^t \int _s^r f(u)\,dB(u)\, g(r)\,dB(r) + \frac{1}{2} \int _s^t \langle f(u),g(u)\rangle \,du. \end{aligned}$$

A simple consequence of Hölder’s inequality then leads to the bounds

$$\begin{aligned} \mathbf{E}|Z_{s,t}^f|^2 \lesssim \Vert f\Vert _{L^p} |t-s|^{1-\frac{2}{p}},\qquad \mathbf{E}|{{\mathbb {Z}}}_{s,t}^{f,g}|^2 \lesssim \Vert f\Vert _{L^p}\Vert g\Vert _{L^p}|t-s|^{2-\frac{4}{p}}, \end{aligned}$$

where \( \Vert f\Vert _{L^p}= \Vert f\Vert _{L^p([0,1])}\). This shows again that the bound (3.9) holds, this time with \(\alpha = \frac{1}{2}-\frac{1}{p}\) (and q arbitrary), and our condition on p guarantees that this is greater than \(\frac{1}{3}\).

For \(H > \frac{1}{2}\), the first identity in (3.8) above combined with the positivity of the distribution \(-\eta ''\) and Hölder’s inequality yields the bound

$$\begin{aligned}\mathbf{E}(Z^f_{s,t})^2 \lesssim |t-s|^{1-\frac{2}{p}}\Vert f\Vert _{L^p}^2 \Big | \int _{s-t}^{t-s}\eta ''(v)\,dv\Big | \lesssim \Vert f\Vert _{L^p}^2 |t-s|^{2H-\frac{2}{p}}. \end{aligned}$$

Similarly, again as a consequence of the positivity of \(-\eta ''\), we have the bound

$$\begin{aligned}\mathbf{E}({{\mathbb {Z}}}^{f,g}_{s,t})^2 \le \mathbf{E}\big ((Z^{|f|}_{s,t})^2(Z^{|g|}_{s,t})^2\big ) \le 3\mathbf{E}(Z^{|f|}_{s,t})^2\mathbf{E}(Z^{|g|}_{s,t})^2\lesssim \Vert f\Vert _{L^p}^2\Vert g\Vert _{L^p}^2 |t-s|^{4H-\frac{4}{p}}, \end{aligned}$$

which shows again that (3.9) holds. \(\square \)

3.2 Construction and convergence of the rough driver as \({\delta \rightarrow 0}\)

The aim of this section is to construct the rough path \({{\mathbf {Z}}}^\varepsilon \) (this is the content of Proposition 3.11) and to show that this construction enjoys good stability properties. This is done by stitching together the “canonical” rough path lift for the collection \(\{Z^\varepsilon (x)\}_{x \in \mathbf{R}^d}\) obtained in Proposition 3.8. For \(H>\frac{1}{2}\), this is just iterated Young integrals. For \(H=\frac{1}{2}\) the iterated integrals are considered in the Stratonovich sense and, thanks to the independence of Y and B, the first order process can be interpreted indifferently as either an Itô or a Stratonovich integral.

In order to make use of Proposition 3.8, we use the following lemma, where \(Y^\varepsilon _t\) denotes the Markov process from Sect. 2.1.

Lemma 3.10

Let U be as in Proposition 3.8 and let Assumption 2.2 hold for some \(p_\star \). When \(H < \frac{1}{2}\), we further assume \(E\subset L^{p_\star }\) and \(\beta < H - \frac{1}{p_\star }\), where \(U = \mathcal{C}^\beta \). Then, given \(f \in E\) and setting \({\hat{f}}(t) = f(Y^\varepsilon _t)\), one has for every \(p < p_\star \) the bound

$$\begin{aligned}\mathbf{E}\Vert {\hat{f}}\Vert _U^p \lesssim \Vert f\Vert _E^p, \end{aligned}$$

uniformly over E. (Here we use the convention \(p_\star = \infty \) when \(H>\frac{1}{2}\).)

Proof

For \(H \in (\frac{1}{3},\frac{1}{2})\), the assertion follows immediately from Kolmogorov’s continuity test, using Assumption 2.2 and \(E\subset L^{p}\). For \(H \ge \frac{1}{2}\), it suffices to note that if \(f \in L^p(\mathcal{Y},\mu )\), then for any fixed \(\varepsilon > 0\) the map \({\hat{f}} :t \mapsto f(Y^\varepsilon _t)\) belongs to \(L^p([0,1])\) almost surely and \(\mathbf{E}\Vert {\hat{f}}\Vert _{L^p}^p \le \Vert f\Vert _{L^p}^p\). \(\square \)

We now show how to collect these objects into one “large” Banach space-valued rough path. The process itself will take values in \(\mathcal{B}= \mathcal{C}_b^3(\mathbf{R}^d,\mathbf{R}^d)\), with the second order processes taking value in \(\mathcal{B}_2 = \mathcal{C}_b^3(\mathbf{R}^{d}\times \mathbf{R}^{d},(\mathbf{R}^{d})^{\otimes 2})\). Our aim is then to define a \(\mathcal{B}\oplus \mathcal{B}_2\)-valued rough path \((Z^{\varepsilon },{{\mathbb {Z}}}^{\varepsilon })\) which is the canonical lift (in the sense of Proposition 3.8 for any finite collection of x’s) of

$$\begin{aligned} (Z^\varepsilon _{s,t})(x) = {\varepsilon }^{\frac{1}{2}-H}\int _s^t F(x,Y^\varepsilon _r) \,dB(r). \end{aligned}$$
(3.16)

Let \(\{f_{i,x}^\varepsilon \}_{i\le d}\) be the collection of maps from \(\mathbf{R}_+\) to \( \mathbf{R}^m\) determined by

$$\begin{aligned}\langle f_{i,x}^\varepsilon (t), e\rangle = (F(x,Y^\varepsilon _t)e)_i,\qquad \forall e \in \mathbf{R}^m. \end{aligned}$$

With this notation at hand and recalling the construction of \(Z^f\) and \({{\mathbb {Z}}}^{f,g}\) as in the proof of Proposition 3.8, it is then natural to look for a \(\mathcal{B}\oplus \mathcal{B}_2\)-valued rough path \((Z^{\varepsilon },{{\mathbb {Z}}}^{\varepsilon })\) such that, for every \(x, {\bar{x}} \in \mathbf{R}^d\), the identities

$$\begin{aligned} \big (Z^\varepsilon _{s,t}(x)\big )_i = \varepsilon ^{\frac{1}{2} -H} Z^{f_{i,x}^\varepsilon }_{s,t},\quad \big ({{\mathbb {Z}}}^\varepsilon _{s,t}(x,{\bar{x}})\bigr )_{ij} {\mathop {=}\limits ^{\tiny {\hbox {def}}}}\varepsilon ^{1 -2H} {{\mathbb {Z}}}^{f_{i,x}^\varepsilon , f_{j,{\bar{x}}}^\varepsilon }_{s,t}, \end{aligned}$$
(3.17)

hold almost surely. (Provided that \(F(x,\cdot ) \in E\) for every x, the right-hand sides make sense by combining Lemma 3.10 with Proposition 3.8.) We claim that this does indeed define a bona fide infinite-dimensional rough path

Recall also that a rough path \({{\mathbf {Z}}}= (Z,{{\mathbb {Z}}})\) is weakly geometric if the identity

$$\begin{aligned} Z_{s,t} \otimes Z_{s,t} = {{\mathbb {Z}}}_{s,t} + {{\mathbb {Z}}}^\top _{s,t}, \end{aligned}$$
(3.18)

holds, where the transposition map \((\cdot )^\top :\mathcal{B}\otimes \mathcal{B}\rightarrow \mathcal{B}\otimes \mathcal{B}\) swapping the two factors is continuously extended to \(\mathcal{B}_2\). With these notations, we have the following.

Proposition 3.11

Let Assumptions 2.22.3, and 2.7 hold and let \(\alpha \in (\frac{1}{3} ,H - \frac{1}{p_\star })\) if \(H < \frac{1}{2}\) and \(\alpha \in (\frac{1}{3},\frac{1}{2})\) otherwise. Then, for any \(\varepsilon > 0\), there exists a random rough path \({{\mathbf {Z}}}^\varepsilon = (Z^\varepsilon ,{{\mathbb {Z}}}^\varepsilon )\) in \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) that is weakly geometric and such that, for every x, (3.17) holds almost surely.

Proof

Since the Chen relations and (3.18) are obviously satisfied for smooth approximations as in (3.11), we only need to show that the analytic constraints hold. In other words, for any fixed \(T >0\) and \(\varepsilon > 0\), we look for an almost surely finite random variable \(C_\varepsilon \) such that

$$\begin{aligned}\Vert Z^\varepsilon _{s,t}\Vert _{\mathcal{B}} \le C_\varepsilon |t-s|^\alpha , \qquad \Vert {{\mathbb {Z}}}^\varepsilon _{s,t}\Vert _{\mathcal{B}_2} \le C_\varepsilon |t-s|^{2\alpha }, \end{aligned}$$

holds uniformly over all \(0\le s < t \le T\). By the Kolmogorov criterion for rough paths [15, Thm 3.1], it suffices to show that, for some \(\beta > 0\) and \(p \ge 1\) such that \(\gamma - \frac{1}{p} > \alpha \), one has the bounds

$$\begin{aligned} \mathbf{E}\Vert Z^\varepsilon _{s,t}\Vert _{\mathcal{B}}^p \le C_{\varepsilon ,p} |t-s|^{p\gamma }, \qquad \mathbf{E}\Vert {{\mathbb {Z}}}^\varepsilon _{s,t}\Vert _{\mathcal{B}_2}^{p/2} \le C_{\varepsilon ,p} |t-s|^{p\gamma }. \end{aligned}$$
(3.19)

By Lemma A.1 below, it suffices to show that

$$\begin{aligned} \sup _{x\in \mathbf{R}^d} (1+|x|)^{\kappa p} \mathbf{E}|D^\ell Z^\varepsilon _{s,t}(x)|^p&\lesssim |t-s|^{p\gamma }, \end{aligned}$$
(3.20a)
$$\begin{aligned} \sup _{x,{\bar{x}}\in \mathbf{R}^d} (1+|x|+|{\bar{x}}|)^{\kappa p \over 2} \mathbf{E}|D_x^k D_{{\bar{x}}}^\ell {{\mathbb {Z}}}^\varepsilon _{s,t}(x,{\bar{x}})|^{p\over 2}&\lesssim |t-s|^{p\gamma }, \end{aligned}$$
(3.20b)

for \(k + \ell \le 4\) and some p such that \(p > (4d/\kappa ) \vee (\gamma -\alpha )^{-1}\).

Since for \(\ell \le 4\) we have \(D_x^\ell F_i^*(x,\cdot ) \in E\) with \(\Vert D_x^\ell F_i^*(x,\cdot )\Vert _E \lesssim (1+|x|)^{-\kappa }\) by Assumption 2.7, it follows immediately from Proposition 3.8 combined with Lemma 3.10 that the bound (3.20a) holds for \(\gamma = H\) and \(p \le p_\star \) when \(H < \frac{1}{2}\) and for any \(\gamma < H\) and \(p \ge 1\) when \(H \ge \frac{1}{2}\). The bound (3.20b) follows in the same way. (These arguments are somewhat formal, but can readily be justified by taking limits of smooth approximations.) \(\square \)

In order to prove Proposition 3.1, we make use of the following variant “in probability” of the usual tightness criterion for convergence in law.

Proposition 3.12

Let \((\mathcal{Z},d)\) be a complete separable metric space and let \(\{L_k\,:\, k \in \mathbf{N}\}\) be a countable collection of continuous maps \(L_k :\mathcal{Z}\rightarrow \mathbf{R}\) that separate elements of \(\mathcal{Z}\) in the sense that, for every \(x,y \in \mathcal{Z}\) with \(x\ne y\) there exist k such that \(L_k(x) \ne L_k(y)\).

Let \(\{Z_n\}_{n \ge 0}\) and \(Z_\infty \) be \(\mathcal{Z}\)-valued random variables such that the collection of their laws is tight and such that \(L_k(Z_n) \rightarrow L_k(Z_\infty )\) in probability for every k. Then, \(Z_n \rightarrow Z_\infty \) in probability.

Proof

Let \({\hat{d}}:\mathcal{Z}^2 \rightarrow \mathbf{R}_+\) be the continuous distance function given by

$$\begin{aligned} {\hat{d}}(x,y) = \sum _{k \ge 0} 2^{-k} \bigl (1\wedge |L_k(x) - L_k(y)|\bigr ), \end{aligned}$$

and note first that our assumption implies that \({\hat{d}}(Z_n, Z_\infty ) \rightarrow 0\) in probability. Given \(\varepsilon > 0\), tightness implies that there exists \(K_\varepsilon \subset \mathcal{Z}\) compact such that \(\mathbf{P}(Z_n \not \in K_\varepsilon ) \le \varepsilon \) for every \(n \in \mathbf{N}\cup \{\infty \}\). Furthermore, the set \(\{(x,y) \in K_\varepsilon \times K_\varepsilon \,:\, d(x,y) \ge \varepsilon \}\) is compact, so that \({\hat{d}}\) attains its infimum \(\delta \) on it. Since \({\hat{d}}\) only vanishes on the diagonal, one has \(\delta > 0\) and, since \({\hat{d}}(Z_n, Z_\infty ) \rightarrow 0\) in probability, we can find \(N>0\) such that \(\mathbf{P}({\hat{d}}(Z_n, Z_\infty ) \ge \delta ) \le \varepsilon \) for every \(n \ge N\).

It follows that, for every \(n \ge N\) one has

$$\begin{aligned} \mathbf{P}(d(Z_n, Z_\infty ) \ge \varepsilon ) \le \mathbf{P}(Z_n \not \in K_\varepsilon ) + \mathbf{P}(Z_\infty \not \in K_\varepsilon ) + \mathbf{P}({\hat{d}}(Z_n, Z_\infty ) \ge \delta ) \le 3\varepsilon , \end{aligned}$$

which implies the claim. \(\square \)

Regarding tightness itself, the following lemma is a slight variation of well known results.

Lemma 3.13

Let \({\hat{\mathcal{B}}} \subset \mathcal{B}\) and \({\hat{\mathcal{B}}}_2 \subset \mathcal{B}_2\) be compact embeddings of Banach spaces such that \({\hat{\mathcal{B}}} \otimes {\hat{\mathcal{B}}} \subset {\hat{\mathcal{B}}}_2\) with \(\Vert v\otimes w\Vert _{{\hat{\mathcal{B}}}_2} \lesssim \Vert v\Vert _{{\hat{\mathcal{B}}}}\Vert w\Vert _{{\hat{\mathcal{B}}}}\). Let \(\mathcal{A}\) be a collection of random \({\hat{\mathcal{B}}} \oplus {\hat{\mathcal{B}}}_2\)-valued rough paths such that for some \(\alpha _0>\frac{1}{3}\) and every \({{\mathbf {Z}}}= (Z,{{\mathbb {Z}}})\in \mathcal{A}\),

$$\begin{aligned} \mathbf{E}\big (\Vert Z_{s,t}\Vert _{{\hat{\mathcal{B}}}}^p + \Vert {{\mathbb {Z}}}_{s,t}\Vert _{{\hat{\mathcal{B}}}_2}^{p/2}\bigr ) \le |t-s|^{p\alpha _0}, \end{aligned}$$
(3.21)

for some \(p > 3/(3\alpha _0-1)\). Then, the laws of the \({{\mathbf {Z}}}\)’s in \(\mathcal{A}\) are tight in \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) for any \(T > 0\) and \(\alpha \in (\frac{1}{3}, \alpha _0 - \frac{1}{p})\).

Proof

Write \(\mathcal{G}\) for the metric space given by \(\mathcal{B}\oplus \mathcal{B}_2\) endowed with the metric

$$\begin{aligned}d(a\oplus b, {\bar{a}}\oplus {\bar{b}}) = \Vert {\bar{a}}-a\Vert _{\mathcal{B}} \vee \big \Vert {\bar{b}}- b -\textstyle {\frac{1}{2}} ({\bar{a}} + a)\otimes ({\bar{a}}-a)\big \Vert _{\mathcal{B}_2}^{1/2}. \end{aligned}$$

Recall then that \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) can be identified with the usual space of \(\alpha \)-Hölder functions with values in \(\mathcal{G}\) by identifying \({{\mathbf {Z}}}= (Z,{{\mathbb {Z}}})\) with the function \(t \mapsto {{\mathbf {Z}}}_t {\mathop {=}\limits ^{\tiny {\hbox {def}}}}Z_{0,t} \oplus {{\mathbb {Z}}}_{0,t}\) and noting that, thanks to Chen’s relations,

$$\begin{aligned}d \bigl ({{\mathbf {Z}}}_s,{{\mathbf {Z}}}_t\bigr ) = \Vert Z_{s,t}\Vert _{\mathcal{B}} \vee \Vert {{\mathbb {Z}}}_{s,t}+ \textstyle {\frac{1}{2}} Z_{s,t}\otimes Z_{s,t}\Vert ^{1/2}_{\mathcal{B}_2}. \end{aligned}$$

(See [19, Sec. 7.5] for more details and motivation.) Since d generates the same topology on \(\mathcal{G}\) as that given by the Banach space structure of \(\mathcal{B}\oplus \mathcal{B}_2\), balls of \({\hat{\mathcal{B}}} \oplus {\hat{\mathcal{B}}}_2\) are compact in \(\mathcal{G}\). The claim then follows at once from Kolmogorov’s continuity test, combined with the fact that, given a compact metric space \((\mathcal{X}, d)\) and a compact subset \(\mathcal{K}\) of a Polish space \((\mathcal{Y},{\bar{d}})\), the set \(\mathcal{C}^\beta (\mathcal{X},\mathcal{K})\) is compact in \(\mathcal{C}^\alpha (\mathcal{X},\mathcal{Y})\) for any \(\beta > \alpha \). \(\square \)

Proof of of Proposition 3.1

We apply Proposition 3.12 with the metric space \(\mathcal{Z}\) given by \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\), \(Z_n = {{\mathbf {Z}}}^{\varepsilon ,\delta _n}\) for any given sequence \(\delta _n \rightarrow 0\), and \(Z_\infty = {{\mathbf {Z}}}^\varepsilon \) as constructed in Proposition 3.11. The continuous maps \(L_k\) appearing in the statement are given by the collection of maps \((Z,{{\mathbb {Z}}}) \mapsto Z_t(x)\) and \((Z,{{\mathbb {Z}}}) \mapsto {{\mathbb {Z}}}_{s,t}(x,{\bar{x}})\) for a countable dense set of times s and t and elements \(x, {\bar{x}} \in \mathbf{R}^d\).

It follows from (3.20) and Lemma A.1 that the bound (3.21) holds for \({{\mathbf {Z}}}^{\varepsilon ,\delta }\), uniformly in \(\delta \) (but with \(\varepsilon \) fixed), so that the required tightness condition holds by Lemma 3.13. For any fixed \(\varepsilon > 0\), the convergences in probability \(Z^{\varepsilon ,\delta }_{t}(x)_{i} \rightarrow Z^{\varepsilon }_{t}(x)_{i}\) and \({{\mathbb {Z}}}^{\varepsilon ,\delta }_{s,t}(x,{\bar{x}})_{ij} \rightarrow {{\mathbb {Z}}}^{\varepsilon }_{s,t}(x,{\bar{x}})_{ij}\) were shown in Proposition 3.8. (It suffices to apply it with the choices \(f = f^\varepsilon _{i,x}\) and \(g = f^\varepsilon _{j,{\bar{x}}}\).) \(\square \)

3.3 Formulation of the main technical result

The main technical result of this article can then be formulated in the following way.

Theorem 3.14

Let \(H\in (\frac{1}{3},1)\), let Assumptions 2.12.3, and 2.7 hold, and let \(\alpha \) and \({{\mathbf {Z}}}^\varepsilon \) be as in Proposition 3.11. Then, as \(\varepsilon \rightarrow 0\), \({{\mathbf {Z}}}^\varepsilon \) converges weakly in the space of \(\alpha \)-Hölder continuous \((\mathcal{B},\mathcal{B}_2)\)-valued rough paths to a limit \({{\mathbf {Z}}}\). Furthermore, there is a Gaussian random field W as in (1.8) such that

$$\begin{aligned} Z_{s,t}(x) = W(x,t) - W(x,s),\qquad {{\mathbb {Z}}}_{s,t}(x,{\bar{x}}) = \mathbb {W}^{\mathrm{It}\hat{\mathrm{o}}}_{s,t}(x,{\bar{x}}) + \Sigma (x,{\bar{x}}) (t-s), \end{aligned}$$

where \(\mathbb {W}^{\mathrm{It}\hat{\mathrm{o}}}_{s,t}(x,{\bar{x}}) = \int _s^t Z_{s,r}(x)\otimes W({\bar{x}},dr)\), interpreted as an Itô integral and \(\Sigma \) is as in (1.6).

The proof of this result will be given in Sects. 4 and 5 below, see Proposition 5.1 which is just a slight reformulation of Theorem 3.14. We first show in Sect. 4 that the family \(\{ {{\mathbf {Z}}}^\varepsilon \}_{\varepsilon \le 1}\) is tight in a suitable space of rough paths and then identify its limit in Sect. 5.

Corollary 3.15

Under the assumptions of Theorem 3.14, the solution flow of (1.4) converges weakly to that of the Kunita-type SDE (1.7).

Proof

Define the \(\mathcal{B}\)-valued process

$$\begin{aligned} Z^{0, {\varepsilon }}_{s,t}(x) = \int _s^t F_0(x,Y^\varepsilon _r)\,dr. \end{aligned}$$

It follows from Assumption 2.7 that, for \(k \le 4\) and \(p \le p_\star \),

$$\begin{aligned} \Vert D^k Z^{0, {\varepsilon }}_{s,t}(x)\Vert _{L^p} \le \int _s^t \Vert D^k F_0(x,Y^\varepsilon _r)\Vert _{L^p}\,dr \lesssim |t-s| (1+|x|)^{-\kappa }, \end{aligned}$$

uniformly over \(\varepsilon \). This shows that the family \(\{Z^{0, {\varepsilon }}\}_{\varepsilon \le 1}\) is tight in \(\mathcal{C}^\beta ([0,1],\mathcal{B})\) for every \(\beta < 1\).

Furthermore, by the ergodic theorem which holds under Assumption 2.1, for every x,

$$\begin{aligned}\lim _{\varepsilon \rightarrow 0} Z^{0, {\varepsilon }}_{s,t}(x) = (t-s)\int _{\mathcal{Y}} F_0(x,y)\,\mu (dy), \end{aligned}$$

almost surely. Since we can choose \(\beta \) and \(\alpha \) such that \(\alpha + \beta > 1\) and \(2\beta > 1\), it follows that there is no need to control any cross terms between \(Z^{0,{\varepsilon }}\) and either \(Z^{\varepsilon }\) or \(Z^{0,{\varepsilon }}\) itself in order to be able to solve equations driven by both [27, 53]. Furthermore, since the limit of \(Z^{0,{\varepsilon }}\) is deterministic, one deduces joint convergence from Theorem 3.14.

By the continuity theorem for rough differential equations, the solutions of (1.4) written in the form (3.2) converge weakly to those of the rough differential equation

$$\begin{aligned} dX_t=\delta (X_t)\, d{{\mathbf {Z}}}_t+ \delta (X_t)\, d{{\mathbf {Z}}}_t^0. \end{aligned}$$
(3.22)

It remains to identify solutions to this equation with those of (1.7).

This is straightforward and follows as in [15, Sec 5.1] for example. Since the Gubinelli derivative \(x'\) of the solution \(X = (x,x')\) to (3.22) is given by \(\delta (x)\), the integral \(\int _0^t \delta (X_s) d{{\mathbf {Z}}}_s\) is obtained as limit of the compensated Riemann sum

$$\begin{aligned} \sum _{[u,v]\subset {{\mathcal {P}}}}&\bigl (\delta (x_u) Z_{u,v}+ (D\delta \cdot \delta )(x_u)(\mathbb {W}^{\mathrm{It}\hat{\mathrm{o}}}_{u,v} + (u-v) \Sigma )\bigr )\nonumber \\&= \sum _{[u,v]\subset {{\mathcal {P}}}} \bigl (Z_{u,v}(x_u) + (u-v)G(x_u)\bigr ) + \sum _{[u,v]\subset {{\mathcal {P}}}} (D\delta \cdot \delta )(x_u)\mathbb {W}^{\mathrm{It}\hat{\mathrm{o}}}_{u,v}, \end{aligned}$$
(3.23)

where \({{\mathcal {P}}}\) is a partition on [0, t] and \(D\delta \cdot \delta \) is as in (3.3). Since x is continuous and adapted to the filtration generated by W, the first term converges to

$$\begin{aligned} \int _0^t W(x_u,du) + \int _0^t G(x_u)\,du. \end{aligned}$$

The last term on the other hand converges to 0 in probability since it is a discrete martingale and its summands are centred random variables of variance \(\mathcal{O}(|v-u|^2)\). \(\square \)

4 Tightness of the Rough Driver as \({\varepsilon \rightarrow 0}\)

The content of this section is the proof of the following tightness result. Let \(\{{{\mathbf {Z}}}^\varepsilon \}_{\varepsilon \le 1}\) be given as in Proposition 3.11.

Proposition 4.1

Let Assumptions 2.12.3, and 2.7 hold. For \(H\in (\frac{1}{3}, \frac{1}{2}]\), there exists \( \alpha \in (\frac{1}{3}, H)\) such that the family \(\{{{\mathbf {Z}}}^\varepsilon \}_{\varepsilon \le 1}\) is tight in \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\).

For \(H \in (\frac{1}{2},1)\), if in addition \(\int F(x,y)\,\mu (dy) = 0\) for every x, then the family of rough paths \({{\mathbf {Z}}}^\varepsilon \) is tight in \(\mathcal{C}^\alpha \) for every \(\alpha \in (\frac{1}{3},\frac{1}{2})\).

It will be convenient to introduce the following notation. Given \(f,g \in E\), we use the shorthand

$$\begin{aligned} J^\varepsilon _{s,t}(f) = \varepsilon ^{\frac{1}{2} -H} Z_{s,t}^{f(Y^\varepsilon _\cdot )},\qquad {{\mathbb {J}}}^\varepsilon _{s,t}(f) = \varepsilon ^{1 -2H} {{\mathbb {Z}}}_{s,t}^{f(Y^\varepsilon _\cdot ),g(Y^\varepsilon _\cdot )}. \end{aligned}$$

We then have the following tightness criterion.

Lemma 4.2

Let \(p>d+1\). Assume that for any \(f,g\in E\), \(|s-t|\le 1\), and \(\varepsilon \in (0,1]\),

$$\begin{aligned} \Vert J^\varepsilon _{s,t}(f)\Vert _{L^p}\le C \Vert f\Vert _E |t-s|^{\alpha _0}, \qquad \big \Vert {{\mathbb {J}}}^\varepsilon _{s,t}(f,g)\big \Vert _{L^p} \le C \Vert f\Vert _E\Vert g\Vert _E |t-s|^{2\alpha _0},\ \end{aligned}$$

where \(p > 3/(3\alpha _0-1)\). Let furthermore Assumption 2.7 hold with \(\kappa > \frac{8d}{p}\). Then the family \(\{{{\mathbf {Z}}}^\varepsilon \}_{\varepsilon \le 1}\) is tight in \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) for any \( \alpha < \alpha _0-1/p\).

Proof

Recall that with the above notations, one has from (3.17)

$$\begin{aligned}\bigl (Z^\varepsilon _{s,t}(x)\bigr )_i = J^\varepsilon _{s,t}\big (F_i^*(x,\cdot )\big ),\qquad \bigl ({{\mathbb {Z}}}^\varepsilon _{s,t}(x,{\bar{x}})\bigr )_{ij} = {{\mathbb {J}}}^\varepsilon _{s,t}\big (F_i^*(x,\cdot ),F_j^*({\bar{x}},\cdot )\big ). \end{aligned}$$

By the assumption for \(|\ell | \le 3\) one has for \(|s-t|\le 1\) and \(|x-x'| \le 1\),

$$\begin{aligned} \Vert D^\ell Z^\varepsilon _{s,t}(x)-D^\ell&Z^\varepsilon _{s,t}(x')\Vert _{L^p} \lesssim \sum _i \big \Vert J^\varepsilon _{s,t}\big (D_x^\ell F_i^*(x,\cdot )-D_x^\ell F_i^*(x',\cdot )\big )\big \Vert _{L^p} \\&\lesssim \Vert D_x^\ell F(x,\cdot )-D_x^\ell F(x',\cdot )\Vert _E|t-s|^{\alpha _0}\\&\lesssim \sup _{y \in [x,x']} \Vert D_x^{\ell } F(y,\cdot )\Vert _E |x-x'| |t-s|^{\alpha _0} \\&\lesssim (1+|x|)^{-\kappa } |x-x'||t-s|^{\alpha _0}, \end{aligned}$$

where we wrote \([x,x']\) for the convex hull of \(\{x,x'\}\). Here, the last bound follows from Assumption 2.7. It then follows from Lemma A.1 that, for \({\hat{\mathcal{B}}}\) as defined in the appendix, \(\mathbf{E}\Vert Z^\varepsilon _{s,t}\Vert _{{\hat{\mathcal{B}}}}^p \lesssim |t-s|^{p \alpha _0}\). We choose \(\zeta \) to be any number in \((0, 1-d/p)\).

It similarly follows that for \(|k+\ell | \le 3\) and \(|x-x'|\le 1\)

$$\begin{aligned} \Vert D_x^k D_{{\bar{x}}}^\ell {{\mathbb {Z}}}^\varepsilon _{s,t}&(x,{\bar{x}})-D_x^k D_{{\bar{x}}}^\ell {{\mathbb {Z}}}^\varepsilon _{s,t}(x',{\bar{x}})\Vert _{L^{p/2}} \\&\lesssim \sum _{i,j} \Vert {{\mathbb {J}}}^\varepsilon _{s,t}\big (D_x^k F_i^*(x,\cdot )-D_x^k F_i^*(x',\cdot ), D_x^\ell F_j^*({\bar{x}},\cdot )\bigr )\Vert _{L^{p/2}}\\&\lesssim \sup _{y \in [x,x']} \Vert D_x^{k} F(y,\cdot )\Vert _E\Vert D_x^{\ell } F({\bar{x}},\cdot )\Vert _E |x-x'| |t-s|^{2\alpha _0}\\&\lesssim (1+|x|)^{-\kappa }(1+|{\bar{x}}|)^{-\kappa } |x-x'||t-s|^{2\alpha _0}, \end{aligned}$$

(and analogously when varying \({\bar{x}}\)). Since \(\kappa p /8 > d\) by assumption, we can again apply Lemma A.1 with p/2, thus yielding \(\mathbf{E}\Vert {{\mathbb {Z}}}^\varepsilon _{s,t}\Vert _{{\hat{\mathcal{B}}}_2}^{p/2} \lesssim |t-s|^{p\alpha _0}\). Since furthermore \(p > 3/(3\alpha _0-1)\) by assumption, the conditions of Lemma 3.13 are satisfied and the claim follows for any \(\alpha < \alpha _0 - 1/p\). \(\square \)

Proof of Proposition 4.1

The arguments are quite different for the different ranges of H, but they will always reduce to verifying the assumptions of Lemma 4.2.

First let \(H\in (\frac{1}{3}, \frac{1}{2})\). The first assumption of Lemma 4.2 follows from Proposition 4.5 below with \(\alpha _0=H\) and from the trivial bound

$$\begin{aligned}\varepsilon ^{\frac{1}{2}-\alpha _0} t^\alpha _0 \vee \sqrt{t} \lesssim t^\alpha _0,\qquad \forall \varepsilon \le 1,\quad t \le T, \end{aligned}$$

while the second assumption follows from Proposition 4.7 below. Both hold for any \(p\le p_\star /4\) where \(p_\star >\max \{4d, \frac{6}{(3H-1)}\}\), and the proofs of the propositions are the content of Sect. 4.1.

The ingredients for showing tightness of \({{\mathbf {Z}}}^\varepsilon \) where \(H\in (\frac{1}{2}, 1)\) are given in Sect. 4.2, starting with a bound on J analogous to that of Proposition 4.5. Unlike in the proof of that statement though, we do not show this by bounding the conditional variance \(\mathbf{E}\bigl (|J_{s,t}^\varepsilon (f)|^2\,|\,\mathcal{F}^Y\bigr )\). This is because, as a consequence of the lack of integrability at infinity of \(\eta ''\) when \(H > \frac{1}{2}\), it appears difficult to obtain a sufficiently good bound on it, especially for H close to 1. (In particular, the best bounds one can expect to obtain from a quantitative law of large numbers don’t appear to be sufficient when \(H > \frac{3}{4}\).) The required bounds are collected in Corollary 4.17 which yields the assumptions of Lemma 4.2 with \(\alpha _0 = \frac{1}{2}\) and arbitrary p.

Finally we take \(\alpha _0=\frac{1}{2}\) when \(H=\frac{1}{2}\), then \( \Vert J^\varepsilon _{s,t}(f)\Vert _{L^p}\le \sqrt{ \int _s^t |f(Y_r^\varepsilon )|^2} dr \lesssim C \Vert f\Vert _E \sqrt{ |t-s|}\;\) by Burkholder-Davies-Gundy inequality, and similarly the second order processes satisfies the bound:

$$\begin{aligned} \big \Vert {{\mathbb {J}}}^\varepsilon _{s,t}(f,g)\big \Vert _{L^p}&= \Big \Vert \int _0^t \int _0^{u} f(Y_v^\varepsilon )g(Y_u^\varepsilon ) dB_vdB_u+\frac{1}{2} \int _0^t f(Y_r^\varepsilon )g(Y_s^\varepsilon )\Big \Vert _{L^p} \\&\lesssim \Vert f\Vert _E\Vert g\Vert _E |t-s|, \end{aligned}$$

allowing us to again apply Lemma 4.2 and concluding the proof. \(\square \)

4.1 The low regularity case

This section consists of a number of a priori moment bounds, which we then combine at the end to provide the proof of Proposition 4.7. These uniform in \(\varepsilon \) moment bounds follow from the Hölder continuity of Y in a subspace of \(L^{p^*}\) for a sufficiently large \(p^*\), in particular ergodicity of Y does not play any role.

We will make repeated use of the following simple calculation, where we recall (3.5) for the definition of the distribution \(\eta \).

Lemma 4.3

Given \(t > 0\) and \(H < \frac{1}{2}\), let \(\Psi :[0,t]^2 \rightarrow \mathbf{R}\) be a continuous function such that for some numbers \(\varepsilon > 0\), \(\beta > -2H\), \(\gamma , \zeta > 1-2H\), and \(C, {\hat{C}}, {\bar{C}} \ge 0\) it holds that

$$\begin{aligned} |\Psi (r,r)| \le C|r|^\beta ,\quad |\Psi (s,r)-\Psi (r,r)| \le {\hat{C}}|r|^\beta \bigl (1 \wedge \textstyle {|s-r|^\gamma \over \varepsilon ^\gamma }\bigr ) + {\bar{C}} \varepsilon ^{\beta -\zeta } |s-r|^\zeta ,\nonumber \\ \end{aligned}$$
(4.1)

for all \(s,r \in [0,t]\). Then, one has the bound

$$\begin{aligned} \Big |\int _0^t \int _0^t \Psi (s,r) \eta ''(r-s)\,ds\,dr\Big | \le K\bigl ( C t^{2H + \beta } + {\hat{C}} t^{\beta +1} \varepsilon ^{2H-1} + {\bar{C}} t^{\zeta +2H} \varepsilon ^{\beta -\zeta }\big ),\nonumber \\ \end{aligned}$$
(4.2)

with the proportionality constant K depending only on \(\beta \), \(\gamma \) and \(\zeta \). The same bound holds if the upper limit of the inner integral in (4.2) is given by r instead of t.

Proof

Let I be the double integral appearing in (4.2). As a consequence of Lemma 3.6, we can rewrite it as

$$\begin{aligned} I&= -2\alpha _H \int _0^t\int _0^t \bigl (\Psi (s,r) - \Psi (r,r)\bigr )|r-s|^{2H-2}\,ds\,dr \\&\qquad + 2H \int _0^t \Psi (r,r) \bigl (|t-r|^{2H-1} + |r|^{2H-1}\bigr )\,dr {\mathop {=}\limits ^{\tiny {\hbox {def}}}}I_1 + I_2. \end{aligned}$$

We then have

$$\begin{aligned} |I_1|&\lesssim {\hat{C}}\int _0^t r^\beta \int _0^t \bigl (1 \wedge (|s-r|/\varepsilon )^\gamma \bigr ) |r-s|^{2H-2}\,ds\,dr \\&\quad + {\bar{C}} \varepsilon ^{\beta -\zeta } \int _0^t \int _0^t |r-s|^{2H+\zeta -2}\,ds\,dr \\&\lesssim {\hat{C}} \varepsilon ^{2H-1} \int _0^t r^\beta \int _{\mathbf{R}} (1\wedge |u|^\gamma ) |u|^{2H-2}\,du \,dr + {\bar{C}} \varepsilon ^{\beta -\zeta } t^{2H+\zeta }\\&\lesssim {\hat{C}} \varepsilon ^{2H-1} t^{\beta +1}+ {\bar{C}} \varepsilon ^{\beta -\zeta } t^{2H+\zeta }. \end{aligned}$$

We used the conditioned imposed on \(\beta , \gamma \) and \(\zeta \). Regarding \(I_2\), we have the bound

$$\begin{aligned} |I_2|&\lesssim C\int _0^t |r|^\beta \bigl (|t-r|^{2H-1} + |r|^{2H-1}\bigr )\,dr \\&= C t^{2H+\beta } \int _0^1 |r|^\beta \bigl (|1-r|^{2H-1} + |r|^{2H-1}\bigr )\,dr, \end{aligned}$$

and the claim follows. \(\square \)

Note that replacing \(\Psi (r,s)\) by \(\Psi _\tau (r,s) = \tau ^{2H}\Psi (\tau r,\tau s)\) and t by \(t/\tau \), the left-hand side of (4.2) is left unchanged. Regarding the bounds (4.1), such a change leads to the substitutions \(C \mapsto C \tau ^{2H+\beta }\), \(\varepsilon \mapsto \varepsilon /\tau \), \({\hat{C}} \mapsto {\hat{C}} \tau ^{2H+\beta }\), and \({\bar{C}} \mapsto {\bar{C}} \tau ^{2H+\beta }\). All three terms appearing in the right-hand side of (4.2) are invariant under these substitutions.

Remark 4.4

The proof of Lemma 4.3 works mutatis mutandis for \(\Psi \) taking values in a Banach space, for example \(L^p\). We also see that if \(\Psi \) is upper bounded by a finite sum of terms of the type (4.1) with different exponents \(\beta \) and \(\gamma \), then the bound (4.2) still holds with the corresponding sum in the right-hand side.

We perform a number of preliminary calculations. For this, it will be notationally convenient to introduce the shortcuts

$$\begin{aligned}I^\varepsilon _{s,t}(f) = \int _s^t f(Y^\varepsilon _r)\,dB_r,\qquad J^\varepsilon _{s,t}(f) = Z^{f(Y^\varepsilon _\cdot )}_{s,t} = \varepsilon ^{\frac{1}{2}-H} I^\varepsilon _{s,t}(f), \end{aligned}$$

for \(f \in E\) (with values in \(\mathbf{R}^m\)).

Proposition 4.5

Let \(H\in (\frac{1}{3}, \frac{1}{2})\) and let Assumptions 2.2 and 2.3 hold for some \(p_\star \ge 2\). Then there exists a constant C such that, uniformly over \(s\ge 0\), \(t\ge 0\), and \(f\in E\),

$$\begin{aligned} \Vert J^\varepsilon _{s,t}(f)\Vert _{L^{p_\star }}\le C \Vert f\Vert _E \bigl (\varepsilon ^{{1\over 2}-H} |t-s|^{H} \vee \sqrt{|t-s|}\bigr ). \end{aligned}$$
(4.3)

Proof

Let \(\mathcal{F}^Y\) denote the \(\sigma \)-algebra generated by all point evaluations of the process Y, and \(\mathcal{F}^Y_t\) the corresponding filtration. Write p for \(p_\star \) for brevity. Since B is independent of \(\mathcal{F}^Y\) and the \(L^p\) norm of an element of a Wiener chaos of fixed degree is controlled by its \(L^2\) norm, we have

$$\begin{aligned} \Vert I^\varepsilon _{s,t}(f)\Vert _{L^p}^2 = \big |\mathbf{E}\big (\mathbf{E}\bigl (I^\varepsilon _{s,t}(f)^p\,|\, \mathcal{F}^Y\bigr )\big )\big |^{2/p} \le c \Vert \mathbf{E}(I^\varepsilon _{s,t}(f)^2\,|\, \mathcal{F}^Y)\Vert _{L^{p/2}}, \end{aligned}$$
(4.4)

for some universal constant c depending only on p and on the degree of the Wiener chaos, so that

$$\begin{aligned} \Vert I^\varepsilon _{s,t}(f)\Vert _{L^p}^2 \lesssim \Big \Vert \int _s^t\int _s^t f(Y_r^{\varepsilon })f(Y_{r'}^\varepsilon ) \eta ''(r-r')\, dr\, dr' \Big \Vert _{L^{p/2}}. \end{aligned}$$
(4.5)

Since \(f\in E\) is in \(L^{p}\) by Assumption 2.3, it follows from Assumption 2.2 that

$$\begin{aligned}\bigl \Vert f(Y^\varepsilon _{u})f(Y^\varepsilon _{u+v}) - f(Y^\varepsilon _{u})^2\bigr \Vert _{L^{p/2}} \lesssim \Vert f\Vert _E^2 \bigl ({|v/\varepsilon |^H} \wedge 1\bigr ). \end{aligned}$$

We can therefore apply Lemma 4.3 with \(\gamma = H\) and \(\beta = 0\) so that, for \(\Vert f\Vert _E \le 1\), one has

$$\begin{aligned} \Vert I^\varepsilon _{s,t}(f)\Vert _{L^p}^2 \lesssim \varepsilon ^{2H-1}|t-s| + |t-s|^{2H}, \end{aligned}$$
(4.6)

whence the desired bound follows. (The condition \(\gamma > 1-2H\) is satisfied since \(H > \frac{1}{3}\) by assumption.) \(\square \)

We now consider the second-order process \({{\mathbb {J}}}\) given by

$$\begin{aligned} {{\mathbb {J}}}^\varepsilon _{s,t}(f,g) = \varepsilon ^{1-2H} {{\mathbb {Z}}}^{f(Y^\varepsilon _\cdot ), g(Y^\varepsilon _\cdot )}_{s,t} = \varepsilon ^{1-2H}\int _s^t\int _s^v f(Y^\varepsilon _u)\, dB(u)\, g(Y^\varepsilon _v)\,dB(v), \end{aligned}$$
(4.7)

and bound it in a similar way. Recalling that \( \langle f,g\rangle _\mu =\int _\mathcal{Y}\langle f, g\rangle d\mu \), we first obtain a bound on its expectation.

Proposition 4.6

Let \(H\in (\frac{1}{3}, \frac{1}{2})\), let Assumptions 2.2 and 2.3 hold for some \(p_\star \ge 2\), and let \(f ,g\in E\). One has

$$\begin{aligned}\Vert \mathbf{E}\big ({{\mathbb {J}}}^{\varepsilon }_{s,t}(f,g)\,|\, \mathcal{F}^Y\bigr )\Vert _{L^p} \lesssim \Vert f\Vert _E\Vert g\Vert _E \bigl ( {\varepsilon }^{1-2H} |t-s|^{2H} \vee |t-s| \bigr ), \end{aligned}$$

provided that \(2p \le p_\star \).

Proof

It follows from (4.7) that we have the identity

$$\begin{aligned}\mathbf{E}\big ({{\mathbb {J}}}^{\varepsilon }_{s,t}(f,g)\,|\, \mathcal{F}^Y\bigr ) = {\varepsilon ^{1-2H} \over 2} \int _s^t\int _s^v f(Y^\varepsilon _u) g(Y^\varepsilon _v)\,\eta ''(u-v)\, du\, dv, \end{aligned}$$

and we conclude from Lemma 4.3 and the bound \(\bigl \Vert g(Y^\varepsilon _{u})(f(Y^\varepsilon _{u+w}) - f(Y^\varepsilon _{u}))\bigr \Vert _{L^{p}} \lesssim \Vert f\Vert _E\Vert g\Vert _E \bigl ({|w/\varepsilon |^H} \wedge 1\bigr )\) exactly as above. \(\square \)

Proposition 4.7

Let \(H\in (\frac{1}{3}, \frac{1}{2})\), let Assumptions 2.2 and 2.3 hold for some \(p_\star \ge 2\), let \(f,g \in E\), and let \(2p\le p_\star \). Then there exists a constant C such that

$$\begin{aligned}\big \Vert {{\mathbb {J}}}^\varepsilon _{s,t}(f,g)\big \Vert _{L^p} \le C \Vert f\Vert _E\Vert g\Vert _E \bigl ( {\varepsilon }^{1-2H} |t-s|^{2H} \vee |t-s| \bigr ). \end{aligned}$$

If g is a constant, one obtains a stronger upper bound of the form

$$\begin{aligned} \Vert f\Vert _E|g| \big (\varepsilon ^{{1\over 2}-H} |t-s|^{\frac{1}{2} +H} \vee \varepsilon ^{1-2H}|t-s|^{2H}\big ). \end{aligned}$$

Proof

By Proposition 4.6, it suffices to obtain a bound on

$$\begin{aligned}\big \Vert {{\mathbb {J}}}^\varepsilon _{s,t}(f,g) - \mathbf{E}({{\mathbb {J}}}^\varepsilon _{s,t}(f,g)\,|\,\mathcal{F}^Y)\big \Vert _{L^p}. \end{aligned}$$

As a consequence of Proposition 3.4 and (4.4), we have the bound

$$\begin{aligned}\big \Vert {{\mathbb {J}}}^\varepsilon _{s,t}(f,g) - \mathbf{E}({{\mathbb {J}}}^\varepsilon _{s,t}(f,g)\,|\,\mathcal{F}^Y)\big \Vert _{L^p} \le \sqrt{2} \big \Vert {\tilde{{{\mathbb {J}}}}}^\varepsilon _{s,t}(f,g)\Vert _{L^p}, \end{aligned}$$

where we set

$$\begin{aligned}{\tilde{{{\mathbb {J}}}}}^\varepsilon _{s,t}(f,g) = \varepsilon ^{1-2H}\int _s^t\int _s^v f(Y^\varepsilon _u)\, dB(u)\, g(Y^\varepsilon _v)\,d{\tilde{B}}(v), \end{aligned}$$

for a fractional Brownian motion \({\tilde{B}}\) independent of B (and Y). We furthermore restrict ourselves to the case \(s=0\) and \(m=1\) without loss of generality.

At this point we note that for every \(H > \frac{1}{3}\), one has the identity

$$\begin{aligned} \mathbf{E}\Big (\big ({\tilde{{{\mathbb {J}}}}}^\varepsilon _{0,t}(f,g)\big )^2\,\Big |\, \mathcal{F}^Y\Big ) = \frac{1}{2} \int _0^t \!\int _0^t\phi _\varepsilon (s,s') \eta ''(s-s')\,ds\,ds', \end{aligned}$$
(4.8)

where we have set

$$\begin{aligned}\phi _\varepsilon (s,s') = {\varepsilon ^{2-4H}\over 2} g(Y_s^\varepsilon )g(Y_{s'}^\varepsilon ) \int _0^s \int _0^{s'} f(Y_r^\varepsilon ) f(Y_{r'}^\varepsilon )\,\eta ''(r-r')\,dr\,dr'. \end{aligned}$$

As a consequence of (4.4), we deduce from (4.8) the bound

$$\begin{aligned} \Vert {\tilde{{{\mathbb {J}}}}}^\varepsilon _{0,t}(f,g)\Vert _{L^p}^2 \lesssim \Big \Vert \int _0^t \!\int _0^t\phi _\varepsilon (s,s') \eta ''(s-s')\,ds\,ds'\Big \Vert _{L^{p/2}}. \end{aligned}$$
(4.9)

We now bound \(\phi _\varepsilon \) in such a way that Lemma 4.3 (combined with Remark 4.4) applies with \(\Psi = \phi _\varepsilon \). In order to apply this result, we first verify that the first bound in (4.1) holds. Applying Hölder’s inequality we obtain for \(2p \le p_\star \),

$$\begin{aligned} \big \Vert \phi _\varepsilon (s,s) \big \Vert _{L^{p/2}} \le C{\varepsilon }^{2-4H} \Vert g(Y^\varepsilon _s)\Vert _{L^{2p}}^2 \Big \Vert \int _0^s \int _0^{s} f(Y_r^\varepsilon ) f(Y_{r'}^\varepsilon )\,\eta ''(r-r')\,dr\,dr'\Big \Vert _{L^{p}}.\nonumber \\ \end{aligned}$$
(4.10)

Since the last factor is the same expression as the right-hand side of (4.5), it is bounded as in (4.6), thus yielding

$$\begin{aligned}\big \Vert \phi _\varepsilon (s,s) \big \Vert _{L^{p/2}} \lesssim {\varepsilon }^{2-4H} \Vert f\Vert ^2_E \Vert g\Vert ^2_E \bigl (s^{2H}\vee \varepsilon ^{2H-1} s \bigr ). \end{aligned}$$

Regarding the second bound in (4.1), we note that, for \(s'\ge s\) and \(\alpha >\frac{1}{3}\), one has

$$\begin{aligned} \phi _\varepsilon (s,s')&- \phi _\varepsilon (s,s) = \varepsilon ^{2-4H} g(Y_s^\varepsilon )g(Y_{s'}^\varepsilon ) \int _0^s \int _s^{s'} f(Y_r^\varepsilon ) f(Y_{r'}^\varepsilon )\,|r-r'|^{2H-2}\,dr\,dr' \\&+ \varepsilon ^{2-4H} g(Y_s^\varepsilon )\bigl (g(Y_{s'}^\varepsilon )-g(Y_{s}^\varepsilon )\bigr ) \int _0^s \int _0^{s} f(Y_r^\varepsilon ) f(Y_{r'}^\varepsilon )\,\eta ''(r-r')\,dr\,dr'. \end{aligned}$$

Since \(2p\le p_\star \) and \(\int _0^s \int _s^{s'} |r-r'|^{2H-2}\,dr\,dr' \lesssim |s'-s|^{2H}\), the \(L^{p/2}\) norm of the first term is of order \(\varepsilon ^{2-4H} \Vert f\Vert ^2_E\Vert g\Vert ^2_E|s'-s|^{2H}\). By Hölder’s inequality, the second term is bounded similarly to before by

$$\begin{aligned}{\varepsilon }^{2-4H} \Vert g(Y^\varepsilon _s)\Vert _{L^{2p}}\Vert g(Y^\varepsilon _s)-g(Y^\varepsilon _{s'})\Vert _{L^{2p}} \Big \Vert \int _0^s \int _0^{s} f(Y_r^\varepsilon ) f(Y_{r'}^\varepsilon )\,\eta ''(r-r')\,dr\,dr'\Big \Vert _{L^{p}}. \end{aligned}$$

By Assumptions 2.2 and 2.3, the factors involving g are bounded by

$$\begin{aligned} \Vert g\Vert _E^2 \big ({|s'-s|^H\varepsilon ^{-H}}\wedge 1\big ), \end{aligned}$$

while the remaining factor is the same is in (4.10), thus yielding a bound of the order

$$\begin{aligned} \Vert \phi _\varepsilon (s,s') - \phi _\varepsilon (s,s)\Vert _{L^{p/2}}\lesssim {\varepsilon }^{2-4H} \Vert f\Vert ^2_E \Vert g\Vert _E^2\bigl (s^{2H}\vee \varepsilon ^{2H-1} s \bigr )\big ({|s'-s|^H\varepsilon ^{-H}}\wedge 1\big ). \end{aligned}$$

Applying Lemma 4.3 (and Remark 4.4) and inserting the resulting bound into (4.9), eventually yields the bound

$$\begin{aligned} \Vert {\tilde{{{\mathbb {J}}}}}^\varepsilon _{0,t}(f,g)\Vert _{L^p}^2 \lesssim \Vert f\Vert ^2_E\Vert g\Vert ^2_E \bigl ( {\varepsilon }^{2-4H} |t|^{4H} + \varepsilon ^{1-2H} |t|^{1 +2H} + |t|^2\bigr ), \end{aligned}$$
(4.11)

as desired. (Note that the second term is bounded by the first and the last one which is why it was omitted in the statement.)

In case g is a constant, the second term in the expression for \(\phi _\varepsilon (s,s') - \phi _\varepsilon (s,s)\) vanishes identically. Since this is the term responsible for the summand proportional to \(|t|^2\) in (4.11), the claim follows. \(\square \)

4.2 The regular case \({H\in (\frac{1}{2}, 1)}\)

For the case where the slow variables are driven by a fractional Brownian motion of higher regularity, \(H>\frac{1}{2}\), we exploit the ergodicity of the fast motion even for proving tightness for the first order processes.

To prove the tightness of the processes \({{\mathbf {Z}}}_t^{\varepsilon }\), we take a different strategy and estimate higher order moments of the \(Z^\varepsilon _{s,t}\) and \({{\mathbb {Z}}}^\varepsilon _{s,t}\). This requires us to estimate the expectation of multiple integrals of the form

$$\begin{aligned} \int _0^t \dots \int _0^t \prod _{i=1}^{2p} f_i(Y^\varepsilon _{t_i} )\,dB_{t_1}\cdots dB_{t_{2p}}. \end{aligned}$$
(4.12)

For the second order processes, half of the upper limits of the integrals are given by one of the \(t_i\)’s, but since we will not need to exploit any cancellations these integrals are controlled by the bounds on the hypercube. For \(p=1\), it is easy to see that this integral is of order \(\varepsilon ^{2H-1}t\), but the case \(p=2\) is already more complicated:

$$\begin{aligned}&\mathbf{E}\int _0^t\dots \int _0^t \prod _{i=1}^4 f_i(Y^\varepsilon _{t_i} )\,dB_{t_1}\cdots dB_{t_{4}} \nonumber \\&\quad =C \int _0^t\dots \int _0^t \mathbf{E}\Big (\prod _{i=1}^4 f_i(Y^\varepsilon _{t_i})\Big ) |t_{2}-t_{1}|^{2H-2}|t_{4}-t_{3}|^{2H-2}\, dt_1\cdots dt_4. \end{aligned}$$
(4.13)

If we look at the regime \(t_1<t_2<t_3<t_4\) say and write \(P^\varepsilon _t = P_{t/\varepsilon }\), the first factor is given by

$$\begin{aligned} \mathbf{E}\Big (\prod _{i=1}^4 f_i(Y^\varepsilon _{t_i})\Big )= \int f_1 P^\varepsilon _{t_2-t_1}\big ( f_2 P^\varepsilon _{t_3-t_2} (f_3 P^\varepsilon _{t_4-t_3} f_4) \big )\,d\mu . \end{aligned}$$

Since \(f_{3,4} = f_3P^\varepsilon _{t_4-t_3} f_4\) is no longer centred, we unfortunately do not have very good bounds on this expression. One can however do better than \(\exp (-c|t_4-t_3|/\varepsilon )\): subtracting and adding the mean of \(f_{3,4}\), we can write the expression as

$$\begin{aligned} \int f_1 P^\varepsilon _{t_2-t_1}\big ( f_2 P^\varepsilon _{t_3-t_2} (f_{3,4} - {\bar{f}}_{3,4}) \big )\,d\mu + {\bar{f}}_{3,4} \,{\bar{f}}_{1,2}, \end{aligned}$$
(4.14)

where now the first term is bounded by \(\exp (-c|t_4 - t_2|/\varepsilon )\) and the second term is bounded by \(\exp (-c|t_4 - t_3|/\varepsilon -c|t_2 - t_1|/\varepsilon )\). This is still not optimal: we note this time that we can recenter \(f_{2,3,4} = f_2 P^\varepsilon _{t_3-t_2} (f_{3,4} - {\bar{f}}_{3,4})\) “for free” since \(f_1\) has mean zero, so the first term is actually of order \(\exp (-c|t_4 - t_1|/\varepsilon )\). It is then not too difficult to see that, the contribution of the second term of (4.14) to the integral (4.13) is of order \(\varepsilon ^{4H-2} t^2\), while the contribution of the first term is \(\varepsilon ^{4H-1} t\), which is of lower order for \(t \ge \varepsilon \). Our aim is to generalise such considerations to arbitrarily high moments.

In particular, the “correct” way of rewriting the factor \(\mathbf{E}\big (\prod _{i=1}^{2p} f_i(Y^\varepsilon _{t_i})\big )\) so that it yields usable bounds is in terms of its cumulants. Given a collection \(\{X_i\}_{i \in I}\) of random variables and a subset \(A \subset I\), we write \(X_A\) as a shorthand for the collection \(\{X_i\}_{i \in A}\) and \(X^A\) as a shorthand for \(\prod _{i \in A} X_i\). Given a finite set A, we write \(\mathcal{P}(A)\) for the set of partitions of A. We also write \(\mathbf{E}_c X_A\) for the joint cumulant, so that one has the identities

$$\begin{aligned} \mathbf{E}X^I = \sum _{\Delta \in \mathcal{P}(I)} \prod _{A \in \Delta } \mathbf{E}_c X_A,\qquad \mathbf{E}_c X_I = \sum _{\Delta \in \mathcal{P}(I)} C_\Delta \prod _{A \in \Delta } \mathbf{E}X^A, \end{aligned}$$
(4.15)

where \(C_\Delta = (|\Delta |-1)! (-1)^{|\Delta |-1}\).

Proposition 4.8

Let Assumptions 2.1 and 2.3 hold for \(H>\frac{1}{2}\). For any \(k \ge 2\) and any \(s_1, \dots , s_k\in [0,t]\), there exist constants \(c, C > 0\) such that the following holds. Let \(f_1,\ldots , f_k \in E \) with \(\int _\mathcal{Y}f_i \,d\mu = 0\), let \(s_1< \ldots < s_k\), and set \(X_i = f_i(Y_{s_i})\). Then, one has the bound

$$\begin{aligned} \mathbf{E}_c X_{[k]} \le C \exp \Big (-c \sum _{i,j\le k} |s_i-s_j|\Big )\prod _{i} \Vert f_i\Vert _E\; \end{aligned}$$

where [k] denotes the set \(\{1, \dots , k\}\).

Proof

Note first that since c is allowed to depend on k, it actually suffices to show that \(\mathbf{E}_c X_{[k]} \le C \exp \big (-c \sup _{i < k} |s_{i+1}-s_i|\big )\). From now on we fix \(i_\star \in \{1,\ldots ,k\}\) to be the index which realises that supremum. Let \({\tilde{Y}}\) be an independent copy of Y and set

$$\begin{aligned} {\tilde{X}}_j = \left\{ \begin{array}{cl} f_j(Y_{s_j}) &{} \text {if} j \le i_\star , \\ f_j({\tilde{Y}}_{s_j}) &{} \text {otherwise.} \end{array}\right. \end{aligned}$$

The most important property of the joint cumulant of a collection of random variables is that if it can be broken into two independent sub-collections, then the joint cumulant vanishes. As a consequence, we have

$$\begin{aligned}\mathbf{E}_c X_{[k]} = \mathbf{E}_c X_{[k]} - \mathbf{E}_c {\tilde{X}}_{[k]} = \sum _{\Delta \in \mathcal{P}(I)} C_\Delta \Big (\prod _{A \in \Delta } \mathbf{E}X^A - \prod _{A \in \Delta } \mathbf{E}{\tilde{X}}^A\Big ). \end{aligned}$$

We now put a total order on the elements of a partition \(\Delta \) by postulating that \(A_1 \le A_2\) whenever \(\inf \{a \in A_1\} \le \inf \{a \in A_2\}\) (this is just for definiteness, the actual choice of order is unimportant). We can then write the above as a telescoping sum, yielding

$$\begin{aligned} \mathbf{E}_c X_{[k]} = \sum _{\Delta \in \mathcal{P}(I)} C_\Delta \sum _{A \in \Delta } \Bigl (\mathbf{E}X^A - \mathbf{E}{\tilde{X}}^A\Bigr ) \Big (\prod _{B < A,\,B\in \Delta } \mathbf{E}X^B\Big )\Big (\prod _{B > A,\, B\in \Delta } \mathbf{E}{\tilde{X}}^B\Big ).\nonumber \\ \end{aligned}$$
(4.16)

We fix \(A \subset [k]\) such that \(\mathbf{E}X^A \ne \mathbf{E}{\tilde{X}}^A\) and write \(A = \{a_1,\ldots ,a_\ell \}\) with \(\ell = |A|\) and \(i \mapsto a_i\) increasing. We also write \(j_\star < \ell \) for the index such that \(a_{j_\star } \le i_\star \) and \(a_{j_\star +1} > i_\star \). (This necessarily exists since otherwise \(\mathbf{E}X^A = \mathbf{E}{\tilde{X}}^A\).) For \(i < \ell \) and \(n \ge 1\), we also write \(T_i :E_n \rightarrow E_{n+1}\) for the operator given by

$$\begin{aligned} T_i g = f_{a_i} P_{t_i} g,\qquad t_i {\mathop {=}\limits ^{\tiny {\hbox {def}}}}s_{a_{i+1}}-s_{a_i}, \end{aligned}$$

whose norm, as an operator from \(E_n\) to \(E_{n+1}\), is bounded by a (possibly n-dependent) multiple of \(\Vert f_{a_i}\Vert _E\), since it is of order \(\Vert f_{a_i}\Vert _E \,e^{-c t_i}\) from \(E \times E_n\) to \(E_{n+1}\), when restricted to functions of vanishing mean, by Assumption 2.1. We used the continuity of multiplication of functions. It then follows from the Markov property that

$$\begin{aligned} \mathbf{E}X^A = \int _{\mathcal{Y}} T_1\ldots T_{\ell -1} f_{a_\ell } \,d\mu , \end{aligned}$$
(4.17)

(this is easily shown by induction over \(\ell \)) while we similarly have by the definition of \({\tilde{X}}\)

$$\begin{aligned} \mathbf{E}{\tilde{X}}^A = \int _\mathcal{Y}T_1\ldots T_{i_\star -1} f_{a_{i_\star }} \,d\mu \, \int _\mathcal{Y}T_{i_\star +1}\ldots T_{\ell -1} f_{a_\ell } \,d\mu . \end{aligned}$$
(4.18)

This is because, setting \(A_1 = \{a_1,\ldots , a_{i_\star }\}\) and \(A_2 = A {\setminus } A_1\), one has \(\mathbf{E}{\tilde{X}}^A = \mathbf{E}X^{A_1} \, \mathbf{E}X^{A_2}\) by the definition of \({\tilde{X}}\), so that (4.18) follows from (4.17). Writing \(g = T_{i_\star +1}\ldots T_{\ell -1} f_{a_\ell }\in E_{\ell -i_\star }\), it follows that

$$\begin{aligned} \mathbf{E}X^A - \mathbf{E}{\tilde{X}}^A = \int T_1\ldots T_{i_\star } \Bigl (g - \int g \,d\mu \Bigr )\,d\mu . \end{aligned}$$

The spectral gap assumption (2.1) and the definition of \(i_\star \) then imply that

$$\begin{aligned}\big |\mathbf{E}X^A - \mathbf{E}{\tilde{X}}^A\big | \le C \exp (- c |s_{i_\star +1}-s_{i_\star }|) \prod _{a \in A} \Vert f_a\Vert _E. \end{aligned}$$

Combining this with (4.16) immediately leads to the claimed bound on the corresponding cumulant. \(\square \)

The first identity of (4.15) combined with Wick’s formula for the moments of Gaussians now suggest that we should rewrite the expectation of (4.12) as a sum over terms indexed by pairs \((\Delta ,\pi )\) where \(\Delta \) is a partition of [2p] arising from (4.15) and representing a product of cumulants of the \(f(Y_{t_i})\) and \(\pi \) is a pairing of [2p] arising from Wick’s formula.

Figure 1 for example represents the pairing \(\pi \) and partition \(\Delta \) given by

$$\begin{aligned} \pi&= \{(1,2), (3,4), (5, 6), (7,8), (9, 10)\},\nonumber \\ \Delta&= \{\{1,3\},\{2,4, 5\},\{6, 7, 8\},\{9,10\}\} \end{aligned}$$
(4.19)

Each pairing (ij) yields a factor \(|s_i-s_j|^{2H-2}\) while each element B of the partition yields an exponential factor of the form \(\prod _{i,j \in B}\exp \big (-c |s_i-s_j|\big )\) thanks to Proposition 4.8. Since we consider the case \(H > \frac{1}{2}\), this yields a locally integrable function in the expression for the expectation of (4.12), so our analysis mainly focuses on the large-scale behaviour. We will show then that the terms with \(\Delta = \pi \), yield a contribution of order \(\varepsilon ^{(2H-1)p} t^p\) which dominates our bound, while all other terms are of higher order in \(\varepsilon /t\). We now proceed to formalising this.

Fig. 1
figure 1

Example of pairing and partition of 10 elements

Let \(\mathcal{G}= (\mathcal{V},\mathcal{E})\) be a graph with vertex set \(\mathcal{V}\) and edge multiset \(\mathcal{E}\) (multiple edges are allowed). Edges \(e \in \mathcal{E}\) are oriented from \(e_-\) to \(e_+\) and we only consider graphs with \(e_+ \ne e_-\). We also label the vertices by two exponents \(\alpha _\pm :\mathcal{E}\rightarrow \mathbf{R}_-\). Finally, we assume that we have a “kernel assignment”, i.e. a collection of functions \(K_e:\mathbf{R}\rightarrow \mathbf{R}\) (with \(e \in \mathcal{E}\)) such that

$$\begin{aligned} |K_e(t)| \le C \bigl (|t|^{\alpha _-(e)}\mathbf {1}_{|t| \le 1} + |t|^{\alpha _+(e)}\mathbf {1}_{|t| \ge 1}\bigr ). \end{aligned}$$
(4.20)

We denote by \(\Vert K_e\Vert _e\) the smallest possible constant C appearing in the above expression. For those who are not familiar with such graph representations, such a graph will be used to encode expressions of the type

$$\begin{aligned} I_\mathcal{G}:= \int _{\mathbf{R}^{nd}} \prod _{e \in \mathcal{E}} K_e (s_{e_+} - s_{e_-}) \, \varphi (x_1, \dots , x_k) dx. \end{aligned}$$
(4.21)

Each of its nodes u represents an integration variable \(s_u\), each edge e represents a factor \(K_e\), and the resulting expression is integrated against some bounded function \(\varphi \). The exponents \(\alpha _-(e)\) and \(\alpha _+(e)\) indicate the singularity of \(K_e\) at 0 and at infinity respectively.

Definition 4.9

We say that such a labelled graph is “regular” if, for every subset \(\mathcal{V}_0 \subset \mathcal{V}\), we have

$$\begin{aligned} \sum _{e \in \mathcal{E}\,:\, e_\pm \in \mathcal{V}_0} \alpha _-(e) + |\mathcal{V}_0| > 1. \end{aligned}$$
(4.22)

The significance of this condition (also called Weinberg’s condition) is that it guarantees that the function \(K^\mathcal{G}\) on \(\mathbf{R}^\mathcal{V}\) given by

$$\begin{aligned} K^\mathcal{G}(s) = \prod _{e \in \mathcal{E}} K_{e}(s_{e_+} - s_{e_-}) \end{aligned}$$
(4.23)

is locally integrable [71] where \(s_{e_{\pm }}\) is the \(e_{\pm }\) component of \(s\in \mathbf{R}^\mathcal{V}\). (See also [28, Prop. 2.3] for a formulation closer to the one given here.)

We will be mainly interested in the large-scale behaviour here. To describe this, consider a partition \(\mathcal{P}\) of \(\mathcal{V}\). We say that such a partition is tight if there exists \(A \in \mathcal{P}\) such that \(A\cap \mathcal{V}_i \ne \emptyset \) for every connected component \(\mathcal{V}_i\) of \(\mathcal{G}\). Given \(\mathcal{P}\), we then also write \(u \sim v\) if there exists \(A \in \mathcal{P}\) with \(\{u,v\} \subset A\).

Definition 4.10

We then say that a labelled graph as above is “integrable” if

$$\begin{aligned} \sum _{e \in \mathcal{E}\,:\, e_- \not \sim e_+} \alpha _+(e) + |\mathcal{P}| < 1, \end{aligned}$$
(4.24)

for every tight partition \(\mathcal{P}\). (Note the similarity with Weinberg’s condition.)

The following is then an immediate consequence of [28, Thm 4.3].

Proposition 4.11

Let \(\mathcal{G}\) be a regular and integrable graph with m connected components. Then, there exists C depending on \(\mathcal{G}\) such that

$$\begin{aligned}\int _{[-L,L]^{\mathcal{V}}} |K^\mathcal{G}(s)|\,ds \lesssim L^m \prod _{e \in \mathcal{E}} \Vert K_{e}\Vert _{e}, \end{aligned}$$

uniformly over \(L \ge 1\), with proportionality constant depending only on the labelled graph \(\mathcal{G}\), where \( \Vert K_{e}\Vert _{e}\) is as defined by (4.20).

Remark 4.12

Our bound on the large-scale behaviour of the kernels \(K_\mathfrak {t}\) is weaker than the bound [28, Eq. 4.1] since we assume no bounds on the derivatives. The reason why the result still holds is that we assume local integrability, which avoids all renormalisation issues and therefore gets rid of regularity requirements.

An immediate, but very useful, corollary is the following.

Corollary 4.13

Let \(\mathcal{G}\) be a regular graph with m connected components and let \(L \ge 1\). Let \(\beta :\mathcal{E}\rightarrow \mathbf{R}_+\) be such that

$$\begin{aligned} \sum _{e \in \mathcal{E}\,:\, e_- \not \sim e_+} \bigl (\alpha _+(e) - \beta (e)\bigr ) + |\mathcal{P}| < 1, \end{aligned}$$
(4.25)

for every tight partition \(\mathcal{P}\). Then, there exists C depending on \(\mathcal{G}\) and \(\beta \) such that

$$\begin{aligned}\int _{[-L,L]^{\mathcal{V}}} |K^\mathcal{G}(t)|\,dt \le C L^m \prod _{e \in \mathcal{E}} L^{\beta (e)}\Vert K_{e}\Vert _{e}. \end{aligned}$$

Proof

It suffices to note that we can assume that the kernels \(K_e\) vanish outside of \([-2L,2L]\) since this does not affect the value of the integral. If we then consider the graph identical to \(\mathcal{G}\), but with its labels replaced by \(( \alpha _-, \alpha _+ - \beta )\), then (4.25) implies integrability for the new graph by (4.24). The local integrability condition (regularity) still holds, so Proposition 4.11 applies. It remains to note that since \(1 \le (L/|t|)^\beta \mathbf {1}_{1\le |t|\le L}\), decreasing \(\alpha _+(e)\) by \(\beta (e)\) in (4.20) has the effect of increasing the norm \(\Vert \cdot \Vert _e\) by (at most) a factor \((2L)^{\beta (e)}\), provided that we do consider functions supported in \([-2L,2L]\).

\(\square \)

We will make use of the following property.

Lemma 4.14

Let \({\tilde{\mathcal{G}}}\) be a graph obtained by deleting some of the edges of \(\mathcal{G}\) but without changing its connected components. If \({\tilde{\mathcal{G}}}\) is integrable, then so is \(\mathcal{G}\) itself.

Proof

This is immediate from Definition 4.10, combined with the fact that the \(\alpha _+(e)\) are negative by assumption. \(\square \)

The following simple result will also be useful.

Lemma 4.15

If \(\alpha _+(e) < -1\) for every edge e of \(\mathcal{G}\), then it is integrable.

Proof

Let \(\mathcal{P}\) be a tight partition of the vertex set \(\mathcal{V}\) of \(\mathcal{G}\) and let \(\mathcal{G}_\mathcal{P}= (\mathcal{V}_\mathcal{P},\mathcal{E}_\mathcal{P})\) denote the graph obtained by removing self-loops from \(\mathcal{G}/{\sim }\), with \(\sim \) obtained from \(\mathcal{P}\) as in (4.24). Then \(\mathcal{G}_\mathcal{P}\) is connected by the definition of tightness so that \(|\mathcal{E}_\mathcal{P}| \ge |\mathcal{V}_\mathcal{P}|-1\), which translates into \(|\{e\in \mathcal{E}\,:\, e_- \not \sim e_+\}| \ge |\mathcal{P}|-1\). Since \(\alpha _+(e) < -1\) for every edge, the bound (4.24), and therefore the desired claim, then follow at once. \(\square \)

We now use these preliminary results both to bound J and \({{\mathbb {J}}}\) and to determine their limits in the case \(H > \frac{1}{2}\).

Our main technical result is the following bound.

Proposition 4.16

Let Assumptions 2.1 and 2.3 hold for \(H>\frac{1}{2}\) and let \(\kappa \in (0,2-2H)\). For \(f,g \in E\) with \(\int f d\mu =\int g d\mu =0\), set

$$\begin{aligned} C(f,g) = \Gamma (1-2H) \bigl (\langle f,\mathcal{L}^{1-2H} g\rangle _\mu + \langle g,\mathcal{L}^{1-2H} f\rangle _\mu \bigr ). \end{aligned}$$

Then, for every \(p \ge 1\) and \(f \in E^{2p}\) with \(\int f_i \,d\mu = 0\) for every i, setting

$$\begin{aligned} I_{2p} (f) =\int _{L_1}^{M_1}\cdots \int _{L_{2p}}^{M_{2p}} \Big (\prod _{j=1}^{2p} f_j(Y_{t_j}) \Big )\Big ( \prod _{k=1}^p |t_{2k}-t_{2k-1}|^{2H-2}\Big ) \,dt_1\cdots dt_{2p}, \end{aligned}$$

there exists a constant \(K>0\) such that

$$\begin{aligned} \Big |I_{2p} (f) - \prod _{k=1}^p |[L_{2k-1},M_{2k-1}]\cap [L_{2k},M_{2k}]| C(f_{2k},f_{2k-1}) \Big | \le K L^{p-\kappa }, \end{aligned}$$
(4.26)

where \(L = \sup _{i} |M_i - L_i| \vee 1\).

Proof

We fix p and write I as a shorthand for \(I_{2p} (f)\). The properties of cumulants show that, setting \(X_i = f_i(Y_{t_i})\) as previously, I is given by

$$\begin{aligned} I&= \sum _{\Delta \in \mathcal{P}([2p])} I_\Delta ,\\ I_\Delta&{\mathop {=}\limits ^{\tiny {\hbox {def}}}}\int _{L_1}^{M_1}\cdots \int _{L_{2p}}^{M_{2p}} \Big (\prod _{A \in \Delta } \mathbf{E}_c X_A \Big )\Big ( \prod _{k=1}^p |t_{2k}-t_{2k-1}|^{2H-2}\Big ) \,dt_1\cdots dt_{2p}. \end{aligned}$$

Note first that since the \(f_i\) are centred, we have \(I_\Delta = 0\) unless \(|A| \ge 2\) for every \(A \in \Delta \). There is furthermore one special partition, namely \(\Delta _\star = \big \{\{2k-1,2k\}\,:\, k\in [p]\big \}\). For the summand generated by this ‘base’ partition we have \(I_{\Delta _\star } = \prod _{k=1}^p I(2k-1,2k)\), where we set

$$\begin{aligned}I(k,\ell ) = \int _{L_{k}}^{M_{k}} \int _{L_{\ell }}^{M_{\ell }} \mathbf{E}\big (f_{k}(Y_s)f_{\ell }(Y_t)\big )\,|t-s|^{2H-2}\,dt\,ds. \end{aligned}$$

We then note that, for \(a < b\) and \(f,g \in E\) centred, it follows from the spectral gap assumption and the fact that \(E, E_1 \subset L^2(\mu )\) by Assumption 2.3, that

$$\begin{aligned} \Big |\int _a^b \mathbf{E}\big (f(Y_0)g(Y_t)\big )\,|t|^{2H-2}\,dt - C(f,g) \mathbf {1}_{0 \in [a,b]} \Big | \lesssim \Vert f\Vert _E\Vert g\Vert _E e^{-c(|a| \wedge |b|)},\nonumber \\ \end{aligned}$$
(4.27)

for some fixed constant c. It follows from (4.27) that

$$\begin{aligned}\big |I(k,\ell ) - |[L_{k},M_{k}]\cap [L_{\ell },M_{\ell }]| C(f_{k},f_{\ell }) \big | \lesssim \int _{L_{k}}^{M_{k}} e^{-c(|L_\ell - s| \wedge |M_\ell - s|)} \,ds \lesssim 1, \end{aligned}$$

and that \(I_{\Delta _\star }\) differs from the desired expression in the statement by an error of at most \(\mathcal{O}(L^{p-1})\).

Since \(I = \sum _{\Delta \in \mathcal{P}([2p])} I_\Delta \) and we already obtained (4.26) for I replaced by \(I_{\Delta _\star }\), it remains to show that \(|I_\Delta | \lesssim L^{p-\kappa }\) for every partition \(\Delta \ne \Delta _\star \) with \(\kappa \) as in the statement. Fix such a partition \(\Delta \) from now on and write again \(\sim \) for the equivalence relation induced by \(\Delta \) on [2p]. We then define a graph \(\mathcal{G}_\Delta \) with vertex set \(\mathcal{V}= [2p]\) and edge set \(\mathcal{E}= \mathcal{E}_B \cup \mathcal{E}_\Delta \), where

$$\begin{aligned}\mathcal{E}_B = \{(2k-1,2k)\,:\, k\in [p]\},\qquad \mathcal{E}_\Delta = \{(u,v)\,:\, u\sim v\}. \end{aligned}$$

We furthermore assign kernels to the edges of \(\mathcal{G}_\Delta \) by

$$\begin{aligned} K_e(t) = \left\{ \begin{array}{cl} |t|^{2H-2} &{} \text {for} e \in \mathcal{E}_B, \\ e^{-c |t|} &{} \text {otherwise,} \end{array}\right. \end{aligned}$$
(4.28)

so that Proposition 4.8 yields the bound

$$\begin{aligned}|I_\Delta | \lesssim \int _{[0,L]^{2p}} \big |K^{\mathcal{G}_\Delta }(t)\big |\,dt. \end{aligned}$$

The kernel assignment (4.28) is consistent with the exponents given by

$$\begin{aligned}\alpha _-(e) = \left\{ \begin{array}{ll} 2H-2 &{} \text {for} e \in \mathcal{E}_B, \\ 0 &{} \text {otherwise,} \end{array}\right. \qquad \alpha _+(e) = \left\{ \begin{array}{cl} 2H-2 &{} \text {for} e \in \mathcal{E}_B, \\ -2 &{} \text {otherwise,} \end{array}\right. \end{aligned}$$

whence it immediately follows that \(\mathcal{G}_\Delta \) is regular.

It now remains to find a function \(\beta :\mathcal{E}\rightarrow \mathbf{R}_+\) allowing us to apply Corollary 4.13 to \(\mathcal{G}_\Delta \). For this, we construct a set \(\mathcal{T}\) and set \(\beta (e) = 1-\kappa \) for \(e \in \mathcal{T}\) and 0 otherwise. To construct \(\mathcal{T}\)m consider the graph \({\hat{\mathcal{G}}}_\Delta \) which has \({\hat{\mathcal{V}}}: = \Delta \) as its vertex set and such that its edge set \({\hat{\mathcal{E}}}\) is given by

$$\begin{aligned}{\hat{\mathcal{E}}} = \{(\pi _\Delta (2k-1),\pi _\Delta (2k))\,:\, \pi _\Delta (2k-1) \ne \pi _\Delta (2k)\}, \end{aligned}$$

where \(\pi _\Delta :[2p] \rightarrow \Delta \) maps an element to the unique element of the partition \(\Delta \) that contains it. In other words, \({\hat{\mathcal{G}}}_\Delta \) is obtained by quotienting \(\mathcal{G}_\Delta \) by the partition \(\Delta \) and then removing self-loops. Let now \(\mathcal{T}\subset \mathcal{E}_B\) be such that \({\hat{\mathcal{T}}} = \pi _\Delta \mathcal{T}\) is a maximal spanning forest for \({\hat{\mathcal{G}}}_\Delta \). In the case of (4.19) for example, one could take \(\mathcal{T}= \{(1,2),(5,6)\}\). With \(\kappa \) as in the statement of the proposition, we now set \(\beta (e) = 1-\kappa \) for \(e \in \mathcal{T}\) and 0 otherwise. The reason why this choice of \(\beta \) satisfies (4.25) for the graph \(\mathcal{G}_\Delta \) is that by construction the labelling \(\gamma = \alpha _+ - \beta \) is such that \(\mathcal{G}_\Delta \) contains a spanning forest \({\tilde{\mathcal{T}}}\) consisting of edges e with \(\gamma (e) = 2H-3+\kappa < -1\). (To build a reduced set of edges from \(\mathcal{E}= \mathcal{E}_B \cup \mathcal{E}_\Delta \), we start with \(\mathcal{T}\) and then connect its components using edges in \(\mathcal{E}_\Delta \).) It then remains to first apply Lemma 4.14 to reduce ourselves to considering \(\mathcal{G}_\Delta \) and then apply Lemma 4.15.

Denote now by m the number of connected components of \({\hat{\mathcal{G}}}_\Delta \) and note that since every element of \(\Delta \) is of size at least 2, \({\hat{\mathcal{G}}}_\Delta \) has at most p vertices. It follows that the number of elements in \(\mathcal{T}\) equals at most \(p-m\), so that Corollary 4.13 yields the bound \(I_\Delta \lesssim L^m L^{(1-\kappa )(p-m)} = L^{p -\kappa (p-m)}\), which is bounded by \(L^{p-\kappa }\) unless \(m=p\). Since the only partition \(\Delta \) yielding \(m=p\) is the complete pairing \(\Delta _\star \), the claim follows at once. \(\square \)

Corollary 4.17

Let the assumptions of Proposition 4.16 hold. For \(H \in (\frac{1}{2},1)\) and \(f\in E\) with \(\int f(y)\,\mu (dy) = 0\), one has for every \(p \ge 1\) the bounds

$$\begin{aligned} \Vert J_{s,t}^\varepsilon (f)\Vert _{L^{2p}} \lesssim t^{\frac{1}{2}}, \quad \Vert {{\mathbb {J}}}_{s,t}^\varepsilon (f)\Vert _{L^{p}} \lesssim t, \end{aligned}$$

uniformly over \(t \le T\) (for any fixed \(T \ge 1\)) and \(\varepsilon \le 1\).

Proof

For integer \(p \ge 1\), we note that as a consequence of Wick’s theorem and the fact that Y is independent of B, one has the identity,

$$\begin{aligned} \mathbf{E}\bigl (J_{0,t}^\varepsilon (f)\bigr )^{2p} = C_p \varepsilon ^p \int _{[0,t/\varepsilon ]^{2p}} \mathbf{E}\bigl (f(Y_{s_1})\cdots f(Y_{s_{2p}})\bigr )\prod _{k=1}^p \eta ''(s_{2k}-s_{2k-1})\,ds.\nonumber \\ \end{aligned}$$
(4.29)

where \(C_p = 2^{-p}(2p-1)!!\). For \(t\ge \varepsilon \) we apply Proposition 4.16 with \(L = t/\varepsilon \), so that

$$\begin{aligned} \Big |\mathbf{E}\bigl (J_{0,t}^\varepsilon (f)\bigr )^{2p}\Big |&\lesssim {\varepsilon }^p (t/\varepsilon )^{p-\kappa }+{\varepsilon }^p (t/\varepsilon )^{p}\lesssim \varepsilon ^\kappa t^{p-\kappa } + t^p \lesssim t^p. \end{aligned}$$

For \(t<{\varepsilon }\), (4.29) is bounded by \(C\varepsilon ^p\Vert f\Vert _{L^{2p}}^{2p}|t/\varepsilon |^{2Hp}\lesssim t^p\). One similarly obtains the bound \(\mathbf{E}\bigl ({{\mathbb {J}}}_{0,t}^\varepsilon (f,g)\bigr )^{p} \lesssim t^p\), completing the proof. \(\square \)

5 Identification of the Limit

In this section we complete the proof of Theorem 3.14 by identifying the limit (in law) of \({{\mathbf {Z}}}^\varepsilon \) as \(\varepsilon \rightarrow 0\). The proof proceeds in two steps. First, in Proposition 5.3, we show that the first-order process Z itself converges in law to a limit W with covariance given as in (1.8). In a second step, we then exploit martingale techniques, and in particular [39], to obtain convergence of the second-order process \({{\mathbb {Z}}}\) to the limit described in Theorem 3.14. Recall that, by (3.17),

$$\begin{aligned}\big (Z^\varepsilon _{s,t}(x)\big )_i = \varepsilon ^{\frac{1}{2} -H} Z^{f_{i,x}^\varepsilon }_{s,t},\quad \big ({{\mathbb {Z}}}^\varepsilon _{s,t}(x,{\bar{x}})\bigr )_{ij} {\mathop {=}\limits ^{\tiny {\hbox {def}}}}\varepsilon ^{1 -2H} {{\mathbb {Z}}}^{f_{i,x}^\varepsilon , f_{j,{\bar{x}}}^\varepsilon }_{s,t}. \end{aligned}$$

Proposition 5.1

In the setting of Proposition 4.1, the family of random rough paths \({{\mathbf {Z}}}^\varepsilon \) converges in law, as \(\varepsilon \rightarrow 0\), to the unique (in law) random rough path \({{\mathbf {Z}}}\) such that the following hold. The process Z is a \(\mathcal{B}\)-valued Wiener process with covariance given by

$$\begin{aligned} \mathbf{E}\big (Z_{s,t}(x)\otimes Z_{u,v}({\bar{x}})\big ) = |[s,t] \cap [u,v]| \, \bigl (\Sigma (x,{\bar{x}}) + \Sigma ({\bar{x}},x)^\top \bigr ), \end{aligned}$$
(5.1)

with \(\Sigma \) as defined in (1.6). The “second-order” process \({{\mathbb {Z}}}\) is the \(\mathcal{B}_2\)-valued process such that for any \(x,{\bar{x}}\) in \(\mathbf{R}^d\)

$$\begin{aligned}{{\mathbb {Z}}}_{s,t}(x, {\bar{x}}) = \int _s^t Z_{s,r}(x) \otimes dZ_{s,r}({\bar{x}}) + (t-s) \Sigma (x,{\bar{x}}), \end{aligned}$$

where the integral is interpreted in the Itô sense.

Proof

The convergence in distribution of any finite collections of the stochastic processes follows from Proposition 5.11 below. By Proposition 4.1, \((Z^\varepsilon , {{\mathbb {Z}}}^\varepsilon )\) is tight in \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\) for suitable \(\alpha \in (\frac{1}{3}, H)\), so the weak convergence holds with respect to the rough path norm on \(\mathscr {C}^\alpha ([0,T], \mathcal{B}\oplus \mathcal{B}_2)\). \(\square \)

5.1 Law of large numbers

We will need the following quantitative version of the law of large numbers. Let \(E \subset E_1\subset E_2\) be Banach spaces of functions \(\mathcal{Y}\rightarrow \mathbf{R}\) containing constants and such that pointwise multiplication from \(E \times E_1\) into \(E_2\) is continuous.

Lemma 5.2

Let \(E\subset L^4\) and \(E_2 \subset L^2\), let the spectral gap condition (2.1) hold for \(n=1,2\), and let \(f, g \in E\). Then, the bound

$$\begin{aligned} \Big \Vert \int _0^T f(Y_s) g(Y_{s+t})\,ds - T \langle f, P_t g\rangle _\mu \Big \Vert _{L^2} \lesssim \sqrt{(1+t)T}\Vert f\Vert _E \Vert g\Vert _E, \end{aligned}$$
(5.2)

holds uniformly over \(t, T \in \mathbf{R}_+\) with a proportionality constant depending only on the constants appearing in the two assumptions.

Proof

Writing \(f_s\) as a shorthand for \(f(Y_s)\) and similarly for g, the square of the left-hand side of (5.2) is given by

$$\begin{aligned}2\int _0^T\int _0^r\big ( \mathbf{E}\bigl (f_sg_{s+t}f_rg_{r+t}\bigr ) - \mathbf{E}\bigl (f_sg_{s+t}\bigr )\mathbf{E}\bigl (f_rg_{r+t}\bigr )\big )\,ds\,dr. \end{aligned}$$

Since \(E\subset L^4\), Hölder’s inequality shows that the integrand is bounded by some multiple of \(\Vert f\Vert _E^2 \Vert g\Vert _E^2\). It thus follows from the triangle inequality that the required bound follows for \(t \ge T\) so we assume \(t \le T\) from now on. Using the same bound on the integrand, we can further restrict the inner integral to impose \(s + t \le r\) at an additive cost of order at most \(tT\Vert f\Vert _E^2 \Vert g\Vert _E^2\). On that smaller domain, we can then rewrite the integrand as

$$\begin{aligned}\mathbf{E}\big (f_s g_{s+t} \big (P_{r-s-t} (f P_t g) - \langle f,P_t g\rangle _\mu \big )(Y_{s+t})\big ). \end{aligned}$$

By the spectral gap assumption applied to \(fP_tg\in E_2\), we have the bound

$$\begin{aligned}\Vert P_{r-s-t} (f P_t g) - \langle f,P_t g\rangle _\mu \Vert _{E_2} \lesssim e^{-c(r-s-t)} \Vert f P_t g\Vert _{E_2} \lesssim e^{-c(r-s-t)} \Vert f\Vert _E \Vert g\Vert _{E_1}. \end{aligned}$$

Combining this again with Hölder’s inequality, \(E\subset L^4\), and \(E_2 \subset L^2\), we conclude that the integrand is of order \(e^{-c(r-s-t)} \Vert f\Vert _E^2 \Vert g\Vert _E^2\), thus yielding a contribution to the integral of order \(T\Vert f\Vert _E^2 \Vert g\Vert _E^2\) as desired. \(\square \)

5.2 Identification of the first-order process

We treat separately the cases \(H < \frac{1}{2}\) and \(H>\frac{1}{2}\), while the case \(H = \frac{1}{2}\) is straightforward and will be considered when we put both cases together in Proposition 5.11 below.

5.2.1 The low regularity case

Let \(H\in (\frac{1}{3}, \frac{1}{2})\). Conditional on Y, the process \(J_{s,t}^{\varepsilon }(f)=\varepsilon ^{\frac{1}{2} -H}\int _s^t f(Y_r^{\varepsilon })dB(r)\) is centred Gaussian. In order to identify its limiting distribution, it thus suffices to show that its conditional covariances converge to a limit that is independent of Y. This is the content of the following result.

Proposition 5.3

Let Assumptions 2.1 and  2.3 hold for \(H\in (\frac{1}{3}, \frac{1}{2})\), and let Assumptions 2.2 hold for some \(p_\star \ge 4\). Let \(f, g \in E\) and let \(u<v\) and \(s < t\). Then, we have

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\mathbf{E}\bigl ( J^\varepsilon _{s,t}(f)J^\varepsilon _{u,v}(g)\,|\,\mathcal{F}^Y \bigr ) = |[s,t] \cap [u,v]| \,C(f,g), \end{aligned}$$

in \(L^2\), where \(C(f,g) =\frac{1}{2} \Gamma (2H+1)\bigl (\langle f, \mathcal{L}^{1-2H} g\rangle _\mu + \langle \mathcal{L}^{1-2H} f, g\rangle _\mu \bigr )\).

Proof

We work in components it is therefore again sufficient to assume that \(m=1\). We first consider the case \([u,v] = [s,t]\). A straightforward calculation similar to the one given in (3.13) shows that, setting \(\alpha _H = H(1-2H)\), one has the identity

$$\begin{aligned} \begin{aligned} \mathbf{E}&\bigl (J^\varepsilon _{s,t}(f) J^\varepsilon _{s,t}(g) \,|\, \mathcal{F}^{Y} \bigr )\\&=\frac{-\alpha _H}{2} \varepsilon ^{1-2H}\!\!\! \int _{s-t}^{t-s}\! |v|^{2H-2}\!\!\! \int _{2s+|v|}^{2t-|v|} \bigl (f(Y^\varepsilon _{u+v\over 2})g(Y^\varepsilon _{u-v \over 2}) - f(Y^\varepsilon _{\frac{u}{2}})g(Y^\varepsilon _{\frac{u}{2}})\bigr )\,du\,dv \\&\qquad +H\varepsilon ^{1-2H} \int _0^{t-s}u^{2H-1} \bigl ((fg)(Y^\varepsilon _{(2t-u)/2}) + (fg)(Y^\varepsilon _{(2s+u)/2})\bigr )\,du. \end{aligned}\nonumber \\ \end{aligned}$$
(5.3)

Provided that \(2p \le p_\star \), the size of the \(L^p\)-norm of the last term is at most of order \({\Vert f\Vert _E\Vert g\Vert _E|t-s|^{2H}\varepsilon ^{1-2H}}\), so that it does not contribute to the limit considered in the statement. (Note however that in the special case \(H = {1\over 2}\) which we do not consider here, this term is the only surviving one, which contributes to the fact that the conclusion of the lemma still holds in this case.)

Regarding the first term, we note that changing the sign of v is the same as swapping f and g. Taking \(s=0\) (by stationarity) and performing a change of variables, it remains to show that, for any fixed \(t > 0\),

$$\begin{aligned} {\varepsilon }\int _{0}^{\frac{t}{{\varepsilon }}}&\int _{v}^{{2t\over \varepsilon }-v} v^{2H-2} \bigl (f(Y_{(u+v)/2})g(Y_{(u-v)/2}) - f(Y_{\frac{u}{2}})g(Y_{\frac{u}{2}})\bigr )\,du \,dv \nonumber \\&\qquad \rightarrow -t \frac{ \Gamma (2H+1)}{ H(1-2H)} \langle f, \mathcal{L}^{1-2H} g\rangle _\mu \end{aligned}$$
(5.4)

in \(L^2\) as \(\varepsilon \rightarrow 0\). We set

$$\begin{aligned} {\hat{I}}_\varepsilon = -\frac{1}{2} {\alpha _H} {\varepsilon }\int _{0}^{\frac{t}{{\varepsilon }}} v^{2H-2} G_v\,dv, \end{aligned}$$

with

$$\begin{aligned} G_{v} = \int _{v}^{{2t\over \varepsilon }-v} \bigl (f(Y_{(u+v)/2})g(Y_{(u-v)/2}) - f(Y_{\frac{u}{2}})g(Y_{\frac{u}{2}})\bigr )\,du. \end{aligned}$$

To show that \( \lim _{\varepsilon \rightarrow 0} {\hat{I}}_\varepsilon = \frac{t}{2} \Gamma (2H+1)\langle f, \mathcal{L}^{1-2H} g\rangle _\mu \;\) in \(L^2\), we treat the values of v close to the singularity separately from the others, so we fix some (eventually sufficiently small) exponent \(\kappa \). For “small” values of v, we then have the bound

$$\begin{aligned} {\varepsilon }\Big \Vert \int _{0}^{{\varepsilon }^\kappa } v^{2H-2} G_v\,dv\Big \Vert _{L^2}&\lesssim t \Vert f\Vert _E\Vert g\Vert _E \int _{0}^{{\varepsilon }^\kappa }v^{2H-2} \bigl (1 \wedge v^H\bigr )\,dv \\&\lesssim \varepsilon ^{(3H-1)\kappa } t \Vert f\Vert _E\Vert g\Vert _E, \end{aligned}$$

which converges to 0 as desired for any fixed \(\kappa > 0\) since \(H > \frac{1}{3}\).

For the remaining values of v, we apply Lemma 5.2, which yields the bound

$$\begin{aligned} \Big \Vert G_{v} - 2\Big ({t\over \varepsilon }-|v|\Big )\bigl (\langle P_v f,g\rangle _\mu -\langle f,g\rangle _\mu \bigr )\Big \Vert _{L^2} \lesssim \sqrt{\varepsilon ^{-1}(1+|v|)(t-\varepsilon |v|)} \Vert f\Vert _E\Vert g\Vert _E. \end{aligned}$$

For \(\kappa <\frac{1}{ 2(1-2H)}\), we furthermore have the bound

$$\begin{aligned} \varepsilon \int _{\varepsilon ^\kappa }^{\frac{t}{\varepsilon }}&v^{2H-2}\sqrt{\varepsilon ^{-1}(1+|v|)(t-\varepsilon |v|)}\,dv \lesssim \sqrt{\varepsilon t} \int _{\varepsilon ^\kappa }^{\frac{t}{\varepsilon }} \bigl (v^{2H-2} + v^{2H-\frac{3}{2}}\bigr )\,dv \\&\lesssim \sqrt{\varepsilon t} \Bigl (\int _{\varepsilon ^\kappa }^{\infty } v^{2H-2}\,dv + \int _{0}^{\frac{t}{\varepsilon }} v^{2H-\frac{3}{2}}\,dv\Bigr ) \lesssim \varepsilon ^{\frac{1}{2} -\kappa (1-2H)}\sqrt{t} + \varepsilon ^{1-2H} t^{2H}, \end{aligned}$$

which converges to 0 as \(\varepsilon \rightarrow 0\) for every fixed t. We conclude that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}{\hat{I}}_\varepsilon&= -\alpha _H \lim _{\varepsilon \rightarrow 0} \int _{\varepsilon ^\kappa }^{\frac{t}{\varepsilon }} v^{2H-2} \big ( t- \varepsilon |v|\big )\bigl (\langle P_v f,g\rangle _\mu -\langle f,g\rangle _\mu \bigr )\,dv \\&= -\alpha _H \lim _{\varepsilon \rightarrow 0} \int _{0}^{\infty } v^{2H-2} t \bigl (\langle P_v f,g\rangle _\mu -\langle f,g\rangle _\mu \bigr )\,dv \\&= {t\over 2}\Gamma (2H+1)\langle f, \mathcal{L}^{1-2H} g\rangle _\mu , \end{aligned}$$

holds in \(L^2\), as claimed.

We now consider the case when \([s,t) \cap [u,v) = \emptyset \) and assume without loss of generality that \(t \le u\). We then have

$$\begin{aligned} \big |\mathbf{E}\bigl (I^\varepsilon _{s,t}(f) I^\varepsilon _{u,v}(g) \,|\, \mathcal{F}^{Y} \bigr ) \big |&= \alpha _H\Big |\int _s^t \int _{u}^{v} |r-{\bar{r}}|^{2H-2} f(Y^\varepsilon _r)g(Y^\varepsilon _{{\bar{r}}})\,d{\bar{r}}\,dr\Big |\\&\lesssim \int _s^t \int _{u}^{v} |r-{\bar{r}}|^{2H-2}\,d{\bar{r}}\,dr < \infty . \end{aligned}$$

Since this is multiplied by \(\varepsilon ^{1-2H}\), it follows that \(\mathbf{E}\bigl (J^\varepsilon _{s,t}(f) J^\varepsilon _{u,v}(g) \,|\,\mathcal{F}^Y \bigr ) \rightarrow 0\) in that case. The general case then follows immediately since we have \(J^\varepsilon _{s,t}(f) = J^\varepsilon _{s,u}(f) + J^\varepsilon _{u,t}(f)\) for any \(s\le u\le t\), so that it can be reduced to the two cases we just treated. \(\square \)

5.2.2 The high regularity case

Let \(H>\frac{1}{2}\). We first show that the process Z is Gaussian with covariance given by (5.1). For this we recall relations between cumulants and expectations. Fix a finite set A as well as elements \(f_a \in E\) and intervals \([s_a, t_a] \subset \mathbf{R}\) for every \(a \in A\). Given a subset \(B \subset A\), we write \(\mathcal{G}(B)\) for the set of pairs \((\Delta ,p)\) where \(\Delta \in \mathcal{P}(B)\) is a partition of B without singletons and p is a pairing of B (i.e. \(p\in \mathcal{P}(B)\) contains only sets of size two). We also write \([s,t]_B \subset \mathbf{R}^B\) for the domain . Given \(G = (\Delta , p) \in \mathcal{G}(B)\), we then set

$$\begin{aligned} J_G^\varepsilon = (-\varepsilon ^{\frac{1}{2}-H}\alpha _H)^{|B|}\int _{[s,t]_B} \prod _{B' \in \Delta } \mathbf{E}_c \big (f_a(Y_{r_a}^\varepsilon )\,:\, a\in B'\big ) \prod _{\{a,b\} \in p} |r_a - r_b|^{2H-2}\,dr. \end{aligned}$$

In order to extract a formula for the joint cumulants of the \(J^\varepsilon _{s,t}\)’s, we note that if \(B_1 \cap B_2 = \emptyset \) and \(G_i \in \mathcal{G}(B_i)\), one has

$$\begin{aligned} J_{G_1 \sqcup G_2}^\varepsilon = J_{G_1}^\varepsilon \cdot J_{G_2}^\varepsilon , \end{aligned}$$
(5.5)

where \(G_1 \sqcup G_2 \in \mathcal{G}(B_1 \sqcup B_2)\) denotes the natural concatenation of \(G_1\) and \(G_2\). We furthermore write \(\mathcal{G}_c(B) \subset \mathcal{G}(B)\) for the set of “connected” elements, namely those pairs \(G = (\Delta ,p)\) such that \(G^\vee {\mathop {=}\limits ^{\tiny {\hbox {def}}}}\Delta \vee p = \{B\}\), where we use the usual lattice structure of the set of partitions of B.

Lemma 5.4

Let Assumptions 2.3 holds for \(H>\frac{1}{2}\) and \(f_a\in E\), then for every \(B\subset A\),

$$\begin{aligned}\mathbf{E}_c \big (J^\varepsilon _{s_a,t_a}(f_a)\,:\, a \in B\big ) = \sum _{G \in \mathcal{G}_c(B)} J_G^\varepsilon . \end{aligned}$$

Proof

As a consequence of (5.5), we can write

$$\begin{aligned} \mathbf{E}\prod _{a \in A} J^\varepsilon _{s_a,t_a}(f_a)&= \sum _{G \in \mathcal{G}(A)} J_G^\varepsilon = \sum _{\Delta ' \in \mathcal{P}(A)} \sum _{G \in \mathcal{G}(A)\,:\, G^\vee = \Delta '} J_G^\varepsilon \\&= \sum _{\Delta ' \in \mathcal{P}(A)} \prod _{B \in \Delta '} \sum _{G \in \mathcal{G}_c(B)} J_G^\varepsilon . \end{aligned}$$

Comparing this to the first identity in (4.15), we conclude the proof. \(\square \)

This allows to conclude:

Lemma 5.5

Let Assumptions 2.1 and 2.3 hold for \(H > \frac{1}{2}\). Let \(f_a\in E\) with \(\int f_a d\mu =0\). Then, for any index set B with more than two elements,

$$\begin{aligned} \mathbf{E}_c \big (\lim _{\varepsilon \rightarrow 0} J^\varepsilon _{s_a,t_a}(f_a)\,:\, a \in B\big ) =0. \end{aligned}$$
(5.6)

Consequently, the limits \(\{J_{s_a,t_a}(f_a)\}_{a\in B}\) are jointly Gaussian with covariance \(|[s_a,t_a] \cap [s_b,t_b]| \,C(f_a,f_b)\).

Proof

Let \(G = (\Delta ,p) \in \mathcal{G}_c(B)\) with |B| even and note that \(J_G^\varepsilon \) can be written as

$$\begin{aligned} J_G^\varepsilon = (-\alpha _H)^{|B|}\varepsilon ^{|B|/2}\int _{[\frac{s}{{\varepsilon }},\frac{t}{{\varepsilon }}]_B} \prod _{{\bar{A}} \in \Delta } \mathbf{E}_c \big (f_a(Y_{r_a})\,:\, a\in {\bar{A}}\big ) \prod _{\{a,b\} \in p} |r_a - r_b|^{2H-2}\,dr. \end{aligned}$$

We then note that if \(|B| > 2\), elements \((\Delta ,p) \in \mathcal{G}_c(B)\) are always such that \(\Delta \ne p\). We can therefore apply Proposition 4.16 with \(2p=|B|\), which shows that

$$\begin{aligned}|J_G^\varepsilon | \lesssim \varepsilon ^{|B|/2}\varepsilon ^{\kappa -\frac{|B|}{2}}, \end{aligned}$$

which converges to 0, thus yielding (5.6) as claimed. By Proposition 4.16, for any \(f_i \in E\), \(u<v\) and \(s < t\) we have

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\Big |\mathbf{E}\bigl ( J^\varepsilon _{s,t}(f_i)J^\varepsilon _{u,v}(f_j)\bigr ) - |[s,t] \cap [u,v]| \,C(f_i,f_j)\Big | = 0. \end{aligned}$$

Since Gaussian processes are characterised by the fact that their joint cumulants of order three or higher all vanish, the last claim follows. \(\square \)

5.3 Convergence of the second-order process

In this section we assume that \(H\in (\frac{1}{3},1)\). If \(H=\frac{1}{2}\), \(J_t^{\varepsilon }= \int _0^t f(Y_r^{\varepsilon }) dB_r\) is already a local martingale, otherwise we make a decomposition as follows. Write \({\bar{B}}^t_r = \mathbf{E}\bigl (B_r - B_t \,|\, \mathcal{F}_t^B\bigr )\) for \(r > t\) and write

$$\begin{aligned} J_t^\varepsilon (f) =\varepsilon ^{\frac{1}{2}-H} \int _0^t f(Y_r^{\varepsilon }) dB_r=M_t^f + R_t^f, \end{aligned}$$

where, setting \({\bar{f}} = \int _\mathcal{Y}f(x)\,\mu (dx)\) and \({\tilde{f}} = f - {\bar{f}}\),

$$\begin{aligned} M_t^f&= \varepsilon ^{\frac{1}{2}-H}\int _0^t {\tilde{f}}(Y_r^\varepsilon )\,dB_r + \varepsilon ^{\frac{1}{2}-H}\int _t^\infty \bigl (P_{r-t \over \varepsilon }{\tilde{f}}\bigr )(Y_t^\varepsilon )\,d{\bar{B}}^t_r, \end{aligned}$$
(5.7)
$$\begin{aligned} R_t^f&= \varepsilon ^{\frac{1}{2}-H}{\bar{f}} B_t - \varepsilon ^{\frac{1}{2}-H}\int _t^\infty \bigl (P_{r-t \over \varepsilon }{\tilde{f}}\bigr )(Y_t^\varepsilon )\,d{\bar{B}}^t_r. \end{aligned}$$
(5.8)

The convergence of the integral is guaranteed by the fact that \(|\dot{{\bar{B}}}_t^r|\lesssim {(r-t)^{(H-1)-}}\) and \(P_t {\tilde{f}} \rightarrow 0\) exponentially fast, thanks to the centering condition on \({\tilde{f}}\). This clearly illustrates why have no need to assume that the coefficient \(F(x,\cdot )\) itself is centred for \(H < \frac{1}{2}\) since the first term in \(R_t^f\) obviously converges to 0 in that case. For \(H=\frac{1}{2}\), we write \(M_t^f=J_t^{\varepsilon }(f)\) and \(R_t^f=0\).

Lemma 5.6

Let \(H\in (\frac{1}{3}, 1)\), let \(p>1\) and \(E\subset L^{2p}\), and let (2.1) holds for \(n=1\) For every \(f \in E\) and and every \(t \ge 0\), one has \(\lim _{\varepsilon \rightarrow 0} R_t^f = 0\) in \(L^q\) for any \(q\in [1,p)\). Furthermore, \(\bigl \Vert \varepsilon ^{-H}\int _t^\infty \bigl (P_{r-t \over \varepsilon }{\tilde{f}}\bigr )(Y_t^\varepsilon )\,d{\bar{B}}^t_r\Vert _{L^q} \lesssim \Vert {\tilde{f}}\Vert _E\).

Proof

We only need to prove if for \(H\not = \frac{1}{2}\). The first term in the definition of \(R_t^f\) obviously converges to 0. The scale and shift invariance of fractional Brownian motion shows that, in law, the second term \(\varepsilon ^{\frac{1}{2}-H}\int _t^\infty \bigl (P_{r-t \over \varepsilon }{\tilde{f}}\bigr )(Y_t^\varepsilon )\,d{\bar{B}}^t_r\) equals

$$\begin{aligned}\varepsilon ^{\frac{1}{2}}\int _0^\infty \bigl (P_{r}{\tilde{f}}\bigr )(Y) d{\bar{B}}^0_r, \end{aligned}$$

with Y a random variable with law \(\mu \), independent of \({\bar{B}}^0\). Note now that \(\dot{{\bar{B}}}^0\) is Gaussian and

$$\begin{aligned}\mathbf{E}|\dot{{\bar{B}}}^0_r|^2 \propto |r|^{2H-2}. \end{aligned}$$

Furthermore since \(f\in E\), \(\Vert P_tf\Vert _{L^p}\lesssim |f|_E\) and \(\Vert P_tf\Vert _{L^1}\rightarrow 0\), \(\Vert P_tf\Vert _{L^q}\rightarrow 0\) for any \(q\in [1,p)\), so that by Cauchy–Schwarz,

$$\begin{aligned} \Bigl \Vert \int _0^\infty \bigl (P_{r}{\tilde{f}}\bigr )(Y) d{\bar{B}}^0_r\Bigr \Vert _{L^p}&\lesssim \int _0^\infty \Vert \bigl (P_{r}{\tilde{f}}\bigr )(Y)\Vert _{L^{2p}}|r|^{H-1}\,dr \\&\lesssim \Vert {\tilde{f}}\Vert _E \int _0^\infty e^{-cr} |r|^{H-1}\,dr < \infty , \end{aligned}$$

and the claim follows. \(\square \)

Lemma 5.7

Let \(H\in (\frac{1}{3}, 1) {\setminus } \{\frac{1}{2}\}\). Let Assumption 2.1 hold for \(n=1\), let Assumption 2.2 hold for some \(p_\star > 2\), and let Assumption 2.3 hold.

Let \(f\in E\), then the process \(M_t^f\) as defined above is an \(L^p\) bounded \(\mathcal{F}_t^Y\vee \mathcal{F}_t^B\)-martingale for every \(p<p_\star \) with the convention that \(p_\star =\infty \) for \(H>\frac{1}{2}\)).

Proof

For \(T> 0\), we define the \(\mathcal{F}_t^Y\vee \mathcal{F}_t^B\)-martingale \(M_t^{f,T}\) by

$$\begin{aligned}M_t^{f,T} = \varepsilon ^{\frac{1}{2}-H}\mathbf{E}\Bigl (\int _0^T {\tilde{f}}(Y_r^\varepsilon )\,dB_r\,\Big |\, \mathcal{F}_t^Y\vee \mathcal{F}_t^B\Bigr ), \end{aligned}$$

and we note that for \(T > t\) one has

$$\begin{aligned} M_t^{f,T} = \varepsilon ^{\frac{1}{2}-H}\int _0^{t} {\tilde{f}}(Y_r^\varepsilon )\,dB_r + \varepsilon ^{\frac{1}{2}-H}\int _{t}^T \bigl (P_{r-t \over \varepsilon }{\tilde{f}}\bigr )(Y_t^\varepsilon ) d{\bar{B}}^t_r. \end{aligned}$$

Since \(\Vert P_t {\tilde{f}}\Vert _{E_1} \rightarrow 0\) exponentially fast, it follows from Lemma 5.6 that \(M_t^{f} = \lim _{T \rightarrow \infty } M_t^{f,T}\) in \(L^p\), so that \(M_t^{f}\) is a local martingale. Since the first term of (5.7) is bounded in \(L^p\) (by Proposition 4.5 for \(H\in (\frac{1}{3}, \frac{1}{2})\), and for \(H>\frac{1}{2}\) from Corollary 4.17 which applies since \({\tilde{f}}\) is centred), and the second term converges in \(L^p\) by Lemma 5.6, the claim follows. \(\square \)

Remark 5.8

For \(H<\frac{1}{2}\), we only used the integrability condition \(E\subset L^{p_\star }\) in Lemma 5.7, the condition \(E_2\subset L^2\) from Assumption 2.3 is not needed (nor is it needed in Proposition 4.5).

For \(H\not =\frac{1}{2}\), we can then rewrite \({{\mathbb {J}}}\) as

$$\begin{aligned} {{\mathbb {J}}}_{0,t}^\varepsilon (f,g) = \int _0^t J^\varepsilon _s(f)\, dM_s^g + J_t^\varepsilon (f) R_t^g - \int _0^t R_s^g\,dJ_s^\varepsilon (f). \end{aligned}$$
(5.9)

Here, the first integral is an Itô integral, while the integral \(\int _0^t J_s^{\varepsilon }(f) dR_s^f\) should be interpreted as the limit, as \(\delta \rightarrow 0\), of the corresponding expression with B replaced by a mollified version \(B^\delta \), to which integration by parts is applied.

Lemma 5.9

Let \(H\in (\frac{1}{3}, \frac{1}{2})\cup (\frac{1}{2}, 1)\). Let Assumptions 2.12.3 hold and let \(f \in E\). Then, one has

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\mathbf{E}\Big ( \int _0^t R_s^g\,dJ_s^\varepsilon (f)\;\Big |\; \mathcal{F}^Y \Big ) = -\frac{1}{2} t \Gamma (2H+1)\langle \mathcal{L}^{1-2H}g, f\rangle _\mu , \end{aligned}$$

in probability.

Proof

Let us first write \(R_t^g = J_t^\varepsilon ({\bar{g}}) - {\tilde{R}}_t^g\) where

$$\begin{aligned} {\tilde{R}}_t^g=\varepsilon ^{\frac{1}{2}-H}\int _t^\infty \bigl (P_{r-t \over \varepsilon }{\tilde{g}}\bigr )(Y_t^\varepsilon )\,d{\bar{B}}^t_r, \end{aligned}$$

and note that

$$\begin{aligned} \int _0^t J_s^\varepsilon ({\bar{g}})\,dJ_s^\varepsilon (f) = J_t^\varepsilon ({\bar{g}})\,J_t^\varepsilon (f) - {{\mathbb {J}}}_{0,t}^\varepsilon (f,{\bar{g}}), \end{aligned}$$
(5.10)

for \(H<\frac{1}{2}\) (vanishes for \(H>\frac{1}{2}\) since then \({\bar{g}} = 0\)). Since \(J_t^\varepsilon ({\bar{g}}) = \varepsilon ^{\frac{1}{2}-H} {\bar{g}}\, B_t\), we conclude from Proposition 4.5 that the first term on the right hand side converges to 0 in probability. Since \({\bar{g}}\) is constant, the second part of Proposition 4.7 implies that the second term also converges to 0 in probability for \(H\in (\frac{1}{3}, \frac{1}{2})\), so that it remains to obtain the limit of \(\int _0^t {\tilde{R}}_s^g\,dJ_s^\varepsilon (f)\). For \(H \ne \frac{1}{2}\) we have the identity

$$\begin{aligned} \mathbf{E}\Big ( \int _0^t {\tilde{R}}_s^g\,dJ_s^\varepsilon (f)\, \Big |\, \mathcal{F}^Y \Big )&= \varepsilon ^{1-2H}\mathbf{E}\Bigl ( \int _0^t \int _s^\infty \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )(Y_s^\varepsilon )d{\bar{B}}_r^s f(Y_s^\varepsilon )\,dB_s\;\Big |\; \mathcal{F}^Y\Bigr )\\&= {\varepsilon ^{1-2H} \over 2} \int _0^t \int _s^\infty \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )(Y_s^\varepsilon ) f(Y_s^\varepsilon )\,\eta ''(s-r)\,dr\,ds\\&=H(2H-1)\int _0^t \int _0^\infty \bigl (P_{r} {\tilde{g}} - \mathbf {1}_{H < \frac{1}{2}}{\tilde{g}}\bigr )(Y_s^\varepsilon )\,r^{2H-2}\,dr f(Y_s^\varepsilon )\,ds \\&=\frac{1}{2} \Gamma (2H+1) \int _0^t \big \langle \mathcal{L}^{1-2H} g, f\rangle _\mu (Y_s^\varepsilon )\,ds . \end{aligned}$$

We have used the fact that the difference between \(B_r-B_s\) and \({\bar{B}}_r^s\) is independent of \(B_s\). The claim now follows from Birkhoff’s ergodic theorem (or the quantitative version given in Lemma 5.2). \(\square \)

Proposition 5.10

Let \(H\in (\frac{1}{3}, \frac{1}{2})\cup (\frac{1}{2}, 1)\), let Assumptions 2.12.3 hold, and let \(f\in E\). One has

$$\begin{aligned}\lim _{\varepsilon \rightarrow 0} \int _0^t R_s^g\,dJ_s^\varepsilon (f) = - {t\over 2}\Gamma (2H-1) \langle \mathcal{L}^{1-2H}g, f\rangle , \end{aligned}$$

in probability.

Proof

Since the expectation converges by Lemma 5.9, it remains to show that the variance vanishes as \(\varepsilon \rightarrow 0\). As in the proof of Proposition 4.7, we can fix a realisation of Y and use Proposition 3.4 to reduce ourselves to the case where, conditional on Y, \(R_t^g\) and \(J_t^\varepsilon (f)\) are independent. As a consequence of the proof of Lemma 5.9, \( \int _0^t J_s^\varepsilon ({\bar{g}})\,dJ_s^\varepsilon (f)\rightarrow 0\) in probability, so it suffices to bound the conditional variance of the term with \(R_t^g\) replaced by \({\tilde{R}}_t^g\).

Writing \(A_t = \int _0^t {\tilde{R}}_s^g\,dJ_s^\varepsilon (f)\) (and assuming that \({\tilde{R}}^g\) and \(J^\varepsilon (f)\) are driven by independent fractional Brownian motions), we then have as in (4.8) the identity

$$\begin{aligned}\mathbf{E}\big (A_t^2\,|\, \mathcal{F}^Y\big ) = {1\over 2} \int _0^t \int _0^t \phi _\varepsilon (s,s') \eta ''(s-s')\,ds'\,ds \end{aligned}$$

but this time we have

$$\begin{aligned} \phi _\varepsilon (s,s') = \varepsilon ^{2-4H} f_{s'}^\varepsilon f_s^\varepsilon \int _s^\infty \!\!\!\int _{s'}^\infty \bigl (P_{r'-s'\over \varepsilon } {\tilde{g}}\bigr )_{s'}^\varepsilon \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon C_{s\wedge s'}(r,r')\, dr'\,dr, \end{aligned}$$
(5.11)

where we use the shorthand \(f_s^\varepsilon = f(Y_s^\varepsilon )\) and where

$$\begin{aligned}C_{s\wedge s'}(r,r') = \mathbf{E}\dot{{\bar{B}}}^s_r\dot{{\bar{B}}}^{s'}_{r'} = \int _{-\infty }^{s\wedge s'} (r-u)^{H-\frac{3}{2}}(r'-u)^{H-\frac{3}{2}}\,du. \end{aligned}$$

Note that this holds for any \( (\frac{1}{3}, \frac{1}{2})\cup (\frac{1}{2}, 1)\) and for \(r, r'\ge s\), we have the bound \(|C_s(r,r')| \lesssim |r-s|^{H-1}|r'-s|^{H-1}\), which will be used repeatedly below. As a consequence of this, the \(s \leftrightarrow s'\) symmetry of the integrand in (5.11) and Assumptions 2.1, we obtain for \(p_\star \ge 4\) the upper bound

$$\begin{aligned} \Vert \phi _\varepsilon (s,s)\Vert _{L^1}&\lesssim \varepsilon ^{2-4H} \int _s^\infty e^{-c(r-s)/\varepsilon } \int _{s}^r C_{s}(r,r')\, dr'\,dr\nonumber \\&\lesssim \varepsilon ^{2-4H} \int _s^\infty e^{-c(r-s)/\varepsilon } \int _{s}^r |r'-s|^{H-1} |r-s|^{H-1}\, dr'\,dr\nonumber \\&\lesssim \varepsilon ^{2-4H} \int _s^\infty e^{-c(r-s)/\varepsilon } |r-s|^{2H-1} \,dr\nonumber \\&= \varepsilon ^{2-2H} \int _0^\infty e^{-cr} r^{2H-1} \,dr \lesssim \varepsilon ^{2-2H}, \end{aligned}$$
(5.12)

with constant proportional to \(\Vert f\Vert _E^2 \Vert g\Vert _E^2\). When \(H > \frac{1}{2}\), using the local integrability of \(\eta ''\), this is sufficient to conclude that

$$\begin{aligned}\Vert \mathbf{E}\big (A_t^2\,|\, \mathcal{F}^Y\big )\Vert _{L^1} \lesssim \varepsilon ^{2-2H} \Big |\int _0^t \int _0^t \eta ''(s-s')\,ds'\,ds\Big | \lesssim \varepsilon ^{2-2H} t^{2H}, \end{aligned}$$

which does indeed converge to 0 as \(\varepsilon \rightarrow 0\) as desired.

It remains to consider the case \(H < \frac{1}{2}\), so we restrict ourselves to this case from now on. Regarding \(\delta \phi _\varepsilon (s,s') = \phi _\varepsilon (s,s')- \phi _\varepsilon (s,s)\) for \(s' > s\) (the case \(s < s'\) is analogous), we write \(\delta \phi _\varepsilon = \sum \delta \phi _\varepsilon ^{(i)}\) with

$$\begin{aligned} \delta \phi _\varepsilon ^{(1)}(s,s')&= \varepsilon ^{2-4H} f_{s'}^\varepsilon f_s^\varepsilon \int _s^{s'} \int _{s'}^{\infty } \bigl (P_{r'-s'\over \varepsilon } {\tilde{g}}\bigr )_{s'}^\varepsilon \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon C_{s}(r,r')\, dr'\,dr, \\ \delta \phi _\varepsilon ^{(2)}(s,s')&= -\varepsilon ^{2-4H} f_{s}^\varepsilon f_s^\varepsilon \int _s^{s'} \int _{s}^{s'} \bigl (P_{r'-s\over \varepsilon } {\tilde{g}}\bigr )_{s}^\varepsilon \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon C_{s}(r,r')\, dr'\,dr, \\ \delta \phi _\varepsilon ^{(3)}(s,s')&= -\varepsilon ^{2-4H} f_{s}^\varepsilon f_s^\varepsilon \int _s^{s'} \int _{s'}^{\infty } \bigl (P_{r'-s\over \varepsilon } {\tilde{g}}\bigr )_{s}^\varepsilon \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon C_{s}(r,r')\, dr'\,dr, \\ \delta \phi _\varepsilon ^{(4)}(s,s')&= -\varepsilon ^{2-4H} f_{s}^\varepsilon f_s^\varepsilon \int _{s'}^\infty \int _{s}^{s'} \bigl (P_{r'-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon C_{s}(r,r')\, dr'\,dr, \\ \delta \phi _\varepsilon ^{(5)}(s,s')&= \varepsilon ^{2-4H} \bigl (f_{s'}^\varepsilon -f_{s}^\varepsilon \bigr ) f_s^\varepsilon \int _{s'}^\infty \int _{s'}^{\infty } \bigl (P_{r'-s'\over \varepsilon } {\tilde{g}}\bigr )_{s'}^\varepsilon \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon C_{s}(r,r')\, dr'\,dr, \\ \delta \phi _\varepsilon ^{(6)}(s,s')&= \varepsilon ^{2-4H} f_s^\varepsilon f_s^\varepsilon \int _{s'}^\infty \int _{s'}^{\infty } \bigl (P_{r'-s'\over \varepsilon } {\tilde{g}}-P_{r'-s\over \varepsilon } {\tilde{g}}\bigr )_{s'}^\varepsilon \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon C_{s}(r,r')\, dr'\,dr, \\ \delta \phi _\varepsilon ^{(7)}(s,s')&= \varepsilon ^{2-4H} f_s^\varepsilon f_s^\varepsilon \int _{s'}^\infty \int _{s'}^{\infty } \big (\bigl (P_{r'-s\over \varepsilon } {\tilde{g}}\bigr )_{s'}^\varepsilon -\bigl (P_{r'-s\over \varepsilon } {\tilde{g}}\bigr )_{s}^\varepsilon \big ) \bigl (P_{r-s\over \varepsilon } {\tilde{g}}\bigr )_s^\varepsilon C_{0}(r,r')\, dr'\,dr. \end{aligned}$$

We obtain the bound

$$\begin{aligned} \Vert \delta \phi _\varepsilon ^{(1)}(s,s')\Vert _{L^1}&\lesssim \varepsilon ^{2-4H}\int _{s'}^{\infty } e^{-c(r'-s')/\varepsilon }|r'-s|^{H-1} \int _s^{s'} |r-s|^{H-1}\, dr\,dr' \\&\lesssim \varepsilon ^{2-4H}|s-s'|^H \int _{s'}^{\infty } e^{-c(r'-s')/\varepsilon } |r'-s'|^{H-1}\, dr' \\&\lesssim \varepsilon ^{2-3H}|s-s'|^H, \end{aligned}$$

and similarly for \(\delta \phi _\varepsilon ^{(3)}\) and \(\delta \phi _\varepsilon ^{(4)}\). Regarding \(\delta \phi _\varepsilon ^{(2)}\), we obtain

$$\begin{aligned} \Vert \delta \phi _\varepsilon ^{(2)}(s,s')\Vert _{L^1} \lesssim \varepsilon ^{2-4H} \Big (\int _s^{s'} e^{-c(r-s)/\varepsilon } |r-s|^{H-1} \,dr\Big )^2 \lesssim \varepsilon ^{2-4H} |s-s'|^{2H}. \end{aligned}$$

In view of (5.12) and Assumption 2.2, we obtain for \(\delta \phi _\varepsilon ^{(5)}\) the bound

$$\begin{aligned} \Vert \delta \phi _\varepsilon ^{(5)}(s,s')\Vert _{L^1} \lesssim \varepsilon ^{2-3H}|s'-s|^H, \end{aligned}$$

using \(\Vert f^{\varepsilon }_{s'}-f^{\varepsilon }_s\Vert _{L^p}\lesssim (|s-s'|/\varepsilon )^H\) to obtain the increment in time.

In order to bound \(\delta \phi _\varepsilon ^{(6)}\), we note that one has the bound

$$\begin{aligned} \Vert (P_t {\tilde{g}}-{\tilde{g}})(Y_s^\varepsilon )\Vert _{L^p}&= \bigl \Vert \mathbf{E}\bigl ({\tilde{g}}(Y_{t+s}^\varepsilon ) - {\tilde{g}}(Y_s^\varepsilon )\,|\,\mathcal{G}_s\bigr )\bigr \Vert _{L^p}\\&\le \bigl \Vert {\tilde{g}}(Y_{t+s}^\varepsilon ) - {\tilde{g}}(Y_s^\varepsilon )\bigr \Vert _{L^p} \lesssim \Vert {\tilde{g}}\Vert _E \bigl ( 1\wedge t^H \varepsilon ^{-H} \bigr ). \end{aligned}$$

As a consequence of Assumption 2.1, we thus obtain the bound

$$\begin{aligned}\Vert \delta \phi _\varepsilon ^{(6)}(s,s')\Vert _{L^1} \lesssim \varepsilon ^{2-3H}|s'-s|^H , \end{aligned}$$

and similarly for \(\delta \phi _\varepsilon ^{(7)}\). Collecting all of these bounds, we conclude that

$$\begin{aligned}\Vert \delta \phi _\varepsilon (s,s')\Vert _{L^1} \lesssim \varepsilon ^{2-3H}|s'-s|^H + \varepsilon ^{2-4H} |s-s'|^{2H}. \end{aligned}$$

It suffices then to apply Lemma 4.3 to \(\phi _{\varepsilon }\) with \(\beta = 0\), \({\hat{C}} = 0\), and \(\zeta \in \{H,2H\}\) to conclude that \(\Vert \mathbf{E}(A_t^2\,|\, \mathcal{F}^Y)\Vert _{L^1} \lesssim \varepsilon ^{2-2H} t^{2H} + \varepsilon ^{2-4H}t^{4H} \), which converges to 0 as \(\varepsilon \rightarrow 0\), thus concluding the proof. \(\square \)

Collecting all of these results, we conclude that the following holds.

Proposition 5.11

Let Assumptions 2.12.3 hold and let \(f_1,\ldots f_N \in E\) for some \(N \ge 1\). The processes \(\big (J_t^{\varepsilon }(f_i), {{\mathbb {J}}}_{s,t}^{\varepsilon }(f_i,f_j)\big )_{i,j\le N}\) converge jointly in distribution to

$$\begin{aligned} \Big (W_t^{(i)} , \int _s^t W_r^{(i)}\,dW^{(j)}(r) + \frac{1}{2}(t-s)\Gamma (2H+1) \langle f_i, \mathcal{L}^{1-2H} f_j\rangle _\mu \Big )_{i,j\le N}, \end{aligned}$$

where the \(W^{(i)}\) are Wiener processes with covariance

$$\begin{aligned} \mathbf{E}W^{(i)}_s W^{(j)}_t =\frac{1}{2} (s\wedge t) \Gamma (2H+1) \bigl (\langle \mathcal{L}^{1-2H}f_i, f_j\rangle _\mu + \langle \mathcal{L}^{1-2H}f_j, f_i\rangle _\mu \bigr ). \end{aligned}$$
(5.13)

Proof

If \(H<\frac{1}{2}\), by Proposition 5.3 and Lemma 5.6, \((J_t^{\varepsilon }(f_i), M_t^{\varepsilon }(f_i), R_t^{f_i})_{i\le N}\) all converge jointly to \((W_t^{(i)}, W_t^{(j)},0)\). This holds similarly for \(H>\frac{1}{2}\), using Lemma 5.5. Since by Proposition 5.10 below, \(\int _0^t R_s^{f_i}\,dJ_s^\varepsilon (f_j) \) converges to a deterministic limit, and \(\int _0^t J_s^{\varepsilon }(f_i)dM_t^{\varepsilon }(f_j)\rightarrow \int _0^tW_r^{(i)}\,dW_r^{(j)}\), the desired convergence in distribution follows by combining (5.9) with the standard convergence theorem of stochastic integrals, c.f. [39, Theorem 6.22] and [43, Theorem 2.7].

The case \(H=\frac{1}{2}\) is straightforward. Firstly, we see that conditional on \(\mathcal{F}^Y\), the \(J^{\varepsilon }_{s,t}(f_i) =\int _s^t f_i(Y^\varepsilon _r) \,dB(r)\) are \(L^2\) bounded martingales with respect to the filtration generated by B. By Lemma 5.2, their covariances \( \int _s^t \mathbf{E}(f_i f_j)(Y^\varepsilon _{r})\, dr\) converge to \((t-s)\langle f_i,f_j\rangle _\mu \) in \(L^2\). Then,

$$\begin{aligned}{{\mathbb {J}}}_{s.t}^{\varepsilon }= \int _s^t J^\varepsilon _r(f_i)\,\circ dJ_r^{\varepsilon }(f_j) =\int _s^t \int _s^u f(Y_r^{\varepsilon }) g(Y_u^{\varepsilon }) dB_r dB_u +\frac{1}{2} \int _s^t \big (fg\big ) (Y_r^{\varepsilon }) dr, \end{aligned}$$

which converges in \(L^2\). Since \(J^{\varepsilon }_{s,t}(f_i)\) converge, they converge together with their integrals, concluding the convergence of \((J_t^{\varepsilon }(f_i), {{\mathbb {J}}}_{s,t}^{\varepsilon }(f_i,f_j))_{i,j\le N}\) for \(H=\frac{1}{2}\). \(\square \)