Functional central limit theorems for rough volatility

Horvath, Blanka; Jacquier, Antoine; Muguruza, Aitor; Søjmark, Andreas

doi:10.1007/s00780-024-00533-5

Functional central limit theorems for rough volatility

Open access
Published: 16 April 2024

Volume 28, pages 615–661, (2024)
Cite this article

Download PDF

You have full access to this open access article

Finance and Stochastics Aims and scope Submit manuscript

Functional central limit theorems for rough volatility

Download PDF

Blanka Horvath¹,
Antoine Jacquier²,
Aitor Muguruza³ &
…
Andreas Søjmark⁴

1323 Accesses
Explore all metrics

Abstract

The non-Markovian nature of rough volatility makes Monte Carlo methods challenging, and it is in fact a major challenge to develop fast and accurate simulation algorithms. We provide an efficient one for stochastic Volterra processes, based on an extension of Donsker’s approximation of Brownian motion to the fractional Brownian case with arbitrary Hurst exponent $H \in (0,1)$. Some of the most relevant consequences of this ‘rough Donsker (rDonsker) theorem’ are functional weak convergence results in Skorokhod space for discrete approximations of a large class of rough stochastic volatility models. This justifies the validity of simple and easy-to-implement Monte Carlo methods, for which we provide detailed numerical recipes. We test these against the current benchmark hybrid scheme and find remarkable agreement (for a large range of values of $H$). Our rDonsker theorem further provides a weak convergence proof for the hybrid scheme itself and allows constructing binomial trees for rough volatility models, the first available scheme (in the rough volatility context) for early exercise options such as American or Bermudan options.

Option pricing under fast-varying and rough stochastic volatility

Article 08 June 2018

Asymptotics for volatility derivatives in multi-factor rough volatility models

Article Open access 09 January 2021

The characteristic function of Gaussian stochastic volatility models: an analytic expression

Article 16 September 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Fractional Brownian motion has a long and famous history in probability, stochastic analysis and their applications to diverse fields; see Hurst [39], Kolmogorov [44] and Mandelbrot and Van Ness [52]. Recently, it has experienced a new renaissance in the form of fractional volatility models in mathematical finance. These were first introduced by Comte and Renault [18] and later studied theoretically by Djehiche and Eddahbi [20], Alòs et al. [2] and Fukasawa [29], and given financial motivation and data consistency by Gatheral et al. [32] and Bayer et al. [6]. Since then, a vast literature has pushed the analysis in many directions (see Bayer et al. [5], Bayer et al. [7], Bennedsen et al. [11], El Euch and Rosenbaum [24], Forde and Zhang [26], Fukasawa et al. [30], Guennoun et al. [33], Gulisashvili [34], Horvath et al. [38], Jacquier et al. [41] and Neuman and Rosenbaum [58]), leading to theoretical and practical challenges to understand and implement these models. One of the main issues, at least from a practical point of view, is on the numerical side: the absence of Markovianity rules out PDE-based schemes, and simulation is the only possibility. However, classical simulation methods for fractional Brownian motion (based on Cholesky decomposition or circulant matrices) are notoriously slow, and faster techniques are needed. The state-of-the-art so far is the recent hybrid scheme developed by Bennedsen et al. [10] and its turbocharged version in McCrickerd and Pakkanen [54]. We rise here to this challenge and propose an alternative tree-based approach, mathematically rooted in an extension of Donsker’s theorem to rough volatility.

Donsker [21] (and later Lamperti [48]) proved a functional central limit for Brownian motion, thereby providing a theoretical justification of its random walk approximation. Many extensions have been studied in the literature, and we refer the interested reader to Dudley [23, Chap. 10] for an overview. In the fractional case, Sottinen [66] and Nieminen [59] constructed – following Donsker’s ideas of using i.i.d. sequences of random variables – an approximating sequence converging to fractional Brownian motion with Hurst parameter $H>1/2$. In order to deal with the non-Markovian behaviour of fractional Brownian motion, Taqqu [67] considered sequences of non-i.i.d. random variables, again with the restriction $H>1/2$. Unfortunately, these methodologies do not seem to carry over to the ‘rough’ case $H<1/2$, mainly because of the topologies involved. The recent development of rough paths theory (see Friz and Victoir [28, Sect. 13.4] and Lyons [51]) provided an appropriate framework to extend Donsker’s results to processes with sample paths of Hölder regularity strictly smaller than $1/2$. For $H \in (1/3,1/2)$, Bardina et al. [3] used rough paths to show that functional central limit theorems (in the spirit of Donsker) apply. This in particular suggests that the natural topology to use for fractional Brownian motion with $H<1/2$ is the topology induced by the Hölder norm of the sample paths. Indeed, switching the topology from the Skorokhod one used by Donsker to the (stronger) Hölder topology is the right setting for rough central limit theorems, as we outline in this paper. Recent results by Bender and Parczewski [9] and Parczewski [60, 61] provide convergence for (geometric) fractional Brownian motions with general $H \in (0,1)$ using Wick calculus, assuming that the approximating sequences are Bernoulli random variables. We extend this (Theorem 2.11) to a universal functional central limit theorem, involving general (discrete or continuous) random variables as approximating sequences, only requiring finiteness of moments.

We consider a general class of continuous processes with any Hölder regularity, including fractional Brownian motion with $H\in (0,1)$, truncated Brownian semi-stationary processes, Gaussian Volterra processes, as well as rough volatility models recently proposed in the financial literature. The fundamental novelty here is an approximating sequence capable of simultaneously keeping track of the approximated rough volatility process (fractional Brownian motion, Brownian semistationary process, or any continuous path functional thereof) and of the underlying Brownian motion. This is crucial in order to take into account the correlation of the two processes, the so-called leverage effect in financial modelling. While approximations of two-dimensional (correlated) semimartingales are well understood in the standard case, the rough case is so far an open problem. Our analysis easily generalises beyond Brownian drivers to more general semimartingales, emphasising that the subtle, yet essential, difficulties lie in the passage from the semimartingale setup to the rough case. This is the first Monte Carlo method available in the literature, specifically tailored to two-dimensional rough systems, based on an approximating sequence for which we prove a Donsker–Lamperti-type functional central limit theorem (FCLT). This further provides a pathwise justification of the hybrid scheme by Bennedsen et al. [10], and allows the use of tree-based schemes, opening the doors to pricing early-exercise options such as American options. In Sect. 2, we present the class of models we are considering and state our main results. The proof of the main theorem is developed in Sect. 3 in several steps. We reserve Sect. 4 to applications of the main result, namely weak convergence of the hybrid scheme, binomial trees as well as numerical examples. We present simple numerical recipes, providing a pedestrian alternative to the advanced hybrid schemes in [10, 54], and develop a simple Monte Carlo scheme with low implementation complexity, for which we provide comparison charts against [10] in terms of accuracy and against [54] in terms of speed. Reminders on Riemann–Liouville operators and additional technical proofs are postponed to the Appendix.

Notations: For the interval $\mathbb{I}:=[0,1]$, we denote by $\mathcal {C}(\mathbb{I})$ and $\mathcal {C}^{\alpha}(\mathbb{I})$ the spaces of continuous and $\alpha $-Hölder-continuous functions on $\mathbb{I}$ with Hölder regularity $\alpha \in (0,1)$, respectively. We also need $\mathcal {C}^{1}(\mathbb{I}):=\{f:\mathbb{I}\to \mathbb{R}: f'\text{ exists and is continuous on } \mathbb{I}\} $. For $0 \le \alpha \leq 1$, we define $\mathcal {C}^{\alpha}_{+}(\mathbb{I}):=\{f \in \mathcal {C}^{\alpha }(\mathbb{I}) : f \geq 0\}$. Both definitions imply bounded first order derivatives on $\mathbb{I}$. We use $C$, $\widetilde{C}$, $\widehat{C}$, $C_{1}$, $C_{2}$, $\overline{C}$, $\underline{C}$ as strictly positive real constants which may change from line to line and whose exact values do not matter.

2 Weak convergence of rough volatility models

Donsker’s invariance principle, also termed ‘functional central limit theorem’, ensures the weak convergence in the Skorokhod space of an approximating sequence to a Brownian motion. As opposed to the central limit theorem, Donsker’s theorem is a pathwise statement which ensures that convergence takes place for all times. This result is particularly important for Monte Carlo methods which aim to approximate pathwise functionals of a given process (an essential requirement to price path-dependent financial securities for example). We prove here a version of Donsker’s result not only in the Skorokhod topology, but also in the stronger Hölder topology, for a general class of continuous stochastic processes.

2.1 Hölder spaces and fractional operators

For $\beta \in (0,1)$, the $\beta $-Hölder space $\mathcal {C}^{\beta}(\mathbb{I})$ with the norm

$$ \| f\|_{\beta} := |f|_{\beta} + \| f\|_{\infty }= \sup _{ \substack{t,s\in \mathbb{I}\\ t\neq s}}\frac{|f(t)-f(s)|}{|t-s|^{\beta}} + \max _{t\in \mathbb{I}} |f(t)|, $$

is a non-separable Banach space; see Krylov [45, Chap. 3]. In the spirit of Riemann–Liouville fractional operators recalled in Appendix A, we introduce generalised fractional operators. For $\lambda \in (0,1)$ and any $\alpha \in (-\lambda , 1-\lambda )$, we define the space $\mathcal {L}^{\alpha }:= \{ g\in \mathcal {C}^{2}((0,1]) : |\frac{g(u)}{u^{\alpha}} |, | \frac{g^{\prime}(u)}{u^{\alpha -1}} | \text{ and }|\frac{g^{\prime \prime}(u)}{u^{\alpha -2}} |\text{ are bounded} \}$. When $\alpha >0$, we define the function $g$ at the origin to be $g(0)=0$ by continuous extension.

Definition 2.1

For any $\lambda \in (0,1)$ and $\alpha \in (-\lambda , 1-\lambda )$, the generalised fractional operator (GFO) associated to $g\in \mathcal {L}^{\alpha }$ is defined on $\mathcal {C}^{\lambda}(\mathbb{I})$ as

$$ (\mathcal {G}^{\alpha }f)(t) := \textstyle\begin{cases} \int _{0}^{t}(f(s)-f(0))\frac{\mathrm {d}}{\mathrm {d}t}g(t-s)\mathrm {d}s, & \text{if } \alpha \in (0,1-\lambda ), \\ \frac{\mathrm {d}}{\mathrm {d}t}(\int _{0}^{t}(f(s)-f(0))g(t-s)\mathrm {d}s), & \text{if } \alpha \in (-\lambda , 0). \end{cases} $$

(2.1)

We further use the notation $G(t):=\int _{0}^{t}g(u)\mathrm {d}u$ for any $t\in \mathbb{I}$. Of particular interest in mathematical finance are the kernels

$$ \textstyle\begin{array}{l@{\quad}l@{\quad}l} \text{Riemann--Liouville: } & g(u) = u^{\alpha} &\quad \text{for } \alpha \in (-1, 1); \\ \text{Gamma fractional: } & g(u) = u^{\alpha} \mathrm {e}^{\beta u} &\quad \text{for }\alpha \in (-1, 1), \beta < 0; \\ \text{power-law: } & g(u) = u^{\alpha}(1+u)^{\beta -\alpha} &\quad \text{for } \alpha \in (-1, 1), \beta < -1. \end{array} $$

(2.2)

The result below generalises the classical mapping properties of Riemann–Liouville fractional operators first proved by Hardy and Littlewood [36] and is of fundamental importance in the rest of our analysis.

Proposition 2.2

For any $\lambda \in (0,1)$ and $\alpha \in (-\lambda , 1-\lambda )$, the operator $\mathcal {G}^{\alpha}$ is continuous from $\mathcal {C}^{\lambda}(\mathbb{I})$ to $\mathcal {C}^{\lambda +\alpha}(\mathbb{I})$.

The proof can be found in Appendix C. This result is analogous to the classical Schauder estimates, phrased in terms of convolution with a suitable regularising kernel, as e.g. treated in Friz and Hairer [27, Theorem 14.17] and Broux et al. [15, Theorem 2.13 and Lemma 2.9], but in settings that are slightly different from ours.

We develop here an approximation scheme for the following system which generalises the concept of rough volatility introduced in Alòs et al. [2], Fukasawa [29] and Gatheral et al. [32] in the context of mathematical finance. The process $X$ represents the dynamics of the logarithm of a stock price process, under a given reference pricing measure, with

$$ \textstyle\begin{cases} \mathrm {d}X_{t} \hspace{-2ex} & = \displaystyle -\frac{1}{2}V_{t} \mathrm {d}t + \sqrt{V_{t}} \, \mathrm {d}B_{t}, \qquad X_{0} = 0, \\ V_{t} & = \displaystyle \Phi (\mathcal {G}^{\alpha }Y )(t), \end{cases} $$

(2.3)

with $\alpha \in (-\frac{1}{2},\frac{1}{2})$ and where $Y$ is the (strong) solution to the stochastic differential equation

$$ \mathrm {d}Y_{t} = b(Y_{t})\mathrm {d}t + a(Y_{t})\mathrm {d}W_{t}, \qquad Y_{0}\in \mathcal {D}_{Y}, $$

(2.4)

where $\mathcal {D}_{Y}$ denotes the state space of $Y$, usually ℝ or $\mathbb{R}_{+}$. The two Brownian motions $B$ and $W$ are defined on a common filtered probability space $(\Omega , \mathcal {F}, (\mathcal {F}_{t})_{t\in \mathbb{I}}, \mathbb{P})$ and are correlated by the parameter $\rho \in [-1,1]$, and we introduce $\overline {\rho }:=\sqrt{1-\rho ^{2}}$. We further let $\Phi $ be any operator such that for all $\gamma \in (0,1)$, we have $\Phi :\mathcal {C}^{\gamma}( \mathbb{I})\to \mathcal {C}^{\gamma}_{+}(\mathbb{I})$ with $\Phi $ continuous from $(\mathcal {C}^{\gamma}(\mathbb{I}),\|\cdot \|_{\gamma })$ to itself. This in particular ensures that whenever $Y \in \mathcal {C}^{\lambda}(\mathbb{I})$, then $V \in \mathcal {C}_{+}^{\alpha +\lambda}(\mathbb{I})$, i.e., $V$ is nonnegative and belongs to $\mathcal {C}^{\alpha +\lambda}(\mathbb{I})$. As an example, one can consider a so-called Nemyckij operator $\Phi (f):=\phi \circ f$ given by composition with some $\phi :\mathbb{R} \rightarrow \mathbb{R}_{+}$, in which case Drábek [22] has shown that the operator $\Phi $ is continuous from $(\mathcal {C}^{\gamma}(\mathbb{I}),\|\cdot \|_{\gamma })$ to $(\mathcal {C}^{\gamma}(\mathbb{I}),\|\cdot \|_{\gamma })$ for all $\gamma \in (0,1)$ if and only if $\phi \in \mathcal {C}^{1}(\mathbb{R})$. It remains to formulate a precise definition for $\mathcal {G}^{\alpha }W$ (Proposition 2.4) and for $\mathcal {G}^{\alpha }Y$ (Corollary 2.5) to fully specify the system (2.3) and clarify the existence of solutions.

Assumption 2.3

There exist $C_{b}, C_{a}>0$ such that for all $y \in \mathcal {D}_{Y}$,

$$ |b(y)| \leq C_{b} (1+|y|), \qquad |a(y)| \leq C_{a} (1+|y|), $$

where $a$ and $b$ are continuous functions such that there is a unique strong solution to (2.4).

Existence of solutions to (2.4) along with Assumption 2.3 need to be checked case by case beyond standard Lipschitz and linear growth conditions (we provide a showcase of models in Examples 2.6–2.9 below satisfying existence and pathwise uniqueness conditions). Not only is the solution to (2.4) continuous, but $( \frac{1}{2}-\varepsilon )$-Hölder- continuous for any $\varepsilon \in (0, \frac{1}{2})$ as a consequence of the Kolmogorov–Čentsov theorem in Čentsov [16]. Existence and precise meaning of $\mathcal {G}^{ \alpha }Y$ are delicate and treated below.

2.2 Examples

Before constructing our approximation scheme, let us discuss a few examples of processes within our framework. As a first useful application, these generalised fractional operators provide a (continuous) mapping between a standard Brownian motion and its fractional counterpart.

Proposition 2.4

For any $\alpha \in (-\frac{1}{2},\frac{1}{2} )$, the equality $(\mathcal {G}^{\alpha }W)(t)=\int _{0}^{t} g(t-s)\mathrm {d}W_{s}$ holds almost surely for all $t\in \mathbb{I}$.

Proof

Since the paths of Brownian motion are $(\frac{1}{2}-\varepsilon )$-Hölder-continuous for any $\varepsilon \in (0,\frac{1}{2})$, existence (and continuity) of $\mathcal {G}^{\alpha }W$ is guaranteed for all $\alpha \in (-\frac{1}{2},\frac{1}{2} )$. When $\alpha \in (0,\frac{1}{2} )$, the kernel is smooth and square-integrable, so that Itô’s product rule yields by Proposition 2.2 (since $g(0)=0$ and $\alpha >0$) that

$$\begin{aligned} (\mathcal {G}^{\alpha }W)(t) &= \int _{0}^{t} (W_{s} -W_{0}) \frac{\mathrm {d}}{\mathrm {d}t}g(t-s) \mathrm {d}s \\ &= g(t) (W_{0} -W_{0} ) -g(0) (W_{t} -W_{0} ) + \int _{0}^{t} g(t-s) \mathrm {d}W_{s} \\ &=\int _{0}^{t} g(t-s)\mathrm {d}W_{s}, \end{aligned}$$

and the claim holds. For $\alpha \in (-\frac{1}{2}, 0)$ and any $\varepsilon >0$, introduce the operator

$$ (\mathcal {G}^{1+\alpha}_{\varepsilon }f )(t) := \int _{0}^{t-\varepsilon }g(t-s)\big(f(s)-f(0) \big)\mathrm {d}s \qquad \text{for all }t\in \mathbb{I}, $$

which satisfies $\frac{\mathrm {d}}{\mathrm {d}t}\lim _{\varepsilon \downarrow 0} (\mathcal {G}^{1+\alpha}_{\varepsilon }f )(t) = \left (\mathcal {G}^{\alpha}f\right )(t)$ pointwise. Now for any $t\in \mathbb{I}$, almost surely,

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} (\mathcal {G}^{1+\alpha}_{\varepsilon }W )(t) &= g(\varepsilon ) (W_{t-\varepsilon }-W_{0}) - g(t) (W_{0}-W_{0} ) + \int _{0}^{t-\varepsilon } W_{s} \frac{\mathrm {d}}{\mathrm {d}t}g(t-s) \mathrm {d}s \\ &=g(\varepsilon )W(0)+\int _{0}^{t-\varepsilon }g(t-s)\mathrm {d}W_{s}. \end{aligned}$$

(2.5)

Then as $\varepsilon $ tends to zero, the right-hand side of (2.5) tends to $\int _{0}^{t}g(t-s)\mathrm {d}W_{s}$ and furthermore, the convergence is uniform. On the other hand, the equalities

$$\begin{aligned} (\mathcal {G}_{0}^{1+\alpha}W)(t)-(\mathcal {G}_{0}^{1+\alpha}W)(0) & = \lim _{\varepsilon \downarrow 0} \big((\mathcal {G}^{1+\alpha}_{\varepsilon }W)(t)-(\mathcal {G}^{1+\alpha}_{ \varepsilon }W)(0) \big) \\ & = \lim _{\varepsilon \downarrow 0}\int _{0}^{t}\bigg(\frac{\mathrm {d}}{\mathrm {d}s}\mathcal {G}^{1+ \alpha}_{\varepsilon }W\bigg)(s)\mathrm {d}s \\ & = \int _{0}^{t}\lim _{\varepsilon \downarrow 0}\bigg(\frac{\mathrm {d}}{\mathrm {d}s}\mathcal {G}^{1+ \alpha}_{\varepsilon }W\bigg)(s)\mathrm {d}s \\ & = \int _{0}^{t}\left (\int _{0}^{s}g(s-u)\mathrm {d}W_{u}\right )\mathrm {d}s \end{aligned}$$

hold since convergence is uniform on compacts, and the fundamental theorem of calculus concludes the proof. □

Minimal changes to the proof of Proposition 2.4 also yield the following result.

Corollary 2.5

If $Y$ solves (2.4), then $(\mathcal {G}^{\alpha }Y)(t)=\int _{0}^{t} g(t-s)\mathrm {d}Y_{s}$ almost surely for all $t\in \mathbb{I}$ and $\alpha \in (-\frac{1}{2}, \frac{1}{2})$.

Up to a constant multiplicative factor $C_{\alpha}$, the (left) fractional Riemann–Liouville operator (see Appendix A) is identical to the GFO in (2.2) so that the Riemann–Liouville (or type-II) fractional Brownian motion can be written as $C_{\alpha }\mathcal {G}^{\alpha }W$. Proposition 2.2 then implies that the Riemann–Liouville operator is continuous from $\mathcal {C}^{1/2}( \mathbb{I})$ to $\mathcal {C}^{1/2+\alpha}(\mathbb{I})$ for $\alpha \in (-\frac{1}{2},\frac{1}{2} )$. Each kernel in (2.2) gives rise to processes proposed by Barndorff-Nielsen and Schmiegel [4] for turbulence and financial modelling.

Example 2.6

The rough Bergomi model by Bayer et al. [6] reads

$$ V_{t} = \xi _{0}(t)\mathcal {E}\bigg(2\nu C_{H}\int _{0}^{t}(t-s)^{\alpha} \mathrm {d}W_{s} \bigg), $$

with $V_{0}, \nu ,\xi _{0}(\,\cdot \,)>0$, $\alpha \in (-\frac{1}{2},\frac{1}{2} )$ and $\mathcal {E}(\,\cdot \,)$ is the stochastic exponential. (We remark that this coincides here with the Wick stochastic exponential.) This corresponds exactly to (2.3) with $g(u) = u^{\alpha}$, $Y = W$ and

$$ \Phi (\varphi )(t) := \xi _{0}(t) \exp \big(2\nu C_{H}\varphi (t) \big)\exp \bigg(-2\nu ^{2}C_{H}^{2}\int _{0}^{t}(t-s)^{2\alpha} \mathrm {d}s \bigg). $$

Example 2.7

A truncated Brownian semistationary (TBSS) process is defined as $\int _{0}^{t} g(t-s)\sigma (s)\mathrm {d}W_{s}$ for $t\in \mathbb{I}$, where $\sigma $ is $(\mathcal {F}_{t})_{t\in \mathbb{I}}$-predictable with locally bounded trajectories and finite second moments, and $g: \mathbb{I}\setminus \{0\}\to \mathbb{I}$ is Borel-measurable and square-integrable. If $\sigma \in \mathcal {C}^{1}(\mathbb{I})$, this class falls within the GFO framework.

Example 2.8

Bennedsen et al. [11] added a Gamma kernel to the volatility process, which yields the truncated Brownian semistationary (Bergomi-type) model

$$ V_{t} = \xi _{0}(t)\mathcal {E}\bigg(2\nu C_{H}\int _{0}^{t}(t-s)^{\alpha} \mathrm {e}^{- \beta (t-s)}\mathrm {d}W_{s}\bigg), $$

with $\beta >0$, $\alpha \in (-\frac{1}{2},\frac{1}{2} )$. This corresponds to (2.3) with $Y=W$, Gamma fractional kernel $g(u) = u^{\alpha} \mathrm {e}^{-\beta u}$ in (2.2) and

$$ \Phi (\varphi )(t) := \xi _{0}(t) \exp \big(2\nu C_{H}\varphi (t) \big)\exp \bigg(-2\nu ^{2}C_{H}^{2}\int _{0}^{t}(t-s)^{2\alpha} \mathrm {e}^{-2 \beta (t-s)}\mathrm {d}s\bigg). $$

Example 2.9

The version of the rough Heston model introduced by Guennoun et al. [33] reads

$$ \begin{aligned} Y_{t} & \displaystyle = Y_{0} + \int _{0}^{t}\kappa (\theta -Y_{s}) \mathrm {d}s+\int _{0}^{t}\xi \sqrt{Y_{s}} \, \mathrm {d}W_{s}, \\ V_{t} & \displaystyle =\eta +\int _{0}^{t}(t-s)^{\alpha }\mathrm {d}Y_{s}, \end{aligned} $$

with $Y_{0}$, $\kappa $, $\xi $, $\theta $ all $>0$, $2\kappa \theta >\xi ^{2}$ and $\eta >0$, $\alpha \in (-\frac{1}{2},\frac{1}{2} )$. This corresponds exactly to (2.3) with $g(u) = u^{\alpha}$, $\Phi (\varphi )(t) := \eta + \varphi (t)$, and the coefficients of (2.4) read $b(y) = \kappa (\theta -y)$ and $a(y) = \xi \sqrt{y}$. This model is markedly different from the rough Heston model introduced by El Euch and Rosenbaum [24] (for which the characteristic function is known in semiclosed form). Unfortunately, this version of the rough Heston model is outside of the scope of our invariance principle.

2.3 The approximation scheme

We now move on to the core of the project, namely an approximation scheme for the system (2.3), (2.4). The basic ingredient to construct approximating sequences will be suitable families of i.i.d. random variables which satisfy the following assumption.

Assumption 2.10

The family $(\xi _{i})_{i\geq 1}$ forms an i.i.d. sequence of centered random variables with finite moments of all orders and $\mathbb{E}[\xi _{1}^{2}] = \sigma ^{2}>0$.

Given $(\zeta _{i})_{i\geq 1}$ as in Assumption 2.10, Lamperti’s [49] generalisation of Donsker’s invariance principle in Donsker [21] tells us that a Brownian motion $W$ can be approximated weakly in Hölder space (Theorem 3.1) by processes of the form

$$ W_{n}(t) := \frac{1}{\sigma \sqrt{n}}\sum _{k=1}^{\lfloor nt \rfloor }\zeta _{k} + \frac{nt-\lfloor nt \rfloor }{\sigma \sqrt{n}}\zeta _{\lfloor nt \rfloor +1}, $$

(2.6)

defined pathwise for any $\omega \in \Omega $, $n\geq 1$ and $t\in \mathbb{I}$. As we show in Sect. 3.2, a similar construction holds to weakly approximate the process $Y$ from (2.4) in Hölder space via

$$\begin{aligned} Y_{n}(t) &:= Y_{n}(0) + \frac{1}{n}\sum _{k=1}^{\lfloor nt \rfloor }b (Y_{n}^{k-1} ) + \frac{nt-\lfloor nt \rfloor }{n}b (Y_{n}^{\lfloor nt \rfloor } ) \\ & \hphantom{=::} + \frac{1}{\sigma \sqrt{n}}\sum _{k=1}^{\lfloor nt \rfloor }a(Y_{n}^{k-1} )\zeta _{k} + \frac{nt-\lfloor nt \rfloor }{\sigma \sqrt{n}} a(Y_{n}^{\lfloor nt \rfloor } )\zeta _{\lfloor nt \rfloor +1}, \end{aligned}$$

(2.7)

where $Y_{n}^{k}:=Y_{n}(t_{k})$ and $\mathcal {T}_{n}:=\{t_{k} = \frac{k}{n} : k=0,1,\dots ,n\}$. Here the $\zeta _{k}$ correspond to the innovations of the Brownian motion $W$ in (2.4). Similarly, we use $\xi _{k}$ when referring to the innovations of the Brownian $B$ from (2.3) which enter into the approximations of the log-stock price in (2.8) below. Throughout the paper, we assume that the innovations $(\xi _{i})_{i=1, \dots ,\lfloor nt \rfloor }$ and $(\zeta _{i})_{i=1, \dots ,\lfloor nt \rfloor }$ come from two sequences $(\xi _{i})_{i\geq 1}$ and $(\zeta _{i})_{i\geq 1}$ satisfying Assumption 2.10 and such that $((\xi _{i},\zeta _{i}))_{i\geq 1}$ is i.i.d. with $\mathrm{corr}(\xi _{i},\zeta _{i})=\rho $ for all $i\geq 1$. Naturally, the approximations in (2.7) and in (2.8) below should be understood pathwise, but we omit the $\omega $-dependence in the notations for clarity.

Regarding the approximation scheme for the process $X$ given by (2.3), we follow a typical route in weak convergence analysis (see Billingsley [13, Sect. 13] and Ethier and Kurtz [25, Chap. 3]) and establish convergence in the Skorokhod space $\left (\mathcal {D}(\mathbb{I}),d_{\mathcal {D}}\right )$. Here $\mathcal {D}(\mathbb{I})=\mathcal {D}(\mathbb{I};\mathbb{R})$ denotes the space of ℝ-valued càdlàg functions on $\mathbb{I}$ and $d_{\mathcal {D}}$ denotes a metric inducing Skorokhod’s $J_{1}$-topology. To approximate $X$ in this space, we then consider the process

$$ \textstyle\begin{array}{l@{\quad}l} X_{n}(t) &:=\; \displaystyle -\frac{1}{2n}\sum _{k=1}^{\lfloor nt \rfloor }\Phi (\mathcal {G}^{ \alpha }Y_{n} )(t_{k-1}) + \frac{1}{\sigma \sqrt{n}}\sum _{k=1}^{\lfloor nt \rfloor } \sqrt{\Phi (\mathcal {G}^{\alpha }Y_{n} )(t_{k-1})} \, \xi _{k}. \end{array} $$

(2.8)

Analogously to (2.7), one could rewrite these as continuous processes via linear interpolation; but we note that the interpolating term would decay to zero by Chebyshev’s inequality. The following result, proved in Sect. 3.4, confirms the functional convergence of the approximating sequence $(X_{n})_{n\geq 1}$.

Theorem 2.11

The sequence $(X_{n})_{n\geq 1}$ converges in $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}})$ weakly to $X$.

The construction of the proof allows extending the convergence to the case where $Y$ is a $d$-dimensional diffusion without additional work. The proof of the theorem requires a certain number of steps: we start with the convergence of the approximations $(Y_{n})$ in some Hölder space, which we then translate into convergence of the sequence $(\Phi (\mathcal {G}^{\alpha }Y_{n}))$ by suitable continuity properties of the operations $\mathcal {G}^{\alpha}$ and $\Phi $, before finally deducing also the convergence of the corresponding stochastic integrals for the approximations of (2.3). These steps are carried out in Sects. 3.2–3.4 below.

3 Functional CLT for a family of Hölder-continuous processes

3.1 Weak convergence of Brownian motion in Hölder spaces

Donsker’s classical convergence result was proved under the Skorokhod topology. We concentrate here on convergence in the Hölder topology, due to Lamperti [49]. The standard convergence result for Brownian motion can be stated as follows.

Theorem 3.1

For $\lambda <\frac{1}{2}$, the sequence $(W_{n})$ in (2.6) converges in $(\mathcal {C}^{\lambda}(\mathbb{I}),\|\cdot \|_{\lambda })$ weakly to a Brownian motion.

The proof relies on finite-dimensional convergence and tightness of the approximating sequence. Not surprisingly, the tightness criterion in [13, Sect. 13] for the Skorokhod space $\mathcal {D}(\mathbb{I})$ and for a Hölder space are different. In fact, the tightness criterion in a Hölder space is strictly related to the Kolmogorov–Čentsov continuity result [16]. Note in passing that the approximating sequence (2.6) is piecewise differentiable in time for each $n\geq 1$ even though its limit is obviously not. The proof of Theorem 3.1 follows from Theorem 3.3 below under Assumption 2.10.

Theorem 3.2

Let $Z\in \mathcal {C}^{\lambda}(\mathbb{I})$ and $(Z_{n})_{n\geq 1}$ be an approximating sequence in the sense that for any sequence $(\tau _{k})_{k \geq 1}$ in $\mathbb{I}$, the sequence $((Z_{n}(\tau _{k}))_{k \geq 1})_{n \geq 1}$ converges in distribution to $(Z(\tau _{k}))_{k \geq 1}$ as $n \to \infty $. Assume further that

$$ \mathbb{E}[|Z_{n}(t) - Z_{n}(s)|^{\gamma }] \leq C|t-s|^{1+\beta} $$

(3.1)

holds for all $n\geq 1$, $t,s\in \mathbb{I}$ and some $C$, $\gamma $, $\beta $ all $>0$ with $\frac{\beta}{\gamma}\leq \lambda $. Then $(Z_{n})_{n\geq 1}$ converges in $\mathcal {C}^{\mu}(\mathbb{I})$ weakly to $Z$ for $\mu <\frac{\beta}{\gamma}\leq \lambda $.

The proof of this theorem relies on results of Račkauskas and Suquet [63], who prove the convergence in the Hölder space $\mathcal {C}_{0}^{\lambda}(\mathbb{I})$ with the norm $\|f\|^{0}_{\lambda}:=|f|_{ \lambda}+|f(0)|$ for all functions that satisfy

$$ \lim _{\delta \downarrow 0} \sup _{ \substack{0< t-s< \delta \\ t,s\in \mathbb{I}}}\frac{|f(t)-f(s)|}{(t-s)^{\gamma}}=0. $$

From this point, the proof of Theorem 3.2 is a straightforward consequence because $(\mathcal {C}_{0}^{\lambda}( \mathbb{I}),\|\cdot \|^{0}_{\lambda })$ is a separable closed subspace of $(\mathcal {C}^{\lambda}(\mathbb{I}),\|\cdot \|_{\lambda })$ (see Hamadouche [35] and [63] for details), and one can then use (3.1) to establish tightness as in [49]. Moreover, as the identity map from $C_{0}^{\lambda}(\mathbb{I})$ into $\mathcal {C}^{\lambda}(\mathbb{I})$ is continuous, weak convergence in the former implies weak convergence in the latter. To conclude our review of weak convergence in Hölder spaces, the following theorem due to Račkauskas and Suquet [63] provides necessary and sufficient conditions ensuring convergence in Hölder space.

Theorem 3.3

Račkauskas–Suquet [63, Theorem 1]

For $\lambda \in (0, \frac{1}{2})$, the sequence $(W_{n})_{n\geq 1}$ in (2.6) converges in $\mathcal {C}^{\lambda}(\mathbb{I})$ weakly to a Brownian motion if and only if $\mathbb{E}[\xi _{1}] = 0$ and $\lim \limits _{t\uparrow \infty}t^{\frac{1}{1-2\lambda}}\mathbb{P}[| \xi _{1}|\geq t] = 0$.

Assumption 2.10 ensures the conditions in Theorem 3.3. The following statement allows us to apply Theorem 3.2 on $\mathbb{I}$ and extend the Hölder convergence result via linear interpolation to a sequence of continuous processes.

Theorem 3.4

Let $Z\in \mathcal {C}^{\lambda}(\mathbb{I})$ and $(Z_{n})_{n\geq 1}$ be an approximation sequence such that finite-dimensional convergence holds as $n \to \infty $. Moreover, if

$$ \mathbb{E}[ |Z_{n}(t_{i})-Z_{n}(t_{j}) |^{\gamma} ]\leq C |t_{i}-t_{j} |^{1+ \beta} $$

(3.2)

for any $t_{i}, t_{j} \in \mathcal {T}_{n}$ and some $\beta $, $\gamma $, $C$ all $>0$ with $\frac{\beta}{\gamma}\leq \lambda $ and $\gamma \geq 1+\beta $, then the linearly interpolating sequence

$$ \overline{Z}_{n}(t):=Z_{n}\bigg(\frac{\lfloor nt \rfloor }{n}\bigg) +(nt-\lfloor nt \rfloor )\bigg(Z_{n} \Big(\frac{\lfloor nt \rfloor +1}{n}\Big) -Z_{n}\Big(\frac{\lfloor nt \rfloor }{n}\Big) \bigg) $$

satisfies (3.1). In particular, $(\overline{Z}_{n})_{n \geq 1}$ converges in $\mathcal {C}^{\mu}(\mathbb{I})$ weakly to $Z$ for $\mu <\frac{\beta}{\gamma}\leq \lambda $.

Proof

It suffices to confirm (3.1) for $\bar{Z}_{n}$. The final claim then follows from Theorem 3.2. Take $t, s \in \mathbb{I}$ and suppose first that $t-s\geq \frac{1}{n}$. Letting $Z_{n}^{k} := Z_{n}(t_{k})$ and $\overline{Z}_{n}^{k} := \overline{Z}_{n}(t_{k})$, we can then write

$$\begin{aligned} \mathbb{E}[|\overline{Z}_{n}(t)-\overline{Z}_{n}(s)|^{\gamma} ] &= \mathbb{E}[ |Z_{n}^{ \lfloor nt \rfloor } + (nt - \lfloor nt \rfloor ) (Z_{n}^{\lfloor nt \rfloor +1} - Z_{n}^{\lfloor nt \rfloor } ) \\ & \hphantom{=\mathbb{E}[} - Z_{n}^{\lfloor ns \rfloor } - (ns - \lfloor ns \rfloor ) (Z_{n}^{\lfloor ns \rfloor +1} - Z_{n}^{\lfloor ns \rfloor } ) |^{\gamma} ] \\ & \leq 3^{\gamma -1} \mathbb{E}[ |Z_{n}^{\lfloor nt \rfloor } - Z_{n}^{\lfloor ns \rfloor } |^{\gamma} + [nt- \lfloor nt \rfloor ]^{\gamma} |Z_{n}^{\lfloor nt \rfloor +1} - Z_{n}^{\lfloor nt \rfloor } |^{\gamma} \\ & \hphantom{=3^{\gamma -1} \mathbb{E}[} + [ns-\lfloor ns \rfloor ]^{\gamma} |Z_{n}^{\lfloor ns \rfloor +1} - Z_{n}^{\lfloor ns \rfloor } |^{\gamma }] \\ & \leq C \bigg(\Big(\frac{\lfloor nt \rfloor -\lfloor ns \rfloor }{n}\Big)^{1+\beta} + \frac{(nt-\lfloor nt \rfloor )^{\gamma}}{n^{1+\beta}} + \frac{(ns-\lfloor ns \rfloor )^{\gamma}}{n^{1+\beta}}\bigg) \\ &\leq C (t-s)^{1+\beta}, \end{aligned}$$

where we used (3.2) and the fact that $\frac{\lfloor nt \rfloor -\lfloor ns \rfloor }{n}\leq 2(t-s)$, $nt -\lfloor nt \rfloor \leq 1$ for $t\geq 0$ since $t-s \geq \frac{1}{n}$.

It remains to consider the case $t-s < \frac{1}{n} $. There are two possible scenarios. If $\lfloor nt \rfloor =\lfloor ns \rfloor $, then using $\gamma \geq 1+\beta $ gives

$$ \mathbb{E}[|\overline{Z}_{n}(t)-\overline{Z}_{n}(s)|^{\gamma}] = \mathbb{E}[ |( nt-ns) (Z_{n}^{\lfloor nt \rfloor +1} - Z_{n}^{\lfloor nt \rfloor } ) |^{\gamma} ] \leq \frac{C|t-s|^{\gamma}}{ n^{1+\beta -\gamma}} \leq C (t-s)^{1+\beta}. $$

If $\lfloor nt \rfloor \neq \lfloor ns \rfloor $, then either $\lfloor nt \rfloor +1= \lfloor ns \rfloor $ or $\lfloor nt \rfloor = \lfloor ns \rfloor +1$. Without loss of generality, consider the second case. Then

$$\begin{aligned} \mathbb{E}[|\overline{Z}_{n}(t)-\overline{Z}_{n}(s)|^{\gamma} ] & = \mathbb{E}[ | \overline{Z}_{n}(t)-Z_{n}^{\lfloor nt \rfloor }+Z_{n}^{\lfloor nt \rfloor } - \overline{Z}_{n}(s) |^{ \gamma} ] \\ &\leq 2^{\gamma -1} \mathbb{E}[ |\overline{Z}_{n}(t) - Z_{n}^{\lfloor nt \rfloor } |^{ \gamma}+ |Z_{n}^{\lfloor nt \rfloor }-\overline{Z}_{n}(s) |^{\gamma} ] \\ &\leq C \big((t-s)^{1+\beta}+\mathbb{E}[ |(\lfloor nt \rfloor -ns) (Z_{n}^{\lfloor nt \rfloor } -Z_{n}^{ \lfloor nt \rfloor -1} ) |^{\gamma} ]\big), \end{aligned}$$

and the result follows as before since $t-\frac{\lfloor nt \rfloor }{n}<|t-s|$ and $|s-\frac{\lfloor nt \rfloor }{n}|\leq |t-s|$. □

3.2 Weak convergence of Itô diffusions in Hölder spaces

The first important step in our analysis is to extend the Donsker–Lamperti weak convergence from Brownian motion to the Itô diffusion $Y$ in (2.4).

Theorem 3.5

The sequence $(Y_{n})_{n\geq 1}$ in (2.7) converges in $(\mathcal {C}^{\lambda}(\mathbb{I}),\|\cdot \|_{\lambda })$ weakly to $Y$ in (2.4) for all $\lambda <\frac{1}{2}$,

Proof

Finite-dimensional convergence is a classical result by Kushner [47]; so only tightness needs to be checked. In particular, using Theorem 3.4, we need only consider the partition $\mathcal {T}_{n}$. Thus we get

$$\begin{aligned} \begin{aligned} & \mathbb{E}[|Y_{n}^{j}-Y_{n}^{i}|^{2p} ] \\ &= \mathbb{E}\bigg[\bigg| \sum _{k=i+1}^{j} \frac{1}{n}b(Y_{n}^{k-1})+ \frac{1}{\sigma \sqrt{n}}a(Y_{n}^{k-1})\zeta _{k}\bigg|^{2p}\bigg] \\ & \leq 2^{2p-1}\bigg( \mathbb{E}\bigg[\bigg| \sum _{k=i+1}^{j} \frac{1}{n}b(Y_{n}^{k-1}) \bigg|^{2p}\bigg]+\mathbb{E}\bigg[\bigg|\sum _{k=i+1}^{j} \frac{a (Y_{n}^{k-1} )\zeta _{k}}{\sigma \sqrt{n}}\bigg|^{2p}\bigg] \bigg) \\ & \leq 2^{2p-1} \bigg( \mathbb{E}\bigg[\bigg| \sum _{k=i+1}^{j} \frac{1}{n}b(Y_{n}^{k-1}) \bigg|^{2p}\bigg]+C(p)\mathbb{E}\bigg[\bigg|\sum _{k=i+1}^{j} \frac{a(Y_{n}^{k-1})^{2} \zeta _{k}^{2}}{\sigma ^{2}n}\bigg|^{p} \bigg]\bigg) \\ & \leq 2^{2p-1} \bigg( \frac{(j-i)^{2p-1}}{n^{2p}} \sum _{k=i+1}^{j} \mathbb{E}[b(Y_{n}^{k-1})^{2p} ] \\ & \hphantom{=:2^{2p-1} \bigg(} +\frac{(j-i)^{p-1}}{n^{p}}C(p) \frac{\mathbb{E}[\zeta _{1}^{2p}]}{\sigma ^{2p}}\sum _{k=i+1}^{j}\mathbb{E}[a(Y_{n}^{k-1})^{2p} ]\bigg) \\ &\leq 2^{2p-1} \frac{(j-i)^{p-1}}{n^{p}}\sum _{k=i+1}^{j} \bigg(C_{b}^{2p} \mathbb{E}[(1+|Y_{n}^{k-1}|)^{2p} ] \\ & \hphantom{=:2^{2p-1} \frac{(j-i)^{p-1}}{n^{p}}\sum _{k=i+1}^{j} \bigg(} +C(p)\frac{C_{a}^{2p}\mathbb{E}[\zeta _{1}^{2p}]}{\sigma ^{2p}}\mathbb{E}[ (1+|Y_{n}^{k-1}| )^{2p} ]\bigg) \\ & \leq \max \bigg(C_{b}^{2p},C(p) \frac{C_{a}^{2p}\mathbb{E}[\zeta _{1}^{2p}]}{\sigma ^{2p}}\bigg) 2^{2p} \frac{(j-i)^{p-1}}{n^{p}} \bigg( (j-i)+\sum _{k=i+1}^{j} \mathbb{E}[|Y_{n}^{k-1}|^{2p} ] \bigg) \\ & \leq \max \bigg(C_{b}^{2p},C(p) \frac{C_{a}^{2p}\mathbb{E}[\zeta _{1}^{2p}]}{\sigma ^{2p}}\bigg) 2^{2p} \exp \bigg(\sum _{k=i+1}^{j}2^{2p} \frac{(j-i)^{p-1}}{n^{p}}\bigg) (t_{j}-t_{i})^{p} \\ & \leq \max \bigg(C_{b}^{2p},C(p) \frac{C_{a}^{2p}\mathbb{E}[\zeta _{1}^{2p}]}{\sigma ^{2p}}\bigg) 2^{2p} \exp ( 2^{2p} ) (t_{j}-t_{i})^{p} \\ &=:\mathfrak {C}(p) (t_{j}-t_{i})^{p}, \end{aligned} \end{aligned}$$

where we have used the discrete version of the BDG inequality (see Beiglböck and Siorpaes [8, Theorem 6.3]) in the martingale term $\sum _{k=i+1}^{j}\frac{1}{\sigma \sqrt{n}}a (Y_{n}^{k-1} )\zeta _{k}$ with the constant $C(p):=6^{p}(p-1)^{p-1}$. Indeed, if we consider the discrete-time martingale process $(x^{i,j}_{n})_{u}:=\sum _{k=1}^{u} \frac{1}{\sigma \sqrt{n}}a (Y_{n}^{k+i-1} )\zeta _{i+k}$ for $u\in \{1,\dots ,j-i\}$, we have $|(x^{i,j}_{n})_{j-i}|\leq \max _{u\in \{1,\dots , j-i\}} |(x^{i,j}_{n})_{u}|$, and the BDG inequality clearly also applies to $|x^{i,j}_{j-i}|$. We also used independence of $\zeta _{k}$ and $Y_{k-1}$ and the linear growth of $b(\,\cdot \,)$ and $a( \,\cdot \,)$ from Assumption 2.3, Hölder’s inequality and the discrete version of Gronwall’s lemma (see Clark [17]) in the last step. Since $\mathbb{E}[\zeta _{k}^{2p}]$ is bounded by Assumption 2.10 and the constant $\mathfrak {C}(p)$ only depends on $p$, but not on $n$, the criterion (3.2) of Theorem 3.4 holds for $p>1$ with $\gamma =2p$ and $\beta =p-1$. □

Corollary 3.6

Let $(Y_{n})_{n\geq 1}$ be as in Theorem 3.5with innovations $(\zeta _{i})_{i\geq 1}$, and suppose $(B^{n})_{n\geq 1}$ is defined by the Donsker approximations (2.6) for some innovations $(\xi _{i})_{i\geq 1}$ satisfying Assumption 2.10such that $((\zeta _{i},\xi _{i}))_{i\geq 1}$ is i.i.d. with $\mathrm{corr}(\zeta _{i},\xi _{i})=\rho $ for all $i\geq 1$. Then there is joint weak convergence in $(\mathcal {C}^{\lambda}(\mathbb{I},\mathbb{R}^{2}),\|\cdot \|_{\lambda })$ of $(B_{n},Y_{n})$ to $(B,Y)$, for all $\lambda <\frac{1}{2}$, for a standard Brownian motion $B$ such that $[B,W]_{t}=\rho t$ for $t\in \mathbb{I}$, where $W$ is the standard Brownian motion driving the dynamics of the weak limit $Y$ in (2.4).

Proof

Let $(\zeta ^{\perp}_{i})_{i\geq 1}$ satisfy Assumption 2.10 and be independent of the innovations $(\zeta _{i})_{i\geq 1}$ defining $(Y^{n})_{n\geq 1}$. Then set $\xi _{i}:=\rho \zeta _{i} + \overline {\rho }\zeta ^{\perp}_{i}$ for $i\geq 1$ and let $B_{n}$ be defined in terms of $(\xi _{i})_{i\geq 1}$. This yields the same finite-dimensional distributions of $(B_{n},Y_{n})$ as for the general $(\xi _{i})_{i\geq 1}$ in the statement of Corollary 3.6. Consider now the drift vector $\textbf{b}(y)=(0,b(y))$ and the $2\times 2$ diffusion matrix $\textbf{a}(y)$ with rows $(\rho , \overline {\rho })$ and $(0,a(y))$. Then the result of Kushner [47] applies directly to give finite-dimensional convergence with the desired limit. Finally, tightness of $((B_{n},Y_{n}))_{n \geq 1}$ follows analogously to the proof of Theorem 3.5, and the claim follows. □

3.3 Invariance principle for rough processes

We have set the ground to extend our results to processes that are not necessarily $(\frac{1}{2}-\varepsilon )$-Hölder-continuous, Markovian or semimartingales. More precisely, we are interested in $\alpha $-Hölder-continuous paths with $\alpha \in (0,1)$, such as Riemann–Liouville fractional Brownian motion or some TBSS processes. A key tool is the continuous mapping theorem, first proved by Mann and Wald [53], which establishes the preservation of weak convergence under continuous operators.

Theorem 3.7

Let $(\mathcal {X},\|\cdot \|_{\mathcal {X}})$ and $(\mathcal {Y},\|\cdot \|_{\mathcal {Y}})$ be two normed spaces and assume that $g:\mathcal {X}\to \mathcal {Y}$ is a continuous operator. If the sequence $(Z_{n})_{n\geq 1}$ of random variables converges in $(\mathcal {X},\|\cdot \|_{\mathcal {X}})$ weakly to $Z$, then $(g(Z_{n}))_{n\geq 1}$ converges in $(\mathcal {Y},\|\cdot \|_{\mathcal {Y}})$ weakly to $g(Z)$.

Many authors have exploited the combination of Theorems 3.1 and 3.7 to prove weak convergence; see e.g. Pollard [62, Chap. IV]. This path avoids the lengthy computations of tightness and finite-dimensional convergence in classical proofs. In fact, Hamadouche [35] already realised that Riemann–Liouville fractional operators are continuous, so that Theorem 3.7 holds under mapping by Hölder-continuous functions. In contrast, the novelty here is to consider the family of GFOs applied to Brownian motion together with the extension of Brownian motion to Itô diffusions.

The analogue of Theorem 3.5 for $\mathcal {G}^{ \alpha }Y$ reads as follows.

Theorem 3.8

For $(Y_{n})_{n \geq 1}$ in (2.7) and its weak limit $Y$ in $(\mathcal {C}^{\lambda}(\mathbb{I}),\|\cdot \|_{\lambda} )$ for $\lambda <\frac{1}{2}$, the representation

$$\begin{aligned} (\mathcal {G}^{\alpha }Y_{n} )(t) = & \sum _{i=1}^{\lfloor nt \rfloor }n \big(G (t-t_{i-1} )-G (t-t_{i} ) \big) (Y_{n}^{i}-Y_{n}^{i-1} ) \\ & + n G (t-t_{\lfloor nt \rfloor } ) \big(Y_{n}(t)-Y_{n}^{\lfloor nt \rfloor } \big) \end{aligned}$$

(3.3)

holds for $t\in \mathbb{I}$. Furthermore, this sequence $\left (\mathcal {G}^{\alpha }Y_{n}\right )_{n\geq 1}$ converges in $(\mathcal {C}^{\alpha +\lambda}(\mathbb{I}),\|{\cdot} \|_{\alpha +\lambda} )$ weakly to $\mathcal {G}^{\alpha }Y$ for any $\alpha \in (-\lambda , 1-\lambda )$.

Proof

Recall that the sequence (2.7) is piecewise differentiable in time. First, for $\alpha \in (0, 1-\lambda )$, note that $g(0)=0$ since $\alpha >0$ and therefore by integration by parts (where $Y_{n}$ is piecewise differentiable), for $n\geq 1$ and $t\in \mathbb{I}$,

$$\begin{aligned} & (\mathcal {G}^{\alpha }Y_{n})(t) \\ & = \int _{0}^{t} g'(t-s)\big(Y_{n}(s)-Y_{n}(0)\big)\mathrm {d}s \\ &= \int _{0}^{t} g(t-s)\frac{\mathrm {d}(Y_{n}(s) - Y_{n}(0))}{\mathrm {d}s}\mathrm {d}s \\ & = \frac{1}{\sigma \sqrt{n}}\bigg(\sum _{i=1}^{\lfloor nt \rfloor }n\int _{t_{i-1}}^{t_{i}} g(t-s) a (Y_{n}^{i-1} )\zeta _{i}\mathrm {d}s + n\int _{t_{\lfloor nt \rfloor }}^{t} g(t-s)a (Y_{n}^{ \lfloor nt \rfloor } ) \zeta _{\lfloor nt \rfloor +1}\mathrm {d}s\bigg) \\ & \phantom{=:} +\frac{1}{n}\bigg(n\sum _{i=1}^{\lfloor nt \rfloor }\int _{t_{i-1}}^{t_{i}} g(t-s) b (Y_{n}^{i-1} )\mathrm {d}s + n\int _{t_{\lfloor nt \rfloor }}^{t} g(t-s)b (Y_{n}^{\lfloor nt \rfloor } ) \mathrm {d}s \bigg) \\ &=\sum _{i=1}^{\lfloor nt \rfloor }n \big(G(t-t_{i-1}) - G(t-t_{i}) \big) (Y_{n}^{i} - Y_{n}^{i-1} ) \\ & \phantom{=:} + n \big(G(t-t_{\lfloor nt \rfloor }) - G(0) \big) \big(Y_{n}(t)-Y_{n}^{\lfloor nt \rfloor } \big), \end{aligned}$$

and (3.3) follows since $G(0)=0$ in the last line. When $\alpha \in (-\lambda ,0)$, using $G(0)=0$, we similarly get

$$\begin{aligned} & \int _{0}^{t} g(t-s) \big(Y_{n}(s) - Y_{n}(0)\big)\mathrm {d}s \\ & = \int _{0}^{t} G(t-s)\frac{\mathrm {d}(Y_{n}(s) - Y_{n}(0))}{\mathrm {d}s}\mathrm {d}s \\ & = \frac{1}{\sigma \sqrt{n}} \bigg(\sum _{i=1}^{\lfloor nt \rfloor }n\int _{t_{i-1}}^{t_{i}} G(t-s) a (Y_{n}^{i-1} )\zeta _{i}\mathrm {d}s + n\int _{t_{\lfloor nt \rfloor }}^{t} G(t-s)a (Y_{n}^{ \lfloor nt \rfloor } ) \zeta _{\lfloor nt \rfloor +1}\mathrm {d}s\bigg) \\ &\phantom{=:} +\frac{1}{n} \bigg(n\sum _{i=1}^{\lfloor nt \rfloor }\int _{t_{i-1}}^{t_{i}} G(t-s) b (Y_{n}^{i-1} )\mathrm {d}s + n\int _{t_{\lfloor nt \rfloor }}^{t} G(t-s)b (Y_{n}^{\lfloor nt \rfloor } ) \mathrm {d}s \bigg) \\ &= n\bigg( \sum _{i=1}^{\lfloor nt \rfloor } \Big(\frac{b (Y_{n}^{i-1} )}{n} + \frac{a (Y_{n}^{i-1} )}{\sigma \sqrt{n}}\zeta _{i}\Big) \int _{t_{i-1}}^{t_{i}}G(t-s) \mathrm {d}s \\ & \hphantom{=n\bigg\{ } + \Big(\frac{b (Y_{n}^{\lfloor nt \rfloor } )}{n} + \frac{a (Y_{n}^{\lfloor nt \rfloor } )}{\sigma \sqrt{n}}\zeta _{\lfloor nt \rfloor +1}\Big) \int _{t_{ \lfloor nt \rfloor }}^{t} G(t-s) \mathrm {d}s \bigg) \\ &= n\bigg( \sum _{i=1}^{\lfloor nt \rfloor } (Y_{n}^{i} - Y_{n}^{i-1} ) \int _{t_{i-1}}^{t_{i}}G(t-s) \mathrm {d}s + \big(Y_{n}(t)-Y_{n}^{\lfloor nt \rfloor } \big) \int _{t_{\lfloor nt \rfloor }}^{t} G(t-s) \mathrm {d}s \bigg), \end{aligned}$$

and from there it readily follows that

$$\begin{aligned} (\mathcal {G}^{\alpha }Y_{n})(t)&= \frac{\mathrm {d}}{\mathrm {d}t} \int _{0}^{t} g(t-s) \big(Y_{n}(s) - Y_{n}(0)\big)\mathrm {d}s \\ & =\sum _{i=1}^{\lfloor nt \rfloor }n \big(G(t-t_{i-1})-G(t-t_{i}) \big) (Y_{n}^{i} - Y_{n}^{i-1} ) \\ & \phantom{=:} + n G(t-t_{\lfloor nt \rfloor })\bigl(Y_{n}(t)-Y_{n}^{\lfloor nt \rfloor }\bigr) \end{aligned}$$

as desired; note that when $t=\frac{k}{n}$, the difference quotients pick up an extra term, but this vanishes in the limit. Finally, the claimed convergence follows analogously to that in Theorem 3.5 by continuous mapping, along with the fact that $\mathcal {G}^{\alpha}$ is a continuous operator from $(\mathcal {C}^{\lambda}(\mathbb{I}),\|\cdot \|_{\lambda} )$ to $(\mathcal {C}^{\lambda +\alpha}(\mathbb{I}),\|\cdot \|_{\lambda +\alpha} )$ for all $\lambda \in (0,1)$ and $\alpha \in (-\lambda , 1-\lambda )$. □

Notice here that the mean value theorem implies

$$ (\mathcal {G}^{\alpha }Y_{n} )(t) = \sum _{i=1}^{\lfloor nt \rfloor }g (t_{i}^{*} ) (Y_{n}^{i}-Y_{n}^{i-1} ) + g (t^{*}_{\lfloor nt \rfloor +1} )\bigl(Y_{n}(t)-Y_{n}^{\lfloor nt \rfloor }\bigr), $$

(3.4)

where $t^{*}_{i}\in [t-t_{i},t-t_{i-1}]$ and $t^{*}_{\lfloor nt \rfloor +1}\in [0,t-t_{\lfloor nt \rfloor }]$ and we use that $G(0)=0$. This expression is closer to the usual left-point forward Euler approximation. For numerical purposes, (3.4) is much more efficient since the integral $G$ required in (3.3) is not necessarily available in closed form. Nevertheless, not any arbitrary choice of $t^{*}_{i}$ gives the desired convergence from the above argument. We present a suitable candidate for optimal $t_{i}^{*}$ in Sect. 4.3.1, which guarantees weak convergence in the Hölder sense.

As could be expected, the Hurst parameter influences the speed of convergence of the scheme. We leave a formal proof to further study, but the following argument provides some intuition about the correct normalising factor. Given $g \in \mathcal {L}^{\alpha}$, we can write $g(u)=u^{\alpha}L(u)$, where $L$ is a bounded function on $\mathbb{I}$. At time $t=t_{i}$, take $t^{*}_{k}=t_{i}-t_{k}+\frac{\varepsilon}{n}$ for $\varepsilon \in [0,1]$. For $\alpha \in (-\lambda ,0)$, since $g \in \mathcal {L}^{\alpha}$, we can rewrite the approximation (3.4) as

$$ (\mathcal {G}^{\alpha }Y_{n} )(t_{i}) = \frac{1}{n^{1/2+\alpha}}\sum _{k=1}^{i} (i - k + \varepsilon )^{\alpha }L(t_{k}^{*}) (Y^{k}_{n}-Y_{n}^{k-1} ) \sqrt{n} \qquad \text{for }i=0,\ldots ,n. $$

Here, $(i-k+\varepsilon )^{\alpha}\leq \varepsilon ^{\alpha}$ is bounded in $n\geq 1$ as long as $\varepsilon \in (0,1]$; so the normalisation factor is of the order $n^{-\alpha -1/2}$. When $\alpha \in (0, 1-\lambda )$, the approximation (3.4) instead reads

$$ (\mathcal {G}^{\alpha }Y_{n} )(t_{i}) = \frac{1}{\sqrt{n}}\sum _{k=1}^{i} \bigg(t_{i}-t_{k}+\frac{\varepsilon}{n}\bigg)^{\alpha }L(t^{*}_{k}) (Y^{k}_{n}-Y_{n}^{k-1} )\sqrt{n} \qquad \text{for }i=0,\ldots ,n, $$

in which case $(t_{i}-t_{k}+\frac{\varepsilon}{n})^{\alpha}\leq t_{i}^{\alpha}$ is bounded in $n\geq 1$, and hence the normalisation factor is of the order $n^{-1/2}$. This intuition is consistent with the result by Neuenkirch and Shalaiko [57] who found the strong rate of convergence of the Euler scheme to be of the order $\mathcal{O}(n^{-H})$ for $H<\frac{1}{2}$ for fractional Ornstein–Uhlenbeck. So far, our results hold for $\alpha $-Hölder-continuous functions; however, for practical purposes, it is often necessary to constrain the volatility process $(V_{t})_{t\in \mathbb{I}}$ to remain strictly positive at all times. The stochastic integral $\mathcal {G}^{\alpha }Y$ need not be such in general. However, a simple transformation (e.g. exponential) can easily overcome this fact. The remaining question is whether the $\alpha $-Hölder-continuity is preserved after this composition.

Proposition 3.9

Let $(Y_{n})_{n\geq 1}$ be the approximating sequence (2.7) in $\mathcal {C}^{\lambda}(\mathbb{I})$ for fixed $\lambda <1/2$. Then $(\Phi (\mathcal {G}^{\alpha }Y_{n} ) )_{n \geq 1}$ converges in $(\mathcal {C}^{\alpha +\lambda}(\mathbb{I}),\|\cdot \|_{\alpha +\lambda} )$ weakly to $\Phi (\mathcal {G}^{\alpha }Y )$ for all $\alpha \in (-\lambda , 1-\lambda )$.

Proof

By Theorem 3.8, $(\mathcal {G}^{\alpha }Y_{n})_{n \geq 1}$ converges in $(\mathcal {C}^{\lambda +\alpha}(\mathbb{I}),\|\cdot \|_{\lambda +\alpha} )$ weakly to $\mathcal {G}^{\alpha }Y$. Furthermore, with our assumptions, $\Phi $ is continuous from $(\mathcal {C}^{\lambda +\alpha}(\mathbb{I}),\|\cdot \|_{\lambda +\alpha} )$ to $(\mathcal {C}^{\lambda +\alpha}(\mathbb{I}),\|\cdot \|_{\lambda +\alpha} )$. The proposition thus follows from the continuous mapping theorem. The diagram below summarises the steps with $\lambda <1/2$. The double arrows show weak convergence, and we indicate next to them the topology in which it takes place.

□

3.4 Extending the weak convergence to the Skorokhod space and proof of Theorem 2.11

The Skorokhod space $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}} )$ of càdlàg functions equipped with the Skorokhod topology has been widely used to prove weak convergence. It markedly simplifies when we only consider continuous functions (as is the case for our framework with Hölder-continuous processes). Billingsley [13, Sect. 12] proved that the identity $\left (\mathcal {D}(\mathbb{I})\cap \mathcal {C}(\mathbb{I}),d_{\mathcal {D}}\right )=\left (\mathcal {C}(\mathbb{I}), \|\cdot \|_{\infty}\right )$ always holds. This seemingly simple statement allows us to reduce proofs of weak convergence of continuous processes in the Skorokhod topology to that in the supremum norm, which is usually much simpler. We start with the following straightforward observation.

Lemma 3.10

For $\lambda \in (0,1)$, the identity map is continuous from $(\mathcal {C}^{\lambda}(\mathbb{I}),\|\cdot \|_{\lambda })$ to $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}} )$.

Proof

Since the identity map is linear, it suffices to check that it is bounded. For this observe that $\| f\|_{\lambda}=|f|_{\lambda}+\sup _{t\in \mathbb{I}} |f(t)|=|f|_{\lambda}+ \| f\|_{\infty}>\| f\|_{\infty}$, which concludes the proof since the Skorokhod norm in the space of continuous functions is equivalent to the supremum norm. □

Applying the continuous mapping theorem twice, first with the generalised fractional operator (Theorem 3.8) and then with the identity map, directly yields the following result.

Theorem 3.11

For any $\alpha \in (-\frac{1}{2},\frac{1}{2} )$, the sequence $(\Phi (\mathcal {G}^{\alpha }Y_{n}) )_{n \geq 1}$ converges in $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}} )$ weakly to $\Phi (\mathcal {G}^{\alpha }Y )$. Moreover, the sequence is tight in $(\mathcal {C}(\mathbb{I}),\|\cdot \|_{\infty} )$.

The final step in the proof of our main theorem is to extend the functional weak convergence to the log-stock price $X$. For this, we rely on the weak convergence theory for stochastic integrals due to Jakubowski et al. [42] and further developed by Kurtz and Protter [46]. Throughout, we write $H \bullet N:=\int H \mathrm {d}N $ and use the notation $H_{-}$ for the process $H_{-}(t):=H(t-)$ obtained by taking left limits. The next result is a restatement of [46, Theorem 2.2] in the special case $\delta =\infty $ (in their notation) and restricted to real-valued processes on $\mathbb{I}$.

Theorem 3.12

For each $n\geq 1$, let $N_{n}=M_{n}+A_{n}$ be an $(\mathcal{F}^{n}_{t})$-semimartingale and $H_{n}$ an $(\mathcal{F}^{n}_{t})$-adapted càdlàg process on $\mathbb{I}$. Suppose that for all $\gamma >0$, there are $(\mathcal{F}^{n}_{t})$-stopping times $(\tau _{n}^{\gamma})_{n\geq 1}$ with the property that $\sup _{n \geq 1}\mathbb{P}[\tau _{n}^{\gamma }\leq \gamma ] \leq 1/\gamma $ and $\sup _{n\geq 1}\mathbb{E}[ [M_{n}]_{\tau _{n}^{\gamma}\land 1}+T_{ \tau _{n}^{\gamma}\land 1}(A_{n})]<\infty $, where $T_{t}$ denotes the total variation on $[0,t]$. If $(H_{n},N_{n} )_{n \geq 1}$ converges in $(\mathcal {D}(\mathbb{I}, \mathbb{R}^{2}),d_{\mathcal {D}})$ weakly to $(H,N)$, then $N$ is a semimartingale in the filtration generated by $(H,N)$ and $(H_{n}, N_{n}, (H_{n})_{-}\bullet N_{n} )_{n \geq 1}$ converges in $(\mathcal {D}(\mathbb{I}, \mathbb{R}^{3}),d_{\mathcal {D}})$ weakly to $(H, N, H_{-} \bullet N)$.

With this, we can now give the proof of Theorem 2.11 which asserts the functional weak convergence of the approximations $(X_{n})_{n \geq 1}$ from (2.8) to the desired log-price $X$ from (2.3).

Proof

of Theorem 2.11 We begin by considering, for all $n\geq 1$, the particular approximations

$$ M_{n}(t):= \frac{1}{\sigma \sqrt{n}}\sum _{i=1}^{\lfloor nt \rfloor }\xi _{i}, $$

for $t\in \mathbb{I}$, of the driving Brownian motion $B$ in the dynamics of $X$. Here the $\xi _{i}$ satisfy Assumption 2.10, and so do the $\zeta _{i}$ in the construction of $Y_{n}$ from (2.7). While each pair $\xi _{i}$ and $\zeta _{i}$ are correlated, they form an i.i.d. sequence $((\zeta _{i},\xi _{i}))_{i\geq 1}$ across the pairs. In particular, it is straightforward to see that each $M_{n}$ is a martingale on $\mathbb{I}$ for the filtration $(\mathcal{F}^{n}_{t})$ defined by $\mathcal{F}^{n}_{t}:=\sigma (\zeta _{i},\xi _{i}:i=1,\ldots ,\lfloor nt \rfloor )$. Moreover, we have

$$ \mathbb{E}\bigl[[M_{n}]_{t}\big] = \lfloor nt \rfloor \frac{1}{\sigma ^{2}n } \mathbb{E}[\xi _{1}^{2}] =\frac{\lfloor nt \rfloor }{n} \leq 1 $$

for $t\in \mathbb{I}$ and for all $n\geq 1$. Consequently, we can simply take $\tau _{n}^{\gamma }\equiv +\infty $ for all $\gamma >0$ and $n\geq 1$ to satisfy the required control on the integrators $N_{n}:=M_{n}$ in Theorem 3.12. By Ethier and Kurtz [25, Theorem 7.1.4], the $M_{n}$ converge in $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}})$ weakly to a Brownian motion $B$. Now fix $\alpha \in (-\frac{1}{2}, \frac{1}{2})$ and define a sequence $(H_{n})_{n \geq 1}$ of càdlàg processes on $\mathbb{I}$ by setting $H_{n}(1):=\Phi (\mathcal {G}^{\alpha }Y_{n})(1)$ and $H_{n}(t):=\Phi (\mathcal {G}^{\alpha }Y_{n})(t_{k-1})$ for $t\in [t_{k-1},t_{k})$ and $k=1,\ldots ,n$. In view of Theorem 3.11, the Arzelà–Ascoli characterisation of tightness (see [13, Theorem 8.2]) for the space $(\mathcal {C}(\mathbb{I}),\|\cdot \|_{\infty} )$ allows us to conclude that the $H_{n}$ converge in $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}})$ weakly to $H:=\Phi (\mathcal {G}^{\alpha }Y)$. Furthermore, recalling the definition of $Y_{n}$ in (2.7), each $H_{n}$ is adapted to the filtration $(\mathcal{F}^{n}_{t})$ introduced above. By Corollary 3.6, we readily deduce that there is joint weak convergence in $(\mathcal {D}(\mathbb{I}),d_{\mathcal{D}})\times (\mathcal {D}(\mathbb{I}),d_{\mathcal{D}}) \times (\mathcal {D}(\mathbb{I}),d_{\mathcal{D}}) $ of $(Y_{n},H_{n},M_{n})$ to $(Y,H,B)$, where $Y$ satisfies (2.4) for a Brownian motion $W$ with $[ W, B ]_{t} = \rho t$ for all $t\in \mathbb{I}$. As noted in [46], the Skorokhod topology on $\mathcal {D}(\mathbb{I},\mathbb{R}^{2})$ is stronger than the product topology on $\mathcal {D}(\mathbb{I})\times \mathcal {D}(\mathbb{I})$, but here it automatically follows that we have weak convergence in $(\mathcal {D}(\mathbb{I}, \mathbb{R}^{2}),d_{\mathcal {D}} )$ of the pairs $(H_{n},M_{n})$ to $(H,B)$ by standard properties of the Skorokhod topology (see e.g. [25, Theorem 3.10.2]), since the limiting pair $(H,B)$ is continuous. Consequently, we are in a position to apply Theorem 3.12. To this end, observe that

$$ \big((H_{n})_{-} \bullet M_{n}\big)(t) = \sum _{k=1}^{\lfloor nt \rfloor } H_{n}(t_{k}-) \big(M_{n}(t_{k})-M_{n}(t_{k}-)\big) =\frac{1}{\sigma \sqrt{n}}\sum _{k=1}^{ \lfloor nt \rfloor } \Phi (\mathcal {G}^{\alpha }Y_{n})(t_{k-1})\xi _{k}, $$

which is precisely the second term on the right-hand side of (2.8). Therefore, Theorem 3.12 gives that the stochastic integral $H \bullet M = \sqrt{\Phi (\mathcal {G}^{\alpha }Y)}\bullet B$ is in $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}})$ the weak limit of the second term on the right-hand side of (2.8). For the first term on the right-hand side of (2.8), we have $-\frac{1}{2}\int _{0}^{\cdot }H_{n}(s) \mathrm {d}s$ converging weakly to $-\frac{1}{2}\int _{0}^{\cdot }H(s)\mathrm {d}s$ by the continuous mapping theorem, as the integral is a continuous operator from $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}})$ to itself. Since there is weak convergence in $(\mathcal {D}(\mathbb{I},\mathbb{R}^{2}),d_{\mathcal {D}})$ of the pairs $(H_{n},(H_{n})_{-} \bullet M_{n})$ to $(H, H \bullet B)$, the sum of the two terms on the right-hand side of (2.8) are then also weakly convergent in $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}})$. Recalling that the limit $Y$ satisfies (2.4) for a Brownian motion $W$ such that $W$ and $B$ are correlated with parameter $\rho $, we hence conclude that $X$ converges in $(\mathcal {D}(\mathbb{I}),d_{\mathcal {D}})$ weakly to the desired limit. □

4 Applications

4.1 Weak convergence of the hybrid scheme

The hybrid scheme (and its turbocharged version in McCrickerd and Pakkanen [54]) introduced by Bennedsen et al. [10] is the current state-of-the-art to simulate TBSS processes. However, only convergence in mean-square-error was proved, but not (functional) weak convergence which would justify the use of the scheme for path-dependent options. Unless otherwise stated, we denote by $\mathcal {T}_{n}:=\{t_{k} = \frac{k}{n}:k=0,1,\dots ,n\}$ the uniform grid on $\mathbb{I}$. We show that the Hölder convergence also holds for the case $g(x)=x^{\alpha}$.

Proposition 4.1

The hybrid scheme sequence $(\widetilde{\mathcal {G}}^{\alpha }W_{n})_{n \geq 1}$ is defined as

$$ \widetilde{\mathcal {G}}^{\alpha }W_{n}(t):=\widetilde{\mathcal {G}}^{\alpha }W_{n} \bigg(\frac{\lfloor nt \rfloor }{n}\bigg) +(nt-\lfloor nt \rfloor )\bigg(\widetilde{\mathcal {G}}^{\alpha }W_{n} \Big(\frac{\lfloor nt \rfloor +1}{n}\Big) -\widetilde{\mathcal {G}}^{\alpha }W_{n}\Big( \frac{\lfloor nt \rfloor }{n}\Big) \bigg) $$

for $t\in \mathbb{I}$, where for $i = 0,\ldots ,n$ and $\kappa \geq 1$,

$$ \widetilde{\mathcal {G}}^{\alpha }W_{n}(t_{i}) := \sum _{k=1}^{(i-\kappa ) \vee 0} \xi _{k} \int _{t_{k-1}}^{t_{k}}\sqrt{n} (t_{i}-s)^{\alpha} \mathrm {d}s + \int _{0\vee t_{i-\kappa}}^{t_{i}}(t_{i}-s)^{\alpha} \mathrm {d}W_{s} $$

with $\xi _{k} := \int _{t_{k-1}}^{t_{k}}\mathrm {d}W_{s}\sim \mathcal {N}(0,\frac{1}{n})$. It converges in $(\mathcal {C}^{\alpha +1/2},\|\cdot \|_{\alpha +1/2} )$ weakly to $\mathcal {G}^{\alpha }W$ for $\alpha \in (-\frac{1}{2},\frac{1}{2} )$ and $\kappa =1$.

Proof

Finite-dimensional convergence follows trivially as the target process is centered Gaussian; thus convergence of the covariance matrices ensures finite-dimensional convergence. To prove weak convergence, we only need to show that the approximating sequence is tight, by verifying the criteria from Theorem 3.4 as follows. In Theorem 3.4, we need to show that

$$ \mathbb{E}[ |\widetilde{\mathcal {G}}^{\alpha }W_{n}(t_{i})-\widetilde{\mathcal {G}}^{\alpha }W_{n}(t_{j}) |^{2p} ]\leq C |t_{i}-t_{j}|^{2p\alpha +p} $$

for all $t_{i}, t_{j} \in \mathcal {T}_{n}$, for $p\geq 1$ and some constant $C\geq 0$. Without loss of generality, assume $t_{j}< t_{i}$ and take $\kappa =1$. Define

$$ \widetilde{\sigma}^{2}:=\mathbb{E}[ |\widetilde{\mathcal {G}}^{\alpha }W_{n}(t_{i})- \widetilde{\mathcal {G}}^{\alpha }W_{n}(t_{j}) |^{2} ]. $$

The statement is trivial if $j=0$ or $i=1$, so we assume otherwise. By the definition of $\widetilde{\mathcal{G}}^{\alpha}W_{n}(t_{i})$ and using the notation ≲ to mean less than or equal with a constant factor, we can write, since $\kappa =1$,

$$\begin{aligned} & |\widetilde{\mathcal{G}}^{\alpha}W_{n}(t_{i}) - \widetilde{\mathcal{G}}^{\alpha}W_{n}(t_{j}) |^{2} \\ & = \bigg|\sqrt{n}\sum _{k=1}^{i-1} \xi _{k} \int _{t_{k-1}}^{t_{k}}(t_{i}-s)^{ \alpha} \mathrm {d}s - \sqrt{n}\sum _{k=1}^{j-1}\xi _{k} \int _{t_{k-1}}^{t_{k}}(t_{j}-s)^{ \alpha} \mathrm {d}s \\ & \hphantom{=:\bigg|} + \int _{t_{i-1}}^{t_{i}}(t_{i}-s)^{\alpha }\mathrm {d}W_{s} - \int _{t_{j-1}}^{t_{j}}(t_{j}-s)^{ \alpha }\mathrm {d}W_{s}\bigg|^{2} \\ & = \bigg|\sqrt{n}\sum _{k=1}^{j-1}\xi _{k} \int _{t_{k-1}}^{t_{k}} \big((t_{i}-s)^{\alpha} - (t_{j}-s)^{\alpha} \big)\mathrm {d}s - \sqrt{n} \sum _{k=j}^{i-1} \xi _{k}\int _{t_{k-1}}^{t_{k}}(t_{i}-s)^{\alpha} \mathrm {d}s \\ & \hphantom{=:\bigg|} + \int _{t_{i-1}}^{t_{i}}(t_{i}-s)^{\alpha }\mathrm {d}W_{s} - \int _{t_{j-1}}^{t_{j}}(t_{j}-s)^{ \alpha }\mathrm {d}W_{s}\bigg|^{2} \\ & \lesssim \bigg|\sqrt{n}\sum _{k=1}^{j-1} \xi _{k} \int _{t_{k-1}}^{t_{k}} \big((t_{i}-s)^{\alpha} - (t_{j}-s)^{\alpha} \big)\mathrm {d}s \bigg|^{2} + \bigg|\sqrt{n}\sum _{k=j}^{i-1} \xi _{k} \int _{t_{k-1}}^{t_{k}}(t_{i}-s)^{ \alpha} \mathrm {d}s \bigg|^{2} \\ & \hphantom{=:} + \bigg|\int _{t_{i-1}}^{t_{i}}(t_{i}-s)^{\alpha }\mathrm {d}W_{s} - \int _{t_{j-1}}^{t_{j}}(t_{j}-s)^{ \alpha }\mathrm {d}W_{s}\bigg|^{2}. \end{aligned}$$

For the first term, since the sequence $(\xi _{k})$ is i.i.d. $\sim \mathcal{N}(0,\frac{1}{n})$, we can write

$$\begin{aligned} &\mathbb{E}\bigg[\bigg|\sqrt{n}\sum _{k=1}^{j-1} \xi _{k} \int _{t_{k-1}}^{t_{k}} \big((t_{i}-s)^{\alpha} - (t_{j}-s)^{\alpha} \big)\mathrm {d}s \bigg|^{2} \bigg] \\ & = \sum _{k=1}^{j-1} \bigg|\int _{t_{k-1}}^{t_{k}} \big((t_{i}-s)^{ \alpha} - (t_{j}-s)^{\alpha} \big)\mathrm {d}s\bigg|^{2} \\ & \leq \sum _{k=1}^{j-1} \bigg|\int _{t_{k-1}}^{t_{k}}(t_{i}-t_{j})^{ \alpha} \mathrm {d}s\bigg|^{2} \leq \frac{j-1}{n^{2}}(t_{i}-t_{j})^{2\alpha} \lesssim (t_{i}-t_{j})^{2\alpha}. \end{aligned}$$

For the second term, we have by Jensen’s inequality that

$$\begin{aligned} \mathbb{E}\bigg[\bigg|\sqrt{n}\sum _{k=j}^{i-1}\xi _{k} \int _{t_{k-1}}^{t_{k}}(t_{i}-s)^{ \alpha} \mathrm {d}s \bigg|^{2}\bigg] & = \sum _{k=j}^{i-1}\bigg|\int _{t_{k-1}}^{t_{k}}(t_{i}-s)^{ \alpha} \mathrm {d}s\bigg|^{2} \\ & \leq \sum _{k=j}^{i-1}\int _{t_{k-1}}^{t_{k}}(t_{i}-s)^{2\alpha} \mathrm {d}s \\ & = \int _{t_{j-1}}^{t_{i-1}}(t_{i}-s)^{2\alpha} \mathrm {d}s \\ &= \frac{(t_{i} - t_{j-1})^{2\alpha +1} - (t_{i} - t_{i-1})^{2\alpha +1}}{2\alpha +1} \\ & = \frac{1}{2\alpha +1} \bigg(\Big(t_{i} - t_{j}+\frac{1}{n}\Big)^{2 \alpha +1} - \frac{1}{n^{2\alpha +1}}\bigg) \\ & \lesssim \frac{(t_{i} - t_{j})^{2\alpha +1}}{2\alpha +1}. \end{aligned}$$

Finally, for the last term,

$$\begin{aligned} & \mathbb{E}\bigg[\bigg|\int _{t_{i-1}}^{t_{i}}(t_{i}-s)^{\alpha }\mathrm {d}W_{s} - \int _{t_{j-1}}^{t_{j}}(t_{j}-s)^{\alpha }\mathrm {d}W_{s}\bigg|^{2}\bigg] \\ & = \mathbb{E}\bigg[\bigg|\int _{t_{j-1}}^{t_{j}} \big((t_{i}-s)^{\alpha }- (t_{j}-s)^{ \alpha }\big) \mathrm {d}W_{s} \\ & \hphantom{=:\mathbb{E}\bigg[} + \int _{t_{j}}^{t_{i}}(t_{i}-s)^{\alpha }\mathrm {d}W_{s} - \int _{t_{j-1}}^{t_{i-1}}(t_{i}-s)^{ \alpha }\mathrm {d}W_{s}\bigg|^{2}\bigg] \\ & \lesssim \mathbb{E}\bigg[\bigg|\int _{t_{j-1}}^{t_{j}} \big((t_{i}-s)^{ \alpha }- (t_{j}-s)^{\alpha }\big) \mathrm {d}W_{s}\bigg|^{2}\bigg] \\ & \hphantom{=:} + \mathbb{E}\bigg[\bigg|\int _{t_{j}}^{t_{i}}(t_{i}-s)^{\alpha }\mathrm {d}W_{s} \bigg|^{2}\bigg] + \mathbb{E}\bigg[\bigg|\int _{t_{j-1}}^{t_{i-1}}(t_{i}-s)^{ \alpha }\mathrm {d}W_{s}\bigg|^{2}\bigg] \\ & = C \bigg( \int _{t_{j-1}}^{t_{j}} \big((t_{i}-s)^{\alpha }- (t_{j}-s)^{ \alpha }\big)^{2} \mathrm {d}s + \int _{t_{j}}^{t_{i}}(t_{i}-s)^{2\alpha} \mathrm {d}s + \int _{t_{j-1}}^{t_{i-1}}(t_{i}-s)^{2\alpha} \mathrm {d}s\bigg) \\ & \leq C \bigg(\int _{t_{j-1}}^{t_{j}} \big((t_{i}-s)^{\alpha }- (t_{j}-s)^{ \alpha }\big)^{2} \mathrm {d}s + \frac{(t_{i}-t_{j})^{2\alpha +1}}{2\alpha +1} + \frac{(t_{i-1}-t_{j-1})^{2\alpha +1}}{2\alpha +1}\bigg) \\ & \leq C \bigg( \int _{t_{j-1}}^{t_{j}}(t_{i}-t_{j})^{2\alpha} \mathrm {d}s + 2 \frac{(t_{i}-t_{j})^{2\alpha +1}}{2\alpha +1}\bigg) \\ & \lesssim (t_{i}-t_{j})^{2\alpha +1}. \end{aligned}$$

Thus by standard moment properties of Gaussian random variables (see Boucheron et al. [14, Theorem 2.1]), we obtain

$$ \mathbb{E} [ |\widetilde{\mathcal{G}}^{\alpha} W_{n}(t_{i}) - \widetilde{\mathcal{G}}^{\alpha} W_{n}(t_{j}) |^{2p} ] \lesssim \widetilde{\sigma}^{2p} \lesssim (t_{i}-t_{j})^{2p\alpha + p}, $$

which gives the desired result. □

We further note that Proposition 4.1 and Theorem 2.11 ensure the weak convergence of the log-stock price for the hybrid scheme as well.

Remark 4.2

Proposition 4.1 may easily be extended to a $d$-dimensional Brownian motion $W$ (for example for multifactor volatility models), also providing a weak convergence result for the $d$-dimensional version of the hybrid scheme recently developed by Heinrich et al. [37].

4.2 Application to fractional binomial trees

We consider a binomial setting for the Riemann–Liouville fractional Brownian motion $\mathcal {G}^{H-1/2} W$ with $g(u):= u^{H-1/2}$, $H\in (0,1)$, for which Theorem 3.8 provides a weakly converging sequence. On the partition $\mathcal {T}_{n}$ and with a Bernoulli sequence $(\zeta _{i})_{i=1, \dots ,n}$ satisfying $\mathbb{P}[\zeta _{i}=1]=\mathbb{P}[\zeta _{i}=-1]=\frac{1}{2}$ for all $i$ (satisfying the assumption of Theorem 2.11), the approximating sequence reads

$$ (\mathcal {G}^{H-1/2} W_{n})(t_{i}) := \frac{1}{\sqrt{n}}\sum _{k=1}^{i} \zeta _{k} (t_{i} - t_{k-1} )^{H-1/2} \qquad \text{for } i=0,\ldots ,n. $$

Figure 1 shows a fractional binomial tree structure for $H=0.75$ and $H=0.1$. Despite being symmetric, such trees cannot be recombining due to the (non-Markovian) path-dependent nature of the process. It might be possible in principle to modify the tree at each step to make it recombining, following the procedure developed in Akyıldırım et al. [1] for Markovian stochastic volatility models. This is not so straightforward, though, and requires a thorough analysis which we leave for future research.

4.3 Monte Carlo

Theorem 2.11 introduces the theoretical foundations for Monte Carlo methods (in particular for path-dependent options) for rough volatility models. In this section, we give a general and easy-to-understand recipe to implement the class of rough volatility models (2.3). For the numerical recipe, we consider the general time partition $\mathcal {T}:=\{t_{i}=\frac{iT}{n} : i=0,1,\ldots ,n\}$ on $[0,T]$ with $T>0$.

Algorithm 4.3

1)
Simulate for $j =1,\dots ,M $ and $i = 1, \dots ,n$ random variables $\xi _{j,i}$, i.i.d., and $\zeta _{j,i}$, i.i.d., all $\sim \mathcal {N}(0,1)$ and with $\mathrm{corr}(\xi _{j,i},\zeta _{j,i})~=~\rho $.
2)
Simulate M paths of $Y_{n}$ via
$$ Y_{n}^{j}(t_{i}) = \frac{T}{n}\sum _{k=1}^{i}b\big(Y_{n}^{j}(t_{k-1})\big) + \frac{T}{\sqrt{n}}\sum _{k=1}^{i}a\big(Y_{n}^{j}(t_{k-1})\big)\zeta _{j,k} $$
for $i=1,\ldots ,n$ and $j=1,\ldots ,M$. Here $Y_{n}^{j}(t_{i})$ denotes the $j$th path $Y_{n}$ evaluated at the time point $t_{i}$, which is different from the notation $Y_{n}^{j}$ in the theoretical framework above, but should not create any confusion. We also compute, for $i=1,\ldots ,n$ and $j=1,\ldots ,M$,
$$ \Delta Y^{j}_{n}(t_{i}) := Y^{j}_{n}(t_{i}) - Y^{j}_{n}(t_{i-1}). $$
3)
Simulate $M$ paths of the fractional driving process $((\mathcal {G}^{\alpha }Y_{n})(t))_{t \in \mathcal {T}}$ using
$$ (\mathcal {G}^{\alpha }Y_{n})^{j}\left (t_{i}\right ) := \sum _{k=1}^{i}g(t_{i-k+1})\Delta Y^{j}_{n}(t_{k}) = \sum _{k=1}^{i}g(t_{k})\Delta Y^{j}_{n}(t_{i-k+1}) $$
for $i=1,\ldots ,n$ and $j=1,\ldots ,M$. The complexity of this step is in general of the order $\mathcal {O}(n^{2})$ (see Appendix Bfor details). However, it is easily implemented by using discrete convolutions with complexity $\mathcal {O}(n\log n)$ (see Algorithm B.4in Appendix Bfor details in the implementation). With $\mathfrak {g} := (g(t_{i}))_{i=1,\ldots ,n}$ and $\Delta Y^{j}_{n} := (\Delta Y_{n}^{j}(t_{i}))_{i=1,\ldots ,n}$ for $j=1,\ldots ,M$, we can write $(\mathcal {G}^{\alpha }Y_{n})^{j}(\mathcal {T})=\sqrt{\frac{T}{n}}(\mathfrak {g}\ast \Delta Y^{j}_{n})$ for $j=1,\ldots ,M$, where ∗ represents the discrete convolution operator.
4)
Use the forward Euler scheme to simulate, for $i=1,\ldots ,n$ and $j=1,\ldots ,M$, the log-stock process as
$$ X^{j}(t_{i}) = X^{j}(t_{i-1}) - \frac{1}{2}\frac{T}{n}\Phi (\mathcal {G}^{\alpha }Y_{n} )^{j}(t_{i-1}) + \sqrt{\frac{T}{n}}\sqrt{\Phi (\mathcal {G}^{\alpha }Y_{n} )^{j}(t_{i-1})}\,\xi _{j,i}. $$

Remark 4.4

1) When $Y=W$, we may skip Step 2 in Algorithm 4.3 and replace $\Delta Y^{j}_{n}(t_{i})$ by $\sqrt{\frac{T}{n}}\zeta _{i,j}$ in Step 3.

2) Step 3 may be replaced by the hybrid scheme from [10] only when $Y=W$.

Antithetic variates in Algorithm 4.3 are easy to implement as it suffices to consider for $j=1,\ldots ,M$ the uncorrelated random vectors $\zeta _{j} := (\zeta _{j,1},\zeta _{j,2},\ldots ,\zeta _{j,n})$ and $\xi _{j} := (\xi _{j,1},\xi _{j,2},\ldots ,\xi _{j,n})$. Then $(\rho \xi _{j} + \overline {\rho }\zeta _{j},\xi _{j})$, $(\rho \xi _{j} - \overline {\rho }\zeta _{j},\xi _{j})$, $(-\rho \xi _{j} - \overline {\rho }\zeta _{j},-\xi _{j})$ and $(-\rho \xi _{j} + \overline {\rho }\zeta _{j},-\xi _{j})$ for $j=1,\ldots ,M$ constitute the antithetic variates, which significantly improves the performance of Algorithm 4.3 by reducing memory requirements, reducing variance and accelerating execution by exploiting symmetry of the antithetic random variables.

4.3.1 Enhancing the performance

A standard practice in Monte Carlo simulation is to match moments of the approximating sequence with the target process. In particular, when the process is Gaussian, matching first and second moments suffices. We only illustrate this approximation for Brownian motion: the left-point approximation (3.4) (with $Y=W$) may be modified to match moments as

$$ (\mathcal {G}^{\alpha }W)(t_{i}) \approx \frac{1}{\sigma \sqrt{n}}\sum _{k=1}^{i}g (t^{*}_{k} )\zeta _{k} \qquad \text{for }i=0,\ldots ,n, $$

(4.1)

where $t^{*}_{k}$ is chosen optimally. The first two moments of $\mathcal {G}^{ \alpha }W$ read

$$ \mathbb{E}[ (\mathcal {G}^{\alpha }W )(t) ] = 0, \qquad \operatorname {Var}[ (\mathcal {G}^{\alpha }W )(t) ] = \int _{0}^{t}g(t-s)^{2}\mathrm {d}s. $$

The first moment of the approximating sequence (4.1) is always zero, and the second moment reads

$$ \operatorname {Var}\bigg[\frac{1}{\sigma \sqrt{n}}\sum _{k=1}^{j-1}g (t^{*}_{k} ) \zeta _{k}\bigg] =\frac{1}{n}\sum _{k=1}^{j-1}g (t^{*}_{k} )^{2}. $$

Equating theoretical and approximating variances gives $\frac{1}{n}g(t^{*}_{k})^{2}=\int _{t_{k-1}}^{t_{k}} g(t-s)^{2}\mathrm {d}s$ for $k=1,\ldots ,n$, so that the optimal evaluation point can be computed via

$$ g(t_{k}^{*})=\sqrt{n\int _{t_{k-1}}^{t_{k}} g(t-s)^{2}\mathrm {d}s} \qquad \text{for $k=1,\ldots ,n$.} $$

(4.2)

With the optimal evaluation point, the scheme is still a convolution so that Algorithm B.4 in Appendix B can still be used for faster computations. In the Riemann–Liouville fractional Brownian motion case, $g(u) = u^{H-1/2}$ and the optimal point can be computed in closed form as

$$ t_{k}^{*} = \bigg(\frac{n}{2H} \big( (t-t_{k-1} )^{2H}- (t-t_{k} )^{2H} \big)\bigg)^{1/(2H-1)} \qquad \text{for }k=1,\ldots ,n. $$

Proposition 4.5

The moment matching sequence $(\widehat{\mathcal {G}}^{\alpha }W_{n})_{n \geq 1}$ is defined as

$$\begin{aligned} \widehat{\mathcal {G}}^{\alpha }W_{n}(t)&:=\widehat{\mathcal {G}}^{\alpha }W_{n}\bigg( \frac{\lfloor nt \rfloor }{n}\bigg) \\ & \hphantom{::=} +(nt-\lfloor nt \rfloor )\bigg(\widehat{\mathcal {G}}^{\alpha }W_{n}\Big(\frac{\lfloor nt \rfloor +1}{n} \Big) -\widehat{\mathcal {G}}^{\alpha }W_{n}\Big(\frac{\lfloor nt \rfloor }{n}\Big) \bigg) \end{aligned}$$

(4.3)

for $t\in \mathbb{I}$, where

$$ \widehat{\mathcal {G}}^{\alpha }W_{n}(t_{i}) := \sum _{k=1}^{i} \xi _{k} \sqrt{n \int _{t_{k-1}}^{t_{k}} g(t_{i}-s)^{2}\mathrm {d}s} $$

(4.4)

with $(\xi _{k})_{k \geq 1}$ an i.i.d. family of centered sub-Gaussian random variables satisfying $\mathbb{E}[\xi _{k}^{2}]=\frac{1}{n}$ (namely, $\mathbb{P}[|\xi _{k}|>x]\leq C \mathrm {e}^{-vx^{2}}$ for all $x>0$ and some $C,v>0$). Then weak convergence to $\mathcal {G}^{\alpha }W$ holds in $(\mathcal {C}^{\alpha +1/2},\|\cdot \|_{\alpha +1/2} )$ for $\alpha \in (-\frac{1}{2},\frac{1}{2} )$.

Proof

Finite-dimensional convergence follows from the central limit theorem as the target process is centered Gaussian; thus convergence of the covariance matrices is enough. It then suffices to prove that the approximating sequence is tight in the desired space, which in view of Theorem 3.4 can be deduced by establishing the control

$$ \mathbb{E}[ |\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{i})-\widetilde{\mathcal {G}}^{\alpha }W_{n}(t_{j}) |^{2p} ]\leq C |t_{i}-t_{j}|^{2p\alpha +p} $$

for all $t_{i},t_{j}\in \mathcal{T}_{n}$, for $p\geq 1$ and some constant $C\geq 0$. We have

$$\begin{aligned} \widetilde{\sigma}^{2} & := \mathbb{E}\big[\big(\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{i})- \widehat{\mathcal {G}}^{\alpha }W_{n}(t_{j})\big)^{2}\big] \\ & \hphantom{:} = \mathbb{E}\big[\big(\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{i})\big)^{2}\big]+ \mathbb{E}\big[\big(\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{j})\big)^{2}\big]-2\mathbb{E}[ \widehat{\mathcal {G}}^{\alpha }W_{n}(t_{i})\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{j}) ]. \end{aligned}$$

We note that

$$ \mathbb{E}\big[\big(\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{i})\big)^{2}\big] = \sum _{k=1}^{i}\bigg|\sqrt{\int _{t_{k-1}}^{t_{k}} g(t_{i}-s)^{2}\mathrm {d}s} \bigg|^{2} = \int _{0}^{t_{i}} \! g(t_{i}-s)^{2}\mathrm {d}s=\mathbb{E}\big[\big( \mathcal {G}^{\alpha }W(t_{i})\big)^{2}\big], $$

and by Cauchy–Schwarz, we also have

$$\begin{aligned} \mathbb{E}[\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{i})\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{j}) ] & = \sum _{k=1}^{t_{i}\wedge t_{j}}\sqrt{\int _{t_{k-1}}^{t_{k}} g(t_{i}-s)^{2} \mathrm {d}s \int _{t_{k-1}}^{t_{k}} g(t_{j}-s)^{2} \mathrm {d}s } \\ & \geq \int _{0}^{t_{i}\wedge t_{j}} g(t_{i}-s)g(t_{j}-s)\mathrm {d}s \\ &=\mathbb{E}[\mathcal {G}^{\alpha }W(t_{i})\mathcal {G}^{\alpha }W(t_{j}) ]. \end{aligned}$$

We then obtain $\widetilde{\sigma}^{2}\leq \mathbb{E}[ |\mathcal {G}^{\alpha }W(t_{i})-\mathcal {G}^{ \alpha }W(t_{j}) |^{2} ]$. Finally, $\widetilde{\mathcal {G}}^{\alpha }W_{n}(t_{i})-\widetilde{\mathcal {G}}^{\alpha }W(t_{j})$ is sub-Gaussian as a linear combination of sub-Gaussian random variables, and the Gaussian moment inequality (see Boucheron et al. [14, Theorem 2.1]) with the variance estimate $\widetilde{\sigma}^{2}$ yields

$$ \mathbb{E}[ |\widehat{\mathcal {G}}^{\alpha }W_{n}(t_{i})-\widehat{\mathcal {G}}^{\alpha }W(t_{j}) |^{2p} ]\leq \mathbb{E}[ |\mathcal {G}^{\alpha }W(t_{i})-\mathcal {G}^{\alpha }W(t_{j}) |^{2p} ]\leq \widetilde{C} |t_{i}-t_{j}|^{2p\alpha +p}. $$

□

4.3.2 Reducing variance

As Bayer et al. [6] and Bennedsen et al. [10] pointed out, a major drawback in simulating rough volatility models is the very high variance of the estimators, so that a large number of simulations are needed to produce a decent price estimate. Nevertheless, the rDonsker scheme admits a very simple conditional expectation technique which reduces both memory requirements and variance while also admitting antithetic variates. This approach is best suited for calibrating European type options. We consider $\mathcal {F}^{B}_{t}=\sigma (B_{s}:s\leq t)$ and $\mathcal {F}^{W}_{t}=\sigma (W_{s}:s\leq t)$, the natural filtrations generated by the Brownian motions $B$ and $W$. In particular, the conditional variance process $(V_{t}|\mathcal {F}^{W}_{t})$ is deterministic. As discussed by Romano and Touzi [64] and recently adapted to the rBergomi case by McCrickerd and Pakkanen [54], we can decompose the stock price process as

$$ \mathrm {e}^{X_{t}} = \mathcal {E}\bigg(\rho \int _{0}^{t}\sqrt{\Phi (\mathcal {G}^{\alpha }Y )(s)} \, \mathrm {d}W_{s}\bigg) \mathcal {E}\bigg(\overline {\rho }\int _{0}^{t}\sqrt{\Phi (\mathcal {G}^{ \alpha }Y )(s)} \, \mathrm {d}W^{\bot}_{s}\bigg) =: \mathrm {e}^{X^{||}_{t}}\mathrm {e}^{X^{ \bot}_{t}} $$

and notice that

$$ X_{t}\vert (\mathcal {F}^{W}_{t} \vee \mathcal {F}^{B}_{0}) \sim \mathcal {N}\bigg(X^{||}_{t} - \overline {\rho }^{2}\int _{0}^{t}\Phi (\mathcal {G}^{\alpha }Y )(s)\mathrm {d}s, \overline {\rho }^{2} \int _{0}^{t}\Phi (\mathcal {G}^{\alpha }Y )(s)\mathrm {d}s\bigg). $$

Thus $\exp (X_{t})$ becomes lognormal and the Black–Scholes closed-form formulae are valid here (European, barrier options, maximum, etc.). The advantage of this approach is that the orthogonal Brownian motion $W^{ \bot}$ is completely unnecessary for the simulation; hence the generation of random numbers is reduced to a half, yielding proportional memory saving. Not only do we get this, but this simple trick also reduces the variance of the Monte Carlo estimate; hence fewer simulations are needed to obtain the same precision. The following algorithm implements this idea, assuming that $Y = W$.

Algorithm 4.6

On the equidistant grid $\mathcal {T}$:

1)
Simulate i.i.d. random variables $\zeta _{j,i} \sim \mathcal {N}(0,1)$ and create antithetic variates $-\zeta _{j,i}$ for $j=1,\ldots ,M/2$, $i=1,\ldots ,n$.
2)
Simulate $M$ paths of the fractional driving process $\mathcal {G}^{\alpha }W$ using discrete convolution (see Algorithm B.4in Appendix Bfor details) to get
$$ (\mathcal {G}^{\alpha }W)^{j}(\mathcal {T})=\sqrt{\frac{T}{n}}(\mathfrak {g}\ast \zeta _{j}),\qquad j=1,\ldots ,M, $$
and store in memory $\overline {\rho }^{2}\int _{0}^{T}(\mathcal {G}^{\alpha }W)^{j}(s)\mathrm {d}s \approx \overline {\rho }^{2}\frac{T}{n}\sum _{k=0}^{n-1} (\mathcal {G}^{\alpha }W)^{j}(t_{k}) =: \Sigma ^{j}$ for each $j=1,\ldots ,M$.
3)
Use the forward Euler scheme to simulate for $i=1,\ldots ,n$, $j=1,\ldots ,M$ the log-stock process as
$$ X^{j}(t_{i}) = X^{j}(t_{i-1}) - \frac{\rho ^{2}}{2}\frac{T}{n}\Phi (\mathcal {G}^{\alpha }W )^{j}(t_{i-1}) + \rho \sqrt{\frac{T}{n}}\sqrt{\Phi (\mathcal {G}^{\alpha }W )^{j}(t_{i-1})}\,\zeta _{j,i}. $$
4)
Finally, we may compute any option price using the Black–Scholes formula. For instance, the price of a call option with strike $K$ and maturity $T\in \mathbb{I}$ would be given by $C^{j}(K)=\exp (X^{j}(T))\mathcal {N}(d^{j}_{1})-K\mathcal {N}(d^{j}_{2})$ for $j=1,\ldots ,M$, with $d^{j}_{1,2}$ given by $d^{j}_{1}:=\frac{1}{\sqrt{\Sigma ^{j}}}(X^{j}(T)-\log K +\frac{1}{2}\Sigma ^{j})$ and $d^{j}_{2}=d^{j}_{1}-\sqrt{\Sigma ^{j}}$. Thus the output of the model would be $C(K)=\frac{1}{M}\sum _{k=1}^{M}C^{j}(K)$.

The algorithm is easily adapted to general diffusions $Y$ as drivers of the volatility (Algorithm 4.3, Step 2). Algorithm 4.3 is obviously faster than Algorithm 4.6, especially when using control variates. Nevertheless, with the same number of paths, Algorithm 4.6 remarkably reduces the Monte Carlo variance, meaning in turn that fewer simulations are needed, making it very competitive for calibration.

4.4 Numerical example: rough Bergomi model

Figures 2–5 perform a numerical analysis of the Monte Carlo convergence as a function of $n$. We observe that the lower $H$, the larger $n$ needs to be to achieve convergence. However, we also observe that for the Cholesky, rDonsker (naive and moment match) and hybrid schemes and $H\geq 0.1$, with $n=252$, we already achieve a precision of the order $10^{-4}$, which is equivalent to a basis point in financial terms. For $H<0.1$, we might require $n$ larger than 252 if precision is required beyond $10^{-4}$. We also observe in Fig. 5 that the naive rDonsker approximation converges extremely slowly for small $H$. Additionally, Figs. 6–11 measure the price estimations compared to the Cholesky method which is taken as benchmark. The hybrid scheme tends to be closer to this benchmark especially for $H< 0.1$. When $H\geq 0.1$, for both the hybrid scheme and rDonsker moment match, we observe an error less than $10^{-4}$ for $n\geq 252$. It is noteworthy to mention that the naive rDonsker scheme has substantially worse convergence (at least an order of magnitude) than the other methods. We note that the black lines in all figures represent the $99\%$ Monte Carlo standard deviations; hence errors below that threshold should be interpreted as noise.

4.5 Speed benchmark against Markovian stochastic volatility models

In this section, we benchmark the speed of the rDonsker scheme against the hybrid scheme and a classical Markovian stochastic volatility model using $10^{5}$ simulations and averaging the speeds over 10 trials. For the former, we simulate the rBergomi model from Bayer et al. [6], whereas for the latter, we use the classical Bergomi [12] model using a forward Euler scheme in both volatility and stock price. All three schemes are implemented in Cython to make the comparison fair and to obtain speeds comparable to C++. Figure 12 shows that rDonsker is about 2 times slower than the Markovian case, whereas the hybrid scheme is approximately 2.5 times slower, which is expected from the complexities of the two schemes. However, it is remarkable that the $\mathcal {O}(n\log n)$ complexity of the FFT stays almost constant with the grid size $n$, and the computational time grows almost linearly as in the Markovian case. We presume that this is the case since $n\ll $ 10’000 is relatively small. Figure 12 also shows that rough volatility models can be implemented very efficiently and are not much slower than classical stochastic volatility models.

4.6 Implementation guidelines and conclusion

The numerical analysis above suggests some guidelines to implement rough volatility models driven by TBSS processes of the form $\mathcal {G}^{H-1/2} Y$ for some Itô diffusion $Y$.

Regarding empirical estimates, Gatheral et al. [32] suggest that $H\approx 0.15$. Bennedsen et al. [11] give an exhaustive analysis of more than 2000 equities for which $H\in [0.05,0.2]$. On the pricing side, Bayer et al. [6] and Jacquier et al. [40] found that calibration yields $H\in [0.05,0.10]$. Finally, Livieri et al. [50] found evidence in options data that $H \approx 0.3$. Despite the diverse ranges found so far, there is a common agreement that $H<\frac{1}{2}$. Table 1 provides recommendations about when to use which numerical scheme depending on the value of the Hurst parameter.

Table 1 Recommendations for the use of different numerical schemes depending on the Hurst parameter

Full size table

Remark 4.7

The rough Heston model by Guennoun et al. [33] is out of the scope of the hybrid scheme. Moreover, any process of the form $\mathcal {G}^{\alpha }Y$ for some Itô diffusion $Y$ under Assumption 2.3 is in general out of the scope of the hybrid scheme. This only leaves the choice of using the rDonsker scheme, for which reasonable accuracy is obtained at least for Hölder regularities greater than 0.05.

4.7 Bushy trees and binomial markets

Binomial trees have attracted a lot of attention from both academics and practitioners, as their apparent simplicity provides easy intuition about the dynamics of a given asset. Not only this, but they are by construction arbitrage-free and allow finding prices of path-dependent options, together with their hedging strategy. In particular, early exercise options, in particular Bermudan or American options, are usually priced using trees as opposed to Monte Carlo methods. The convergence stated in Theorem 2.11 lays the theoretical foundations to construct fractional binomial trees (note that Bernoulli random variables satisfy the conditions of the theorem). Figure 1 already showed binomial trees for fractional Brownian motion, but we ultimately need trees describing the dynamics of the stock price.

4.7.1 A binary market

We invoke Theorem 2.11 with the independent sequences $(\zeta _{i})_{i=1,\dots ,n}$, $(\zeta _{i}^{\bot})_{i=1,\dots ,n}$ such that $\mathbb{P}[\zeta _{i}=1]=\mathbb{P}[\zeta ^{\bot}_{i}=1]=\mathbb{P}[\zeta _{i}=1]=\mathbb{P}[ \zeta ^{\bot}_{i}=-1]=\frac{1}{2}$ for all $i$. We further define on $\mathcal {T}$ for any $i=1, \ldots , n$ the quantities

$$\begin{aligned} B_{n}(t_{i}) & = \sqrt{\frac{T}{n}}\sum _{k=1}^{i} (\rho \zeta _{k}+ \overline {\rho }\zeta ^{\bot}_{k} ), \\ Y_{n}(t_{i}) & = \frac{T}{n}\sum _{k=1}^{i}b\big(Y_{n}(t_{k-1})\big) + \sqrt{\frac{T}{n}}\sum _{k=1}^{i}\sigma \big(Y_{n}(t_{k-1})\big) \zeta _{k}, \end{aligned}$$

the approximating sequences to $B$ and $Y$ in (2.3). The approximation for $X$ is then given by

$$ X_{n}(t_{i}) = X_{n}(t_{i-1}) - \frac{1}{2}\frac{T}{n}\sum _{k=1}^{i} \Phi (\mathcal {G}^{\alpha }Y_{n} )(t_{k}) + \sqrt{\frac{T}{n}}\sum _{k=1}^{i} \sqrt{\Phi (\mathcal {G}^{\alpha }Y_{n} )(t_{k})} \, (\rho \zeta _{k}+\overline {\rho }\zeta ^{\bot}_{k} ). $$

In order to construct the tree, we have to consider all possible permutations of the random vectors $(\zeta _{i})_{i = 1,\dots ,n}$ and $(\zeta ^{ \bot}_{i})_{i = 1,\dots ,n}$. Since each random variable only takes two values, this adds up to $4^{n}$ possible combinations; hence the ‘bushy tree’ terminology. When $\rho \in \{-1,1\}$, the magnitude is reduced to $2^{n}$.

4.8 American options in rough volatility models

There is so far no available scheme for American options (or any early-exercise options for that matter) under rough volatility models, but the fractional trees constructed above provide a framework to do so. In the Black–Scholes model, American options can be priced using binomial trees by backward induction. A key ingredient is the Snell envelope and the following representation.

Definition 4.8

Let $(X(t))_{t\in \mathbb{I}}$ be an $(\mathcal{F}_{t})_{t\in \mathbb{I}}$-adapted process. The Snell envelope $\mathcal{J}$ of $X$ with respect to $(\mathcal{F}_{t})_{t\in \mathbb{I}}$ is defined as $\mathcal{J}(X)(t) := \text{esssup}_{\tau \in \tilde{\mathbb{I}}_{t}} \mathbb{E}[X(\tau )| \mathcal{F}_{t}]$ for all $t\in \mathbb{I}$, where $\tilde{\mathbb{I}}_{t}$ denotes the set of stopping times with values in $[t,1] \subseteq \mathbb{I}$.

Definition 4.9

For a log-strike $k$ and price process $S$, write $C(t):=\mathrm {e}^{-rt}(S(t)- \mathrm {e}^{k})^{+}$ and $P(t):=\mathrm {e}^{-rt}(\mathrm {e}^{k}-S(t))^{+}$. Then the discounted time-$t$ prices of an American call and put with log-strike $k$ and maturity $T$ are given by, respectively,

$$ \mathrm {e}^{-rt}C_{t}^{a}(k,T)=\mathcal{J}(C)(t), \qquad \mathrm {e}^{-rt}P_{t}^{a}(k,T)= \mathcal{J}(P)(t). $$

Preservation of weak convergence under the Snell envelope map is due to Mulinacci and Pratelli [55], who proved that convergence takes place in the Skorokhod topology provided that the Snell envelope is continuous. In our setting, the scheme for American options is justified by the following result. Recall that a filtration is said to be immersed in a larger filtration if all local martingales in the smaller filtration are also local martingales in the larger filtration.

Theorem 4.10

Let $V$ and $X$ in (2.3) be such that $\mathbb{E}[\sup _{t\in \mathbb{I}}\mathrm {e}^{X_{t}}] <\infty $ and the natural filtration of $X$ is immersed in the natural filtration of $(B,W)$ from (2.3) and (2.4). With the sequence $(X_{n})_{n\geq 1}$ converging in $(\mathcal{D}(\mathbb{I}),d_{\mathcal{D}})$ weakly to $X$, as per Theorem 2.11, assume additionally that $(X_{n})_{n\geq 1}$ satisfies the conditions of Mulinacci and Pratelli [55, Theorem 3.5]. Let $S=\mathrm {e}^{X}$ and $S_{n}=\mathrm {e}^{X_{n}}$. If for each $n\geq 1$, $C_{n}$ and $P_{n}$ are defined as in Definition 4.9for the price process $S_{n}$, then $(S_{n},\mathcal{J}(C_{n}),\mathcal{J}(P_{n}))$ converges in $(\mathcal{D}(\mathbb{I};\mathbb{R}^{3}),d_{\mathcal{D}})$ weakly to $(S,\mathcal{J}(C),\mathcal{J}(P))$, where the Snell envelopes are defined with respect to the natural filtrations of the corresponding price processes.

Proof

By the continuous mapping theorem, the weak convergence of $(X_{n})$ implies that $(S_{n},C_{n},P_{n})$ converges in $(\mathcal{D}(\mathbb{I};\mathbb{R}^{3}),d_{\mathcal{D}})$ weakly to $(S,C,P)$. Because $C$ and $P$ are continuous and $\mathbb{E}[\sup _{t\in \mathbb{I}}\mathrm {e}^{X_{t}}] <\infty $, it follows from Karatzas and Shreve [43, Theorem D.13] that for both $\mathcal{J}(C)$ and $\mathcal{J}(P)$, the predictable finite-variation part of the Doob–Meyer decomposition is continuous. Moreover, the assumption that the natural filtration of $X$ is immersed in that of $(W,B)$ implies that also the martingale part is continuous. Therefore $\mathcal{J}(C)$ and $\mathcal{J}(P)$ are continuous processes. In view of the additional assumptions, the claim now follows from [55, Theorem 5.5]. □

Mulinacci and Pratelli [55] also gave explicit conditions for weak convergence to be preserved in the Markovian case. It is trivial to see that the pricing of American options in the rough tree scheme coincides with the classical backward induction procedure. We consider a continuously compounded interest rate $r$ and dividend yield $d$.

Algorithm 4.11

On the equidistant grid $\mathcal {T}$:

1)
Construct the binomial tree using the explicit construction in Sect. 4.7.1and obtain $(S^{j}_{t})_{t\in \mathcal {T},j=1,\dots ,4^{n}}$.
2)
The backward recursion for the American option price with payoff function $h(\,\cdot \,)$ is given by $\widetilde{h}_{t_{N}} := h(S_{t_{N}})$ and
$$ \widetilde{h}_{t_{i}} := \mathrm {e}^{(d-r)/n}\mathbb{E}[\widetilde{h}_{t_{i+1}}|\mathcal {F}_{t_{i}} ] \vee h(S_{t_{i}}) \qquad \textit{for } i=N-1,\ldots ,0, $$
where $\mathbb{E}[\,\cdot \,|\mathcal {F}_{t_{i}}] = \frac{1}{4} (\widetilde{h}^{\mathrm{++}}_{t_{i+1}}+\widetilde{h}^{\mathrm{+-}}_{t_{i+1}}+\widetilde{h}^{\mathrm{-+}}_{t_{i+1}}+\widetilde{h}^{\mathrm{--}}_{t_{i+1}} )$ and $\widetilde{h}_{t_{i}}^{\mathrm{\pm \pm}}$ represents the outcome $(\zeta _{i},\zeta ^{\bot}_{i})=(\pm 1,\pm 1)$ for the driving binomials, following the construction in Sect. 4.7.1.
3)
Finally, $\widetilde{h}_{0}$ is the price of the American option at the inception of the contract.

The main computational cost of the scheme is the construction of the tree in Step 1). Once the tree is constructed, computing American prices for different options is a fast routine.

4.8.1 Numerical example: rough Bergomi model

We note that, as proved by Gassiat [31, Theorem 1.1(1)], the price process in the rough Bergomi model is a true martingale when $\rho \leq 0$. We construct a rough volatility tree for the rough Bergomi model (introduced in [6]) and check the accuracy of the scheme. Figures 13 and 14 show the fractional trees for different values of $H$ and for $\rho \in \{-1,1\}$. Both pictures show a markedly different behaviour, but as a common property, we observe that as $H$ tends to $1/2$, the tree structure somehow becomes simpler.

4.8.2 European options

Figure 15 displays volatility smiles obtained using the tree scheme. Even though the time steps are not sufficient for small $H$, the fit remarkably improves when $H\geq 0.15$, and always remains inside the $95\%$ confidence interval with respect to the hybrid scheme. Moreover, the moment-matching approach from Sect. 4.3.1 shows a better accuracy than the standard rDonsker scheme when $H \leq 0.1$. In Fig. 16, a detailed error analysis corroborates these observations: the relative error is smaller than $3\%$ for $H\geq 0.15$.

4.8.3 American options

In the context of American options, there is no benchmark to compare our result. However, the accurate results found in the previous section (at least for $H\geq 0.15$) justify the use of trees to price American options. Figure 17 shows the output of American and European put prices with interest rates equal to $r=5\%$. Interestingly, the rougher the process (the smaller $H$), the larger the difference between in-the-money European and American options.

References

Akyıldırım, E., Dolinsky, Y., Soner, H.M.: Approximating stochastic volatility by recombinant trees. Ann. Appl. Probab. 24, 2176–2205 (2014)
MathSciNet Google Scholar
Alòs, E., León, J., Vives, J.: On the short-time behavior of the implied volatility for jump-diffusion models with stochastic volatility. Finance Stoch. 11, 571–589 (2007)
MathSciNet Google Scholar
Bardina, X., Nourdin, I., Rovira, C., Tindel, S.: Weak approximation of a fractional SDE. Stoch. Process. Appl. 120, 39–65 (2010)
MathSciNet Google Scholar
Barndorff-Nielsen, O.E., Schmiegel, J.: Ambit processes: with applications to turbulence and tumour growth. In: Benth, F.E., et al. (eds.) Stochastic Analysis and Applications. Abel Symp., vol. 2, pp. 93–124. Springer, Berlin (2007)
Google Scholar
Bayer, C., Friz, P., Gassiat, P., Martin, J., Stemper, B.: A regularity structure for rough volatility. Math. Finance 30, 782–832 (2020)
MathSciNet Google Scholar
Bayer, C., Friz, P., Gatheral, J.: Pricing under rough volatility. Quant. Finance 16, 1–18 (2016)
MathSciNet Google Scholar
Bayer, C., Friz, P., Gulisashvili, A., Horvath, B., Stemper, B.: Short-time near the money skew in rough fractional stochastic volatility models. Quant. Finance 19, 779–798 (2019)
MathSciNet Google Scholar
Beiglböck, M., Siorpaes, P.: Pathwise versions of the Burkholder–Davis–Gundy inequality. Bernoulli 21, 360–373 (2015)
MathSciNet Google Scholar
Bender, C., Parczewski, P.: On the connection between discrete and continuous Wick calculus with an application to the fractional Black–Scholes model. In: Cohen, S.N., et al. (eds.) Stochastic Processes, Finance and Control: A Festschrift in Honor of Robert J. Elliott, pp. 3–40. World Scientific, Singapore (2012)
Google Scholar
Bennedsen, M., Lunde, A., Pakkanen, M.S.: Hybrid scheme for Brownian semistationary processes. Finance Stoch. 21, 931–965 (2017)
MathSciNet Google Scholar
Bennedsen, M., Lunde, A., Pakkanen, M.S.: Decoupling the short- and long-term behavior of stochastic volatility. J. Financ. Econom. 20, 961–1006 (2022)
Google Scholar
Bergomi, L.: Smile dynamics II. Risk, October, 67–73 (2005)
Billingsley, P.: Convergence of Probability Measures, 2nd edn. Wiley, New York (1999)
Google Scholar
Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, London (2013)
Google Scholar
Broux, L., Caravenna, F., Zambotti, L.: Hairer’s multilevel Schauder estimates without regularity structures. Preprint (2023). Available online at https://arxiv.org/abs/2301.07517
Čentsov, N.N.: Weak convergence of stochastic processes whose trajectories have no discontinuities of the second kind and the ‘heuristic’ approach to the Kolmogorov–Smirnov tests. Theory Probab. Appl. 1, 140–144 (1956)
MathSciNet Google Scholar
Clark, D.S.: Short proof of a discrete Gronwall inequality. Discrete Appl. Math. 16, 279–281 (1987)
MathSciNet Google Scholar
Comte, F., Renault, É.: Long memory continuous time models. J. Econom. 73, 101–149 (1996)
MathSciNet Google Scholar
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)
MathSciNet Google Scholar
Djehiche, M., Eddahbi, M.: Hedging options in market models modulated by the fractional Brownian motion. Stoch. Anal. Appl. 19, 753–770 (2001)
MathSciNet Google Scholar
Donsker, M.D.: An invariance principle for certain probability limit theorems. In: Memoirs of the AMS, vol. 6. Am. Math. Soc., Providence (1951)
Google Scholar
Drábek, P.: Continuity of Nemyckij’s operator in Hölder spaces. Comment. Math. Univ. Carol. 16, 37–57 (1975)
Google Scholar
Dudley, R.M.: Uniform Central Limit Theorems. Cambridge University Press, Cambridge (1999)
Google Scholar
El Euch, O., Rosenbaum, M.: The characteristic function of rough Heston models. Math. Finance 29, 3–38 (2019)
MathSciNet Google Scholar
Ethier, S.N., Kurtz, T.G.: Markov Processes: Characterization and Convergence. Wiley, New York (1986)
Google Scholar
Forde, M., Zhang, H.: Asymptotics for rough stochastic volatility models. SIAM J. Financ. Math. 8, 114–145 (2017)
MathSciNet Google Scholar
Friz, P.K., Hairer, M.: A Course on Rough Paths. Springer, Berlin (2021)
Google Scholar
Friz, P.K., Victoir, N.: Multidimensional Stochastic Processes as Rough Paths: Theory and Applications. Cambridge University Press, Cambridge (2010)
Google Scholar
Fukasawa, M.: Asymptotic analysis for stochastic volatility: martingale expansion. Finance Stoch. 15, 635–654 (2011)
MathSciNet Google Scholar
Fukasawa, M., Takabatake, T., Westphal, R.: Consistent estimation for fractional stochastic volatility model under high-frequency asymptotics (Is volatility rough?). Math. Finance 32, 1086–1132 (2022)
MathSciNet Google Scholar
Gassiat, P.: On the martingale property in the rough Bergomi model. Electron. Commun. Probab. 24, 1–9 (2019)
MathSciNet Google Scholar
Gatheral, J., Jaisson, T., Rosenbaum, M.: Volatility is rough. Quant. Finance 18, 933–949 (2018)
MathSciNet Google Scholar
Guennoun, H., Jacquier, A., Roome, P., Shi, F.: Asymptotic behaviour of the fractional Heston model. SIAM J. Financ. Math. 9, 1017–1045 (2018)
Google Scholar
Gulisashvili, A.: Large deviation principle for Volterra type fractional stochastic volatility models. SIAM J. Financ. Math. 9, 1102–1136 (2018)
MathSciNet Google Scholar
Hamadouche, D.: Invariance principles in Hölder spaces. Port. Math. 57, 127–151 (2000)
MathSciNet Google Scholar
Hardy, G.H., Littlewood, J.E.: Some properties of fractional integrals I. Math. Z. 27, 565–606 (1928)
MathSciNet Google Scholar
Heinrich, C., Pakkanen, M., Veraart, A.E.D.: Hybrid simulation scheme for volatility modulated moving average fields. Math. Comput. Simul. 166, 224–244 (2019)
MathSciNet Google Scholar
Horvath, B., Jacquier, A., Lacombe, C.: Asymptotic behaviour of randomised fractional volatility models. J. Appl. Probab. 56, 496–523 (2019)
MathSciNet Google Scholar
Hurst, H.E.: Long-term storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 116, 770–799 (1956)
Google Scholar
Jacquier, A., Martini, C., Muguruza, A.: On VIX futures in the rough Bergomi model. Quant. Finance 18, 45–61 (2018)
MathSciNet Google Scholar
Jacquier, A., Pakkanen, M., Stone, H.: Pathwise large deviations for the rough Bergomi model. J. Appl. Probab. 55, 1078–1092 (2018)
MathSciNet Google Scholar
Jakubowski, A., Mémin, J., Pagès, G.: Convergence en loi des suites d’intégrales stochastiques sur l’espace $D1$ de Skorokhod. PTRF 81, 111–137 (1989)
Google Scholar
Karatzas, I., Shreve, S.E.: Methods of Mathematical Finance. Springer, New York (1998)
Google Scholar
Kolmogorov, A.: Wienersche Spiralen und einige andere interessante Kurven im Hilbertschen Raum. C.R. (Dokl.) Acad. URSS (N.S.) 26, 115–118 (1940)
MathSciNet Google Scholar
Krylov, N.V.: Lectures on Elliptic and Parabolic Equations in Hölder Spaces. Graduate Studies in Mathematics, vol. 12. Am. Math. Soc., Providence (1996)
Google Scholar
Kurtz, T.G., Protter, P.E.: Weak convergence of stochastic integrals and stochastic differential equations. In: Talay, D., Tubaro, L. (eds.) CIME School in Probability. Lecture Notes in Mathematics, vol. 1627, pp. 1–41. Springer, Berlin (1991)
Google Scholar
Kushner, J.: On the weak convergence of interpolated Markov chains to a diffusion. Ann. Probab. 2, 40–50 (1974)
MathSciNet Google Scholar
Lamperti, J.: Semi-stable stochastic processes. Trans. Am. Math. Soc. 104, 62–78 (1962)
MathSciNet Google Scholar
Lamperti, J.: On convergence of stochastic processes. Trans. Am. Math. Soc. 104, 430–435 (1962)
MathSciNet Google Scholar
Livieri, G., Mouti, S., Pallavicini, A., Rosenbaum, M.: Rough volatility: evidence from option prices. IISE Trans. 50, 767–776 (2018)
Google Scholar
Lyons, T.J.: Differential equations driven by rough signals. Rev. Mat. Iberoam. 14, 215–310 (1998)
MathSciNet Google Scholar
Mandelbrot, B., Van Ness, J.: Fractional Brownian motions, fractional noises and applications. SIAM Rev. 10, 422–437 (1968)
MathSciNet Google Scholar
Mann, H.B., Wald, A.: On stochastic limit and order relationships. Ann. Math. Stat. 14, 217–226 (1943)
MathSciNet Google Scholar
McCrickerd, R., Pakkanen, M.S.: Turbocharging Monte Carlo pricing for the rough Bergomi model. Quant. Finance 18, 1877–1886 (2018)
MathSciNet Google Scholar
Mulinacci, S., Pratelli, M.: Functional convergence of Snell envelopes: applications to American options approximations. Finance Stoch. 2, 311–327 (1998)
MathSciNet Google Scholar
Muskhelishvili, N.I.: Singular Integral Equations. Boundary Problems of Functions Theory and Their Applications to Mathematical Physics. Revised translation from the Russian, edited by J.R.M. Radok. Reprinted. Wolters-Noordhoff Publishing, Groningen (1972).
Google Scholar
Neuenkirch, A., Shalaiko, T.: the order barrier for strong approximation of rough volatility models. Preprint (2016). Available online at https://arxiv.org/abs/1606.03854
Neuman, E., Rosenbaum, M.: Fractional Brownian motion with zero Hurst parameter: a rough volatility viewpoint. Electron. Commun. Probab. 23, Paper no. 61, 1–12 (2018)
MathSciNet Google Scholar
Nieminen, A.: Fractional Brownian motion and martingale-differences. Stat. Probab. Lett. 70, 1–10 (2004)
MathSciNet Google Scholar
Parczewski, P.: Donsker-type theorems for correlated geometric fractional Brownian motions and related processes. Electron. Commun. Probab. 22, Paper no. 55, 1–13 (2017)
MathSciNet Google Scholar
Parczewski, P.: A fractional Donsker theorem. Stoch. Anal. Appl. 32, 328–347 (2014)
MathSciNet Google Scholar
Pollard, D.: Convergence of Stochastic Processes. Springer, Berlin (1984)
Google Scholar
Račkauskas, A., Suquet, C.: Necessary and sufficient condition for the functional central limit theorem in Hölder spaces. J. Theor. Probab. 17, 221–243 (2004)
Google Scholar
Romano, M., Touzi, N.: Contingent claims and market completeness in a stochastic volatility model. Math. Finance 7, 399–412 (1997)
MathSciNet Google Scholar
Samko, S.G., Kilbas, A.A., Marichev, O.I.: Fractional Integrals and Derivatives: Theory and Applications. Gordon & Breach, Yverdon (1993)
Google Scholar
Sottinen, T.: Fractional Brownian motion, random walks and binary market models. Finance Stoch. 5, 343–355 (2001)
MathSciNet Google Scholar
Taqqu, M.: Weak convergence to fractional Brownian motion and to the Rosenblatt process. Z. Wahrscheinlichkeitstheor. Verw. Geb. 31, 287–302 (1975)
MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank Christian Bayer, Peter Friz, Masaaki Fukasawa, Paul Gassiat, Jim Gatheral, Mikko Pakkanen and Mathieu Rosenbaum for useful discussions. The numerical implementations have been carried out on the collaborative Zanadu platform, and the code is fully available at GitHub:RoughFCLT.

Author information

Authors and Affiliations

Department of Mathematics, University of Oxford, Oxford, UK
Blanka Horvath
Department of Mathematics, Imperial College London, and the Alan Turing Institute, London, UK
Antoine Jacquier
Kaiju Capital Management and Department of Mathematics, Imperial College London, London, UK
Aitor Muguruza
Department of Statistics, London School of Economics, London, UK
Andreas Søjmark

Authors

Blanka Horvath
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Jacquier
View author publications
You can also search for this author in PubMed Google Scholar
Aitor Muguruza
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Søjmark
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antoine Jacquier.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

BH acknowledges financial support from the SNSF Early Postdoc.Mobility grant 165248; AM is grateful to the Imperial CDT in Financial Computing & Analytics for financial support. Part of this work was carried out while AJ was Visiting Professor in Baruch College, CUNY. AJ further acknowledges financial support from the EPSRC/T032146 grant.

Appendices

Appendix A: Riemann–Liouville operators

We review here fractional operators and their mapping properties. We follow closely the excellent monograph by Samko et al. [65, Chap. 1], as well as some classical results by Hardy and Littlewood [36]. However, we introduce a modification in their definition, so that the condition $f(0)=0$ is not necessary as opposed to the original definition in [36].

1.1 A.1 Riemann–Liouville fractional operators

Definition A.1

For $\lambda \in (0,1)$, $\alpha \in (-\lambda , 1- \lambda )$, the left Riemann–Liouville fractional operator is defined on $\mathcal {C}^{\lambda}( \mathbb{I})$ as

$$\begin{aligned} (I^{\alpha }f)(t) := \textstyle\begin{cases} \frac{1}{\Gamma (\alpha )}\int _{0}^{t} \frac{f(s)-f(0)}{(t-s)^{1-\alpha}} \mathrm {d}s & \text{for }\alpha \in (0,1- \lambda ), \\ (\frac{\mathrm {d}}{\mathrm {d}t}I^{1+\alpha}f )(t) \\ \quad = \frac{1}{\Gamma (1+\alpha )}\frac{\mathrm {d}}{\mathrm {d}t}(\int _{0}^{t}(t-s)^{ \alpha }(f(s)-f(0))\mathrm {d}s) & \text{for }\alpha \in (-\lambda , 0). \end{cases}\displaystyle \end{aligned}$$

(A.1)

Theorem A.2

For any $f\in \mathcal {C}^{\lambda}(\mathbb{I})$ with $\lambda \in (0,1)$ and $\alpha \in (-\lambda , 1-\lambda )$, we have $I^{\alpha}f \in \mathcal {C}^{\lambda +\alpha}(\mathbb{I})$. In particular, there exists $C>0$ such that $|(I^{\alpha}f)(t)| \leq C t^{\alpha +\lambda}$ for any $t \in \mathbb{I}$.

Proof

We first consider $\alpha >0$. Then using the definition and $f\in \mathcal {C}^{\lambda}(\mathbb{I})$, we obtain

$$\begin{aligned} | (I^{\alpha}f)(t) | & = \frac{1}{\Gamma (\alpha )} \bigg| \int _{0}^{t} \frac{f(u)-f(0)}{(t-u)^{1-\alpha}} \mathrm {d}u | \\ &\leq \frac{|f|_{\lambda}}{\Gamma (\alpha )}\int _{0}^{t} \frac{u^{\lambda }\mathrm {d}u}{(t-u)^{1-\alpha}} \\ & \leq \frac{\Gamma (2+\lambda )|f|_{\lambda}}{(1+\lambda )\Gamma (\alpha +\lambda +1)}t^{ \alpha +\lambda}, \end{aligned}$$

which proves the estimate for $|I^{\alpha}f|$. Next we prove that $I^{\alpha}f\in \mathcal {C}^{\lambda +\alpha}(\mathbb{I})$. For this, introduce $\phi (t):=f(t)-f(0)$ and consider $t,t+h\in \mathbb{I}$ with $h>0$. Then

$$\begin{aligned} &(I^{\alpha}f)(t+h) - (I^{\alpha}f)(t) \\ & = \frac{1}{\Gamma (\alpha )}\bigg(\int _{-h}^{t} \frac{\phi (t-u)}{(u+h)^{1-\alpha}} \mathrm {d}u - \int _{0}^{t} \frac{\phi (t-u)}{u^{1-\alpha}} \mathrm {d}u\bigg) \\ & = \frac{\phi (t)}{\Gamma (1+\alpha )} \big((t+h)^{\alpha}-t^{ \alpha }\big)+\frac{1}{\Gamma (\alpha )}\int _{-h}^{0} \frac{\phi (t-u)-\phi (t)}{(u+h)^{1-\alpha}} \mathrm {d}u \\ & \phantom{=:} + \frac{1}{\Gamma (\alpha )}\int _{0}^{t} \big((u+h)^{\alpha -1} - u^{ \alpha -1} \big) \big(\phi (t-u)-\phi (t) \big)\mathrm {d}u \\ & =: J_{1}+J_{2}+J_{3}. \end{aligned}$$

(A.2)

We first consider $J_{1}$. If $h>t$, then

$$ |J_{1}|\leq \frac{|f|_{\lambda}}{\Gamma (1+\alpha )}t^{\lambda }\big((t+h)^{ \alpha}-t^{\alpha }\big) \leq C h^{\lambda +\alpha}. $$

On the other hand, when $0< h< t$, since $(1+u)^{\alpha}-1\leq \alpha u$ for $u>0$, we get

$$ |J_{1}| \leq \frac{|f|_{\lambda}}{\Gamma (1+\alpha )}t^{\lambda + \alpha}\bigg|\bigg(1+\frac{h}{t}\bigg)^{\alpha}-1\bigg| \leq C h t^{ \lambda +\alpha -1}\leq C h^{\lambda +\alpha}. $$

For $J_{2}$, since $f\in \mathcal {C}^{\lambda}(\mathbb{I})$, we can write

$$ |J_{2}|\leq \frac{|f|_{\lambda}}{\Gamma (\alpha )}\int _{-h}^{0} \frac{|u|^{\lambda}}{(u+h)^{1-\alpha}} \leq C h^{\lambda +\alpha}. $$

Finally,

$$\begin{aligned} |J_{3}|&\leq \frac{|f|_{\lambda}}{\Gamma (\alpha )}\int _{0}^{t}u^{ \lambda }\big(u^{\alpha -1}-(u+h)^{\alpha -1}\big)\mathrm {d}u \\ & = \frac{|f|_{\lambda}}{\Gamma (\alpha )}h^{\lambda +\alpha}\int _{0}^{t/h}u^{ \lambda}\big(u^{\alpha -1}-(u+1)^{\alpha -1}\big)\mathrm {d}u. \end{aligned}$$

Hence if $t\leq h$, then $|J_{3}|\leq C h^{\lambda +\alpha}$. Likewise, if $t>h$ and $\lambda +\alpha <1$, then $|J_{3}|\leq C h^{\lambda +\alpha}$ since

$$ |u^{\alpha -1}-(u+1)^{\alpha -1} | = u^{\alpha -1}\bigg(1-\Big(1+ \frac{1}{u}\Big)^{\alpha -1}\bigg)\leq C u^{\alpha -2}. $$

Thus we have shown that $I^{\alpha}f$ satisfies the $(\lambda +\alpha )$-Hölder condition and belongs to $\mathcal {C}^{\lambda +\alpha}(\mathbb{I})$ in the case where $\alpha >0$. The conclusion for $\alpha <0$ follows by taking $g(u):=u^{\alpha}$ in the proof of Proposition 2.2 in Appendix C. □

Corollary A.3

For any $\lambda \in (0,1)$ and $\alpha \in (-\lambda , 1-\lambda )$, $I^{\alpha }$ is a continuous operator from $\mathcal {C}^{\lambda}(\mathbb{I})$ to $\mathcal {C}^{\lambda +\alpha}(\mathbb{I})$.

Proof

Clearly, $I^{\alpha }$ is a linear operator. From Theorem A.2, $\| I^{\alpha }f\|_{\alpha +\lambda} \leq C\| f\|_{\lambda}$ since $|f|_{\lambda}\leq \| f\|_{\lambda}$. Therefore $I^{\alpha }$ is also bounded and hence continuous. □

Appendix B: Discrete convolution

Definition B.1

For $\mathrm{a},\mathrm{b}\in \mathbb{R}^{n}$, the discrete convolution operator $\ast :\mathbb{R}^{n}\times \mathbb{R}^{n}\to \mathbb{R}^{n}$ is defined as

$$ (\mathrm{a} * \mathrm{b})_{i} := \sum _{m=0}^{i} \mathrm{a}_{m} \mathrm{b}_{i-m},\qquad i=0,\ldots ,n-1. $$

When simulating $\mathcal {G}^{\alpha }W$ on the uniform partition $\mathcal {T}$, the scheme (Algorithm 4.3, Step 3) reads

$$ (\mathcal {G}^{\alpha }W)^{j}(t_{i}) = \sum _{k=1}^{i}g(t_{k})\zeta _{j,k-i+1} \qquad \text{for }i=1,\ldots ,n, $$

which has the form of the discrete convolution in Definition B.1. Rewritten in matrix form,

$$ \begin{pmatrix} g(t_{1}) & 0 & \ldots & 0 \\ g(t_{2}) & g(t_{1}) & \ldots & 0 \\ \vdots &\vdots &\ddots & 0 \\ g(t_{n}) & g(t_{n-1}) & \ldots & g(t_{1}) \end{pmatrix} \begin{pmatrix} \zeta _{1} \\ \vdots \\ \zeta _{n} \end{pmatrix} , $$

it is clear that this operation yields a complexity of the order $\mathcal {O}(n^{2})$, which can be improved drastically.

Definition B.2

For a sequence $\mathrm {c}:= (c_{0},c_{1},\dots ,c_{n-1} )\in \mathbb{C}^{n}$, the discrete Fourier transform (DFT) is given by

$$ \widehat{f}(\mathrm {c})[j] := \sum _{k=0}^{n-1}c_{k} \exp \bigg(- \frac{2\mathtt {i}\pi j k}{n}\bigg) \qquad \text{for }j=0,\ldots ,n-1, $$

and the inverse DFT of $\mathrm {c}$ is given by

$$ f(\mathrm {c})[k] := \frac{1}{n}\sum _{j=0}^{n-1}c_{j} \exp \bigg( \frac{2\mathtt {i}\pi j k}{n}\bigg) \qquad \text{for }k=0,\ldots ,n-1. $$

In general, both transforms require a computational effort of the order $\mathcal {O}(n^{2})$, but the fast Fourier transform (FFT) algorithm by Cooley and Tukey [19] exploits the symmetry and periodicity of complex exponentials of the DFT and reduces the complexity of both transforms to $\mathcal {O}(n\log n)$.

Theorem B.3

For $\mathrm {a}, \mathrm {b}\in \mathbb{R}^{n}$, the identity $(\mathrm {a}\ast \mathrm {b})=f (\widehat{f}(\mathrm {a})\bullet \widehat{f}(\mathrm {b}) )$ holds, where • is the pointwise multiplication.

This implies that the complexity of the discrete convolution reduces to $\mathcal {O}(n\log n)$ by FFT.

Algorithm B.4

On the equidistant grid $\mathcal {T}= \{ t_{i} =\frac{iT}{n} : i = 0,1,\dots ,n \}$:

1)
Draw i.i.d. random variables $\zeta _{j,i} \sim \mathcal {N}(0,1)$ for $j=1,\ldots ,M$, $i=1,\ldots ,n$.
2)
Define $\mathfrak {g}:=(g(t_{i}))_{i=1,\ldots ,n}$ and $\zeta _{j}:=(\zeta _{j,i})_{i=1,\ldots ,n}$ for $j=1,\ldots ,M$.
3)
Using the FFT, compute $\varphi _{j}:=\widehat{f}(\mathfrak {g})\bullet \widehat{f}(\zeta _{j})$ for $j=1,\ldots ,M$.
4)
Simulate $M$ paths of $\mathcal {G}^{\alpha }W$, using the FFT, as $(\mathcal {G}^{\alpha }W)^{j}(\mathcal {T})=\sqrt{\frac{T}{n}}f(\varphi _{j})$ for $j=1,\ldots ,M$.

In Step 2 in Algorithm B.4, we may replace the evaluation points $\mathfrak {g}$ by any optimal evaluation point $(g(t_{i}^{*}))_{i=1}^{n}$ as in (4.2). Many packages offer a direct implementation of the discrete convolution such as numpy.convolve in Python. The user then only needs to pass the arguments $\mathfrak {g}$ and $\xi _{j}$, and Steps 3 and 4 in Algorithm B.4 are computed automatically (using efficient FFT techniques) by the function. Although the FFT step is the heaviest computation in the simulation of rough volatility models, the actual time grid $\mathcal {T}$ is not especially large ($n\ll 1000$). Hence the fastest FFT for very large $n$ is not essential as the implementation is run on smaller time grids. In this aspect, we find that numpy.convolve is a very competitive implementation.

Appendix C: Proof of Proposition 2.2

In this section, we present a proof of Proposition 2.2. We first consider the case $\alpha \in (-\lambda , 0)$ with $0<\lambda \leq 1$. Fix $f\in \mathcal {C}^{\lambda}(\mathbb{I})$ and $g \in \mathcal{L}^{\alpha}$. As our first step, we derive a useful representation akin to Samko et al. [65, Eq. (13.1)], but for the operator $\mathcal{G}^{\alpha}$ associated to $g$, which amounts to $\mathcal{G}^{\alpha }f(0)=0$ and

$$ \mathcal{G}^{\alpha }f(t) = \big(f(t)-f(0)\big)g(t) - \int _{0}^{t} \big(f(t)-f(s)\big)\frac{\mathrm{d}}{\mathrm{d}t}g(t-s)\mathrm{d}s $$

(C.1)

for $t\in (0,1]$ (note that $g(0)$ need not be defined, but the assumptions on $f$ and $g$ yield $\mathcal{G}^{\alpha }f(0)=0$ by its definition in (2.1)). To show that (C.1) holds, we look at the difference quotients for the definition of $\mathcal{G}^{\alpha }f$ in (2.1). For any $t\in [0,1)$ and any small enough $h>0$, a bit of rewriting leads to the equality

$$\begin{aligned} &\int _{0}^{t+h}\big(f(s)-f(0)\big)g(t+h-s)\mathrm {d}s - \int _{0}^{t}\big(f(s)-f(0) \big)g(t-s)\mathrm {d}s \\ & = \int _{0}^{t} \big(f(s)-f(t)\big)\bigl( g(t+h-s)-g(t-s)\bigr)\mathrm {d}s \\ & \phantom{=:} + \big(f(t)-f(0)\big)\biggl( \int _{0}^{t+h} g(t+h-s)\mathrm{d}s - \int _{0}^{t} g(t-s) \mathrm{d}s \biggr) \\ & \phantom{=:} + \int _{t}^{t+h} \big(f(s)-f(t)\big)g(t+h-s)\mathrm{d}s. \end{aligned}$$

(C.2)

In the second term on the right-hand side of (C.2), a change of variables gives

$$ \big(f(t)-f(0)\big)\biggl(\int _{0}^{t+h} g(t+h-s)\mathrm{d}s - \int _{0}^{t} g(t-s) \mathrm{d}s \biggr) = \big(f(t)-f(0)\big)\int _{-h}^{0} g(t-r) \mathrm{d}r. $$

Looking at the third term on the right-hand side of (C.2), using the assumptions $f\in \mathcal {C}^{\lambda}$ and $g\in \mathcal{L}^{\alpha}$ for the given $\lambda $ and $\alpha $, we get

$$ \bigg\vert \int _{t}^{t+h} \big(f(s)-f(t)\big)g(t+h-s)\mathrm {d}s \bigg\vert \leq \int _{t}^{t+h} C_{1} h^{\lambda }C_{2} h^{\alpha } \mathrm {d}s \leq C_{0} h^{1+\lambda + \alpha} = o(h) $$

as $h$ tends to zero, since $\lambda +\alpha \in (0,1)$. As for the first term, we have

$$\begin{aligned} \bigg\vert \big(f(t)-f(s)\big)\frac{g(t+h-s)-g(t-s)}{h} \bigg\vert & \leq \frac{ C_{1} |t-s|^{\lambda}}{h}\int _{t-s}^{t-s+h} g^{\prime}(r) \mathrm {d}r \\ & \leq C_{1} (t-s)^{\lambda }C_{2} |t-s|^{\alpha -1} \\ & = C_{0} (t-s)^{\lambda +\alpha -1} \end{aligned}$$

for all $s\in (0,t)$, $h>0$, where $\lambda + \alpha -1 \in (-1,0)$; so the right-hand side is in $L^{1}([0,t])$ and we can apply the dominated convergence theorem. Specifically, dividing by $h$ in (C.2) and sending $h$ to zero, we obtain (C.1) in the limit by using dominated convergence on the first term, Lebesgue’s differentiation theorem on the second, and noting that the third term vanishes.

Having established (C.1), we use it to obtain the desired Hölder estimates. We begin with the first term on the right-hand side of (C.1). Let $\phi (t):=(f(t)-f(0))g(t)$ for $t\in \mathbb{I}$, where we note that $|(f(t)-f(0))g(t)|\leq C t^{\lambda + \alpha}$ with $\lambda + \alpha \in (0,1)$ so that $\phi (0)=0$ is well defined. Rewriting and using the assumptions $f\in \mathcal {C}^{\lambda}$ and $g\in \mathcal{L}^{\alpha}$ for the given $\lambda $ and $\alpha $, we get for every $t\in \mathbb{I}$ and $h\in (0,1-t]$ that

$$\begin{aligned} \vert \phi (t+h)-\phi (t) \vert &\leq |f(t)-f(0)| \int _{t}^{t+h}|g^{ \prime}(r)|\mathrm{d}r + |g(t+h)||f(t+h)-f(t)| \\ & \leq C \vert f \vert _{\lambda }t^{\lambda + \alpha} (t+h)^{\alpha } \bigl( (t+h)^{-\alpha} - t^{-\alpha} \bigr) + C \vert f \vert _{ \lambda }h^{\lambda + \alpha} \\ & \leq C^{\prime }\vert f \vert _{\lambda }h^{\lambda +\alpha} \end{aligned}$$

(C.3)

where the last inequality follows by elementary considerations as in the arguments in Muskhelishvili [56, Chap. 1, Sect. 6, item 5]. The case $h\in [-t,0]$ is analogous.

For the second term on the right-hand side of (C.1), we can follow a procedure similar to the proof of Samko et al. [65, Lemma 13.1]. Defining

$$ \varphi (t):=\int _{0}^{t} \big(f(t)-f(s)\big) \frac{\mathrm{d}}{\mathrm{d}t}g(t-s)\mathrm{d}s = \int _{0}^{t} \big(f(t)-f(t-r) \big)\frac{\mathrm{d}}{\mathrm{d}r}g(r)\mathrm{d}r $$

and rewriting things, we arrive for any $t\in \mathbb{I}$ and $h\in [-t,1-t]$ at

$$\begin{aligned} \varphi (t+h)-\varphi (t) &= \int _{0}^{t} \big(f(t) - f(t-r)\big) \bigl(g^{\prime}(r+h) - g^{\prime}(r)\bigr)\mathrm{d}r \\ & \phantom{=:} - \int _{0}^{t} \big(f(t) - f(t-r)\big)g^{\prime}(r+h)\mathrm{d}r \\ & \phantom{=:} + \int _{-h}^{t} \big(f(t+h)-f(t-u)\big) g^{\prime}(u+h)\mathrm{d}u \\ &= \int _{0}^{t} \big(f(t) - f(t-r)\big)\bigl(g^{\prime}(r+h) - g^{ \prime}(r)\bigr)\mathrm{d}r \\ & \phantom{=:} - \int _{0}^{t} \big(f(t+h) - f(t)\big)g^{\prime}(r+h)\mathrm{d}r \\ & \phantom{=:} + \int _{-h}^{0} \big(f(t+h)-f(t-u)\big) g^{\prime}(u+h)\mathrm{d}u \\ &=: I_{1} + I_{2} + I_{3}. \end{aligned}$$

(C.4)

Without loss of generality, we assume $h>0$. For the first integral, a change of variables gives

$$\begin{aligned} |I_{1}| &\leq \vert f \vert _{\lambda }\int _{0}^{t} r^{\lambda } \int _{r}^{r+h} |g^{\prime \prime} (u)|\mathrm{d}u \mathrm{d}r \\ &\leq C \vert f \vert _{\lambda }\int _{0}^{t} r^{\lambda }\bigl( r^{ \alpha -1} - (r+h)^{\alpha -1} \bigr) \mathrm{d}r \\ & = C \vert f \vert _{\lambda }h^{\lambda +\alpha -1} \int _{0}^{t} \bigg(\frac{r}{h}\bigg)^{\lambda }\bigg( \Big(\frac{r}{h}\Big)^{ \alpha -1} - \Big(\frac{r}{h}+1\Big)^{\alpha -1} \bigg) \mathrm{d}r \\ & = C \vert f \vert _{\lambda }h^{\lambda +\alpha } \int _{0}^{t/h} u^{ \alpha +\lambda -1} \bigg(1 -\Big( 1+\frac{1}{u}\Big)^{\alpha -1} \bigg) \mathrm{d}u \\ & \leq C \vert f \vert _{\lambda }h^{\lambda +\alpha } \bigg( \int _{0}^{1} u^{\lambda +\alpha -1} \mathrm{d}u + (1-\alpha )\int _{1}^{\infty} u^{ \lambda +\alpha -2} \mathrm{d}u \bigg), \end{aligned}$$

where $\lambda +\alpha \in (0,1)$; so the final two terms on the right-hand side are finite. In the final line, we have used that the mapping $y\mapsto - (1+y)^{\alpha -1}+(\alpha -1)y$ is concave with a maximum value of −1 at $y=0$.

As regards the two remaining integrals $I_{2}$ and $I_{3}$, we see immediately that

$$\begin{aligned} &|I_{2}| \leq C \vert f \vert _{\lambda }h^{\lambda} \int _{0}^{ \infty }(r+h)^{\alpha -1} \mathrm{d}r= \frac{C}{|\alpha |} \vert f \vert _{\lambda }h^{\lambda +\alpha}, \\ &|I_{3}| \leq C \vert f \vert _{\lambda }\int _{-h}^{0} (u+h)^{ \lambda +\alpha - 1} \mathrm{d}u = \frac{C}{|\lambda +\alpha |} \vert f \vert _{\lambda }h^{\lambda + \alpha}. \end{aligned}$$

By linearity, the desired continuity of the operator $\mathcal{G}^{\alpha}: \mathcal {C}^{\lambda}(\mathbb{I})\rightarrow \mathcal {C}^{ \lambda +\alpha}(\mathbb{I})$ for $\alpha \in (-\lambda , 0)$ now follows from (C.1), (C.3) and the three above estimates for (C.4).

It remains to consider $\alpha \in (0,1-\lambda ) $. As before, recall $0<\lambda \leq 1$ and fix $f\in \mathcal {C}^{\lambda}(\mathbb{I})$ along with $g\in \mathcal{L}^{\alpha}$. Unlike above, $s\mapsto \frac{\mathrm{d}}{\mathrm{d}t}g(t-s)$ is now integrable on $\mathbb{I}$ which makes things go through more easily; in particular, we can work directly with the definition of $\mathcal{G}^{\alpha }f$ in (2.1), applying arguments analogous to (C.4). The case $g(u)=u^{\alpha}$ is already covered by the proof of Theorem A.2. For a general $g \in \mathcal{G}^{\alpha}$, we can retrace those same steps except that in (A.2) and the subsequent estimates for $J_{1}$, $J_{2}$ and $J_{3}$, we must now invoke our control on $g$ and its derivatives (similarly to how we did it above for (C.4) and the subsequent estimates of $I_{1}$, $I_{2}$ and $I_{3}$). This completes the proof of Proposition 2.2. □

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Horvath, B., Jacquier, A., Muguruza, A. et al. Functional central limit theorems for rough volatility. Finance Stoch 28, 615–661 (2024). https://doi.org/10.1007/s00780-024-00533-5

Download citation

Received: 07 May 2019
Accepted: 09 September 2023
Published: 16 April 2024
Issue Date: July 2024
DOI: https://doi.org/10.1007/s00780-024-00533-5

Keywords

Mathematics Subject Classification

JEL Classification

G11

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Functional central limit theorems for rough volatility

Abstract

Similar content being viewed by others

Option pricing under fast-varying and rough stochastic volatility

Asymptotics for volatility derivatives in multi-factor rough volatility models

The characteristic function of Gaussian stochastic volatility models: an analytic expression

1 Introduction

2 Weak convergence of rough volatility models

2.1 Hölder spaces and fractional operators

Definition 2.1

Proposition 2.2

Assumption 2.3

2.2 Examples

Proposition 2.4

Proof

Corollary 2.5

Example 2.6

Example 2.7

Example 2.8

Example 2.9

2.3 The approximation scheme

Assumption 2.10

Theorem 2.11

3 Functional CLT for a family of Hölder-continuous processes

3.1 Weak convergence of Brownian motion in Hölder spaces

Theorem 3.1

Theorem 3.2

Theorem 3.3

Theorem 3.4

Proof

3.2 Weak convergence of Itô diffusions in Hölder spaces

Theorem 3.5

Proof

Corollary 3.6

Proof

3.3 Invariance principle for rough processes

Theorem 3.7

Theorem 3.8

Proof

Proposition 3.9

Proof

3.4 Extending the weak convergence to the Skorokhod space and proof of Theorem 2.11

Lemma 3.10

Proof

Theorem 3.11

Theorem 3.12

Proof

4 Applications

4.1 Weak convergence of the hybrid scheme

Proposition 4.1

Proof

Remark 4.2

4.2 Application to fractional binomial trees

4.3 Monte Carlo

Algorithm 4.3

Remark 4.4

4.3.1 Enhancing the performance

Proposition 4.5

Proof

4.3.2 Reducing variance

Algorithm 4.6

4.4 Numerical example: rough Bergomi model

4.5 Speed benchmark against Markovian stochastic volatility models

4.6 Implementation guidelines and conclusion

Remark 4.7

4.7 Bushy trees and binomial markets

4.7.1 A binary market

4.8 American options in rough volatility models

Definition 4.8

Definition 4.9

Theorem 4.10

Proof

Algorithm 4.11

4.8.1 Numerical example: rough Bergomi model

4.8.2 European options

4.8.3 American options

References

Acknowledgements

Author information

Authors and Affiliations