Brownian semistationary process
Let \((\varOmega,\mathcal{F},(\mathcal{F}_{t})_{t \in \mathbb {R}}, \mathbb{P})\) be a filtered probability space, satisfying the usual conditions, supporting a (two-sided) standard Brownian motion \(W=(W_{t})_{t \in \mathbb {R}}\). We consider a Brownian semistationary process
$$\begin{aligned} X_{t} = \int_{-\infty}^{t} g(t-s)\sigma_{s} \mathrm {d}W_{s}, \quad t \in \mathbb {R}, \end{aligned}$$
(2.1)
where \(\sigma= (\sigma_{t})_{t\in \mathbb {R}}\) is an \((\mathcal{F}_{t})_{t \in \mathbb {R}}\)-predictable process with locally bounded trajectories, which captures the stochastic volatility (intermittency) of \(X\), and where \(g: (0,\infty) \to[0,\infty)\) is a Borel-measurable kernel function.
To ensure that the integral (2.1) is well defined, we assume that the kernel function \(g\) is square-integrable, that is, \(\int _{0}^{\infty}g(x)^{2} \mathrm {d}x < \infty\). In fact, we shortly introduce some more specific assumptions on \(g\) that imply its square-integrability. Throughout the paper, we also assume that the process \(\sigma\) has finite second moments, \(\mathbb {E}[\sigma_{t}^{2}] < \infty\) for all \(t \in \mathbb {R}\), and that the process is covariance-stationary, namely,
$$ \mathbb{E}[\sigma_{s}] = \mathbb{E}[\sigma_{t}],\quad \mathrm{Cov}(\sigma_{s},\sigma_{t}) = \mathrm{Cov}(\sigma_{0},\sigma_{|s-t|}), \quad s,t \in \mathbb {R}. $$
These assumptions imply that also \(X\) is covariance-stationary, that is,
$$ \mathbb{E}[X_{t}] = 0, \quad\mathrm{Cov}(X_{s},X_{t}) = \mathbb {E}[\sigma_{0}^{2}]\int _{0}^{\infty}g(x)g(x+|s-t|)\mathrm {d}x, \quad s,t \in \mathbb {R}. $$
However, the process \(X\) need not be strictly stationary as the dependence between the volatility process \(\sigma\) and the driving Brownian motion \(W\) may be time-varying.
Kernel function
As mentioned above, we consider a kernel function that satisfies \(g(x) \propto x^{\alpha}\) for some \(\alpha\in(-\frac{1}{2},\frac{1}{2})\setminus\{0 \}\) when \(x>0\) is near zero. To make this idea rigorous and to allow additional flexibility, we formulate our assumptions on \(g\) using the theory of regular variation [15] and, more specifically, slowly varying functions.
To this end, recall that a measurable function \(L : (0,1] \rightarrow [0,\infty)\) is slowly varying at 0 if for any \(t>0\),
$$\begin{aligned} \lim_{x \rightarrow0} \frac{L(tx)}{L(x)} = 1. \end{aligned}$$
Moreover, a function \(f(x) = x^{\beta}L(x)\), \(x \in(0,1]\), where \(\beta \in \mathbb {R}\) and \(L\) is slowly varying at 0, is said to be regularly varying at 0, with \(\beta\) being the index of regular variation.
Remark 2.1
Conventionally, slow and regular variation are defined at \(\infty\) [15, Sects. 1.2.1 and 1.4.1]. However, \(L\) is slowly varying (resp. regularly varying) at 0 if and only if \(x \mapsto L(1/x)\) is slowly varying (resp. regularly varying) at \(\infty\).
A key feature of slowly varying functions, which will be very important in the sequel, is that they can be sandwiched between polynomial functions as follows. If \(\delta>0\) and \(L\) is slowly varying at 0 and bounded away from 0 and \(\infty\) on every interval \((u,1]\), \(u \in(0,1)\), then there exist constants \(\overline{C}_{\delta} \geq \underline{C}_{\delta} >0\) such that
$$\begin{aligned} \underline{C}_{\delta} x^{\delta} \leq L(x) \leq\overline{C}_{\delta} x^{-\delta}, \quad x \in(0,1]. \end{aligned}$$
(2.2)
The inequalities above are an immediate consequence of the so-called Potter bounds for slowly varying functions; see [15, Theorem 1.5.6(ii)] and (4.1) below. Making \(\delta\) very small, we see that slowly varying functions are asymptotically negligible in comparison with polynomially growing/decaying functions. Thus, by multiplying power functions and slowly varying functions, regular variation provides a flexible framework to construct functions that behave asymptotically like power functions.
Our assumptions concerning the kernel function \(g\) are as follows:
-
(A1)
For some \(\alpha\in(-\frac{1}{2},\frac{1}{2}) \setminus \{0\}\),
$$\begin{aligned} g(x) = x^{\alpha}L_{g}(x), \quad x \in(0,1], \end{aligned}$$
where \(L_{g} : (0,1] \to[0,\infty)\) is continuously differentiable, slowly varying at 0 and bounded away from 0. Moreover, there exists a constant \(C>0\) such that the derivative \(L'_{g}\) of \(L_{g}\) satisfies
$$\begin{aligned} |L_{g}'(x)| \leq C(1+x^{-1}), \quad x\in(0,1]. \end{aligned}$$
-
(A2)
The function \(g\) is continuously differentiable on \((0,\infty)\), with derivative \(g'\) that is ultimately monotonic and also satisfies \(\int_{1}^{\infty}g'(x)^{2} \mathrm {d}x\).
-
(A3)
For some \(\beta\in(-\infty,-\frac{1}{2})\),
$$ g(x) = \mathcal{O}(x^{\beta}), \quad x \rightarrow\infty. $$
(Here, and in the sequel, we use the notation \(f(x) = \mathcal {O}(h(x))\), \(x \rightarrow a\), to indicate that \(\limsup_{x \rightarrow a} |\frac{f(x)}{h(x)}| < \infty\). Additionally, analogous notation is later used for sequences and computational complexity.) In view of the bound (2.2), these assumptions ensure that \(g\) is square-integrable. It is worth pointing out that (A1) accommodates functions \(L_{g}\) with \(\lim_{x \rightarrow 0} L_{g}(x) = \infty\), e.g. \(L_{g}(x) = 1 - \log x\).
The assumption (A1) influences the short-term behavior and roughness of the process \(X\). A simple way to assess the roughness of \(X\) is to study the behavior of its variogram (also called second-order structure function in the turbulence literature)
$$ V_{X}(h) := \mathbb{E}[|X_{h} - X_{0}|^{2}], \quad h \geq0, $$
as \(h \rightarrow0\). Note that by covariance-stationarity,
$$ V_{X}(|s-t|) = \mathbb{E}[|X_{s} - X_{t}|^{2}], \quad s,t \in \mathbb {R}. $$
Under our assumptions, we have the following characterization of the behavior of \(V_{X}\) near zero, which generalizes a result of Barndorff-Nielsen [3, p. 9] and implies that \(X\) has a locally Hölder-continuous modification. Therein, and in what follows, we write \(a(x) \sim b(x)\), \(x \rightarrow y\), to indicate that \(\lim_{x \rightarrow y} \frac{a(x)}{b(x)} = 1\). The proof of this result is carried out in Sect. 4.1.
Proposition 2.2
Suppose that (A1)–(A3) hold.
-
(i)
The variogram of
\(X\)
satisfies
$$ V_{X}(h) \sim \mathbb {E}[\sigma_{0}^{2}] \bigg(\frac{1}{2\alpha+ 1} + \int_{0}^{\infty}\big( (y+1)^{\alpha}- y^{\alpha}\big)^{2} \mathrm {d}y\bigg) h^{2\alpha+1} L_{g}(h)^{2}, \quad h \rightarrow0, $$
which implies that
\(V_{X}\)
is regularly varying at zero with index
\(2\alpha+1\).
-
(ii)
The process
\(X\)
has a modification with locally
\(\phi\)-Hölder-continuous trajectories for any
\(\phi\in(0,\alpha+ \frac{1}{2})\).
Motivated by Proposition 2.2, we call \(\alpha\) the roughness index of the process \(X\). Ignoring the slowly varying factor \(L_{g}(h)^{2}\) in (2.2), we see that the variogram \(V(h)\) behaves like \(h^{2\alpha+1}\) for small values of \(h\), which is reminiscent of the scaling property of the increments of a fractional Brownian motion (fBm) with Hurst index \(H = \alpha+ \frac{1}{2}\). Thus, the process \(X\) behaves locally like such an fBm, at least when it comes to second order structure and roughness. (Moreover, the factor \(\frac{1}{2\alpha + 1} + \int_{0}^{\infty}( (y+ 1)^{\alpha}- y^{\alpha})^{2} \mathrm {d}y\) coincides with the normalization coefficient that appears in the Mandelbrot–Van Ness representation [23, Theorem 1.3.1] of an fBm with \(H = \alpha+ \frac{1}{2}\).)
Let us now look at two examples of a kernel function \(g\) that satisfies our assumptions.
Example 2.3
The so-called gamma kernel
$$\begin{aligned} g(x) = x^{\alpha} e^{-\lambda x}, \quad x \in(0,\infty), \end{aligned}$$
with parameters \(\alpha\in(-\frac{1}{2},\frac{1}{2}) \setminus\{ 0 \} \) and \(\lambda> 0\), has been used extensively in the literature on \(\mathcal{BSS}\) processes. It is particularly important in connection with statistical modeling of turbulence, see Corcuera et al. [16], but it also provides a way to construct generalizations of Ornstein–Uhlenbeck (OU) processes with a roughness that differs from the usual semimartingale case \(\alpha= 0\), while mimicking the long-term behavior of an OU process. Moreover, \(\mathcal {BSS}\) and \(\mathcal{LSS}\) processes defined using the gamma kernel have interesting probabilistic properties; see [24]. An in-depth study of the gamma kernel can be found in [3]. Setting \(L_{g} (x) := e^{-\lambda x}\), which is slowly varying at 0 since \(\lim_{x \rightarrow0} L_{g}(x) = 1\), it is evident that (A1) holds. Since \(g(x)\) decays exponentially fast to 0 as \(x \rightarrow\infty\), it is clear that also (A3) holds. To verify (A2), note that \(g\) satisfies
$$\begin{aligned} g'(x) = \bigg( \frac{\alpha}{x}-\lambda\bigg) g(x), \quad g''(x) = \bigg(\Big(\frac{\alpha}{x}-\lambda\Big)^{2}- \frac{\alpha}{x^{2}}\bigg) g(x), \quad x \in(0,\infty), \end{aligned}$$
where \(\lim_{x \rightarrow\infty}((\frac{\alpha}{x}-\lambda)^{2}- \frac {\alpha}{x^{2}})= \lambda^{2}>0\), so \(g'\) is ultimately increasing with
$$\begin{aligned} g'(x)^{2} \leq(|\alpha| + \lambda)^{2} g(x)^{2}, \quad x \in[1,\infty). \end{aligned}$$
Thus, \(\int_{1}^{\infty}g'(x)^{2} \mathrm {d}x < \infty\) since \(g\) is square-integrable.
Example 2.4
Consider the power-law kernel function
$$\begin{aligned} g(x) = x^{\alpha}(1+x)^{\beta-\alpha}, \quad x \in(0,\infty), \end{aligned}$$
with parameters \(\alpha\in(-\frac{1}{2},\frac{1}{2})\setminus\{0\}\) and \(\beta\in(-\infty,-\frac{1}{2})\). The behavior of this kernel function near zero is similar to that of the gamma kernel, but \(g(x)\) decays to zero polynomially as \(x \rightarrow\infty\), so it can be used to model long memory. In fact, it can be shown that if \(\beta\in (-1,-\frac{1}{2})\), then the autocorrelation function of \(X\) is not integrable. Clearly, (A1) holds with \(L_{g}(x) := (1+x)^{\beta -\alpha}\), which is slowly varying at 0 since \(\lim_{x \rightarrow0} L_{g}(x) = 1\). Moreover, note that we can write
$$ g(x) = x^{\beta}K_{g}(x), \quad x \in(0,\infty), $$
where \(K_{g}(x):=(1+x^{-1})^{\beta-\alpha}\) satisfies \(\lim_{x \rightarrow\infty}K_{g}(x)=1\). Thus, also (A3) holds. We can check (A2) by computing
$$ \begin{aligned} g'(x) & = \frac{\alpha+\beta x}{x(1+x)} g(x), \\ g''(x) & = \bigg(\Big(\frac{\alpha+\beta x}{x(1+x)}\Big)^{2} + \frac {-\alpha-2\alpha x- \beta x^{2}}{x^{2}(1+x)^{2}}\bigg) g(x), \quad x \in (0,\infty), \end{aligned} $$
where \(-\alpha-2\alpha x- \beta x^{2} \rightarrow\infty\) when \(x \rightarrow\infty\) (as \(\beta< -\frac{1}{2}\)), so \(g'\) is ultimately increasing. Additionally, we note that
$$ g'(x)^{2} \leq(|\alpha|+|\beta|)^{2} g(x)^{2}, \quad x \in[1,\infty), $$
implying \(\int_{1}^{\infty}g'(x)^{2} \mathrm {d}x < \infty\) since \(g\) is square-integrable.
Hybrid scheme
Let \(t \in \mathbb {R}\) and consider discretizing \(X_{t}\) based on its integral representation (2.1) on the grid \(\mathcal{G}^{n}_{t} := \{t,t-\frac {1}{n}, t-\frac{2}{n},\ldots\}\) for \(n \in \mathbb {N}\). To derive our discretization scheme, let us first note that if the volatility process \(\sigma\) does not vary too much, then it is reasonable to use the approximation
$$ X_{t} = \sum_{k = 1}^{\infty}\int_{t-\frac{k}{n}}^{t-\frac{k}{n}+\frac {1}{n}} g(t-s) \sigma_{s} \mathrm {d}W_{s} \approx\sum_{k = 1}^{\infty}\sigma _{t-\frac{k}{n}} \int_{t-\frac{k}{n}}^{t-\frac{k}{n}+\frac{1}{n}} g(t-s) \mathrm {d}W_{s}, $$
(2.3)
that is, we keep \(\sigma\) constant in each discretization cell. (Here, and in the sequel, “≈” stands for an informal approximation used for purely heuristic purposes.) If \(k\) is “small”, then due to (A1), we may approximate
$$ g(t-s) \approx(t-s)^{\alpha}L_{g}\bigg( \frac{k}{n} \bigg),\quad t-s \in \bigg[\frac{k-1}{n},\frac{k}{n}\bigg]\setminus\{ 0 \}, $$
(2.4)
as the slowly varying function \(L_{g}\) varies “less” than the power function \(y \mapsto y^{\alpha}\) near zero; cf. (2.2). If \(k\) is “large”, or at least \(k \geq2\), then choosing \(b_{k} \in [k-1,k]\) provides an adequate approximation
$$ g(t-s) \approx g\bigg(\frac{b_{k}}{n}\bigg),\quad t-s \in\bigg[\frac{k-1}{n},\frac{k}{n}\bigg], $$
(2.5)
by (A2). Applying (2.4) to the first \(\kappa\) terms, where \(\kappa= 1,2,\ldots{}\), and (2.5) to the remaining terms in the approximating series in (2.3) yields
$$\begin{aligned} \sum_{k = 1}^{\infty}\sigma_{t-\frac{k}{n}} \int_{t-\frac{k}{n}}^{t-\frac {k}{n}+\frac{1}{n}} g(t-s) \mathrm {d}W_{s} & \approx \sum_{k = 1}^{\kappa}L_{g} \bigg( \frac{k}{n}\bigg) \sigma_{t-\frac{k}{n}} \int_{t-\frac {k}{n}}^{t-\frac{k}{n}+\frac{1}{n}} (t-s)^{\alpha} \mathrm {d}W_{s} \\ &\quad{} + \sum_{k = \kappa+ 1}^{\infty}g\bigg(\frac{b_{k}}{n}\bigg) \sigma_{t-\frac{k}{n}} \int_{t-\frac{k}{n}}^{t-\frac{k}{n}+\frac{1}{n}} \mathrm {d}W_{s}. \end{aligned}$$
(2.6)
For completeness, we also allow \(\kappa= 0\), in which case we require that \(b_{1} \in(0,1]\) and interpret the first sum on the right-hand side of (2.6) as zero. To make numerical implementation feasible, we truncate the second sum on the right-hand side of (2.6) so that both sums have \(N_{n} \geq\kappa+1\) terms in total. Thus, we arrive at a discretization scheme for \(X_{t}\), which we call a hybrid scheme, given by
$$ X^{n}_{t} := \check{X}^{n}_{t} + \hat{X}^{n}_{t}, $$
where
$$\begin{aligned} \check{X}^{n}_{t} & := \sum_{k=1}^{\kappa}L_{g}\bigg( \frac{k}{n} \bigg) \sigma _{t-\frac{k}{n}} \int_{t-\frac{k}{n}}^{t-\frac{k}{n}+\frac{1}{n}} (t-s)^{\alpha} \mathrm {d}W_{s}, \end{aligned}$$
(2.7)
$$\begin{aligned} \hat{X}^{n}_{t} & := \sum_{k=\kappa+1}^{N_{n}} g\bigg(\frac{b_{k}}{n}\bigg)\sigma_{t- \frac{k}{n}}\big( W_{t-\frac{k}{n}+\frac{1}{n}} - W_{t-\frac {k}{n}}\big), \end{aligned}$$
(2.8)
and \(\mathbf{b}:=(b_{k})_{k=\kappa+1}^{\infty}\) is a sequence of real numbers, evaluation points, that must satisfy \(b_{k} \in[k-1,k]\setminus \{0\}\) for each \(k\geq\kappa+1\), but otherwise can be chosen freely.
As it stands, the discretization grid \(\mathcal{G}^{n}_{t}\) depends on the time \(t\), which may seem cumbersome with regard to sampling \(X^{n}_{t}\) simultaneously for different times \(t\). However, note that whenever times \(t\) and \(t'\) are separated by a multiple of \(\frac{1}{n}\), the corresponding grids \(\mathcal{G}^{n}_{t}\) and \(\mathcal{G}^{n}_{t'}\) will intersect. In fact, the hybrid scheme defined by (2.7) and (2.8) can be implemented efficiently, as we shall see in Sect. 3.1, below. Since
$$ g\bigg(\frac{b_{k}}{n}\bigg) = g\bigg(t-\Big(t-\frac{b_{k}}{n}\Big)\bigg), $$
the degenerate case \(\kappa=0\) with \(b_{k} = k\) for all \(k \geq1\) corresponds to the usual Riemann-sum discretization scheme of \(X_{t}\) with (Itô type) forward sums from (2.8). Henceforth, we denote the associated sequence \(( k)_{k=\kappa+1}^{\infty}\) by \(\mathbf {b}_{\mathrm{FWD}}\), where the subscript “\(\mathrm{FWD}\)” alludes to forward sums. However, including terms involving Wiener integrals of a power function given by (2.7), that is, having \(\kappa\geq1\), improves the accuracy of the discretization considerably, as we shall see. Having the leeway to select \(b_{k}\) within the interval \([k-1,k]\setminus\{ 0 \}\), so that the function \(g(t-\cdot)\) is evaluated at a point that does not necessarily belong to \(\mathcal {G}^{n}_{t}\), leads additionally to a moderate improvement.
The truncation in the sum (2.8) entails that the stochastic integral (2.1) defining \(X_{t}\) is truncated at \(t - \frac{N_{n}}{n}\). In practice, the value of the parameter \(N_{n}\) should be large enough to mitigate the effect of truncation. To ensure that the truncation point \(t - \frac{N_{n}}{n}\) tends to \(-\infty\) as \(n \rightarrow\infty\) in our asymptotic results, we introduce the following assumption:
-
(A4)
For some \(\gamma>0\),
$$\begin{aligned} N_{n} \sim n^{\gamma+ 1},\quad n \rightarrow\infty. \end{aligned}$$
Asymptotic behavior of mean square error
We are now ready to state our main theoretical result, which gives a sharp description of the asymptotic behavior of the mean square error (MSE) of the hybrid scheme as \(n \rightarrow\infty\). We defer the proof of this result to Sect. 4.2.
Theorem 2.5
Suppose that (A1)–(A4) hold, so that
$$\begin{aligned} \gamma> -\frac{2\alpha+1}{2\beta+1}, \end{aligned}$$
(2.9)
and that for some
\(\delta>0\),
$$\begin{aligned} \mathbb{E}[|\sigma_{s} - \sigma_{0}|^{2}] = \mathcal{O}(s^{2\alpha+1+\delta }), \quad s\downarrow0. \end{aligned}$$
(2.10)
Then for all
\(t\in\mathbb{R}\),
$$\begin{aligned} \mathbb{E}[|X_{t} - X^{n}_{t}|^{2}] \sim J(\alpha,\kappa,\mathbf{b}) \mathbb {E}[\sigma_{0}^{2}] n^{-(2\alpha+1)} L_{g}(1/n)^{2} , \quad n\rightarrow\infty, \end{aligned}$$
(2.11)
where
$$\begin{aligned} J(\alpha,\kappa,\mathbf{b}) := \sum_{k=\kappa+1}^{\infty} \int_{k-1}^{k} (y^{\alpha} - b_{k}^{\alpha})^{2} \mathrm {d}y < \infty. \end{aligned}$$
(2.12)
Remark 2.6
Note that if \(\alpha\in(-\frac{1}{2},0)\), then having
$$\begin{aligned} \mathbb{E}[|\sigma_{s} - \sigma_{0}|^{2}] = \mathcal{O}(s^{\theta}), \quad s\downarrow0, \end{aligned}$$
for all \(\theta\in(0,1)\), ensures that (2.10) holds. (Take, say, \(\delta:= \frac{1}{2}(1 - (2\alpha+1))>0\) and \(\theta:= 2\alpha+1 + \delta= \alpha+ 1 \in(0,1)\).)
When the hybrid scheme is used to simulate the \(\mathcal{BSS}\) process \(X\) on an equidistant grid \(\{0,\frac{1}{n},\frac{2}{n},\ldots,\frac {\lfloor nT \rfloor}{n} \}\) for some \(T>0\) (see Sect. 3.1 on the details of the implementation), the following consequence of Theorem 2.5 ensures that the covariance structure of the simulated process approximates that of the actual process \(X\).
Corollary 2.7
Suppose that the assumptions of Theorem
2.5
hold. Then for any
\(s\), \(t\in \mathbb {R}\)
and
\(\varepsilon>0\),
$$ |\mathbb {E}[X^{n}_{t} X^{n}_{s}]-\mathbb {E}[X_{t} X_{s}]| = \mathcal{O}(n^{-(\alpha+\frac {1}{2})+\varepsilon}), \quad n \rightarrow\infty. $$
Proof
Let \(s\), \(t\in \mathbb {R}\). Applying the Cauchy–Schwarz inequality, we get
$$ \begin{aligned} |\mathbb {E}[X^{n}_{t} X^{n}_{s}]-\mathbb {E}[X_{t} X_{s}]| & \leq \mathbb {E}[(X^{n}_{t})^{2}]^{1/2} \mathbb {E}[|X_{s}-X^{n}_{s}|^{2}]^{1/2} \\ & \quad{} + \mathbb {E}[X_{s}^{2}]^{1/2} \mathbb {E}[|X_{t}-X^{n}_{t}|^{2}]^{1/2}. \end{aligned} $$
We have \(\sup_{n \in \mathbb {N}} \mathbb {E}[(X^{n}_{t})^{2}]^{1/2} <\infty\) since \(\mathbb {E}[(X^{n}_{t})^{2}] \rightarrow \mathbb {E}[X_{t}^{2}]<\infty\) as \(n \rightarrow\infty\), by Theorem 2.5. Moreover, Theorem 2.5 and the bound (2.2) imply that we also have \(\mathbb {E}[|X_{s}-X^{n}_{s}|^{2}]^{1/2}=\mathcal{O}(n^{-(\alpha+\frac{1}{2})+\varepsilon})\) and \(\mathbb {E}[|X_{t}-X^{n}_{t}|^{2}]^{1/2}=\mathcal{O}(n^{-(\alpha+\frac {1}{2})+\varepsilon})\) for any \(\varepsilon>0\). □
In Theorem 2.5, the asymptotics of the MSE (2.11) are determined by the behavior of the kernel function \(g\) near zero, as specified in (A1). The condition (2.9) ensures that the error from approximating \(g\) near zero is asymptotically larger than the error induced by the truncation of the stochastic integral (2.1) at \(t - \frac{N_{n}}{n}\). In fact, a different kind of asymptotics of the MSE, where truncation error becomes dominant, could be derived when (2.9) does not hold, under some additional assumptions, but we do not pursue this direction in the present paper.
While the rate of convergence in (2.11) is fully determined by the roughness index \(\alpha\), which may seem discouraging at first, it turns out that the quantity \(J(\alpha,\kappa,\mathbf{b})\), which we call the asymptotic MSE, can vary a lot, depending on how we choose \(\kappa\) and \(\mathbf{b}\), and can have a substantial impact on the precision of the approximation of \(X\). It is immediate from (2.12) that increasing \(\kappa\) will decrease \(J(\alpha,\kappa ,\mathbf{b})\). Moreover, for given \(\alpha\) and \(\kappa\), it is straightforward to choose \(\mathbf{b}\) so that \(J(\alpha,\kappa,\mathbf {b})\) is minimized, as shown in the following result.
Proposition 2.8
Let
\(\alpha\in(-\frac{1}{2},\frac{1}{2})\setminus\{ 0\}\)
and
\(\kappa \geq0\). Among all sequences
\(\mathbf{b}=(b_{k})_{k=\kappa+1}^{\infty}\)
with
\(b_{k} \in[k-1,k]\setminus\{0 \}\)
for
\(k \geq\kappa+1\), the function
\(J(\alpha,\kappa,\mathbf{b})\), and consequently the asymptotic MSE induced by the discretization, is minimized by the sequence
\(\mathbf {b}^{*}\)
given by
$$\begin{aligned} b_{k}^{*} = \bigg( \frac{k^{\alpha+1} - (k-1)^{\alpha+1}}{\alpha+1}\bigg)^{1/\alpha}, \quad k\geq\kappa+1. \end{aligned}$$
Proof
Clearly, a sequence \(\mathbf{b}=(b_{k})_{k=\kappa+1}^{\infty}\) minimizes the function \(J(\alpha,\kappa,\mathbf{b})\) if and only if \(b_{k}\) minimizes \(\int_{k-1}^{k} (y^{\alpha}-b_{k}^{\alpha})^{2} \mathrm {d}y\) for any \(k \geq \kappa+1\). By standard \(L^{2}\)-space theory, \(c \in \mathbb {R}\) minimizes the integral \(\int_{k-1}^{k} (y^{\alpha}-c)^{2} \mathrm {d}y\) if and only if the function \(y \mapsto y^{\alpha}- c\) is orthogonal in \(L^{2}\) to all constant functions. This is tantamount to
$$ \int_{k-1}^{k} (y^{\alpha}- c) \mathrm {d}y = 0, $$
and computing the integral and solving for \(c\) yields
$$ c = \frac{k^{\alpha+1}- (k-1)^{\alpha+1}}{\alpha+ 1}. $$
Setting \(b^{*}_{k} := c^{1/\alpha} \in(k-1,k)\) completes the proof. □
To understand how much increasing \(\kappa\) and using the optimal sequence \(\mathbf{b}^{*}\) from Proposition 2.8 improves the approximation, we study numerically the asymptotic root mean square error (RMSE) \(\sqrt{J(\alpha,\kappa,\mathbf{b})}\). In particular, we assess how much the asymptotic RMSE decreases relative to the RMSE of the forward Riemann-sum scheme (\(\kappa=0\) and \(\mathbf {b} = \mathbf{b}_{\mathrm{FWD}}\)) by using the quantity
$$\begin{aligned} \textrm{reduction in asymptotic RMSE} = - \frac{\sqrt{J(\alpha,\kappa ,\mathbf{b})}-\sqrt{J(\alpha,0,\mathbf{b}_{\mathrm{FWD}})}}{\sqrt {J(\alpha,0,\mathbf{b}_{\mathrm{FWD}})}} \cdot100\%. \end{aligned}$$
(2.13)
The results are presented in Fig. 1. We find that employing the hybrid scheme with \(\kappa\geq1\) leads to a substantial reduction in the asymptotic RMSE relative to the forward Riemann-sum scheme when \(\alpha\in(-\frac{1}{2},0)\). Indeed, when \(\kappa\geq 1\), the asymptotic RMSE, as a function of \(\alpha\), does not blow up as \(\alpha\rightarrow-\frac{1}{2}\), while with \(\kappa= 0\) it does. This explains why the reduction in the asymptotic RMSE approaches \(100\% \) as \(\alpha\rightarrow-\frac{1}{2}\). When \(\alpha\in(0,\frac {1}{2})\), the improvement achieved using the hybrid scheme is more modest, but still considerable. Figure 1 also highlights the importance of using the optimal sequence \(\mathbf{b}^{*}\), instead of \(\mathbf{b}_{\mathrm{FWD}}\), as evaluation points in the scheme, in particular when \(\alpha\in(0,\frac{1}{2})\). Finally, we observe that increasing \(\kappa\) beyond 2 does not appear to lead to a significant further reduction. Indeed, in our numerical experiments, reported in Sects. 3.2 and 3.3 below, we observe that using \(\kappa= 1,2\) already leads to good results.
Remark 2.9
It is non-trivial to evaluate the quantity \(J(\alpha,\kappa,\mathbf{b})\) numerically. Computing the integral in (2.12) explicitly, we can approximate \(J(\alpha,\kappa,\mathbf{b})\) by
$$ J_{N}(\alpha,\kappa,\mathbf{b}) := \sum_{k=\kappa+1}^{N} \bigg( \frac {k^{2\alpha+1}-(k-1)^{2\alpha+1}}{2\alpha+1}-\frac{2 b^{\alpha}_{k}(k^{\alpha+1}-(k-1)^{\alpha+1})}{\alpha+1} + b^{2\alpha}_{k} \bigg) $$
with some large \(N\in \mathbb {N}\). This approximation is adequate when \(\alpha \in(-\frac{1}{2},0)\), but its accuracy deteriorates when \(\alpha \rightarrow\frac{1}{2}\). In particular, the singularity of the function \(\alpha\mapsto J(\alpha,\kappa,\mathbf{b})\) at \(\frac{1}{2}\) is difficult to capture using \(J_{N}(\alpha,\kappa,\mathbf{b})\) with numerically feasible values of \(N\). To overcome this numerical problem, we introduce a correction term in the case \(\alpha\in(0,\frac {1}{2})\). The correction term can be derived informally as follows. By the mean value theorem, and since \(b^{*}_{k} \approx k-\frac{1}{2}\) for large \(k\), we have
$$ (y^{\alpha}- b^{\alpha}_{k})^{2} = \alpha^{2} \xi^{2\alpha-2 }(y - b_{k})^{2} \approx \textstyle\begin{cases} \alpha^{2} k^{2\alpha-2 }(y - k)^{2}, & \quad \mathbf{b} = \mathbf{b}_{\mathrm{FWD}},\\ \alpha^{2} k^{2\alpha-2 }(y - k + \frac{1}{2})^{2}, & \quad \mathbf{b} = \mathbf{b}^{*}, \end{cases} $$
where \(\xi= \xi(y,b_{k}) \in[k-1,k]\), for large \(k\). Thus, for large \(N\), we obtain
$$\begin{aligned} \begin{aligned} &J(\alpha,\kappa,\mathbf{b}) - J_{N}(\alpha,\kappa,\mathbf{b}) \\ &\quad= \sum_{k=N+1}^{\infty}\int_{k-1}^{k} (y^{\alpha} - b_{k}^{\alpha})^{2}\mathrm {d}y \\ &\quad \approx \textstyle\begin{cases} \alpha^{2} \sum_{k=N+1}^{\infty}k^{2\alpha-2 } \int_{k-1}^{k} (y - k)^{2} \mathrm {d}y, \quad & \mathbf{b} = \mathbf{b}_{\mathrm{FWD}},\\ \alpha^{2} \sum_{k=N+1}^{\infty}k^{2\alpha-2 }\int_{k-1}^{k} (y - k + \frac {1}{2})^{2} \mathrm {d}y, \quad & \mathbf{b} = \mathbf{b}^{*}, \end{cases}\displaystyle \\ &\quad = \textstyle\begin{cases} \frac{\alpha^{2}}{3} \zeta(2-2\alpha,N+1), \quad & \mathbf{b} = \mathbf{b}_{\mathrm{FWD}},\\ \frac{\alpha^{2}}{12} \zeta(2-2\alpha,N+1), \quad & \mathbf{b} = \mathbf{b}^{*}, \end{cases}\displaystyle \end{aligned} \end{aligned}$$
where \(\zeta(x,s) := \sum_{k=0}^{\infty}\frac{1}{(k+s)^{x}}\), \(x>1\), \(s > 0\), is the Hurwitz zeta function, which can be evaluated using accurate numerical algorithms.
Remark 2.10
Unlike the Fourier-based method of Benth et al. [13], the hybrid scheme does not require truncating the singularity of the kernel function \(g\) when \(\alpha\in(-\frac {1}{2},0)\), which is beneficial to maintaining the accuracy of the scheme when \(\alpha\) is near \(-\frac{1}{2}\). Let us briefly analyze the effect on the approximation error of truncating the singularity of \(g\); cf. [13, pp. 75–76]. Consider for any \(\varepsilon>0\) the modified \(\mathcal{BSS}\) process
$$ \tilde{X}^{\varepsilon}_{t} := \int_{-\infty}^{t} g_{\varepsilon}(t-s) \sigma_{s} \mathrm {d}W_{s}, \quad t \in \mathbb {R}, $$
defined using the truncated kernel function
$$ g_{\varepsilon}(x) := \textstyle\begin{cases} g(\varepsilon), \quad & x \in(0,\varepsilon],\\ g(x), \quad & x \in(\varepsilon,\infty). \end{cases} $$
Adapting the proof of Theorem 2.5 in a straightforward manner, it is possible to show that under (A1) and (A3),
$$ \begin{aligned} \mathbb{E}[|X_{t}-\tilde{X}^{\varepsilon}_{t}|^{2}] & = \mathbb {E}[\sigma_{0}^{2}] \int _{0}^{\varepsilon}\big(g(s) - g(\varepsilon)\big)^{2} \mathrm {d}s \\ & \sim \underbrace{\bigg( \frac{1}{2\alpha+1}-\frac{2}{\alpha +1}+1\bigg)}_{=:\tilde{J}(\alpha)} \mathbb {E}[\sigma_{0}^{2}] \varepsilon^{2\alpha +1} L_{g}(\varepsilon)^{2}, \quad\varepsilon\downarrow0, \end{aligned} $$
for any \(t \in \mathbb {R}\). While the rate of convergence, as \(\varepsilon \downarrow0\), of the MSE that arises from replacing \(g\) with \(g_{\varepsilon}\) is analogous to the rate of convergence of the hybrid scheme, it is important to note that the factor \(\tilde{J}(\alpha)\) blows up as \(\alpha\downarrow-\frac{1}{2}\). In fact, \(\tilde{J}(\alpha )\) is equal to the first term in the series that defines \(J(\alpha,0,\mathbf{b}_{\mathrm{FWD}})\) and
$$ \tilde{J}(\alpha) \sim J(\alpha,0,\mathbf{b}_{\mathrm{FWD}}), \quad \alpha\downarrow-\frac{1}{2}, $$
which indicates that the effect of truncating the singularity, in terms of MSE, is similar to the effect of using the forward Riemann-sum scheme to discretize the process when \(\alpha\) is near \(-\frac{1}{2}\). In particular, the truncation threshold \(\varepsilon\) would then have to be very small in order to keep the truncation error in check.
Extension to truncated Brownian semistationary processes
It is useful to extend the hybrid scheme to a class of non-stationary processes that are closely related to \(\mathcal{BSS}\) processes. This extension is important in connection with an application to the so-called rough Bergomi model which we discuss in Sect. 3.3 below. More precisely, we consider processes of the form
$$ Y_{t} = \int_{0}^{t} g(t-s) \sigma_{s} \mathrm {d}W_{s}, \quad t \geq0, $$
(2.14)
where the kernel function \(g\), volatility process \(\sigma\) and driving Brownian motion \(W\) are as before. We call \(Y\) a truncated Brownian semistationary (\(\mathcal{TBSS}\)) process, as \(Y\) is obtained from the \(\mathcal{BSS}\) process \(X\) by truncating the stochastic integral in (2.1) at 0. Of the preceding assumptions, only (A1) and (A2) are needed to ensure that the stochastic integral in (2.14) exists—in fact, of (A2), only the requirement that \(g\) is differentiable on \((0,\infty)\) comes into play.
The \(\mathcal{TBSS}\) process \(Y\) does not have covariance-stationary increments, so we define its (time-dependent) variogram as
$$ V_{Y}(h,t) := \mathbb {E}[|Y_{t+h}-Y_{t}|^{2}], \quad h,t \geq0. $$
Extending Proposition 2.2, we can describe the behavior of \(h \mapsto V_{Y}(h,t)\) near zero as follows. The existence of a locally Hölder-continuous modification is then a straightforward consequence. We omit the proof of this result, as it would be a straightforward adaptation of the proof of Proposition 2.2.
Proposition 2.11
Suppose that (A1) and (A2) hold.
-
(i)
The variogram of
\(Y\)
satisfies for any
\(t \geq0\)
that as
\(h \rightarrow0\),
$$ V_{Y}(h,t) \sim \mathbb {E}[\sigma_{0}^{2}] \bigg(\frac{1}{2\alpha+ 1} + \mathbf{1}_{(0,\infty)}(t)\int_{0}^{\infty}\big( (y+1)^{\alpha}- y^{\alpha}\big)^{2} \mathrm {d}y\bigg) h^{2\alpha+1} L_{g}(h)^{2}, $$
which implies that
\(h \mapsto V_{Y}(h,t)\)
is regularly varying at zero with index
\(2\alpha+1\).
-
(ii)
The process
\(Y\)
has a modification with locally
\(\phi\)-Hölder-continuous trajectories for any
\(\phi\in(0,\alpha+ \frac{1}{2})\).
Note that while the increments of \(Y\) are not covariance-stationary, the asymptotic behavior of \(V_{Y}(h,t)\) is the same as that of \(V_{X}(h)\) as \(h \rightarrow0\) (cf. Proposition 2.2) for any \(t>0\). Thus, the increments of \(Y\) (apart from increments starting at time 0) are locally like the increments of \(X\).
We define the hybrid scheme to discretize \(Y_{t}\), for any \(t \geq0\), as
$$ Y^{n}_{t} := \check{Y}^{n}_{t} + \hat{Y}^{n}_{t}, $$
(2.15)
where
$$\begin{aligned} \check{Y}^{n}_{t} & := \sum_{k=1}^{\min\{\lfloor nt \rfloor, \kappa\}} L_{g}\bigg( \frac{k}{n} \bigg) \sigma_{t-\frac{k}{n}} \int_{t-\frac {k}{n}}^{t-\frac{k}{n}+\frac{1}{n}} (t-s)^{\alpha} \mathrm {d}W_{s}, \\ \hat{Y}^{n}_{t} & := \sum_{k=\kappa+1}^{\lfloor nt \rfloor} g\bigg(\frac {b_{k}}{n}\bigg)\sigma_{t- \frac{k}{n}} \big( W_{t-\frac{k}{n}+\frac {1}{n}} - W_{t-\frac{k}{n}}\big). \end{aligned}$$
In effect, we simply drop the summands in (2.7) and (2.8) that correspond to integrals and increments on the negative real line. We make remarks on the implementation of this scheme in Sect. 3.1 below.
The MSE of the hybrid scheme for the \(\mathcal{TBSS}\) process \(Y\) has the following asymptotic behavior as \(n \rightarrow\infty\), which is in fact identical to the asymptotic behavior of the MSE of the hybrid scheme for \(\mathcal{BSS}\) processes. We omit the proof of this result, which would be a simple modification of the proof of Theorem 2.5.
Theorem 2.12
Suppose that (A1) and (A2) hold, and that for some
\(\delta>0\),
$$\begin{aligned} \mathbb{E}[|\sigma_{s} - \sigma_{0}|^{2}] = \mathcal{O}(s^{2\alpha+1+\delta }), \quad s\downarrow0. \end{aligned}$$
Then for all
\(t>0\),
$$\begin{aligned} \mathbb{E}[|Y_{t} - Y^{n}_{t}|^{2}] \sim J(\alpha,\kappa,\mathbf{b}) \mathbb {E}[\sigma_{0}^{2}] n^{-(2\alpha+1)} L_{g}(1/n)^{2} , \quad n\rightarrow\infty, \end{aligned}$$
where
\(J(\alpha,\kappa,\mathbf{b})\)
is as in Theorem
2.5.
Remark 2.13
Under the assumptions of Theorem 2.12, the conclusion of Corollary 2.7 holds mutatis mutandis. In particular, the covariance structure of the discretized \(\mathcal{TBSS}\) process approaches that of \(Y\) when \(n \rightarrow\infty\).