Abstract
A new approach for modeling lead–lag relationships in high-frequency financial markets is proposed. The model accommodates non-synchronous trading and market microstructure noise as well as intraday variations of lead–lag relationships, which are essential for empirical applications. A simple statistical methodology for analyzing the proposed model is presented, as well. The methodology is illustrated by an empirical study to detect lead–lag relationships between the S&P 500 index and its two derivative products.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A big challenge in high-frequency financial econometrics is measuring lead–lag relationships wherein one asset is correlated to another asset with a delay. Two assets typically exhibit lead–lag relationships when they reflect new information with different speeds. A prominent example is the lead–lag relationship between the cash and futures markets wherein the latter leads the former (see, e.g., Kawaller et al. 1987; de Jong and Nijman 1997; Huth and Abergel 2014). Lead–lag relationships are also known as a source of the so-called Epps effect (see Renò 2003).
There are several attempts to model and analyze lead–lag relationships of high-frequency data. One approach is to utilize classical discrete time-series analysis such as lead–lag regression (Kawaller et al. 1987), cointegration (Hasbrouck 1995), cross-correlation analysis (de Jong and Nijman 1997), and so on. In the meantime, Hoffmann et al. (2013) have introduced a continuous-time model to describe lead–lag relationships. A related model has also been studied in Robert and Rosenbaum (2010) by utilizing the random matrix theory and Ito and Sakemoto (2020) by multinomial dynamic time warping. Other approaches to investigate lead–lag relationships in a continuous-time framework include Hawkes process-based models (Bacry et al. 2013; Da Fonseca and Zaatour 2015), a wavelet-based method (Hayashi and Koike 2018), and a multi-asset lagged adjustment model (Buccheri et al. 2020). Several empirical approaches have been proposed, as well; see Pomponio and Abergel (2013) and Dobrev and Schaumburg (2016), for example.
In this study, we use Hoffmann et al. (2013)’s model which we call the HRY model as a baseline. In the HRY model, the lead–lag relationship is modeled by a pair of non-synchronously observed semimartingales where one is observed with a delay relative to the other. Empirical applications of the HRY model are found in Alsayed and McGroarty (2014); Huth and Abergel (2014); Ceron et al. (2016); Bollen et al. (2017). We expand this model in three directions. First, in high-frequency financial econometrics, it is well recognized that observed prices are subject to market frictions which cause many problems in statistical inferences as the observation frequency increases (see, e.g., Hansen and Lunde 2006). Therefore, for ultra-high-frequency financial data, the observed prices are typically modeled as semimartingales contaminated by noise (called the microstructure noise) rather than pure semimartingales. We thus introduce microstructure noise into the HRY model. We remark that there are a number of studies on volatility/covariation estimation for noisy semimartingales; see Chapter 7 of Aït-Sahalia and Jacod (2014), Shephard and Xiu (2017) and references therein. Second, it is well documented that intraday seasonal effects are an important factor of high-frequency financial data (see, e.g., Andersen and Bollerslev 1997; Bibinger et al. 2019; Ozturk et al. 2017). This motivates us to suppose that the time-lag is heterogeneous rather than constant over the course of the day. In fact, Huth (2012) has reported the existence of intraday variations of lead–lag relationships in financial markets. By these reasons, we introduce a heterogeneous time-lag into the HRY model. Third, because of the low-latency responses of high-frequency traders in recent financial markets (cf. Hasbrouck and Saar 2013), we may expect that the time-lag is quite small, so that it is comparable with the sampling frequency. To take account of this fact explicitly in our model, we consider a local asymptotics, such that the time-lag shrinks as the sampling frequency increases. In econometrics, local asymptotics is a standard technique to make asymptotic theories more realistic in finite samples. Primal examples are studies on models with nearly unit roots (e.g., Phillips 1987; Phillips and Magdalinos 2007) and weak identification problems (e.g., Andrews and Cheng 2012). Examples in high-frequency financial econometrics include volatility estimation in the presence of round-off errors (Li and Mykland 2015; Rosenbaum 2009; Robert and Rosenbaum 2011; Li et al. 2018), small jump analysis (Li 2013) and inference under small microstructure noise (Kurisu 2018; Rosenbaum 2011).
Under the proposed model, we develop a statistical methodology to investigate the lead–lag relationships. The methodology is established in line with a completely different idea from Hoffmann et al. (2013)’s one and enables us to flexibly analyze time-varying lead–lag relationships. In particular, we establish the asymptotic distribution theory for the proposed methodology, which allows us to discuss the statistical significance of the results obtained by applications of our methodology. We note that the asymptotic distribution theory for the estimator proposed in Hoffmann et al. (2013) is not straightforward; see Sect. 3.3 of Hoffmann et al. (2013) for details.
This paper is organized as follows. In Sect. 2, we introduce the model used in this study. In Sects. 3–5, we develop statistical methodologies to analyze the global and local behaviors of the lead–lag relationships over the day. We assess the finite sample performances of the proposed methodologies by a Monte Carlo experiment in Sect. 6, while we provide an empirical illustration of our approach in Sect. 7. All the proofs are collected in the Appendix.
2 Model
We assume that \(X=(X^1,X^2)\) is a bivariate continuous Itô semimartingale defined on a stochastic basis \(\mathcal {B}=(\Omega ,\mathcal {F},(\mathcal {F}_t)_{t\ge 0},P)\), which is of the form:
where \(W_s\) is a bivariate standard Wiener process on \(\mathcal {B}\), \(b_s\) is a bivariate càdlàg \((\mathcal {F}_t)\)-adapted process, and \(\Sigma _s\) is a \(2\times 2\) positive semidefinite symmetric matrix valued càdlàg \((\mathcal {F}_t)\)-adapted process. We assume \(\int _0^1\Sigma ^{12}_s\mathrm {d}s\ne 0\) a.s. We observe X on the interval [0, 1], and denote by \((t^p_i)_{i\ge 0}\) the observation times for \(X^p\) for \(p=1,2\). We assume that \(t^p_i\)’s are \((\mathcal {F}_t)\)-stopping times satisfying \(t^p_i\uparrow \infty\) as \(i\rightarrow \infty\). We also assume that they implicitly depend on a parameter \(n\in \mathbb {N}\) representing the observation frequency and satisfy:
as \(n\rightarrow \infty\) for any \(t>0\) with \(t^p_{-1}:=0\) for \(p=1,2\). Here, the notation \(\rightarrow ^p\) denotes convergence in probability.
For each \(p=1,2\), the observation \(Y^p_i\) of \(X^p\) at the observation time \(t^p_i\) is given by:
for \(i=0,1,\dots\). Here, \(\epsilon ^p_i\)s are measurement errors which are referred to as the microstructure noise in financial econometrics, while \(\vartheta ^p(t)\) denotes a latency of incorporating new information on the efficient log price \(X^p\) into the observed price at the time t. We will assume that \((\vartheta ^p(t))_{t\ge 0}\) is a stochastic process adapted to \((\mathcal {F}_t)\). For high-frequency financial data, we may expect that the latency is small, so that it is comparable with the sampling frequency. For this reason, we consider the local asymptotics, such that \(\vartheta ^p\equiv \vartheta ^p_n=n^{-\alpha }c^p_\vartheta\) for some \(\alpha >\frac{1}{2}\) and some nonnegative-valued process \((c^p_\vartheta (t))_{t\ge 0}\).
We are interested in the process \(\vartheta _n(t):=\vartheta _n^2(t)-\vartheta _n^1(t)\), \(t\in [0,1]\), which we refer to as the spot lead–lag time. If \(\vartheta _n(t)>0\) for some \(t\in [0,1]\), the second asset’s latency is larger than the first asset’s one, so the first asset leads the second asset at the time t and the size of time-lag is equal to \(|\vartheta _n(t)|\). The converse holds true if \(\vartheta _n(t)<0\). Before considering the direct estimation of \(\vartheta _n(t)\), which is discussed in Sect. 5, in the next section, we construct estimators for the following processes:
where \(h_n\) is a tuning parameter introduced in the next section. Note that we have \(L^{n,2}_1\rightarrow \int _0^1\Sigma _s^{12}\mathrm {d}s\) a.s. as \(n\rightarrow \infty\); hence, \(L^{n,2}_1\) is asymptotically non-zero, because we assume \(\int _0^1\Sigma ^{12}_s\mathrm {d}s\ne 0\) a.s. Now, if \(\vartheta _n(t)\) does not depend on t, we have \(\vartheta _n\equiv (h_n/\pi )\arctan (L^{n,1}_1/L^{n,2}_1)=:\mathbf {SLL}_n\), so we can construct an estimator for \(\vartheta _n\) by plugging the estimators for \(L^{n,1}_1\) and \(L^{n,2}_1\) into \(\mathbf {SLL}_n\). Even if \(\vartheta _n(t)\) depends on t, the quantity \(\mathbf {SLL}_n\) remains meaningful as an index to capture an averaged behavior of the process \(\vartheta _n(t)\) on the interval [0, 1]. We call the variable \(\mathbf {SLL}_n\) the spectral lead–lag index and will use a descriptive statistic for assessing lead–lag relationships in our empirical study.
Since the variables \(\mathbf {SLL}_n\) tend to zero as \(n\rightarrow \infty\), the estimation of \(\mathbf {SLL}_n\) is only meaningful under the stronger statement than the usual consistency property. More precisely, since the variables \(\mathbf {SLL}_n\) tend to zero as fast as \(n^{-\alpha }\) in the sense that \(n^\alpha \mathbf {SLL}_n\rightarrow ^p\int _0^1(c^2_\vartheta (s)-c^1_\vartheta (s))\Sigma ^{12}_s\mathrm {d}s/\int _0^1\Sigma ^{12}_s\mathrm {d}s\) as \(n\rightarrow \infty\), a sequence \(\widehat{\mathbf {SLL}}_n\) of estimators for \(\mathbf {SLL}_n\) provides a meaningful estimation result if and only if:
as \(n\rightarrow \infty\). In the next section, we construct estimators for \(\mathbf {SLL}_n\) having the above property.
3 Estimation of the spectral lead–lag index
3.1 Construction of the estimators
First, following Bibinger et al. (2014), we define the spectral statistics as follows:
where \({\bar{t}}^p_i=(t^p_{i-1}+t^p_i)/2\), \(\Phi _{k}(t)=\sin \left( \pi h_n^{-1}(t-kh_n)\right)\), \(J^n_k=(kh_n,(k+1)h_n]\) and \(h_n\) is a positive number, such that \(h_n^{-1}\in \mathbb {N}\). We assume that the sequence \(h_n\) satisfies \(\sqrt{n}h_n\rightarrow c\) as \(n\rightarrow \infty\) for some \(c>0\). Namely, \(S^p_k\) is the Fourier sine coefficient of the observed returns of the pth asset on the interval \(J^n_k\). We cannot directly use the cosine version of \(S^p_k\)s due to end effects, because \(\cos (0)=-\cos (\pi )=1\ne 0\). To deal with this issue, we rely on the same trick as in Bibinger and Winkelmann (2015). Namely, We consider the spectral statistics on the shifted blocks \(((k-\frac{1}{2})h_n,(k+\frac{1}{2})h_n]\), as well, i.e., \(S_{k-\frac{1}{2}}\) (\(k=1,\dots ,h_n^{-1}-1\)). Bibinger and Winkelmann (2015) use these statistics to handle jumps in their spectral covariance estimators. The following formula plays a key role:
so \(\Phi _{k-1}\) and \(-\Phi _k\) behave as the cosine function on the interval \(J^n_{k-1/2}\).
Now, we explain the idea behind the construction of our estimators. For exposition, we assume that \(b_s\equiv 0\), \(\Sigma _s\equiv \Sigma\), \(t^p_i=i/n\), \(\vartheta _n^1\equiv 0\), \(\vartheta _n^2\equiv \vartheta \in \{k/n:k\in \mathbb {Z}_+\}\) and \(E[\epsilon ^1_i\epsilon ^2_j]=0\) for all i, j. We set:
Then, noting that \(|\vartheta |\le h_n/2\) for sufficiently large n and:
we have:
Now, by applying the identity \(\sin (y-x)=\cos (x)\sin (y)-\sin (x)\cos (y)\), we obtain:
The quantity on the right side of the above equation is equal to the integrand of \(L^{n,1}_t\) multiplied by \(h_n\), so we naturally consider the following estimator for \(L^{n,1}_t\):
An analogous argument suggests the following estimator for \(L^{n,2}_t\):
where:
with setting \(S_{-\frac{1}{2}}:=0\).Footnote 1
Remark 3.1
(Cross-correlated noise) There is some empirical evidence, showing that microstructure noise is cross-correlated across multiple assets; see Voev and Lunde (2007) and Ubukata and Oya (2009). One may expect that the estimator \(\widehat{L}^{n,1}_t\) constructed above would be robust against such cross-correlations as long as the serial dependence in the noise process is sufficiently weak in tick time.Footnote 2 To see this, note that summation by parts yields the following approximation for the noise part of \(S^p_k\) (cf. the last equation in page 363 of Bibinger and Winkelmann (2015)):
This suggests that, even if \(E[\epsilon ^1_i\epsilon ^2_i]\ne 0\), the expectation of the noise part of \(\ell ^{n,1}_k\) is negligible by a similar argument to the above. This statement continues to hold true even if \(E[\epsilon ^1_i\epsilon ^2_j]\ne 0\) for \(i\ne j\) as long as \(|E[\epsilon ^1_i\epsilon ^2_j]|\) decays sufficiently fast as \(|i-j|\) increases. This is because, in such a situation, we can replace the summand on the right side of (3.1) by a martingale difference due to Gordin’s martingale approximation method (cf. Sect. 19 of Billingsley (1999)). In fact, this is exactly what we done in the proof to handle the serial dependence in the noise process; see (A.1) and (A.15). In the meantime, the estimator \(\widehat{L}^{n,2}_t\) is biased in the presence of such cross-correlations. In the equidistant sampling case, it is not difficult to see that the bias is proportional to the long-run cross-covariance of the noise process, which can be estimated with a faster convergence rate than \(\widehat{L}^{n,2}_t\) and thus easily corrected; see Bibinger and Reiß (2014) for a serially uncorrelated case. However, the bias correction is not straightforward in the non-synchronous observation case. This is mainly because it is by now not established how to model cross-sectional dependence in microstructure noise in a both mathematically and empirically satisfactory manner (cf. the discussion after Assumption 3 in Bibinger et al. (2019)). For this reason, this paper focuses on the situation where \(\epsilon ^1_i\) and \(\epsilon ^2_j\) are uncorrelated for all i, j.
3.2 Asymptotic theory
In this section, we present an asymptotic theory for the process \({\widehat{L}}^n_t:=({\widehat{L}}^{n,1}_t,{\widehat{L}}^{n,2}_t)^\top\), \(t\in [0,1]\). First, we enumerate the assumptions which we impose. Let \(\lambda\) be a positive constant.
- A1:
-
(i) There is a constant \(\eta \in (0,\frac{1}{2})\), such that \(t^p_i\) is an \((\mathcal {F}_{(t-n^{-\eta })_+})_{t\ge 0}\)-stopping time for any n, i and every \(p=1,2\).
(ii) \(r_n(t)=o_p(n^{-\xi })\) as \(n\rightarrow \infty\) for any \(t>0\) and any \(\xi \in (0,1)\).
(iii) For any \(n\in \mathbb {N}\) and \(p=1,2\), there is a filtration \((\mathcal {H}^{n,p}_t)_{t\ge 0}\) of \(\mathcal {F}\), such that \(t^p_i\) is an \((\mathcal {H}^{n,p}_t)\)-stopping time for every i.
(iv) For each n there is a random subset \(\mathcal {N}^n\) of \(\mathbb {Z}_+\), such that \(\{(\omega ,p)\in \Omega \times \mathbb {Z}_+:p\in \mathcal {N}^n(\omega )\}\) is a measurable set of \(\Omega \times \mathbb {Z}_+\). Moreover, there is a constant \(\kappa \in (0,\frac{1}{2})\), such that \(\#(\mathcal {N}^n\cap \{i:t^p_i\le t\})=O_p(n^\kappa )\) as \(n\rightarrow \infty\) for every \(t>0\) and every \(p=1,2\).
(v) For any \(n\in \mathbb {N}\), \(p=1,2\) and \(r=1,2\), there is a càdlàg \((\mathcal {H}^{n,p}_t)\)-adapted positive-valued process \(G(r)^{n,p}\), such that \(E[|n(t^p_{i+1}-t^p_i)|^r\big |\mathcal {H}^{n,p}_{t^p_i}]=G(r)^{n,p}_{t^p_i}\) for every \(i\in \mathbb {Z}_+\setminus \mathcal {N}^n\). Moreover, there is a càdlàg \((\mathcal {F}_t)\)-adapted positive-valued process \(G(r)^{p}\), such that \(G(r)^{n,p}\rightarrow ^pG(r)^p\) as \(n\rightarrow \infty\) for the Skorokhod topology.
(vi) \(G(1)^p_{t-}>0\) for every \(t>0\) and every \(p=1,2\).
- A2:
-
For \(p=1,2\), \(c^p_\vartheta\) is \((\mathcal {F}_t)\)-adapted and its paths are almost surely Lipschitz continuous.
- \(\hbox {N}_\lambda\):
-
The measurement errors are of the form \(\epsilon ^p_i=\sqrt{v^p_{t^p_i}}u^p_i\) for \(p=1,2\) and \(i=0,1,\dots\), where \(u^p_i\)s are random variables and \(v^p\) is a nonnegative \((\mathcal {F}_t)\)-adapted process. Moreover, they satisfy the following conditions.
(i) \(u^p\) is strictly stationary and independent of \(\mathcal {F}_\infty :=\bigvee _{t>0}\mathcal {F}_t\) for every \(p=1,2\).
(ii) For every \(p=1,2\), \(E[|u^p|^r]<\infty\) for any \(r>0\) and \(E[u^p]=0\).
(iii) The \(\alpha\)-mixing coefficients \(\alpha _p(j)\) of \(u^p\) satisfy \(\sum _{j=1}^\infty \alpha _p(j)^\lambda <\infty\) for every \(p=1,2\).
(iv) \(u^1\) and \(u^2\) are mutually independent
(v) For every \(p=1,2\), the paths of \(v^p\) are almost surely \(\varpi\)-Hölder continuous for some \(\varpi >0\).
Remark 3.2
(Assumptions on observation times) (a) [A1](i) type assumptions are sometimes called the strong predictability condition in the literature and can be found in, e.g., Hayashi and Yoshida (2011) and Koike (2014, 2016). In our situation, this type of condition is necessary to ensure that the “delayed” observation times \((t^p_i-\vartheta ^p_n(t^p_i))_+\) are nearly \((\mathcal {F}_t)\)-stopping times (see Lemma A.2). Here, we emphasize that in our setting, this type of assumption is required due to the possible existence of lead–lag relationships: in our setting, the process \(X^p\) is essentially sampled at the times \(t^p_i-\vartheta ^p_n(t^p_i)\) rather than the times \(t^p_i\). Even if \(t^p_i\) themselves are stopping times, \(t^p_i-\vartheta ^p_n(t^p_i)\) are not necessarily stopping times unless we impose a kind of predictability condition. Therefore, if we developed an asymptotic theory without such a condition in our setting, we would presumably need to depart from the framework of Itô calculus and rely on anticipative calculus (e.g., Malliavin calculus), which will be mathematically challenging. In fact, in Hoffmann et al. (2013), they also impose an analogous condition due to the same reason (see Assumption B2 of Hoffmann et al. (2013)). Indeed, their assumption is stronger than ours, because they consider lead–lag times which do not shrink as n tends to infinity. We also remark that our assumption allows, e.g., random sampling times independent of the \(\sigma\)-field \(\mathcal {F}_\infty\), because such ones can be assumed to be \(\mathcal {F}_0\)-measurable without loss of generality, so we do not necessarily know transactions completely in advance (i.e., there may still be exogenous randomness).
(b) [A1](ii)–(iv) type assumptions are more or less standard in the literature and found in Barndorff-Nielsen et al. (2011) and Koike (2014, 2016) for example. Here, we remark that the introduction of the random set \(\mathcal {N}^n\) is mainly necessary to ensure the stability of Assumption [A1] under the localization procedure used in the proof; see the proof of Lemma 6.3 from Koike (2017b) for details. Of course, one can take the set \(\mathcal {N}^n\) as the empty set, which amounts to a standard situation in the literature. Another reason why we introduce the set \(\mathcal {N}^n\) is because it excludes some trivial exceptions appearing when we set \(\mathcal {N}^n=\emptyset\). For example, if \(t^p_0=\log n/n\) and \(t^p_i=t^p_{i-1}+1/n\), \(i=1,2,\dots\), [A1](v) is not satisfied if we set \(\mathcal {N}^n=\emptyset\).
Remark 3.3
(Assumptions on microstructure noise) An [\(\hbox {N}_\lambda\)] type assumption is used in Jacod et al. (2019, 2017). It allows the noise process to have time-varying variance and serial autocorrelations. Both properties are the stylized facts of ultra-high-frequency financial data (see, e.g., Hansen and Lunde 2006). Assumption [\(\hbox {N}_\lambda\)](iv) excludes cross-correlations between \((\epsilon ^1_i)_{i=0}^\infty\) and \((\epsilon ^2_i)_{i=0}^\infty\). We impose such an assumption due to the reason explained in Remark 3.1. We also remark that the existence of all moments of the noise required by [\(\hbox {N}_\lambda\)] is standard in the literature (see, e.g., Assumption 16.1.1 of Jacod and Protter (2012) and Assumption (N-v) of Jacod et al. (2019)) and not a serious practical restriction as noted in Remark 16.1.2 of Jacod and Protter (2012). It would be possible to state the assumption to require the finite moment up to a suitable order which depends on other parameters such as \(\xi ,\lambda\) and so on.
Let us recall the notion of stable convergence. Given a sequence \((Z_n)\) of random variables taking values in a Polish space S and a sub-\(\sigma\)-field \(\mathcal {G}\) of \(\mathcal {F}\), we say that the variables \(Z_n\) converge \(\mathcal {G}\)-stably in law to an S-valued variable Z, which is defined on an extension of \((\Omega ,\mathcal {F},P)\), if \(E[Uf(Z_n)]\rightarrow E[Uf(Z)]\) as \(n\rightarrow \infty\) for any \(\mathcal {G}\)-measurable bounded variable U and any bounded continuous function f on S. Then, we write \(Z_n\rightarrow ^{\mathcal {G}-d_s}Z\). In this case, for any variables \(U_n\) converging in probability to a \(\mathcal {G}\)-measurable variable U, we have \((U_n,Z_n)\rightarrow ^{\mathcal {G}-d_s}(U,Z)\) as \(n\rightarrow \infty\) for the product topology on the space \(\mathbb {R}\times S\).
Now, we are ready to state our asymptotic result.
Theorem 3.1
Suppose that [A1]–[A2] and [\(\hbox {N}_\lambda\)] are satisfied for some \(\lambda \in (0,\frac{1}{2})\). Then, the bivariate processes \(h_n^{-\frac{1}{2}}\left( {\widehat{L}}^{n,1}-L^{n,1},{\widehat{L}}^{n,2}-L^{n,2}\right)\) converge \(\mathcal {F}_\infty\)-stably in law to \(({\widetilde{W}}^1_{\int _0^\cdot {\mathfrak {v}}^1_s\mathrm {d}s},{\widetilde{W}}^2_{\int _0^\cdot {\mathfrak {v}}^2_s\mathrm {d}s})\) as \(n\rightarrow \infty\) for the Skorokhod topology, where \({\widetilde{W}}^1\) and \({\widetilde{W}}^2\) are mutually independent standard Brownian motions independent of \(\mathcal {F}_\infty\), and:
with \({\mathfrak {S}}^{p,\pm }_s=\Sigma ^{pp}_s\pm \pi ^2c^{-2}v^p_s\Psi ^p_s\),
and \(\gamma _p(j)=E\left[ u^{p}_0u^p_j\right]\) \((j\in \mathbb {Z}_+)\) for \(p=1,2\) and \(s\ge 0\).
Remark 3.4
(Mixing condition on the noise process) It might be worth remarking that the mixing condition imposed in Theorem 3.1 is stronger than the usual one \(\sum _{j=1}^\infty \alpha _p(j) < \infty\) in classical time-series analysis, but this is standard in the literature of volatility/covariance estimation from high-frequency data. For example, Jacod et al. (2019) requires \(\alpha _p(j)=O(j^{-v})\) for some \(v >3\), which implies that \(\sum _{j=1}^\infty \alpha _p(j)^\lambda < \infty\) for some \(\lambda < 1/3\); Ikeda (2016) requires that (at least) \(\alpha _p(j)=O(j^{-\varpi /(\varpi -4)-\delta })\) for some \(\varpi >4\) and \(\delta >0\) as well as \(\sum _{j=1}^\infty j^\varpi \gamma _p(j)<\infty\), where \(\gamma _p(j)\) is the auto-covariance function of \(u^p\), which is much stronger than \(\sum _{j=1}^\infty \alpha _p(j) < \infty\); Varneskov (2016) requires \(\sum _{j=1}^\infty j\alpha _p(j)<\infty\), which implies that \(\sum _{j=1}^\infty \alpha _p(j)^\lambda < \infty\) for any \(\lambda > 1/2\).
Theorem 3.1 has some important conclusions. First, since \(L^{n,1}_1\rightarrow 0\) a.s., \(L^{n,2}_1\rightarrow \int _0^1\Sigma _s^{12}\mathrm {d}s\) a.s., and:
the property of stable convergence and the continuous mapping theorem imply that:
as \(n\rightarrow \infty\), where \(\zeta\) is a standard normal variable independent of \(\mathcal {F}_\infty\) and \(\mathcal {V}:=\int _0^1{\mathfrak {v}}^1_s\mathrm {d}s/(\int _0^1\Sigma _s^{12}\mathrm {d}s)^2\) (recall \(\int _0^1\Sigma _s^{12}ds\ne 0\) a.s.). Next, noting \(L^{n,1}_1/L^{n,2}_1\rightarrow 0\) a.s., we obtain:
as \(n\rightarrow \infty\) by a similar argument to the proof of the delta method, where:
In particular, it holds that:
as \(n\rightarrow \infty\). Therefore, the estimators \(\widehat{\mathbf {SLL}}_n\) satisfy property (2.2) as long as \(\alpha <3/4\).
Remark 3.5
(Necessity of the condition \(\alpha <3/4\)) It is worth mentioning that there is no sequence of estimators having property (2.2) if \(\alpha \ge 3/4\) in the following sense. Let us suppose that the observation data \((Y^1_i,Y^2_i)_{i=1}^n\) are generated by the following simpler model:
where \((B^1_t,B^2_t)\) \((t\in \mathbb {R})\) is a bivariate two-sided Brownian motion, such that \(B_0=0\), \(E[(B^1_1)^2]=E[(B^2_1)^2]=1\) and \(E[B^1_1B^2_1]=\rho\) for some \(\rho \in (-1,0)\cup (0,1)\), \((\epsilon ^1_i,\epsilon ^2_i)\), \(i=1,\dots ,n\) are i.i.d. Gaussian variables with mean 0 and variance \(v>0\), \(\vartheta \in \mathbb {R}\) is the lead–lag parameter. Note that this model corresponds to a simplified version of our model (2.1), such that \(b_s\equiv 0\):
\(t^p_i=i/n\), \(\vartheta _n(t)\equiv \vartheta\) and \(\epsilon ^p_i\overset{i.i.d.}{\sim }{\mathbf {N}}(0,v)\): when the time lag process \(\vartheta _n(t)\) does not depend on t, we may set \(\vartheta ^1_n\equiv 0\) or \(\vartheta ^2_n\equiv 0\) in accordance with the sign of \(\vartheta _n(t)\), so we have adopted such a specification in the above model.
Now, we denote by \(P_{n,\vartheta }\) the law of \((Y^1_1,\dots ,Y^1_n,Y^2_1,\dots ,Y^2_n)\). Then, from Corollary 2.1 of Koike (2017a), \((P_{n,\vartheta })_{\vartheta \in \mathbb {R}}\) has the LAN property at \(\vartheta =0\) with rate \(n^{-3/4}\) and asymptotic Fisher information \(\rho ^2v^{3/2}/\{2(\sqrt{1+\rho }+\sqrt{1-\rho })\}\). In particular, by the local asymptotic minimax theorem (see, e.g., Theorem 1 from Ch. 6 of Le Cam and Yang (2000)), we obtain:
for any \(\eta >0\) and any sequence \({\widehat{\vartheta }}_n\) of estimators.
Another application of Theorem 3.1 is the construction of confidence intervals for \(\mathbf {SLL}_n\). For this purpose, we need estimators for the asymptotic variances of \({\widehat{L}}^{n,1}_1\) and \({\widehat{L}}^{n,2}_1\), as well, and we address this issue in the next subsection.
3.3 Asymptotic variance estimation
Since the analytic expressions of the asymptotic variances of \({\widehat{L}}^{n,1}_t\) and \({\widehat{L}}^{n,2}_t\) are complex, we estimate them by subsampling.Footnote 3 Since we consider infill-asymptotics, traditional subsampling methods (e.g., Politis and Romano (1994)) should be modified due to incorrect centering. Several subsampling methods for high-frequency data have recently been proposed, cf. Kalnina (2011), Sect. 4 of Christensen et al. (2013), Mykland and Zhang (2017), Christensen et al. (2017), Sect. 3.1 of Ikeda (2016). In this paper, we adopt an “overlapping version” of the method proposed in Sect. 4 of Christensen et al. (2013).
Set \(\ell ^n_k=(\ell ^{n,1}_k,\ell ^{n,2}_k)^\top\) for \(k=1,\dots ,h_n^{-1}-1\), and take a positive integer \(K_n\) as the number of subsamples. Then, we define:
We may expect that adjacent subsampled estimators \({\widehat{L}}^{n}(\beta +K_n)\) and \({\widehat{L}}^{n}(\beta )\) would have conditional expectations close to each other. Motivated by this, we introduce the following estimator for the asymptotic covariance matrix of \({\widehat{L}}^n_t\), \(t\in [0,1]\):
The validity of this subsampling method is ensured by the following theorem:Footnote 4
Theorem 3.2
Suppose that [A1]–[A2] and [\(\hbox {N}_\lambda\)] are satisfied for some \(\lambda \in (0,\frac{1}{4})\). Suppose also that the paths of \(\Sigma ^{12}\) are almost surely \(\gamma\)-Hölder continuous for some \(\gamma \in (0,1]\). Then, we have: \(\sup _{0\le t\le 1}|{\widehat{V}}^{n,fg}_t-1_{\{f=g\}}\int _0^t{\mathfrak {v}}^f_s\mathrm {d}s|\rightarrow ^p0\) as \(n\rightarrow \infty\) for every \(f,g=1,2\), provided that \(K_n\rightarrow \infty\) and \(K_n=O(n^z)\) for some \(z<\gamma /\{2(\gamma +1)\}\).
Now, under the assumptions of Theorems 3.1–3.2, we can construct confidence intervals of \(\mathbf {SLL}_n\) as follows. Since we have:
as \(n\rightarrow \infty\), whereFootnote 5:
a \(100(1-\alpha )\)% confidence interval of \(\mathbf {SLL}_n\) for \(\alpha \in (0,1)\) is given by:
with \(z_{\alpha /2}\) being the upper \(\alpha /2\)-quantile of the standard normal distribution.
Remark 3.6
(Relation to Mykland and Zhang (2017)) The diagonal entries of \({\widehat{V}}^{n}_t\) are related to rolling quadratic variations of integrated processes introduced in (Mykland and Zhang 2017, Definition 2). In fact, for \(f=1,2\), \(2h_n{\widehat{V}}^{n,ff}_1\) is almost identical to the quantity \(QV_{B,K}({\hat{\Theta }})\) in Mykland and Zhang (2017) with taking \(\ell ^{n,f}\), \(K_n\), \(h_n^{-1}\) as \({\hat{\Theta }}_{((k-\frac{1}{2})h_n,(k+\frac{1}{2})h_n]}\), K, B, respectively. Mykland and Zhang (2017) have shown that, even after appropriately rescaling, \(QV_{B,K}({\hat{\Theta }})\) is generally a biased estimator for the asymptotic variance of \({\hat{\Theta }}_{(0,1]}\); see Sect. 3.2 ibidem. However, Mykland and Zhang (2017) have also shown that this is not the case when edge effects therein are “tiny”; see Sect. 3.3 ibidem for details. The proofs of Propositions A.1 and A.3 in the Appendix suggest that our estimator would satisfy this tiny edge effects condition. In fact, \(n^{-\alpha }\) and \(K_n\Delta T_n\) in Mykland and Zhang (2017) corresponds to \(\sqrt{h_n}\) and \(K_nh_n\), respectively, while the average of the squared edge effects in \({\widehat{L}}^{n,f}_1\) given by Eq. (21) of Mykland and Zhang (2017) would be of order \(O_p(h_n^{1+a})\) for any \(a\in (0,1)\). Under the assumptions of Theorem 3.2, \(K_nh_n\) has the same order as \(h_n^b\) for some \(b\in (\frac{1}{2},1)\). All together, Remark 9 of Mykland and Zhang (2017) would imply that \({\widehat{V}}^{n,ff}_1\) is a consistent estimator for the asymptotic variance of \({\widehat{L}}^{n,f}_1\).
4 Testing the absence of the lead–lag relationship
In this section, we present another application of Theorem 3.1 to decide whether the spot lead–lag time \(\vartheta _n(t)\) is identical to zero on the interval [0, 1] or not. Noting that \(\vartheta _n(t)\) is stochastic, our purpose is formulated as follows (cf. Aït-Sahalia and Jacod (2009)): we decompose the sample space \(\Omega\) into two disjoint subsets:
and we decide whether the realized outcome \(\omega \in \Omega\) belongs to \(\Omega ^0\) or to \(\Omega ^1\). In the following, we set “\(\omega \in \Omega ^0\)” as the null hypothesis and “\(\omega \in \Omega ^1\)” as the alternative.
To construct the test statistic, we borrow the idea from Dette and Podolskij (2008) and Vetter and Dette (2012). We note that \(\omega \in \Omega ^0\) is equivalent to \(L^{n,1}(\omega )_t=0\) for all \(t\in [0,1]\), provided that \(\Sigma ^{12}(\omega )_t\ne 0\) for all \(t\in [0,1]\). This suggests us to consider the following Kolmogorov–Smirnov type test statistic:
We can also consider the Kuiper type test statistic as follows:
Then, an application of Theorem 3.1 and the continuous mapping theorem yields the following convergence:
as \(n\rightarrow \infty\), where B is a standard Brownian motion independent of \(\mathcal {F}_\infty\). Now, let \({\bar{F}}_{\text {KS}}\) and \({\bar{F}}_{\text {Kuiper}}\) be the survival functions of the variables \(\sup _{t\in [0,1]}|B_t|\) and \(\sup _{t\in [0,1]}B_t-\inf _{t\in [0,1]}B_t\), respectively. Note that they can be analytically evaluated by using formulae 3.1.1.4 and 1.1.15.4(1) from Borodin and Salminen (2002). Then, the p values of the test statistics \(T^\text {KS}_n\) and \(T^\text {Kuiper}_n\) are given by \({\bar{F}}_{\text {KS}}(T^\text {KS}_n)\) and \({\bar{F}}_{\text {Kuiper}}(T^\text {Kuiper}_n)\), respectively. More formally, given a significant level \(\alpha \in (0,1)\), we have:
as \(n\rightarrow \infty\), provided that \(P(\Omega ^0)>0\). Moreover, by the construction of the test statistics, we have:
as \(n\rightarrow \infty\), provided that \(P(\Omega ^1)>0\) and \(\Sigma ^{12}_t\ne 0\) for all \(t\in [0,1]\). Consequently, the hypothesis testing based on \(T^\text {KS}_n\) (resp. \(T^\text {Kuiper}_n\)) is implemented by rejecting the null if and only if \({\bar{F}}_{\text {KS}}(T^\text {KS}_n)<\alpha\) (resp. \({\bar{F}}_{\text {Kuiper}}(T^\text {Kuiper}_n)<\alpha\)). This provides a level \(\alpha\) test with asymptotic power 1.
5 Estimation of the spot lead–lag time
Now, we focus on the direct estimation of the spot lead–lag time \(\vartheta _n(t)\), \(t\in [0,1]\). We remark that \(\vartheta _n(t)\) can be rewritten as \(\vartheta _n(t)=(h_n/\pi )\arctan \left[ \sin \left( \frac{\pi }{h_n}\vartheta _n(t)\right) /\cos \left( \frac{\pi }{h_n}\vartheta _n(t)\right) \right]\). In Sect. 3, we have constructed the estimators for the integrated quantities of \(\sin \left( \frac{\pi }{h_n}\vartheta _n(t)\right)\) and \(\cos \left( \frac{\pi }{h_n}\vartheta _n(t)\right)\), so estimators for the latter ones can be naturally constructed by a kernel approach as in the literature on spot volatility estimation (cf. Kristensen (2010), Kanaya and Kristensen (2016)). Specifically, for a function \(\mathcal {K}:\mathbb {R}\rightarrow \mathbb {R}\) and \(t\in (0,1)\), we set:
where \(\mathcal {K}_{H_n}(s)=\mathcal {K}(s/H_n)/H_n\) for every \(s\ge 0\) and \(H_n>0\) is a bandwidth parameter.
Theorem 5.1
Let \(\mathcal {K}:\mathbb {R}\rightarrow \mathbb {R}\) be a piecewise Lipschitz continuous function supported by the interval \([-1,1]\), such that \(\int _{-\infty }^\infty \mathcal {K}(s)\mathrm {d}s=1\). Let \(H_n\) be a sequence of positive numbers, such that \(H_n\rightarrow 0\), \(H_n^{-1}h_n^{\nu }\rightarrow 0\) for some \(\nu \in (0,1)\) and \((h_n^{-1}H_n)^{3/2}n^{-\alpha }\rightarrow 0\) as \(n\rightarrow \infty\). Suppose that \(t\in (0,1)\) satisfies \(\Sigma _t=\Sigma _{t-}\) a.s. Then, under the assumptions of Theorem 3.1, the variables
converge \(\mathcal {F}_\infty\)-stably in law to the variable \((\sqrt{{\mathfrak {v}}^1_t\int _{-\infty }^\infty \mathcal {K}(s)^2\mathrm {d}s}\cdot w_1,\sqrt{{\mathfrak {v}}^2_t\int _{-\infty }^\infty \mathcal {K}(s)^2\mathrm {d}s}\cdot w_2)^\top\) as \(n\rightarrow \infty\), where \(w_1\) and \(w_2\) are mutually independent standard normal variables independent of \(\mathcal {F}_\infty\).
Now, the estimator for the spot lead–lag time is constructed as:
Under the assumptions of Theorem 5.1, we have:
as \(n\rightarrow \infty\), provided that \(\Sigma ^{12}_t\ne 0\). In particular, \({\widehat{\vartheta }}_n(t)\) is a meaningful estimator for \(\vartheta _n(t)\) as long as \(n^\alpha h_n^{3/2}H_n^{-1/2}\rightarrow 0\) as \(n\rightarrow \infty\). This is possible if \(\alpha <\frac{3}{4}\) while taking \(H_n\propto h_n^\varpi\) for some \(\varpi \in (1-\frac{4}{3}\alpha ,3-4\alpha )\).
Remark 5.1
(Vanishing spot covariation case) It would be useful to investigate what happens if \(\Sigma ^{12}_t=0\). For this purpose, we additionally suppose that paths of \(\Sigma ^{12}\) are almost surely \(\gamma\)-Hölder continuous for some \(\gamma \in (0,1]\). Then, we have:
as \(n\rightarrow \infty\), provided that \(H_n=o(h_n^{1/(2\gamma +1)})\). Therefore, Theorem 5.1, the continuous mapping theorem, and Eq. (5) in Marsaglia (1965) imply that the variables:
converge \(\mathcal {F}_\infty\)-stably in law to the variable \(\sqrt{{\mathfrak {v}}^1_t/{\mathfrak {v}}^2_t}\cdot {\mathfrak {c}}\) as \(n\rightarrow \infty\), where \({\mathfrak {c}}\) is a standard Cauchy variable independent of \(\mathcal {F}_\infty\). In particular, we have \(1_{\{\Sigma ^{12}_t=0\}}{\widehat{\vartheta }}_n(t)=O_p(h_n^{3/2}H_n^{-1/2})\) as \(n\rightarrow \infty\). From the above discussion, we should choose \(h_n\), such that \(h_n^{3/2}H_n^{-1/2}=o(n^{-\alpha })\) to make \({\widehat{\vartheta }}_n(t)\) a meaningful estimator for \(\vartheta _n(t)\), and in this case, we have \(1_{\{\Sigma ^{12}_t=0\}}{\widehat{\vartheta }}_n(t)=o_p(n^{-\alpha })\) as \(n\rightarrow \infty\). Consequently, if \(\Sigma ^{12}_t=0\), \({\widehat{\vartheta }}_n(t)\) will behave as if there is no significant time-lag between two assets.
To estimate the asymptotic variances of the estimators, we use a kernel counterpart of the subsampling estimator developed in Sect. 3.3. Namely, we define:
Theorem 5.2
Let \(\mathcal {K}:\mathbb {R}\rightarrow \mathbb {R}\) be a piecewise Lipschitz continuous function supported by the interval \([-1,1]\), such that \(\int _{-\infty }^\infty \mathcal {K}(s)\mathrm {d}s=1\) . Let \(H_n\) be a sequence of positive numbers, such that \(H_n\rightarrow 0\) and \(H_n^{-1}K_nh_n^{\nu }\rightarrow 0\) for some \(\nu \in (0,1)\) as \(n\rightarrow \infty\) . Then, under the assumptions of Theorem 3.2, we have \({\widehat{V}}(\mathcal {K})^{n,fg}_t\rightarrow ^p1_{\{f=g\}}{\mathfrak {v}}^f_t\) as \(n\rightarrow \infty\) for any \(f,g=1,2\).
As in Sect. 3.3, we can construct confidence intervals of \(\vartheta _n(t)\) under the assumptions of Theorems 5.1–5.2 as follows. Since we have:
as \(n\rightarrow \infty\), where
a \(100(1-\alpha )\)% confidence interval of \(\vartheta _n(t)\) for \(\alpha \in (0,1)\) is given by:
6 Simulation study
In this section, we conduct a Monte Carlo study to assess the finite sample performance of the proposed methodology. We simulate the efficient log-price process \(X^p\) (\(p=1,2\)) from the Rough Fractional Stochastic Volatility (RFSV) model of Gatheral et al. (2018):
where \((B^1,B^2)\) is a two-dimensional standard Brownian motion, such that \([B^1,B^2]_t=Rt\) and \(B^{H,p}\) is a fractional Brownian motion with Hurst parameter H. We assume that \((B^1,B^2)\), \(B^{H,1}\), and \(B^{H,2}\) are mutually independent. As in Section 3.4 of Gatheral et al. (2018), we set \(H=0.14\), \(\nu =0.3\), \(m=-5\), and \(\alpha =5\times 10^{-4}\). \(Z^p_0\) is taken from the stationary distribution of \(Z^p\). We vary the correlation parameter R as \(R=0.5,0.7,0.9\).
We generate the observation data as follows. First, we simulate the equidistant sampling \((X^p_{i/n-\vartheta ^p_n(i/n)})_{i=0}^n\) of the time-lagged efficient log-price process by the Euler–Maruyama scheme, where we set \(n=23,400\). We regard the interval [0, 1] as 6.5 h, so that \(\Delta _n:=1/n\) corresponds to 1 s. Then, for each i, we keep the value \(X^p_{i/n}\) as an observation for the p-th asset with probability \(\pi _p\), where we set \(\pi _1=2/3\) and \(\pi _2=1/2\). After that, we add microstructure noise to the observations. We generate the noise process \((\epsilon ^p_i)\) from the following AR(1) process:
where \(\delta >0\) is chosen, so that \({{\,\mathrm{Var}\,}}[\epsilon ^p_i]=\gamma {{\,\mathrm{Var}\,}}[X^p_{t^p_i}-X^p_{t^p_{i-1}}]\), where \(\gamma\) corresponds to the noise ratio of Oomen (2006). We set \(\phi =0.77\) and \(\gamma =0.5\) as in Sect. 2.3 of Christensen et al. (2014). For the latency processes \(\vartheta ^1_n(t)\) and \(\vartheta ^2_n(t)\), we consider the following two scenarios:
- Scenario 1:
-
\(\vartheta ^1_n(t)\equiv 0\), \(\vartheta ^2_n(t)\equiv \ell \Delta _n\).
- Scenario 2:
-
\(\vartheta _n^1(t)=\ell \Delta _n\cos ^2(\pi t)\), \(\vartheta _n^2(t)=\ell \Delta _n\sin ^2(\pi t)\).
Here, we vary the parameter \(\ell\) as \(\ell =0,1,2,3,4\). 10,000 paths are generated for each scenario. We use \(h_n=20\Delta _n\) to calculate the estimator \({\widehat{L}}^{n}\) and \(K_n=20\) to compute the subsampling-based asymptotic variance estimators. In addition, we set \(H_n=2h_n^{1/3}\) and \(\mathcal {K}(t)=\frac{3}{4}(1-t^2)1_{[-1,1]}(t)\) (the Epanechnikov kernel), while we compute the spot lead–lag time estimator \({\widehat{\vartheta }}_n(t)\).
Table 1 presents the results of estimating the spectral lead–lag index \(\mathbf {SLL}_n\). We report the bias and the root-mean-square error (RMSE) of the estimator \(\widehat{\mathbf {SLL}}_n\) in seconds, i.e., the bias and the RMSE of \(n\cdot \widehat{\mathbf {SLL}}_n\). We see from the table that the biases of the estimates are less than 1% of the true values and the RMSEs are mild, indicating a good finite sample performance of our estimator. The table also reveals that both the biases and the RMSEs decrease as the correlation parameter increases. This is in line with our asymptotic theory and intuitively natural as well, because the lead–lag relationships would be more pronounced with higher correlations.
Table 2 shows the results for the Studentization of \({\widehat{L}}^{n,1}_1/{\widehat{L}}^{n,2}_1\) to check the accuracy of the standard normal approximation. We report the sample means, sample standard deviations (SD), as well as 95 and 99% coverages of the studentization of \({\widehat{L}}^{n,1}_1/{\widehat{L}}^{n,2}_1\), i.e., the variable in the left side of (3.3). The results of the table show that the standard normal approximation works very well.
Table 3 reports the rejection rates of the hypothesis testing presented in Sect. 4 with 5% level. We have two options \(T^\text {KS}_n\) and \(T^\text {Kuiper}_n\) for the test statistics to implement the test. As the table reveals, the sizes of the test are well controlled for both of the test statistics. The powers of the test are very high for both of the test statistics in Scenario 1, while they are much better for \(T^\text {Kuiper}_n\) than for \(T^\text {KS}_n\) in Scenario 2, especially when \(\ell\) or R is small. This might be explained as follows. In the alternative hypothesis, we expect that the values of \(T_n^{\text {KS}}\) and \(T_n^{\text {Kuiper}}\) would be dominated by \(\xi \sup _{0\le t\le 1}|\int _0^t\vartheta _n(s)\mathrm {d}s|\) and \(\xi (\sup _{0\le t\le 1}\int _0^t\vartheta _n(s)\mathrm {d}s-\inf _{0\le t\le 1}\int _0^t\vartheta _n(s)\mathrm {d}s)\), respectively, where \(\xi\) is a positive random variable. In Scenario 2, we have:
so the latter is two times larger than the former. Meanwhile, we numerically obtain \({\bar{F}}_{\text {KS}}^{-1}(0.05)\approx 2.24\) and \({\bar{F}}_{\text {Kuiper}}^{-1}(0.05)\approx 2.50\). These computations could suggest the superior performance of \(T_n^{\text {Kuiper}}\). The performance of the test based on \(T^\text {Kuiper}_n\) is satisfactory for most situations, so it is recommended to use \(T^\text {Kuiper}_n\) as the test statistic.
Table 4 provides the results of estimating the spot lead–lag time \(\vartheta _n(t)\). We report the root mean integrated square error (RMISE) of the estimator \({\widehat{\vartheta }}_n(t)\) in seconds, that is:
The results show the relatively large errors compared to the estimation of the spectral lead–lag index, which is because of the slower rate of convergence presented by Theorem 5.1. However, for high correlation situations, they still seem acceptable.
In summary, our new methodology exhibits a good finite sample performance to analyze small and time-varying lead–lag relationships, and it gives a promising way to investigate lead–lag relationships in high-frequency financial data.
7 Empirical illustration
To illustrate how our new methodology works in real data, we analyze lead–lag relationships between the S&P 500 index and its two derivative products: the E-mini S&P 500 futures and the S&P 500 Standard and Poor’s Depository Receipt (SPDR) Exchange-Traded Fund (ETF). The sample period is the whole of January 2017, containing 20 trading days. We use intraday transaction data taken from the Bloomberg with the accuracy of the timestamp values being one second. We set \(h_n=\text {20 seconds}\), \(K_n=20\), \(H_n=2h_n^{1/3}\), and \(\mathcal {K}(t)=\frac{3}{4}(1-t^2)1_{[-1,1]}(t)\) as in the simulation study.
Figure 1 shows the estimates \(\widehat{\mathbf {SLL}}_n\) of the spectral lead–lag indices and the associated 95% confidence intervals. The red ones are significantly non-zero at the 5% level. The left panel of the figure shows strong evidence that the futures lead the index, which is consistent with the previous studies (see Kawaller et al. (1987) and de Jong and Nijman (1997) for instance). The middle panel also reveals strong evidence that the ETF leads the index. This means that the ETF dominates the index in terms of the price discovery process and it is again consistent with the previous studies such as Tse et al. (2006). The right panel indicates that the futures tend to lead the ETF, but it is less pronounced than the relationships between the index and the futures/ETF. A similar finding in terms of the price discovery process is reported by Tse et al. (2006).
Figure 2 shows the spot lead–lag time estimates of \({\widehat{\vartheta }}_n(t)\) averaged over the sample period and the corresponding 95% confidence bands. We find that U-shape patterns of the intraday variations for the pairs of the index and the futures/ETF. Namely, the spot lead–lag times are shorter at the beginning and the end of the days, while longer at the middle of the days. The peaks of the lead–lag times are located around 14:00 pm. This would be because the market is active at the beginning and the end of the days, yielding fast responses of traders and shorter lead–lag times. On the other hand, the spot lead–lag time process between the futures and the ETF exhibits an increasing trend throughout the day, although the confidence bands for the estimates are comparatively wide and, thus, the significance of this trend seems unclear.
Notes
In the numerical experiments below, we use \(\frac{h_n^{-1}}{h_n^{-1}-1}\ell ^{n,f}_k\) instead of \(\ell ^{n,f}_k\) (\(f=1,2\)) for the finite sample adjustments. This modification does not affect the asymptotic results presented in this paper.
Jacod et al. (2017) found empirical evidence that it is reasonable to assume autocorrelations of microstructure noise as a function of tick time.
Another approach would be bootstrapping. Bootstrap methods for estimating the distributions of estimators based on high-frequency data have been developed in, e.g., Hounyo (2017).
In the numerical experiments below, we compute \(\frac{h_n^{-1}-1}{h_n^{-1}-2K_n}{\widehat{V}}^{n}_1\) instead of \({\widehat{V}}^{n}_1\) for the finite sample adjustment. This modification does not affect the asymptotic result given by Theorem 3.2.
Although Theorem 3.1 ensures the asymptotic independence between \({\widehat{L}}^{n,1}_1\) and \({\widehat{L}}^{n,2}_1\), we take it into account in the construction of the asymptotic variance estimator \(\mathcal {V}_n\) to improve the finite sample accuracy.
References
Aït-Sahalia, Y., & Jacod, J. (2009). Testing for jumps in a discretely observed process. Annals of Statistics, 37, 184–222.
Aït-Sahalia, Y., & Jacod, J. (2014). High-frequency financial econometrics. Princeton: Princeton University Press.
Alsayed, H., & McGroarty, F. (2014). Ultra-high-frequency algorithmic arbitrage across international index futures. Journal of Forecasting, 33, 391–408.
Andersen, T. G., & Bollerslev, T. (1997). Intraday periodicity and volatility persistence in financial markets. Journal of Empirical Finance, 4, 115–158.
Andrews, D. W. K., & Cheng, X. (2012). Estimation and inference with weak, semi-strong, and strong identification. Econometrica, 80, 2153–2211.
Bacry, E., Delattre, S., Hoffmann, M., & Muzy, J. (2013). Some limit theorems for Hawkes processes and application to financial statistics. Stochastic Processes and Their Applications, 123, 2475–2499.
Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., & Shephard, N. (2011). Multivariate realised kernels: Consistent positive semi-definite estimators of the covariation of equity prices with noise and non-synchronous trading. Journal of Econometrics, 162, 149–169.
Bibinger, M., Hautsch, N., Malec, P., & Reiß, M. (2014). Estimating the quadratic covariation matrix from noisy observations: local method of moments and efficiency. Annals of Statistics, 42, 80–114.
Bibinger, M., Hautsch, N., Malec, P., & Reiß, M. (2019). Estimating the spot covariation of asset prices—statistical theory and empirical evidence. Journal of Business and Economic Statistics, 37, 419–435.
Bibinger, M., & Reiß, M. (2014). Spectral estimation of covolatility from noisy observations using local weights. Scandinavian Journal of Statistics, 41, 23–50.
Bibinger, M., & Winkelmann, L. (2015). Econometrics of co-jumps in high-frequency data with noise. Journal of Econometrics, 184, 361–378.
Billingsley, P. (1999). Convergence of probability meausres (2nd ed.). Hoboken: Wiley.
Bollen, N. P., O’Neill, M. J., & Whaley, R. E. (2017). Tail wags dog: Intraday price discovery in VIX markets. Journal of Futures Markets, 37, 431–451.
Borodin, A. N., & Salminen, P. (2002). Handbook of Brownian motion—facts and formulae (2nd ed.). Basel: Springer.
Buccheri, G., Corsi, F., & Peluso, S. (2020). High-frequency lead-lag effects and cross-asset linkages: a multi-asset lagged adjustment model. Statist: J. Bus. Econom. (forthcoming).
Ceron, A., Curini, L., & Iacus, S. M. (2016). First- and second-level agenda setting in the Twittersphere: an application to the Italian political debate. Journal of Information Technology & Politics, 13, 159–174.
Christensen, K., Oomen, R., & Podolskij, M. (2014). Fact or friction: jumps at ultra high frequency. Journal of Financial Economics, 114, 576–599.
Christensen, K., Podolskij, M., Thamrongrat, N., & Veliyev, B. (2017). Inference from high-frequency data: a subsampling approach. Journal of Econometrics, 197, 245–272.
Christensen, K., Podolskij, M., & Vetter, M. (2013). On covariation estimation for multivariate continuous Itô semimartingales with noise in non-synchronous observation schemes. Journal of Multivariate Analysis, 120, 59–84.
Da Fonseca, J., & Zaatour, R. (2015). Correlation and lead-lag relationships in a Hawkes microstructure model. Journal of Futures Markets, 37, 260–285.
de Jong, F., & Nijman, T. (1997). High frequency analysis of lead-lag relationships between financial markets. Journal of Empirical Finance, 4, 259–277.
Dette, H., & Podolskij, M. (2008). Testing the parametric form of the volatility in continuous time diffusion models—a stochastic process approach. Journal of Econometrics, 143, 56–73.
Dobrev, D., & Schaumburg, E. (2016). High-frequency cross-market trading: Model free measurement and applications. Working paper.
Freedman, D. A. (1975). On tail probabilities for martingales. Annals of Probability, 3, 100–118.
Gatheral, J., Jaisson, T., & Rosenbaum, M. (2018). Volatility is rough. Quantitative Finance, 18, 933–949.
Hall, P. (1977). Martingale invariance principles. Annals of Probability, 5, 875–887.
Hansen, P. R., & Lunde, A. (2006). Realized variance and market microstructure noise. Journal of Business and Economic Statistics, 24, 127–161.
Hasbrouck, J. (1995). One security, many markets: Determining the contributions to price discovery. Journal of Finance, 50, 1175–1199.
Hasbrouck, J., & Saar, G. (2013). Low-latency trading. Journal of Financial Markets, 16, 646–679.
Hayashi, T., & Koike, Y. (2018). Wavelet-based methods for high-frequency lead-lag analysis. SIAM Journal on Financial Mathematics, 9, 1208–1248.
Hayashi, T., & Yoshida, N. (2011). Nonsynchronous covariation process and limit theorems. Stochastic Processes and Their Applications, 121, 2416–2454.
Hoffmann, M., Rosenbaum, M., & Yoshida, N. (2013). Estimation of the lead-lag parameter from non-synchronous data. Bernoulli, 19, 426–461.
Hounyo, U. (2017). Bootstrapping integrated covariance matrix estimators in noisy jump-diffusion models with non-synchronous trading. Journal of Econometrics, 197, 130–152.
Huth, N. (2012). Some properties of the correlation between the high-frequency financial assets. Ph.D. thesis, Ecole Centrale Paris.
Huth, N., & Abergel, F. (2014). High frequency lead/lag relationships – empirical facts. Journal of Empirical Finance, 26, 41–58.
Ikeda, S. S. (2016). A bias-corrected estimator of the covariation matrix of multiple security prices when both microstructure effects and sampling durations are persistent and endogenous. Journal of Econometrics, 193, 203–214.
Ito, K., & Sakemoto, R. (2020). Direct estimation of lead-lag relationships using multinomial dynamic time warping. Asia-Pacific Financial Markets, 27, 325–342.
Jacod, J., Li, Y., & Zheng, X. (2017). Statistical properties of microstructure noise. Econometrica, 85, 1133–1174.
Jacod, J., Li, Y., & Zheng, X. (2019). Estimating the integrated volatility with tick observations. Journal of Econometrics, 208, 80–100.
Jacod, J., & Protter, P. (2012). Discretization of processes. Berlin: Springer.
Jacod, J., & Shiryaev, A. N. (2003). Limit theorems for stochastic processes (p. 2nd). Berlin: Springer.
Kalnina, I. (2011). Subsampling high frequency data. Journal of Econometrics, 161, 262–283.
Kanaya, S., & Kristensen, D. (2016). Estimation of stochastic volatility models by nonparametric filtering. Econometric Theory, 32, 861–916.
Kawaller, I. G., Koch, P. D., & Koch, T. W. (1987). The temporal price relationship between S&P 500 futures and the S&P 500 index. Journal of Finance, 42, 1309–1329.
Koike, Y. (2014). Limit theorems for the pre-averaged Hayashi-Yoshida estimator with random sampling. Stochastic Processes and Their Applications, 124, 2699–2753.
Koike, Y. (2016). Estimation of integrated covariances in the simultaneous presence of nonsynchronicity, microstructure noise and jumps. Econometric theory, 32, 533–611.
Koike, Y. (2017a). On the asymptotic structure of Brownian motions with a small lead-lag effect. Journal of the Japan Statistical Society, 47, 1–31.
Koike, Y. (2017b). Time endogeneity and an optimal weight function in pre-averaging covariance estimation. Statistical Inference for Stochastic Processes, 20, 15–56.
Kristensen, D. (2010). Nonparametric filtering of the realized spot volatility: a kernel-based approach. Econometric Theory, 26, 60–93.
Kurisu, D. (2018). Power variations and testing for co-jumps: the small noise approach. Scandinavian Journal of Statistics, 45, 482–512.
Le Cam, L., & Yang, G. L. (2000). Asymptotics in statistics: Some basic concepts (2nd ed.). Berlin: Springer.
Li, J. (2013). Robust estimation and inference for jumps in noisy high frequency data: a local-to-continuity theory for the pre-averaging method. Econometrica, 81, 1673–1693.
Li, Y., & Mykland, P. A. (2015). Rounding errors and volatility estimation. Journal of Financial Econometrics, 13, 478–504.
Li, Y., Zhang, Z., & Li, Y. (2018). A unified approach to volatility estimation in the presence of both rounding and random market microstructure noise. Journal of Econometrics, 203, 187–222.
Marsaglia, G. (1965). Ratios of normal variables and ratios of sums of uniform variables. Journal of the American Statistical Association, 60, 193–204.
McLeish, D. L. (1975). A maximal inequality and dependent strong laws. Annals of Probability, 3, 829–839.
Meng, Y., & Lin, Z. (2009). Maximal inequalities and laws of large numbers for \({L}_q\)-mixingale arrays. Statistics & Probability Letters, 79, 1539–1547.
Mykland, P. A., & Zhang, L. (2017). Assessment of uncertainty in high frequency data: the observed asymptotic variance. Econometrica, 85, 197–231.
Oomen, R. C. A. (2006). n Hansen and Lunde (2006). J. Bus. Econom. Statist., 24, 195–202.
Ozturk, S. R., van der Wel, M., & van Dijk, D. (2017). Intraday price discovery in fragmented markets. Journal of Financial Markets, 32, 28–48.
Phillips, P. C. B. (1987). Towards a unified asymptotic theory for autoregression. Biometrika, 74, 535–547.
Phillips, P. C. B., & Magdalinos, T. (2007). Limit theory for moderate deviations from a unit root. Journal of Econometrics, 136, 115–130.
Politis, D. N., & Romano, J. P. (1994). Large sample confidence regions based on subsamples under minimal assumptions. Annals of Statistics, 22, 2031–2050.
Pomponio, F., & Abergel, F. (2013). Multiple-limit trades: empirical facts and application to lead-lag measures. Quantitative Finance, 13, 783–793.
Renò, R. (2003). A closer look at the Epps effect. International Journal of Theoretical and Applied Finance, 6, 87–102.
Robert, C. Y., & Rosenbaum, M. (2010). On the limiting spectral distribution of the covariance matrices of time-lagged processes. Journal of Multivariate Analysis, 101, 2434–2451.
Robert, C. Y., & Rosenbaum, M. (2011). A new approach for the dynamics of ultra-high-frequency data: the model with uncertainty zones. Journal of Financial Econometrics, 9, 344–366.
Rosenbaum, M. (2009). Integrated volatility and round-off error. Bernoulli, 15, 687–720.
Rosenbaum, M. (2011). A new microstructure noise index. Quantitative Finance, 11, 883–899.
Shephard, N., & Xiu, D. (2017). Econometric analysis of multivariate realised QML: estimation of the covariation of equity prices under asynchronous trading. Journal of Econometrics, 201, 19–42.
Tse, Y., Bandyopadhyay, P., & Shen, Y.-P. (2006). Intraday price discovery in the DJIA index markets. Journal of Business Finance & Accounting, 33, 1572–1585.
Ubukata, M., & Oya, K. (2009). Estimation and testing for dependence in market microstructure noise. Journal of Financial Econometrics, 7, 106–151.
Varneskov, R. T. (2016). Flat-top realized kernel estimation of quadratic covariation with nonsynchronous and noisy asset prices. Journal of Business and Economic Statistics, 34, 1–22.
Vetter, M., & Dette, H. (2012). Model checks for the volatility under microstructure noise. Bernoulli, 18, 1421–1447.
Voev, V., & Lunde, A. (2007). Integrated covariance estimation using high-frequency data in the presence of noise. Journal of Financial Econometrics, 5, 68–104.
Acknowledgements
The author is grateful to Frédéric Abergel, Simon Clinet, Masaaki Fukasawa, Takaki Hayashi, Xiaofei Lu, Hiroki Masuda, Yoann Potiron, Bezirgen Veliyev, and Nakahiro Yoshida for their valuable comments. The author also thanks two anonymous referees for their helpful comments. This work was partly supported by JST CREST Grant Number JPMJCR14D7 and JSPS KAKENHI Grant Numbers JP16K17105, JP17H01100, JP18H00836.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix Proofs
Appendix Proofs
1.1 Auxiliary results on the observation times
We begin by proving some auxiliary results on the observation times. For each n and each \(p=1,2\), we set \(N^{p}_n(t)=\sum _i1_{\{t^p_i\le t\}}\) for \(t\ge 0\). Then, we have the following result:
Lemma A.1
Suppose that [A1] holds true. Then, we have:
as \(n\rightarrow \infty\) for any \(T>0\) and \(p=1,2\).
Proof
First, by an analogous argument to the proof of Lemma 6.3 form Koike (2017b), it is enough to prove the lemma with an additional assumption that \(\sup _{i\ge 0}(t^p_i-t^p_{i-1})\le n^{-\xi }\) holds true for every n with some \(\xi \in (\kappa +1/2,1)\).
Set \(\alpha ^n_i=|I^p_i|/G(1)^{n,p}_{t^p_{i-1}}\). By assumption, we have \(N^{p}_n(t)/n=\sum _{i=1}^{N^{p}_{n}(t)+1}E[\alpha ^n_i|\mathcal {H}^n_{t^p_{i-1}}]+O_p(n^{\kappa -\xi })\) uniformly in \(t\in [0,T]\). On the other hand, noting that we can prove \(N^{p}_n(T)=O_p(n)\) in a similar manner to the proof of Lemma 6.1 from Koike (2017b), we obtain:
so we have \(\sum _{i=1}^{N^{p}_{n}(T)+1}\left( E[\alpha ^n_i|\mathcal {H}^n_{t^p_{i-1}}]-\alpha ^n_i\right) =O_p(n^{-1/2})\) by the Lenglart inequality. This completes the proof of the lemma. \(\square\)
The following result is an immediate consequence of Lemma A.1:
Corollary A.1
Under [A1], \((nh_n)^{-1}\sup _{0\le t\le T}(N^{p}_n(t)-N^{p}_n((t-h_n)_+))\) is tight as \(n\rightarrow \infty\) for any \(T>0\).
Next, take a number \(\eta _1\), such that \(\eta \vee (1-\alpha )<\eta _1<\frac{1}{2}\) and set \({\widetilde{t}}^p_i=(t^p_i-\vartheta ^p_n((t^p_i-n^{-\eta _1})_+)\wedge n^{-\alpha }\log n)_+\) for any n, i and \(p=1,2\).
Lemma A.2
Suppose that [A1] holds true. For sufficiently large n, \({\widetilde{t}}^p_i\) is an \((\mathcal {F}_{(t-n^{-\eta _1}/2)_+})\) -stopping time for all i.
Proof
For any \(t\ge 0\), we have:
Define the filtration \((\mathcal {I}^n_t)\) by \(\mathcal {I}_t=\mathcal {F}_{(t-n^{-\eta _1})_+}\). Since \(t^p_i\) is \((\mathcal {I}_t)\)-stopping time and the process \(\vartheta ^p_n((t-n^{-\eta })_+)\) is \((\mathcal {I}_{t})\)-adapted by assumption, we have \(\{t^p_i\le t+\vartheta ^p_n((t^p_i-n^{-\eta _1})_+)\wedge n^{-\alpha }\log n\}\in \mathcal {I}_{t^p_i}\). Therefore, we obtain \(\{{\widetilde{t}}^p_i\le t\}\in \mathcal {I}_{t+n^{-\alpha }\log n}\). Since \(n^{\eta _1-\alpha }\log n\rightarrow 0\) as \(n\rightarrow \infty\), for sufficiently large n, we have \(\mathcal {I}_{t+n^{-\alpha }\log n}\subset \mathcal {F}_{(t-n^{-\eta _1}/2)_+}\). This completes the proof. \(\square\)
1.2 Notation
In this subsection, we introduce some notation used throughout the proof. For (random) sequences \((x_n)\) and \((y_n)\), \(x_n\lesssim y_n\) means that there exists a (non-random) constant \(K\in [0,\infty )\), such that \(x_n\le Ky_n\) for large n. For a real-valued function f defined on [0, 1] and \(\delta >0\), we denote its modulus of continuity by \(w(f;\delta )\), that is:
For every \(p=1,2\) we set \(\mathcal {G}^p_j=\sigma (\epsilon ^p_i;i\le j)\), \(j=0,1,\dots\). We define the filtration \((\mathcal {F}^n_t)_{t\ge 0}\) by \(\mathcal {F}^n_t=\mathcal {F}_t\vee \mathcal {G}^1_{N^1_n(t)}\vee \mathcal {G}^2_{N^2_n(t)}\). We set \(T^n_k=kh_n\) for all n, k. For each \(p=1,2\) and every \(k=0,1,\dots ,h_n^{-1}-1\), we define the variables \(S^p_k(X)\) and \(S^p_k(\epsilon )\) by:
We set \(L^n_t=(L^{n,1}_t,L^{n,2}_t)^\top\) and \({\widehat{L}}^n_t=({\widehat{L}}^{n,1}_t,{\widehat{L}}^{n,2}_t)^\top\) for \(t\ge 0\).
For any process V and any (random) interval \(I=[S,T)\), we define the variable V(I) by \(V(I)=V_T-V_S\). We also set \(|I|=T-S\) and \(I(s)=[S\wedge s,T\wedge s)\) for \(s\ge 0\).
We set \({\widetilde{I}}^p_i=[{\widetilde{t}}^p_{i-1},{\widetilde{t}}^p_i)\) for each \(p=1,2\) and every i. For a bivariate process V, we define the variable \({\widetilde{S}}^p_k(V)\) by:
for each \(p=1,2\) and every k, where \({\overline{\Phi }}_k(i,p)=\Phi _k({\bar{t}}^p_i)1_{J^n_k}(t^p_i)\). We also set:
and
where \(\Delta {\overline{\Phi }}_k(i,p)={\overline{\Phi }}_k(i+1,p)-{\overline{\Phi }}_k(i,p)\). Note that we have:
due to [A1](i). Then, we set:
\({\widetilde{S}}^p_k(\epsilon )\) is Gordin’s martingale approximation of \(S^p_k(\epsilon )\) (cf. Sect. 19 of Billingsley (1999)).
We define the processes B and M by \(B_t=\int _0^tb_s\mathrm {d}s\) and \(M_t=\int _0^t\Sigma _s^{1/2}\mathrm {d}W_s\) for \(t\ge 0\). For any \(Z,Z'\in \{X,B,M,\epsilon \}\), we set:
and
Then, for every k, we set:
and \(\xi ^n_k=(\xi ^{n,1}_k,\xi ^{n,2}_k)^\top\). For every k, we also define the bivariate variable \({\bar{\xi }}^n_k=({\bar{\xi }}^{n,1}_k,{\bar{\xi }}^{n,2}_k)^\top\) by:
and
Then, we set \({\hat{\xi }}^n_k=h_n^{-\frac{1}{2}}\left( \xi ^n_k-{\bar{\xi }}^n_k\right)\). Note that \({\hat{\xi }}^n_k\) is \(\mathcal {F}^n_{T^n_{k+1}}\)-measurable and satisfies \(E[{\hat{\xi }}^n_k|\mathcal {F}^n_{T^n_{k-1}}]=0\) (here, we use the independence between \(\epsilon ^1\) and \(\epsilon ^2\)).
1.3 Proof of Theorem 3.1
First of all, for the proof, we may additionally assume the following condition:
-
B
The processes b, \(\Sigma\), \(v^1\), and \(v^2\) are bounded.
This is due to a standard localization procedure described in detail in Lemma 4.4.9 of Jacod and Protter (2012) for instance. Moreover, noting Corollary A.1, by an analogous localization argument to Lemma 6.3 from Koike (2017b), we may also assume the following strengthened version of [A1]:
-
SA1
We have [A1], and for every n, it holds that:
$$\begin{aligned} \max _{p=1,2}\sup _{i\ge 0}(t^p_i-t^p_{i-1})\le {\bar{r}}_n \end{aligned}$$(A.2)and
$$\begin{aligned} \max _{p=1,2}\sup _{t\ge 0}\{N^p_n(t)-N^p_n((t-h_n)_+)\}\le nh_n\log n, \end{aligned}$$(A.3)where \({\bar{r}}_n=n^{-\xi }\) and \(\xi\) is a constant satisfying:
$$\begin{aligned}&\frac{15}{16}\vee \left( \frac{7}{8}+\frac{\lambda }{4}\right) \vee \left( 1-\frac{\varpi }{4}\right) \vee \left( \frac{\kappa }{2}+\frac{3}{4}\right) \vee \left( \frac{z}{4}+\frac{15}{16}\right) \nonumber \\&\quad \vee \{1-\gamma +2z(1+\gamma )\}\vee \frac{\nu +7}{8}<\xi <1. \end{aligned}$$(A.4)
Now, we turn to the main body of the proof. The proof of the theorem is completed once we show that the following three propositions hold true:
Proposition A.1
Suppose that [B], [SA1], [A2] and [\(\hbox {N}_\lambda\)] are satisfied for some \(\lambda \in (0,\frac{1}{2})\). Then, we have:
as \(n\rightarrow \infty\).
Proposition A.2
Under the assumptions of Proposition A.1, we have:
as \(n\rightarrow \infty\) for the Skorokhod topology, where \({\widetilde{W}}^1\) and \({\widetilde{W}}^2\) are the same as in Theorem 3.1.
Proposition A.3
Under the assumptions of Proposition A.1, we have:
as \(n\rightarrow \infty\).
1.3.1 Proof of Proposition A.1
First, using elementary properties of the trigonometric functions, we obtain:
for any \(n\in \mathbb {N}\), \(k\in \mathbb {R}\), and \(t,s\ge 0\) as well as:
for any \(n\in \mathbb {N}\), \(k\in \mathbb {R}\), \(i\in \mathbb {Z}_+\), and \(p=1,2\).
Next, we prove some auxiliary results.
Lemma A.3
Under the assumptions of Proposition A.1, we have:
Proof
Summation by parts yields:
Since it holds that \(\sum _{i=-1}^\infty \Delta {\overline{\Phi }}_k(i,p)=0\), we have:
so we obtain \(\sup _{0\le k\le h_n^{-1}}|S_k^p(X)|=O_p(\sqrt{h_n\log n})\) due to the fact that \(w(X^p;\delta )=O_p(\sqrt{\delta |\log \delta |})\) as \(\delta \rightarrow 0\). We can prove \(\sup _{0\le k\le h_n^{-1}}|{\widetilde{S}}_k^p(X)|=O_p(\sqrt{h_n\log n})\) in a similar manner. Also, an analogous argument to the above yields:
on the set \(\{\sup _{0\le t\le 1}c_\vartheta ^p(t)\le \log n\}\). Since \(P(\sup _{0\le t\le 1}c_\vartheta ^p(t)>\log n)\rightarrow 0\) as \(n\rightarrow \infty\), we obtain \(\sup _{0\le k\le h_n^{-1}}|S_k^p(X)-{\widetilde{S}}^p_k(X)|=O_p(\sqrt{n^{-\alpha -\eta _1}\log n})\) by [A2]. \(\square\)
Lemma A.4
Under the assumptions of Proposition A.1, we have:
as \(n\rightarrow \infty\).
Proof
Summation by parts yields \(S_k^p(\epsilon )=-\sum _{i=0}^\infty \epsilon ^p_{i}\Delta {\overline{\Phi }}_k(i,p).\) Therefore, we have:
by the boundedness of \(v^p\), (A.6) and [\(\hbox {N}_\lambda\)], so we obtain the desired result. \(\square\)
Lemma A.5
Under the assumptions of Proposition A.1, for any \(r\in [2,\lambda ^{-1})\), there is a constant \(K_r>0\), such that:
for any \(n\in \mathbb {N}\), \(k\in \mathbb {R}\), \(i\in \mathbb {Z}_+\), and \(p=1,2\).
Proof
By the independence between \((u^p_i)\) and \(\mathcal {F}_\infty\), we have:
so the Minkowski inequality, (A.2), (A.6), and the boundedness of \(v^p\) yield:
Now, for any \(q\ge r\), Lemma 3.102 from Ch. VIII of Jacod and Shiryaev (2003) and [\(\hbox {N}_\lambda\)] yield:
Since we can take \(q\ge r\), such that \(1/r-1/q>\lambda\) by assumption, we obtain the desired result by [\(\hbox {N}_\lambda\)]. \(\square\)
Lemma A.6
Under the assumptions of Proposition A.1, for any \(r\in [2,\lambda ^{-1})\), there is a constant \(K_r>0\), such that:
for any \(p=1,2\), \(n\ge 1\), and \(k=0,1,\dots ,h_n^{-1}\).
Proof
Noting that \(({\widetilde{\epsilon }}_k(i,p))_{i\ge 1}\) is a martingale difference with respect to the filtration \((\mathcal {F}^n_{t^p_i})_{i\ge 0}\) and that \(\{t^p_i\in J^p_k\}\in \mathcal {F}_{T^n_{(k-1)_+}}\) due to [A1], by the Burkholder–Davis–Gundy (soforth BDG) and Minkowski inequalities, we obtain:
Therefore, (A.6), [\(\hbox {N}_\lambda\)], Lemma A.5, and (A.3) imply that:
This complete the proof. \(\square\)
Lemma A.7
Under the assumptions of Proposition A.1, we have:
as \(n\rightarrow \infty\) for every \(Z\in \{B,M,\epsilon \}\).
Proof
By symmetry, it suffices to prove the first convergence in (A.8).
First, the case of \(Z=B\) is obvious, because \(|{\widetilde{S}}^p_k(B)|\lesssim h_n+{\bar{r}}_n+\eta _n(h_n+{\bar{r}}_n)\lesssim h_n\) uniformly in k.
Next, we consider the case of \(Z=\epsilon\). Since \(\xi ^f_k(B,\epsilon )\) is \(\mathcal {F}^n_{T^n_{(k+1)}}\)-measurable and \(E[\xi ^f_k(B,\epsilon )|\mathcal {F}^n_{T^n_{(k-l)}}]=0\) if \(l\ge 2\), the McLeish inequality (Theorem 1.6 of McLeish (1975)) yields:
Since \(|{\widetilde{S}}^p_k(B)|\lesssim h_n\), the desired result follows from Lemma A.6.
Finally, we consider the case of \(Z=M\). We decompose \(\sum _{k=1}^{\lfloor th_n^{-1}\rfloor -1}\xi ^f_k(B,M)\) as:
First, we prove:
We adopt the same strategy as in Sect. 6.1.3 of Christensen et al. (2013). For any càdlàg process V, we denote by \(N_\delta ^V(t)\) the number of jumps of V bigger than \(\delta >0\) before time t. Furthermore, we define:
for \(\rho ,\delta >0\). We evidently have \(\lim _{\delta \rightarrow 0}\limsup _{\rho \rightarrow 0}m_{\rho ,\delta }(V)=0\) a.s. Now, we have:
for any \(\delta >0\) and \(i\in \mathbb {Z}_+\), such that \(t^p_i\in (T^n_{k-3/2},T^n_{k+1}]\), so we obtain:
On the other hand, the Schwarz inequality yields:
and by Lemma A.2, we obtain:
Consequently, by the Schwarz inequality, we obtain:
By the bounded convergence theorem, we have:
so we conclude that \(h_n^{-\frac{1}{2}}\sup _{0\le t\le 1}\left| \mathbb {I}^n_t\right| \rightarrow ^p0\).
Next, we consider \(\mathbb {II}^n_t\). Since \(\sum _{k=1}^{\lfloor th_n^{-1}\rfloor -1}\sum _{i,j}b_{T^n_{k-2}}|{\widetilde{I}}^p_i|M^q({\widetilde{I}}^q_j)\varvec{\Phi }^f_k(i,j)\) is \(\mathcal {F}_{T^n_k}\)-measurable and:
if \(l\ge 2\) by Lemma A.2, the McLeish inequality implies that:
so we obtain \(h_n^{-\frac{1}{2}}\sup _{0\le t\le 1}\left| \mathbb {II}^n_t\right| =o_p(1)\). This completes the proof. \(\square\)
Proof of Proposition A.1
In the light of Lemma A.7, it is enough to show that:
as \(n\rightarrow \infty\) for any \(Z\in \{X,\epsilon \}\), any \(l,m\in \{0,\frac{1}{2},1,\frac{3}{2}\}\) and any \(p,q=1,2\), such that \(p\ne q\). (A.10)–(A.11) follow from Lemmas A.3–A.4 and A.6.
Now, we prove (A.12). It holds that:
By definition we have \(\Theta _{k-m}(N^q_n(T^n_{k-m+1}),q)=0\) and
Therefore, we obtain:
Thus, for any \(r\ge m\), we have:
where \(\mathcal {G}^p_\infty =\bigvee _{i=1}^\infty \mathcal {G}^p_i\). Now, Lemma 3.102 from Ch. VIII of Jacod and Shiryaev (2003) yields:
for any \(c\ge 2\), so we obtain:
Consequently, Lemmas A.3–A.4 and the McLeish inequality imply that:
so we obtain (A.12). We can prove (A.13) analogously, so we complete the proof. \(\square\)
1.3.2 Proof of Proposition A.2
The following lemma is an easy consequence of the BDG inequality.
Lemma A.8
Under the assumptions of Proposition A.2, for any \(r>0\), there is a constant \(C_r>0\), such that:
for any \(p=1,2\) and n, k.
To prove Proposition A.2, we again utilize Gordin’s martingale approximation method. For every \(k=0,1,\dots ,h_n^{-1}-1\), we set:
Since \({\hat{\xi }}^n_k\) is \(\mathcal {F}^n_{T^n_{k+1}}\)-measurable and satisfies \(E[{\hat{\xi }}^n_k|\mathcal {F}^n_{T^n_{k-1}}]=0\), \((\zeta ^n_k)_{k=0}^{h_n^{-1}-1}\) is a sequence of martingale differences with respect to the filtration \((\mathcal {F}^n_{T^n_{k+1}})_{k=-1}^{h_n^{-1}-1}\). Moreover, we have the following result.
Lemma A.9
Under the assumptions of Proposition A.2, we have:
as \(n\rightarrow \infty\).
Proof
Since we have:
it suffices to prove:
We have:
for any \(r\in (1,\lambda ^{-1})\) by Lemmas A.6 and A.8. Since we can take \(r>\frac{1/2}{2\xi -7/4}\) by (A.4), we obtain the desired result. \(\square\)
By Lemma A.9, it is enough to prove:
as \(n\rightarrow \infty\) for the Skorokhod topology. In the light of Theorem 2.2.15 of Jacod and Protter (2012), it suffices to verify the following conditions:
for any \(t>0\), \(f,g\in \{1,2\}\), \(j\in \{1,2\}\), and any bounded \((\mathcal {F}_t)\)-martingale N orthogonal to W as well as for some \(r>2\).
Proof of Eq. (A.18)) By Lemmas A.6 and A.8, we have \(E\left[ \left| \zeta ^{n,f}_k\right| ^r\right] \lesssim h_n^{(4\xi -7/2)r}(\log n)^{r}\) for any \(r\in [2,\lambda ^{-1})\). Since we have \(\frac{2}{8\xi -7}<\lambda ^{-1}\), we can take \(r\in (2,\lambda ^{-1})\), such that (A.18) holds true. \(\square\)
Proof of Eqs. (A.19)–(A.20) For \(L\in \{W^j,N\}\), we have:
Since we have:
\(E\left[ \mathsf {M} _{i,j}\left( L_{T^n_{k+1}}-L_{T^n_k}\right) |\mathcal {F}^n_{T^n_k}\right] =0\) if \(L=N\). On the other hand, if \(L=W^j\), from the above expression, we can prove:
in an analogous manner to the proof of Lemma A.7. Consequently, we complete the proof. \(\square\)
For the proof of Eq. (A.17), we need some auxiliary results.
Lemma A.10
Under the assumptions of Proposition A.2, we have:
as \(n\rightarrow \infty\) for any \(t\in [0,1]\).
Proof
Since we have:
it suffices to prove \(\sum _{k=1}^{\lfloor th_n^{-1}\rfloor -1}E[R^n_k|\mathcal {F}^n_{T^n_k}]\rightarrow ^p0\), where:
For any \(r\in (1,(2\lambda )^{-1}\wedge 2)\), we have:
by the BDG inequality as well as Lemmas A.6 and A.8. Since we can take \(r>\frac{2}{8\xi -7}\), we obtain:
Now, we have:
so we obtain \(\sum _{k=1}^{\lfloor th_n^{-1}\rfloor -1}R^n_k\rightarrow ^p0\) by (A.16). This completes the proof. \(\square\)
Lemma A.11
Under the assumptions of Proposition A.2, we have:
as \(n\rightarrow \infty\) for any \(t\in [0,1]\), \(p,q,p',q'\in \{1,2\}\) and \(l,m,l',m'\in \{0,\frac{1}{2},1,\frac{3}{2}\}\), where:
Proof
The Schwarz and BDG inequalities yield:
so the McLeish inequality implies that:
Since we have:
by Lemma A.2, an analogous argument to the proof of (A.9) yields:
This completes the proof. \(\square\)
Lemma A.12
Under the assumptions of Proposition A.2, we have:
as \(n\rightarrow \infty\) for any \(t\in [0,1]\), \(f,g\in \{1,2\}\), and \(l,m\in \{0,1\}\).
Proof
We note that integration by parts yields:
for any \(p,q,p',q'\in \{1,2\}\) and any \(i,j,i',j'\in \mathbb {Z}_+\). Therefore, Lemma A.11 implies that:
Now, (A.5) yields the desired result. \(\square\)
Lemma A.13
Under the assumptions of Proposition A.2, we have:
as \(n\rightarrow \infty\) for any \(t\in [0,1]\), \(l,m,l',m'\in \{0,\frac{1}{2},1,\frac{3}{2}\}\) and any \(p,q\in \{1,2\}\), such that \(p\ne q\).
Proof
Since \(8\xi -7>2\lambda\) by (A.4), we can take a number r, such that \((8\xi -7)^{-1}<r<(2\lambda )^{-1}\). Then, we have:
by Lemma A.6 and the Schwarz inequality. This estimate and Theorem 1 from Meng and Lin (2009) yield:
Next, by (A.14), the triangular inequality, and (A.6), we have:
so, Lemma 3.102 from Ch. VIII of Jacod and Shiryaev (2003) and [\(\hbox {N}_\lambda\)] imply that:
Combining this with the Schwarz inequality and Lemmas A.4 and A.6, we deduce that:
Now, Lemma 3.102 from Ch. VIII of Jacod and Shiryaev (2003), Hölder inequality, (A.6), and [\(\hbox {N}_\lambda\)] yield:
so we obtain:
by Lemma A.4 and the Schwarz inequality. Moreover, by the \(\varpi\)-Hölder continuity of \(v^q\), [A1](i) and Lemma A.4, we have:
In the meantime, a similar argument to the above yields:
This completes the proof. \(\square\)
Lemma A.14
Under the assumptions of Proposition A.2, we have:
as \(n\rightarrow \infty\) for any \(l,m,l',m'\in \{0,\frac{1}{2},1,\frac{3}{2}\}\) and any \(p,q\in \{1,2\}\), such that \(p\ne q\).
Proof
By an analogous argument to the proof of Lemma A.13, we obtain:
Then, an analogous argument to the proof of (A.9) yields:
This completes the proof. \(\square\)
Lemma A.15
Suppose that [SA1] holds true. Then, we have:
as \(n\rightarrow \infty\) for every \(p=1,2\).
Proof
Set \(\beta ^n_i=|I^p_i|^2-E[|I^p_i|^2|\mathcal {H}^{n,p}_{t^p_{i-1}}]\). From [A1](iv)–(v), it suffices to prove:
Noting that:
by [A1] and Corollary A.1, we have:
for any \(\delta >0\) because \(2\xi -3/2>1/5\). Therefore, the Freedman inequality (Theorem 1.6 of Freedman (1975)) yields:
which completes the proof. \(\square\)
Lemma A.16
Suppose that [SA1] holds true. Let \(s\in [0,1]\) and set \(k=\lceil sh_n^{-1}\rceil\). Then, we have:
as \(n\rightarrow \infty\) for any \(l,m\in \{-\frac{1}{2},0,\frac{1}{2}\}\) and \(j\in \mathbb {Z}_+\).
Proof
The desired result can be shown by combining the identity \(\Delta {\overline{\Phi }}_{k'}(i,p)=\int _{{\overline{t}}^p_i}^{{\overline{t}}^p_{i+1}}\Phi _{k'}'(s)\mathrm {d}s\) holding if \(t^p_i,t^p_{i+1}\in J^n_{k'}\) for any \(k'\in \mathbb {R}\), \(i\in \mathbb {Z}\) and \(p=1,2\) with (A.2) and Lemma A.15. \(\square\)
Lemma A.17
Suppose that [SA1] holds true. Let \(s\in [0,1]\) and set \(k=\lceil sh_n^{-1}\rceil\). Then, we have:
as \(n\rightarrow \infty\) for any \(l,m\in \{-\frac{1}{2},0,\frac{1}{2}\}\), \(r\in \{0,1\}\) and \(j\in \mathbb {N}\).
Proof
We prove (A.22). First, we have:
Next, by [A1], we have:
In the meantime, since \((G(2)^{n,p})_{n\ge 1}\) is tight for the Skorokhod topology, by Theorem 1.14b) from Ch. VI of Jacod and Shiryaev (2003), we have:
Now, by [A1], we have:
Moreover, since we have:
the Lenglart inequality yields:
Finally, by Theorem 1.14b) from Ch. VI of Jacod and Shiryaev (2003), we obtain:
This completes the proof of (A.22).
(A.23) can be shown in an analogous manner. \(\square\)
Lemma A.18
Suppose that [SA1] holds true. Then, we have:
as \(n\rightarrow \infty\) for every \(p=1,2\).
Proof
The results are consequences of standard Riemann sum approximations. \(\square\)
Proof of Eq. (A.17) The desired result can be derived from a combination of Lemmas A.10–A.18, the dominated convergence theorem, and tedious computations. \(\square\)
1.3.3 Proof of Proposition A.3
For every \(k\in \mathbb {R}\), we define the function \(\Psi _k\) on \(\mathbb {R}\) by \(\Psi _{k}(t)=\cos \left( \pi h_n^{-1}(t-kh_n)\right)\). Then, noting that \(\Phi _{k-1}(t)=-\Phi _k(t)=\Psi _{k-1/2}(t)\), we have:
When \(t^1_i\in J^n_{k-1/2}\), \(t^2_j\in J^n_{k-1}\cup J^n_k\) and \({\widetilde{I}}^p_i\cap {\widetilde{I}}^q_j\ne \emptyset\), we have:
so we obtain:
by (A.5). Now, applying the formula \(2\cos (x)\sin (y)=\sin (x+y)-\sin (x-y)\), we obtain:
Since we have:
we obtain:
An analogous argument to the above also yields:
Thus, we complete the proof. \(\square\)
1.4 Proof of Theorem 3.2
As in the previous section, for the proof, we may additionally assume that [B] and [SA1] are satisfied.
Set:
By Lemmas A.3, A.4, A.6, A.8, and (A.21), we have:
so we obtain:
because \(z<4\xi -\frac{15}{4}\).
Next, we set:
From the arguments in the proof of Proposition A.3, we have:
so we obtain:
Analogously, we can deduce \({\bar{\Xi }}^{n}(\beta +K_n)^2-{\bar{\Xi }}^{n}(\beta )^2 =O_p((K_nh_n)^{\gamma +1}+K_n{\bar{r}}_n+n^{-\alpha }).\) Therefore, we have:
Let \(m\in \{0,K_n\}\). Since \({\hat{\Xi }}^{n}(\beta )\) is \(\mathcal {F}^n_{\beta +K_n}\)-measurable and satisfies \(E[{\hat{\Xi }}^{n}(\beta )|\mathcal {F}^n_{\beta -1}]=0\), we have:
and
Moreover, since \(\xi ^n_k\) is \(\mathcal {F}^n_{T^n_{k+1}}\)-measurable and satisfies \(E[\xi ^n_k|\mathcal {F}^n_{T^n_{k-1}}]=0\), we obtain:
where \(\eta ^n_k={\hat{\xi }}^n_k({\hat{\xi }}^n_k)^\top +{\hat{\xi }}^n_k({\hat{\xi }}^n_{k+1})^\top +{\hat{\xi }}^n_{k+1}({\hat{\xi }}^n_k)^\top\). Since we have:
and
we conclude that:
Now, the desired result follows from Lemma A.10 and (A.17). \(\square\)
1.5 Proof of Theorem 5.1
As in Sect. C, we may additionally assume that [B] and [SA1] are satisfied.
The result follows once we prove the following statements:
as \(n\rightarrow \infty\). First, (A.24) can be shown in an analogous manner to the proof of Proposition A.1. Next, from the arguments in the proof of Proposition A.3, we have:
Moreover, by [A2], we obtain:
so we have:
Analogously, we can prove:
so we obtain (A.27). In the meantime, summation by parts yields:
so Lemmas A.6 and A.8 as well as the piecewise Lipschitz continuity of \(\mathcal {K}\) imply that:
by (A.4). This completes the proof of (A.25).
Now we prove (A.26). By virtue of Proposition 5.33 from Ch. VIII of Jacod and Shiryaev (2003), it is enough to prove:
as \(n\rightarrow \infty\) for any \(\mathcal {F}_\infty\)-measurable variable U and any \(x,y,z\in \mathbb {R}\), where:
The basic strategy of the proof is the same as the one used in Hall (1977). Analogous arguments to the proofs of (A.17)–(A.18) yield:
for some \(r>2\). The second statement implies that:
Let us set \(\eta ^n_k={\tilde{\zeta }}^n_k1_{\{\sum _{l=1}^{k-1}\left( {\tilde{\zeta }}^n_k\right) ^2\le {\mathfrak {V}}_{(t-H_n)_+}+1\}}\). Then, noting that \({\mathfrak {V}}_{(t-H_n)_+}\rightarrow ^p{\mathfrak {V}}_t\) by assumption, the above equations imply that:
and
In particular, (A.28) and the bounded convergence theorem yield:
In the next step, we define the function R(x) by \(e^{\sqrt{-1}x}=(1+\sqrt{-1}x)\exp (-x^2/2+R(x))\) and note that \(|R(x)|\le |x|^3\) for \(|x|<1\). Then, we have:
where \(T_n=\prod _{k=1}^{h_n^{-1}-1}(1+\sqrt{-1}\eta ^n_k).\) In particular, (A.29) yields:
Moreover, setting \(J_n=\max \{k\le h_n^{-1}-1:\sum _{l=1}^{k-1}\left( {\tilde{\zeta }}^n_k\right) ^2\le {\mathfrak {V}}_{(t-H_n)_+}+1\}\), we have:
so \(\sup _nE\left[ \left| T_n\right| ^2\right] <\infty\) because of the boundedness of \({\mathfrak {V}}^n_{(t-H_n)_+}\) and (A.29). Therefore, \((T_n)\) is uniformly integrable. This means the uniform integrability of the variables in the left side of (A.31), so the Vitali convergence theorem yields:
In the third step, let \(\mathcal {U}_t\) be the càdlàg version of the bounded martingale \(E[\exp (\sqrt{-1}zU)|\mathcal {F}_t]\). Noting that \(\mathcal {U}_t=E[\exp (\sqrt{-1}zU)|\mathcal {F}^n_t]\) a.s., because U is \(\mathcal {F}_\infty\)-measurable and that \(T_n\) is \(\mathcal {F}^n_{t+H_n+h_n}\)-measurable, because \(\mathcal {K}\) is supported by \([-1,1]\), we have:
Therefore, the Vitali convergence theorem yields:
Now, since \((\eta ^n_k)_{k=1}^{h_n^{-1}-1}\) is a martingale difference with respect to the filtration \((\mathcal {F}^n_{T^n_{k+1}})\) by construction, we obtain:
where \(T'_n=\prod _{k=1}^{\lceil th_n^{-1}\rceil -2}(1+\sqrt{-1}\eta ^n_k).\) Since \(T'_n\) is \(\mathcal {F}^n_{t-}\)-measurable, we obtain:
\({\mathfrak {V}}_t={\mathfrak {V}}_{t-}\) by assumption and we can prove the uniform integrability of \((T_n')\) similarly to the above, so the Vitali convergence theorem yields:
Since \(\eta ^n_k\equiv 0\) if \(kh_n<t-H_n\), because \(\mathcal {K}\) is supported by \([-1,1]\), we obtain:
so the bounded convergence theorem yields:
From (A.30) and (A.32)–(A.34), we obtain the desired result. \(\square\)
1.6 Proof of Theorem 5.2
The theorem can be shown in a parallel manner to the proof of Theorem 3.2. \(\square\)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Koike, Y. Inference for time-varying lead–lag relationships from ultra-high-frequency data. Jpn J Stat Data Sci 4, 643–696 (2021). https://doi.org/10.1007/s42081-021-00106-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-021-00106-2