Introduction

For the majority of financial markets, the pervasive intraday periodicity (IP), which often takes a U- or mirrored J-form during the daily trading time, is a well documented empirical feature of intraday absolute returns (Wood et al. 1985; Harris 1986). As it appears to be highly correlated with intraday variation of trading volume, Admati and Pfleiderer (1988) propose to explain daily IP-shape by strategic interaction of traders around market openings and closures, whereas the periodicity at weekly or monthly horizons could be attributed to the impact of slowly varying macroeconomic fundamentals (Andersen and Bollerslev 1998b; Andersen et al. 2001, 2003).

The availability of high-frequency data allows for the construction of precise estimators of daily integrated volatility (IV) for risky asset returns. The realized volatility (RV) defined as a sum of squared intraday returns is known to be a consistent estimator of daily IV in absence of jumps. Other realized measures such as the bipower variation (BV) should be used for IV estimation in presence of finite-activity jumps during the day (cf. Aït-Sahalia and Jacod 2014). Barndorff-Nielsen and Shephard (2004) derive the asymptotic properties of these quantities for the number of intraday returns \(M\rightarrow \infty \) under mild assumptions on the corresponding pricing process. In practice, however, the number of intraday returns available for computing realized estimators often remains limited due to insufficient liquidity and/or irregular trading activity. Even for highly liquid stocks a practitioner could prefer to rely on simple realized estimators based e.g. on 5 min returns primarily in order to escape from adverse effects of market microstructure noise (MMN) which are particularly pronounced at (ultra) high sampling frequencies (cf. Aït-Sahalia and Jacod 2014).

In this paper we analyze and quantify the impact of IP on the finite M properties of RV and BV estimators by providing corresponding formal statements given the IP functional form. To the best of our knowledge, this research agenda has not been explored yet, although there is a vast amount of literature devoted to modeling and estimating IP (cf. Engle et al. 1990; Andersen et al. 2019; Christensen et al. 2018). Thus, our investigation provides useful insights for exploring differences between the asymptotic theory and the practical finite sample performance of realized measures based on intraday data.

To model intraday returns, we presume a discrete time stochastic specification as in Andersen and Bollerslev (1997), where the variance of intraday returns is written as a product of the deterministic periodic and stochastic volatility (SV) components. The IP is assumed to be constant for all days, whereas the SV part is slowly changing over time. Our framework is motivated by the empirical evidence that the IP captures a vast part of intraday volatility variation whereas the SV impact is of a smaller order (cf. Christensen et al. 2018). We show that for the commonly available finite number of intraday returns M neglecting the impact of IP would lead to non-valid statistical inference concerning daily IV. For a given IP, we compute the first and the second moments of RV and BV, moreover, we establish the asymptotic bivariate distribution of these measures as \(M\rightarrow \infty \). We also quantify the impact of IP on realized tri-power (TP) and quad-power (QP) estimators of daily integrated quarticity (IQ) required for statistical inference about IV.

Our major finding is that for the commonly available finite number of intraday returns the impact of IP should be explicitly addressed when making statistical inference concerning IV. While the RV estimator of IV is unaffected by IP, BV has a finite sample bias negligible only for large sample sizes which are not always available in practice. Moreover, by estimating IQ one should account – at least for small M values – for scaling factors, which depend on the functional form of the IP. We derive the explicit expressions for these IP-correction factors and provide the asymptotic distribution for their estimators.

Our theoretical results are illustrated in a Monte Carlo study, where we investigate the impact of IP on various realized measures. We find that the IP-correction procedures proposed in this paper are helpful against the adverse impact of IP on realized measures in finite samples. Moreover, our IP-corrected estimates are advantageous compared to immediate removing of the estimated IP due to their robustness with respect to IP misspecifications at days with unusual pattern of intraday volatility (cf. Gabrys et al. 2013; Kokoszka and Reimherr 2013). In such situations our approach is preferable in terms of relative bias and mean squared error (MSE) compared to the standard procedure with immediate scaling of intraperiod returns by the estimated IP profile as in Boudt et al. (2011), for example. In the empirical application we estimate the IP for the daily volatility of the Dow Jones Industrial Average Index with and without IP bias corrections.

The remaining part of the paper is organized as follows. In Sect. 2, we introduce the model for intraday returns and discuss the realized estimators of daily IV and IQ. The theoretical results are derived in Sect. 3 where we establish both finite sample and asymptotic stochastic properties of commonly applied realized estimators for a given IP form. Moreover, we provide expressions for IP correction factors and derive the asymptotic distributions of their estimators. Our approach is illustrated in Sect. 4 by means of a simulation study and in Sect. 5 by an empirical application. Sect. 6 concludes, whereas the proofs are placed in the Appendix.

Measuring daily volatility based on intraday information

Before we introduce our model in Sect. 2.2, we provide definitions of the objects which are of importance for our analysis. For this purpose we start from a general jump-diffusion model for log price increments in order to define the objects of our interest, namely the daily IV and daily IQ. Next, we present the corresponding realized estimators which are based on M intraday returns. These realized measures are consistent estimators of IV for \(M\rightarrow \infty \), however, in practice we often face \(M\le 10^2\) primarily because ultra high frequency returns are contaminated by MMN. Thus, it is of importance to explore the finite sample stochastic properties of realized estimators. For this purpose we then consider a discrete time model for intraday returns with an explicit functional specification of intraday periodicity (IP) and study the impact of IP on the realized measures of IV and IQ for finite M.

Model for intraday returns and realized measures

Assume that log-prices of risky assets \(p(t)=\ln P_t\) follow a continuous time process with (possible) additive jump components. We consider a day t as the period of interest with the daily return \(r_t=p(t)-p(t\!-\!1)\) and focus on the integrated volatility (IV), which is defined for day t as

$$\begin{aligned} IV_t=\sigma ^2_t=\int _{t-1}^t \sigma ^2(u)du, \end{aligned}$$

where \(\sigma (u)\) is a spot volatility. In order to make statistical inference about IV measures one also needs statements concerning the daily integrated quarticity (IQ) defined by

$$\begin{aligned} IQ_t=\int _{t-1}^t \sigma ^4(u)du. \end{aligned}$$

The availability of intraday returns allows to construct precise realized estimators (Andersen and Bollerslev 1998a) for the daily IV which are of immense practical importance for estimation and inferences concerning daily volatility. Assume that M equally spaced intraday returns are available for day t. We denoted them by \(r_{t,m}=p(t\!-\!1+m/M)-p(t\!-\!1+(m\!-\!1)/M)\) for \(m=1,\ldots ,M\). Then the daily return \(r_t\) is the sum of intraday returns with \(r_t=\sum _{m=1}^M r_{t,m}\). The most popular IV estimator is the realized volatility (RV) measure given as

$$\begin{aligned} RV_t=\sum _{m=1}^M r^2_{t,m}. \end{aligned}$$

Barndorff-Nielsen and Shephard (2004) show that \(RV_t\) is a consistent estimator of \(IV_t\) without jumps at day t, i.e. \(RV_t\overset{p}{\longrightarrow }IV_t\) as \(M\rightarrow \infty \). Although the estimator \(RV_t\) possesses a set of appealing properties, it is not appropriate in the presence a non-zero jump component.

The bipower variation (BV) proposed by Barndorff-Nielsen and Shephard (2004)

$$\begin{aligned} BV_t = \frac{M}{M-1} \frac{\pi }{2} \sum \limits _{m=2}^{M} |r_{t,m}||r_{t,m-1}| \end{aligned}$$
(1)

is a jump-robust estimator of IV. It is consistent even in presence of jumps, i.e. \(BV_t\overset{p}{\longrightarrow }IV_t\) as \(M\rightarrow \infty \). However, RV has a smaller variance than BV if there is no jumps, so the common practice is first to test for a jump during day t. Then, in case of a significantly large positive distance between RV and BV indicating jumps, one should apply BV; otherwise RV is to use.

Intraday returns are also suitable for the purpose of estimating the unknown daily IQ required for computing variances of RV and BV measures. The realized quarticity (RQ)

$$\begin{aligned} RQ_t=\frac{M}{3}\sum _{m=1}^{M} r^4_{t,m}, \end{aligned}$$

is a consistent estimator of IQ in the case of no jumps with \(RQ_t \overset{p}{\longrightarrow }IQ_t\) as \(M\rightarrow \infty \). However, as \(RQ_t\) measure is not robust (cf. Andersen et al. 2014), Barndorff-Nielsen and Shephard (2004) suggest to use the realized tri-power (TP) and quad-power (QP) measures defined by

$$\begin{aligned} QP_t= & {} \frac{M^2}{M-3} \cdot \frac{\pi ^2}{4} \cdot \sum _{m=4}^M |r_{t,m-3}||r_{t,m-2}||r_{t,m-1}||r_{t,m}|, \end{aligned}$$
(2)
$$\begin{aligned} TP_t= & {} \frac{M^2}{M-2} \cdot \mu _{4/3}^{-3} \sum \limits _{m=3}^M |r_{t,m-2}|^{\frac{4}{3}}|r_{t,m-1}|^{\frac{4}{3}}|r_{t,m}|^{\frac{4}{3}}, \quad \text{ with } \quad \mu _{4/3}=0.8309.\nonumber \\ \end{aligned}$$
(3)

Although both RV and BV have appealing stochastic properties as \(M\rightarrow \infty \), their practical implementation is often based on (say) 5 min intraday returns which makes \(M=78\) intraday observations for a usual 6.5-hours trading day, because of MMN which hinders the use of ultra high frequency data for construction of realized estimators (McAleer and Medeiros 2008). To overcome these problems, the recent research has been focused on making realized estimators more robust to these features. However, in many situations the common practice still remains to sample returns at a lower frequency, i.e. to consider 5–, 10–, or even 15-min intraday returns (Andersen et al. 2011). We follow this strand of literature and concentrate on profound understanding of stochastic properties of realized estimators for comparatively small values M. Although we focus in our analysis primarily on the classical RV, BV, QP, and TP measures, we also provide a discussion of recently proposed further realized measures in Sect. 5.2 and in Dette et al. (2022).

Discrete time model for intraday returns

The IP in absolute intraday returns is one of the most important stylized facts characterizing high frequency data. In order to investigate the IP impact on realized measures for a fixed number of intraday returns M, next we introduce a discrete time model in (4) which is central for study. There is a substantial scope of recent literature concerning discrete-time modeling of intraday returns whereas the IP is assumed to be a multiplicative scaling component (Boudt et al. 2011; Engle and Sokalska 2012; Bekierman and Gribisch 2021). Following Andersen and Bollerslev (1997), we focus on a discrete stochastic model for intraday return without jumps, which is given as

$$\begin{aligned} r_{t,m}= & {} \sigma _{t,m} \cdot u_{t,m}, \qquad \text{ with } \qquad u_{t,m}\, \sim \; \text{ iid }~{\mathcal {N}}(0,1), \nonumber \\ \sigma ^2_{t,m}= & {} 1/M \cdot s^2_{t,m} \cdot \gamma ^2_{t,m}, \end{aligned}$$
(4)

where \(s_{t,m}>0\) is the deterministic IP component and \(\gamma ^2_{t,m}>0\) is the stochastic part. Note that there is no leverage effect in (4), as it is mostly of importance for daily returns but much less pronounced for high-frequency intraday returns, see e.g. Bollerslev et al. (2006).

For our theoretical derivations we presume that the stochastic part remains constant within day t (Andersen and Bollerslev 1998b; Hecq et al. 2012), i.e. \(\gamma _{t,m}=\sigma _t\) for all \(m=1,\ldots ,M\), but may change from one day to another. This assumption is justified by the empirical evidence that IP commonly accounts for a vast part of intraday volatility variation (cf. Christensen et al. 2018). In Sect. 4 we relax this assumption in the Monte Carlo simulation study by considering the intraday SV which follows a diffusion process as e.g. in Goncalves and Meddahi (2009). Based on the results of our Monte Carlo simulations, we conclude that our major findings also hold in the SV setting.

In line with the current literature (Hecq et al. 2012), we set the IP as constant at different days so that we further skip the time index with \(s_{t,m}=s_m\). Moreover, the periodic component is standardized such that it sums up to M over the day with \(\sum _{m=1}^M s^2_{m}=M\). Of course, a special case \(s_m=1\) for all \(m=1,\ldots ,M\) corresponds to no IP. Putting all together, we separate the intraday periodic component \(s_m\) which is solely responsible for intraday heteroskedasticity, and interday stochastic component \(\sigma _t\) which could change from one day to another by writing

$$\begin{aligned} \sigma ^2_{t,m}=1/M \cdot s^2_m \cdot \sigma ^2_t \qquad \text{ with } \qquad \sum _{m=1}^M s^2_{m}=M. \end{aligned}$$
(5)

Although the model in (4) and (5) is fairly simple, it allows a detailed analysis of the IP impact on popular realized measures of the objects of our interest, which are the IV for day t

$$\begin{aligned} IV_t = Var(r_t) = \sum \limits _{m=1}^M Var(r_{t,m})= \frac{1}{M}\sum \limits _{m=1}^M s_m^2 \sigma ^2_{t} = \sigma ^2_t, \end{aligned}$$
(6)

as well as the IQ written by (Andersen et al. 2014)

$$\begin{aligned} IQ_t = M/3 \cdot \sum _{m=1}^M E(r^4_{t,m})=\sigma ^4_t/M \, \cdot \, \sum _{m=1}^M s^4_{m}. \end{aligned}$$
(7)

Note that as in general it holds that \(\sum _{m=1}^M s^4_{m} \ge M\), the IQ is directly influenced by IP. We aim to investigate the impact of IP on realized measures of IV and IQ.

The impact of intraday periodicity on RV and BV

To gain results on the IP impact on realized measures, re-write the normalized IP \(\{s_m\}_{m=1}^M\) as

$$\begin{aligned} s_m^2=g\left( \frac{m}{M}\right) \, \big /\, g_M, \qquad \text{ with } \qquad g_M=\frac{1}{M}\sum _{m=1}^M g\left( \frac{m}{M}\right) , \end{aligned}$$
(8)

where \(g:[0,1]\mapsto {\mathbb {R}}^+\) with \({\mathbb {R}}^+ :=(0,+\infty )\) is a given non-normalized function. This normalization is very common in IP-literature (cf. Andersen and Bollerslev 1997, p. 153). The functional form of \(g(\cdot )\) could be quite flexible and is subject to very general regularity conditions specified in the following propositions.

Bias and variance of realized estimators

For the discrete model of intraday returns (4)–(5) and the IP from \(\{g(m/M)\}_{m=1}^M\), we derive expectation, bias and variance of RV and BV estimators of daily IV in the next proposition.

Proposition 1

Assume that the IP component is given by (8) for some function \(g:[0,1]\mapsto {\mathbb {R}}^+\).

\(\text{(A) }\) The estimator \(RV_t\) for daily IV is unbiased so that \(E[RV_t]=IV_t\). The estimator \(BV_t\) is biased, that is \(E[BV_t]= M/(M-1)\sigma ^2_t(1-R_M)=M/(M-1) IV_t(1-R_M)\), where the factor \(R_M\in (0,1)\) is given by

$$\begin{aligned} R_M = \left( g\left( \frac{1}{M}\right) +\sum \limits _{m=2}^M g\left( \frac{m}{M}\right) ^{1/2}\left[ g\left( \frac{m}{M}\right) ^{1/2}- g\left( \frac{m-1}{M}\right) ^{1/2}\right] \right) /\sum \limits _{m=1}^M g\left( \frac{m}{M}\right) . \end{aligned}$$

If \(g(\cdot )\) is continuously differentiable on interval [0, 1], it holds as \(M\rightarrow \infty \)

$$\begin{aligned} M \cdot R_M = \left[ \frac{1}{2} \int _{0}^1 g'(x)dx\Big /\int _{0}^1 g(x)dx + g(0 )\Big /\int _{0}^1 g(x)dx \right] \cdot \left( 1+ o(1) \right) , \end{aligned}$$

so that \(\lim _{M\rightarrow \infty } R_M=0\), i.e. \(BV_t\) is an asymptotically unbiased estimator of IV.

\(\text{(B) }\) The (co)variances of \(RV_t\) and \(BV_t\) are given as

$$\begin{aligned} Var(RV_t)= & {} \frac{2\sigma ^4_t}{M^2 g^2_M}\sum _{m=1}^M g^2\left( \frac{m}{M}\right) ,\\ Var(BV_t)= & {} \frac{\sigma ^4_t}{(M-1)^2g_M^2} \left\{ \left( \frac{\pi ^2}{4}-1\right) \sum \limits _{m=2}^{M} g\bigg (\frac{m}{M}\bigg )g\left( \frac{m-1}{M}\right) \right. \\&\quad \left. + (\pi -2)\sum \limits _{m=3}^{M}g\left( \frac{m}{M}\right) ^{1/2}g\left( \frac{m-1}{M}\right) g\left( \frac{m-2}{M}\right) ^{1/2}\right\} ,\\ Cov(RV_t,BV_t)= & {} \frac{\sigma _t^4}{M(M-1) g^2_M}\left[ \sum \limits _{m=2}^M g\left( \frac{m-1}{M}\right) ^{1/2}g\left( \frac{m}{M}\right) ^{3/2}\right. \\&\quad \left. +\sum \limits _{m=1}^{M-1} g\left( \frac{m+1}{M}\right) ^{1/2}g\left( \frac{m}{M}\right) ^{3/2}\right] . \end{aligned}$$

For \(g(\cdot )\) continuously differentiable on interval [0, 1] and for \(M\rightarrow \infty \) it holds that \(Var(RV_t)= (1/M) \cdot 2 \sigma ^4_t \cdot \xi \ (1+o(1))\),

$$\begin{aligned} Var(BV_t) = (1/M) \cdot \sigma ^4_t \cdot \xi \left( \frac{\pi ^2}{4} - 3 + \frac{\pi }{4} \right) \ (1+o(1)), \end{aligned}$$
(9)

and \(Cov(RV_t,BV_t) = (1/M) \cdot 2 \sigma ^4_t\cdot \xi (1+ o(1))\). The asymptotic scaling factor \(\xi \) is defined by

$$\begin{aligned} \xi = \int ^1_0 g^2 (x) dx \big /\left( \int ^1_0 g(x) dx\right) ^2. \end{aligned}$$
(10)

It holds that \(\xi \ge 1\) with \(\xi =1\) if and only if \(g(\cdot )\) is almost everywhere constant, i.e. there is no IP.

The property \(R_M \in {(0,1)}\) follows from the proof of Proposition 1 in the Appendix. More precisely, by the Cauchy-Schwarz inequality, we have

$$\begin{aligned} 0< 1 - R_M&= \frac{\sum _{m=2}^M {\left[ g\left( \frac{m}{M}\right) g \left( \frac{m-1}{M}\right) \right] }^{1/2}}{\sum _{m=1}^M g \left( \frac{m}{M}\right) } \\&\le \frac{\left( \sum _{m=2}^M g \left( \frac{m}{M}\right) \right) ^{1/2} \left( \sum _{m=2}^M g \left( \frac{m-1}{M}\right) \right) ^{1/2}}{\sum _{m=1}^M g \left( \frac{m}{M}\right) } < 1. \end{aligned}$$

Thus, in the case of IP, RV is an unbiased estimator for IV but BV has a finite M bias which should be corrected in applications. Since the expectation of BV is given by

$$\begin{aligned} E[BV_t]=\frac{\sigma _t^2}{(M-1)\cdot g_M} \cdot \sum \limits _{m=2}^M g\left( \frac{m}{M}\right) ^{1/2}g\left( \frac{m-1}{M}\right) ^{1/2}= \frac{\sigma _t^2}{M-1} \cdot \sum \limits _{m=2}^M s_m s_{m-1}, \end{aligned}$$

we suggest the following bias-corrected measure

$$\begin{aligned} {\widetilde{BV}}_{t}=\frac{\pi }{2} \cdot M \cdot \left( \sum \limits _{m=2}^M s_m s_{m-1}\right) ^{-1} \cdot \sum _{m=2}^M |r_{t,m}||r_{t,m-1}|. \end{aligned}$$
(11)

Hence, it holds that \({\widetilde{BV}}_{t}=(M-1)\left( \sum \limits _{m=2}^M s_m s_{m-1}\right) ^{-1} BV_t\) leading the correction factor \(\zeta _M\) for BV:

$$\begin{aligned} \zeta _M=\zeta _{M,BV}=BV_t/{\widetilde{BV}}_{t} = \frac{1}{M-1} \cdot \sum \limits _{m=2}^M s_m s_{m-1}, \end{aligned}$$
(12)

which should be replaced by its empirical counterpart \({\hat{\zeta }}_M\) based on IP estimates \({\hat{s}}_m\) in practice. By the same principle, we define the IP factor \(\xi _M\) in RQ for finite M values as

$$\begin{aligned} \xi _M=\xi _{M,RQ}=\frac{1}{M} \cdot \sum \limits _{m=1}^M s_m^4, \end{aligned}$$
(13)

with \(\lim _{M\rightarrow \infty } \xi _M=\xi \) as in (10) of Proposition 1. Note that in case of no IP it holds that \(\xi _M=\xi _{M,RQ}=1\). Of course, in applications we replace \(\xi _{M,RQ}\) by its estimator \({\hat{\xi }}_M={\hat{\xi }}_{M,RQ}\) which is discussed below in Sect. 3.3. Hence, we show analytically that although for BV it holds that \(\lim _{M\rightarrow \infty } \zeta _M=1\), its counterpart \(\xi \) for the realized measures of IQ is still present and could be (depending on the data) rather substantial. In the context of the IP-bias corrections for further realized measures, we investigate by means of numerical analysis several popular MMN-robust realized estimators such as min RV or med RV in the follow-up paper of Dette et al. (2022).

In the next proposition we provide the expectations of realized estimators RQ, TP and QP serving as the measures for IQ under the model defined in (4) and (5) with the function \(g(\cdot )\).

Proposition 2

Assume that the IP is given by (8), then the expectations of \(RQ_t\), \(QP_t\) and \(TP_t\) are given as \(E[RQ_t]= \frac{\sigma ^4_t}{M \cdot g^2_M} \sum _{m=1}^M g(\frac{m}{M})^2\), i.e. \(RQ_t\) is unbiased, and

$$\begin{aligned} E[QP_t]= & {} \frac{\sigma ^4_t}{\left( M-3\right) \cdot g^2_M} \sum _{m=4}^M \left[ g\left( \frac{m-3}{M}\right) g\left( \frac{m-2}{M}\right) g\left( \frac{m-1}{M}\right) g\left( \frac{m}{M}\right) \right] ^{\frac{1}{2}},\\ E[TP_t]= & {} \frac{\sigma ^4_t}{\left( M-2\right) \cdot g^2_M} \sum _{m=3}^M \left[ g\left( \frac{m-2}{M}\right) g\left( \frac{m-1}{M}\right) g\left( \frac{m}{M}\right) \right] ^{\frac{2}{3}}. \end{aligned}$$

Moreover, \(\lim _{M\rightarrow \infty } E[RQ_t]=E[TP_t]= E[QP_t]=\sigma ^4_t \cdot \xi =IQ_t\) if \(g(\cdot )\) is square integrable, where the asymptotic scaling factor \(\xi \) which comprises the impact of IP is given by (10).

Proposition 2 suggests the following factorization of QP and TP measures of IQ in finite samples:

$$\begin{aligned} {\widetilde{QP}}_{t}= & {} QP_t/\xi _{M,QP}, \qquad \xi _{M,QP} = \frac{1}{M-3}\cdot \sum \limits _{m=4}^M s_m s_{m-1} s_{m-2} s_{m-3},\\ {\widetilde{TP}}_{t}= & {} TP_t/\xi _{M,TP}, \qquad \xi _{M,TP} = \frac{1}{M-2} \cdot \sum \limits _{m=3}^M (s_m s_{m-1} s_{m-2})^{4/3}, \end{aligned}$$

whereby \(\xi _{M,QP}\) and \(\xi _{M,TP}\) are the IP scaling factors for QP and TP, respectively. This factorization appears to be useful both in the simulations in Sect. 4 and in the empirical study in Sect. 5.

Asymptotic distribution in case of intraday periodicity

Next, we provide the corresponding bivariate limit distribution for RV and BV as \(M\rightarrow \infty \) for our discrete time model of intraday returns with IP.

Theorem 1

Consider model (4) and (5) without jumps and assume that the IP component is given by (8) with a continuously differentiable function \(g:[0,1]\mapsto {\mathbb {R}}\). Then, as \(M\rightarrow \infty \),

$$\begin{aligned} M^{1/2}\cdot IQ^{-1/2}_t \cdot \begin{pmatrix} RV_t-\sigma ^{2}_t \\ BV_t-\sigma ^{2}_t \\ \end{pmatrix} \overset{L}{\longrightarrow } {\mathcal {N}}\left( \begin{bmatrix} 0 \\ 0 \\ \end{bmatrix}, \begin{bmatrix} 2 &{} 2\\ 2 &{} \frac{\pi ^2}{4}+\pi -3 \\ \end{bmatrix}\right) . \end{aligned}$$

The integrated quarticity \(IQ_t=\xi \cdot \sigma ^{4}_t\) can be consistently estimated by \(RQ_t\), \(QP_t\), or \(TP_t\).

Thus, a pronounced IP with \(\xi > 1\) causes more variability of IV estimators compared to the case of no IP where \(\xi =1\). The asymptotic \((1-\alpha )\)-confidence interval for daily IV and IQ based on RV and RQ measures is given according to Theorem 1 as

$$\begin{aligned} CI_t(1-\alpha )= & {} \left[ RV_t+z_{\alpha /2}\left[ \frac{2}{M}\right] ^{1/2} \cdot RQ_t^{1/2}, \quad RV_t- z_{\alpha /2}\left[ \frac{2}{M}\right] ^{1/2} \cdot RQ_t^{1/2}\right] , \end{aligned}$$

where \(z_{\alpha /2}\) is the  \((\alpha /2)\)-quantile of the standard normal distribution. The asymptotic confidence intervals based on BV, TP or QP measures are constructed similarly.

The results in Proposition 2 and Theorem 1 are useful for statistical inference on IV. As RQ is not robust, one could estimate IQ in presence of jumps by either TP or QP directly as in (2) or (3), i.e. without estimating \(\xi \) separately. However, as we show in the Monte Carlo simulation, the approximation \(TP_t\approx IQ_t\) is only precise for fairly large M. For this reason, for finite M we recommend to use the IP-scaled estimators \({\widetilde{QP}}_{t}\) or \({\widetilde{TP}}_{t}\) as well as the estimated scaling factor \({\hat{\xi }}_{M,RQ}\) from Eq. (15) below for the construction of confidence intervals as

$$\begin{aligned} {\widetilde{CI}}_t(1\!-\!\alpha )= & {} \left[ {\widetilde{BV}}_t\!+\! z_{\alpha /2}\left[ \frac{\pi ^2/4+\pi -4}{M}\right] ^{1/2} \cdot {\hat{\xi }}^{1/2}_{M,RQ} \cdot {\widetilde{QP}}_t^{1/2}, \right. \\&\left. \quad {\widetilde{BV}}_t- z_{\alpha /2}\left[ \frac{\pi ^2/4+\pi -4}{M}\right] ^{1/2} \cdot {\hat{\xi }}^{1/2}_{M,RQ} \cdot {\widetilde{QP}}_t^{1/2}\right] , \end{aligned}$$

because of \(E[ {\hat{\xi }}_{M,RQ} \cdot {\widetilde{QP}}_t]=E[ {\hat{\xi }}_{M,RQ} \cdot {\widetilde{TP}}_t]=E[TP_t]=E[QP_t]=E[RQ_t]\).

Estimation of IP correction factors

For a given number of intraday returns M, one needs consistent estimators of the IP functions \(\{{\hat{s}}^2_m\}_{m=1}^M\) in order to obtain the estimators of the quantities \(\xi _M\) and \(\zeta _M\) defined in (13) and (12), respectively. For this purpose we exploit the SD estimator (cf. Boudt et al. 2011) given as

$$\begin{aligned} {\widehat{SD}}^{2}_{m,T}={1\over T} \sum _{t=1}^T r^2_{t,m}~,~~ \text{ and } ~~~ {\hat{s}}_m=\frac{{\widehat{SD}}_{m,T}}{\left( (1/M)\sum _{m=1}^M {\widehat{SD}}^{2}_{m,T}\right) ^{1/2}}, \end{aligned}$$
(14)

and use the statistics

$$\begin{aligned} {\hat{\xi }}_M&= {\hat{\xi }}_{M,RQ} = (1/M) \cdot \sum \limits _{m=1}^M {\hat{s}}_m^4 ~, ~~~~ \quad {\hat{\zeta }}_M&={\hat{\zeta }}_{M,BV}= (1/M) \cdot \sum \limits _{m=2}^M {\hat{s}}_m {\hat{s}}_{m-1} \end{aligned}$$
(15)

as the estimators of \(\xi _M\) and \(\zeta _M\), respectively. If the variance

$$\begin{aligned} \sigma ^{2} := \lim _{T\rightarrow \infty } \sigma _{T}^{2} := \lim _{T\rightarrow \infty } {1 \over T} \sum _{t=1}^{T} \sigma _{t}^{2} >0 \end{aligned}$$
(16)

exists, it follows from (5) that \(\lim _{T\rightarrow \infty } E [ {\widehat{SD}}^{2}_{m,T} ] =\sigma ^{2}s_m^{2}/M\) which motivates the definition (14). As a consequence, we expect that the estimators in (14) and (15) are consistent for \({s}_m\), \(m=1,\ldots ,M\), as well as for \(\xi _M\) and \(\zeta _M\), respectively. Note that as long as the ergodicity condition (16) is met, the assumption of constant intraday volatility as in Sect. 2 could be relaxed for the following analysis in Sect. 3.3, e.g. by allowing intraday stochastic volatility.

In order to make the intuitive arguments above more precise we investigate in the following the asymptotic distribution of the statistics \({\hat{\zeta }}_M\) and \({\hat{\xi }}_M\) for finite M and \(T\rightarrow \infty \). Note that under common assumptions, such as mixing conditions (see, for example Dehling et al. 1986, among many others) or physical dependence conditions (cf. Wu 2005), the vector \(\widehat{\mathbf {SD}}_{M,T} = ({\widehat{SD}}^{2}_{1,T}, \ldots , {\widehat{SD}}^{2}_{M,T})^\top \) is asymptotically normal distributed if \(T \rightarrow \infty \), that is

$$\begin{aligned} \sqrt{T} \left( \widehat{\mathbf {SD}}_{M,T} - {\mathbf {SD}}_{M,T} \right) \ {{\mathop {\longrightarrow }\limits ^{L}}}\ {\mathcal {N}} (\mathbf {0}, \sigma ^{4}\Sigma ) \quad \text{ with } \quad {\mathbf {SD}}_{M,T} = ({SD}^{2}_{1,T}, \ldots , {SD}^{2}_{M,T})^\top ~,\nonumber \\ \end{aligned}$$
(17)

where \({SD}^{2}_{m,T} = E ({\widehat{SD}}^{2}_{m,T} ) = \sigma _{T}^{2} s_{m}^{2}/M\) for \(m=1, \ldots , M\), \( \sigma ^{4} \Sigma \in {\mathbb {R}}^{M \times M}\) denotes a covariance matrix reflecting the underlying dependence structure and \( \sigma ^{2}\), \( \sigma ^{2}_{T}\) are defined in (16).

Now we provide the theoretical result concerning the estimators \({\hat{\xi }}_M \) and \({\hat{\zeta }}_M\) for finite number of intraday observations M and the estimation period \(T\rightarrow \infty \), which are based on the SD-estimator of the IP components \(s_m\), \(m=1,\ldots ,M\), although similar type of result could be also obtained for other IP-estimators. We denote by \(\langle x,y \rangle = x^{\top }y\) the common inner product on \({\mathbb {R}}^{M}\) with the corresponding norm \(\Vert x\Vert = \langle x,x \rangle ^{1/2}\) and by \({\mathbf {1}}_{M}\) an M-dimensional vector of unit entries.

Proposition 3

Assume that (16) and (17) hold, then, as \(T \rightarrow \infty \), that

$$\begin{aligned}&\sqrt{T} \; ({\hat{\xi }}_M - \xi _M) \ {{\mathop {\longrightarrow }\limits ^{L}}} \ {\mathcal {N}} (0, \tau _{1}^{2}) \end{aligned}$$
(18)
$$\begin{aligned}&\sqrt{T} \; ({\hat{\zeta }}_M - \zeta _M) \ {{\mathop {\longrightarrow }\limits ^{L}}}\ {\mathcal {N}} (0, \tau _{2}^{2})~, \end{aligned}$$
(19)

where the asymptotic variances are given by

$$\begin{aligned} \tau _{1}^{2}= & {} 4 \left( {\mathbf {S}}_{M}^{\top } \Sigma {\mathbf {S}}_{M} - {2 \over M} \Vert {\mathbf {S}}_{M} \Vert ^{2 }{\mathbf {1}}_{M}^{\top } \Sigma {\mathbf {S}}_{M} + {{\varvec{1}}_{M}^{\top } \Sigma {\varvec{1}}_{M} \over M^{2}} \Vert {\mathbf {S}}_{M} \Vert ^{4 } \right) , \end{aligned}$$
(20)
$$\begin{aligned} \tau _{2}^{2}= & {} { 1 \over 4} \left( {\mathbf {R}}_{M} ^{\top } \Sigma {\mathbf {R}}_{M} - {2 \over M} \langle {\mathbf {R}}_{M} , {\mathbf {S}}_{M} \rangle {\varvec{1}}_{M}^{\top } \Sigma {\mathbf {R}}_{M} + {{\varvec{1}}_{M}^{\top } \Sigma {\varvec{1}}_{M} \over M^{2}} \langle {\mathbf {R}}_{M} , {\mathbf {S}}_{M} \rangle ^{2 } \right) ~, \end{aligned}$$
(21)

and the vectors \({\mathbf {S}}_{M} \) and \( {\mathbf {R}}_{M} \) are defined by

$$\begin{aligned} {\mathbf {S}}_{M} := \left( s_{1}^{2} , \ldots , s_{M}^{2} \right) ^{\top }~,~~\quad {\mathbf {R}}_{M} := \left( { s_{2} \over s_{1}} ~,~ {s_{1} + s_{3} \over s_{2}} , \ldots , {s_{M-2} + s_{M} \over s_{M-1}} , {s_{M-1} \over s_{M}} \right) ^\top , \end{aligned}$$

respectively. In particular, if \(\Sigma = \text{ diag } (\vartheta ^2_1, \ldots , \vartheta ^2_M)\) is a diagonal matrix, we have

$$\begin{aligned} \tau _{1}^{2}= & {} 4 \left\{ \sum _{m=1}^{M} \vartheta ^{2}_{m}s_{m}^{4} -{2\over M} \sum _{m=1}^{M} \vartheta ^{2}_{m} s_{m}^{2} \sum _{m=1}^{M} s_{m}^{4} + {1 \over M^{2}} \left( \sum _{m=1}^{M} s_{m}^{4} \right) ^{2} \sum _{m=1}^{M} \vartheta ^{2}_{m} \right\} , \end{aligned}$$
(22)
$$\begin{aligned} \tau _{2}^{2}= & {} { 1 \over 4} \left\{ \sum _{m=1}^{M} {\vartheta ^{2}_{m} \over s_{m}^{2}} (s_{m-1} + s_{m+1})^{2} -{2\over M} \sum _{m=1}^{M} s_{m} (s_{m-1} + s_{m+1}) \sum _{m=1}^{M}\vartheta ^{2}_{m} {s_{m-1} + s_{m+1 } \over s_{m} } \right. \nonumber \\&\quad \left. + {1 \over M^{2}} \sum _{m=1}^{M}\vartheta ^{2}_{m} \left( \sum _{m=1}^{M} s_{m}(s_{m-1} + s_{m+1}) \right) ^{2} \right\} ~, \end{aligned}$$
(23)

where we put \(s_0=s_{M+1}=0\).

When estimating IP elements \(s_m\), we require for consistency that the number of days \(T\rightarrow \infty \), whereas the number of intraday observations M is here fixed. Hence, we derive the distributions of the correction factors \({\hat{\zeta }}_M\) and \({\hat{\xi }}_M\) for fixed M and \(T\rightarrow \infty \) in Proposition 3. Differently, inference about the asymptotic correction factor \(\xi \) in Theorem 1 would require that \(M \rightarrow \infty \) because then \({\hat{\xi }}_M\) is a consistent estimator of \(\xi \). These results allow to make statistical inferences and conduct tests for IP correction factors \(\xi _M\) and \(\zeta _M\) with \(T\rightarrow \infty \). Of course, an extension of our theoretical findings for IP estimated from a finite sample T is also of interest. However, this is also a quite challenging task, for which Christensen et al. (2018) present with some theoretical considerations in their Proposition 3.1. We investigate this issue in the Monte Carlo simulations in Sect. 4.2.

In Proposition 3 we provide the explicit asymptotic results for IP scaling factors given the SD estimator. In general, our results for the SD estimator could be extended for any consistent estimator of IP. In this paper both in our simulations and empirical studies we use a more robust WSD estimator of Boudt et al. (2011) which is described in the Appendix. However, it is much more difficult to get analytical results such as in Proposition 3 for this more complicated WSD estimator which is based on order statistics. For this reason we recommend to make statistical inferences for the WSD approach by using bootstrap procedures, as e.g. in Goncalves and Meddahi (2009) or in Dette et al. (2022).

Simulation study

We illustrate our theoretical findings by means of an extensive Monte Carlo simulation study which is structured as follows. First, we introduce the U-shaped IP functional form and discuss the parameter choice for the model in (4)–(5). We generate intraday returns for estimation of IP shape and construction of various realized measures. In Sect. 4.1 we study the impact of IP on various realized estimators. In Sect. 4.2 we investigate our IP-corrections in terms of MSE whereby we compare our approach with those of Boudt et al. (2011).

We consider \(M=26\), \(M=78\), or \(M=390\) intraday returns, which roughly correspond to sampling at 15 min, 5 min, or 1 min for 6.5-hour trading days, respectively. For constant intraday volatility, we fix \(IV=\sigma ^2=1\) and \(IQ=\xi \cdot \sigma ^4 = \xi \). We generate M intraday returns for each of \(T=10^4\) days with \(r_{t,m} \sim {\mathcal {N}}(0,\gamma ^2_{t,m} s_m^2)\) where \(\gamma ^2\) and \(s_m^2\) are the respective SV and IP components. As a baseline, we set \(\gamma ^2_{t,m} = 1/M\) which corresponds to \(IV=1\) and \(IQ=\xi \).

Later we also consider a SV model where we assume that \(\gamma ^2_{t,m}\) is governed by the process

$$\begin{aligned} \Delta \!\gamma _{t,m}^2= & {} 0.035(0.636 - \gamma _{t,m}^2) \Delta t + 0.144\gamma _{t,m}^2 (\Delta t)^{1/2} u_{t,m}, \end{aligned}$$
(24)

where \(\Delta t=1/M\) and \(u_{t,m} \sim \text{ iid } \, {\mathcal {N}}(0,1)\), with \(\gamma _{{1,1}}^2 = 1\). This model is a discretized version of the GARCH (1,1) diffusion used by Andersen and Bollerslev (1998a), Goncalves and Meddahi (2009) with parameters implying an autoregressive persistence; in the time series context it is related to state-space models for realized volatilities (cf. Golosnoy et al. 2021). In presence of intraday SV there is no exact analytical expression for the impact of IP on realized measures. However, since the SV in (24) follows a highly persistent process, the empirical contribution of \(\gamma ^2_{t,m}\) to intraday heteroskedasticity is of a smaller order compared to the impact of IP (see e.g., Christensen et al. 2018; Bekierman and Gribisch 2021). Moreover, from the empirical perspective, one could precisely estimate the SV components \(\gamma _{t,m}\) ex post, see Bekierman and Gribisch (2016). Then one could calculate the product components \({\hat{s}}_{t,m}={\hat{s}}_m {\hat{\gamma }}_{t,m}\) and compute the IP correction factors for this day t based on the obtained estimates \({\hat{s}}_{t,m}\).

The IP components follow a quadratic convex U-shaped given by

$$\begin{aligned} g(m/M) = \left[ c_1 + c_2(m-M/2)^2\right] ^2, \qquad \qquad c_1,c_2>0, \end{aligned}$$
(25)

so that the standardized IP values are \(s^2_m = g(m/M)/g_M\). The choice of the functional form is (25) is motivated by our empirical findings, see Fig. 5. Note that our theoretical results are applicable for any functional form of IP including asymmetric mirrored J-shape specifications, as e.g. a more flexible asymmetric U-shaped IP specification as in Hasbrouck (1999) and Andersen et al. (2012). Because of \(\sum _{m=1}^M s^2_m=M\), it holds for (25) that \(c_2= 12 \, (1-c_1)\,/\,(M^2+2)\). We select \(c_1 \in \) \(\{0.01,0.11,\dots ,0.91,1\}\) with the most IP curvature for \(c_1\rightarrow 0^+\) whereas \(c_1=1\) is the no-IP case. Next we focus on the value \(c_1=0.71\) corresponding to the evidence from U.S. stock market. Note that the IP form could be very distinct at ‘special’ days characterized by specific announcements and/or unexpected events where the IP curvature could be much more (or less) pronounced.

Fig. 1
figure 1

The asymptotic factor \(\xi \) in (10) and averages of finite sample estimates \({\hat{\xi }}_{M,TP}\), \({\hat{\xi }}_{M,QP}\), \({\hat{\xi }}_{M,RQ}\) for \(M=26,78,390\) depending on parameter \(c_1\) of the intraday profile (25)

The asymptotic scaling factor \(\xi \) defined in (10) is plotted as a function of \(c_1\) in Fig. 1 where we observe that \(\xi \) is substantially larger than one even for \(c_1\) close to one. As the IP is unknown in practice, we construct estimators \({\hat{g}}(\cdot )\) and \({\hat{s}}^2(\cdot )\) by applying the WSD estimator of Boudt et al. (2011) which does not require a-priori specification of the IP functional form; the implementation details for the WSD estimator are provided in the Appendix. Then we compute the average estimates of the scaling factors \({\hat{\xi }}_{M,RQ}\), \({\hat{\xi }}_{M,TP}\), \({\hat{\xi }}_{M,QP}\) which are plotted in Fig. 1. We observe that the factor \({\hat{\xi }}_{M,RQ}\) appears to be very close to \(\xi _M\) even for a fairly small value \(M=26\).

The impact of IP on realized measures

As RV measure is not affected by IP, we study the IP impact on other measures, such as BV for IV, and TP, QP for IQ. After generating IID intraday returns as specified above, we calculate \(BV_t\), \(TP_t\), and \(QP_t\) as well as the IP-corrected estimators denoted by \({\widetilde{BV}}_{t}\), etc. for each day \(t=1,\ldots ,T\). Additionally, we compute the jump-robust medRV and minRV measures of Andersen et al. (2012). Then we build time averages, e.g. \({\overline{BV}}=(1/T) \cdot \sum _{t=1}^T BV_t\), for all measures. These averages of BV, \({\widetilde{BV}}\), medRV, and minRV are shown in Fig. 2; whereas of TP, QP, \({\widetilde{TP}}\), and \({\widetilde{QP}}\) in Fig. 3 for different M and \(c_1\) values. All measures should be equal to one for no IP with \(c_1=1\).

Fig. 2
figure 2

Uncorrected BV, medRV, minRV as well as corrected \({\widetilde{BV}}\) as a function of \(c_1\)

In Fig. 2 the IP-bias in BV is quite pronounced for \(M=26\) and \(M=78\) for large and medium curvatures and is still visible even for \(M=390\). Remarkably, the biases in medRV and minRV are even stronger than in BV. As expected, the bias-corrected mean of \({\widetilde{BV}}\) is close to the true IV for all M, so the suggested correction functions properly.

The averages of the original TP, QP and scaled \({\widetilde{TP}}\), \({\widetilde{QP}}\) are reported in Fig. 3. The original measures are downward biased for finite \(M=26,78\), compared to almost unbiased measures for \(M=390\) which is close to the asymptotic value \(\xi \), see Fig. 1. Remarkably, in case of 5 minute returns with \(M=78\) the bias is still quite substantial for empirical relevant values of IP curvature parameter \(c_1\in [0.6,0.8]\). As expected, QP is more biased than TP for finite M due to longer lags involved in its computing. Our scaled \({\widetilde{TP}}\) and \({\widetilde{QP}}\) measures are equal to the value \(\sigma ^4=1\) for all considered values of \(c_1\) and M which is an appealing property. Summarizing, the IP has a substantial influence on BV, minRV, medRV, so that IP-corrections are needed to get valid statistical inference on IV.

Fig. 3
figure 3

Original QP and TP as well as scaled \({\widetilde{QP}}\) and \({\widetilde{TP}}\) as a function of \(c_1\)

In order to shed light on the finite sample validity of the result in Proposition 3, we provide the exemplary QQ-plots in Fig. 4 for the statistics \({\hat{\zeta }}_M\) and \({\hat{\xi }}_M\) for the case of \(M=78\) and \(c_1 = 0.6\) which are based on the WSD estimator of IP (cf. Boudt et al. 2011). Being standardized properly, both statistics seem to approach normality with the increase of estimation period T, in particular, even \(T=250\) the QQ plots show a decent fit.

Fig. 4
figure 4

QQ-plots for the statistics \({\hat{\zeta }}_M\) (top panel) and \({\hat{\xi }}_M\) (bottom panel) for various values of T, \(M=78\) and \(c_1 = 0.6\)

The comparison of IP-bias corrected estimators

In Proposition 1 we show analytically that the original BV is downward biased so that the true level of risk measured by IV is underestimated. This is a rather undesired scenario from the risk management point of view making an MSE comparison of biased and unbiased measures not reasonable because the MSE is symmetric for upward and downward biases. For this reason, we provide an MSE comparison only for IP-bias corrected estimators.

In particular, we contrast the relative bias and MSE of our corrected estimator \({\widetilde{BV}}_t\) in (11) with those of Boudt et al. (2011) where the estimated IP component \({\hat{s}}_m\) is immediately removed from intraday returns by computing \(r^*_{t,m}=r_{t,m}/{\hat{s}}_m\). Then the BCL-estimator \(BV^*_t\) is given as

$$\begin{aligned} BV^*_t = \frac{M}{M-1} \cdot \frac{\pi }{2}\cdot \sum _{m=2}^M |r^*_{t,m}||r^*_{t,m-1}|, \qquad \text{ with } \qquad E[BV^*_t]=IV_t. \end{aligned}$$
(26)

This immediate removing of IP appears to be a common approach in the current literature (cf. Golosnoy et al. 2012; Bekierman and Gribisch 2021; Christensen et al. 2018), whereby the IP estimates \({\hat{s}}_m\) are based on historical data.

To contrast the relative biases and MSEs of \(BV^*_t\) and \({\widetilde{BV}}_t\), we generate intraday returns for \(T=250\) pre-sample days with the IP parameter value \(c^h_1=0.5\) as in (4) and (25) and use them to get IP estimates \({\hat{s}}_m\). We denote by \(c^h_1\) the ‘historical’ value of \(c_1\) assumed to be constant during the pre-sample period. Then, we focus on the next day’s IP which functional form is described by the ‘current value’ of \(c_1\) denoted by \(c^c_1\). That means, for this next (single) day of our interest, intraday returns are generated with either unchanged current IP parameter \(c^c_1=0.5=c^h_1\) or changed parameter \(c^c_1\ne c^h_1\). The latter case could occur at some ‘special’ days characterized by announcements, unexpected events etc. Note that the values \(c^c_1=0.1\) (extremely pronounced IP) and \(c^c_1=0.9\) (almost no IP) are not empirically relevant but considered for the illustration purposes only. We calculate \(BV^*\) and \({\widetilde{BV}}\) for this day of interest using the historical IP estimates \({\hat{s}}_m\), so there is an IP misspecification when \(c^c_1\ne c^h_1\). We repeat the procedure (generating \(T=250\) pre-sample days and an additional ‘day of interest’) \(10^4\) times and put the computed relative biases and MSEs in Table 1, where we show results for constant intraday volatility in Block A and intraday SV is Block B.

Hence, we study both the effect of estimation risk in case of unchanged IP and the effect of a change in the IP form. The latter is of much practical importance, as there are many empirical confirmations for time variability of IP even after excluding days with important macroeconomic announcements, see Andersen et al. (2001), Hecq et al. (2012), or Andersen et al. (2019).

Table 1 Relative bias and MSE of \(BV^*\) and \({\widetilde{BV}}\). The IP parameter \(c_1\) equals to \(c^h_1\) during estimation period of the IP components and to \(c^c_1\) when realized measures are calculated

We observe in Table 1 that in case of no IP change with \(c^h_1=c^c_1\), the realized measure \(BV^*\) of Boudt et al. (2011) is preferable in terms of relative bias and MSE. These findings are in line with the recent theoretical results of Ghysels et al. (2021) who show under similar assumptions that the most efficient quarticity estimators (in terms of MSE) can be obtained by an immediate adjustment of intraday returns for the IP as e.g. by Boudt et al. (2011) which is also confirmed by our evidence. However, even a comparatively small change in IP, e.g. from \(c^h_1=0.5\) to \(c^c_1=0.7\), leads to a substantial increase in the relative bias and MSE of \(BV^*\). This evidence remains also for intraday SV in Block B. Hence, our IP-correction approach is more robust compared to the procedure of Boudt et al. (2011) which is rather sensible to even small changes in the IP form.

Empirical study

In our application we work with intraday returns for the Dow Jones Industrial Average Index with the focus on measuring daily IV. Our dataset consists of intraday observations from January 1996 to December 2010 with non-regular trading time days skipped. Days with non-regular trading times are those where the trading time have been substantially shorter than the common 6.5 hours because of these or that reasons. We consider 15 min, 10 min, 5 min and 2 min intraday returns; 5 min returns is the most popular choice in practice. To avoid the opening bias effects, we skip the first daily observation which is the common practice to escape from the impact of noisy overnight quotes. The final sample consists of 3329 days with \(M=24, 37, 76\) or 193 observations for 15, 10, 5 or 2 min frequency, respectively.

We estimate the intraday IP \(s_m\), and calculate the IP correction factors \(\zeta \) and \(\xi \). Then we present descriptive statistics for both uncorrected and IP-corrected realized measures. Finally, we discuss the impact of IP bias in some further realized measures which are proposed in the literature.

Estimation of intraday pattern and descriptive statistics

We estimate the IP with the non-parametric WSD estimator of Boudt et al. (2011) and show them in Fig. 5 with components \({\hat{s}}_m\) normalized such that \(\sum _{m=1}^M {\hat{s}}^2_m=M\). The IP pattern has a convex U-shape for all considered sampling frequencies. It is high during morning and afternoon hours and low during the lunch break. In numerical terms, it is about twice as high during the peak in the morning compared to the trough in the middle of the day. Hence, in our empirical application for Dow Jones index the IP estimate approximately corresponds to \(c_1\approx 0.7\) in our simulation study. However, the IP curvature could be more pronounced for individual stocks. For example, in Christensen et al. (2018) the IP curvature parameter takes values around \(c_1=0.60\) for some assets from the Dow Jones index.

Fig. 5
figure 5

Full sample estimated intraday patterns of the Dow Jones index

Given these IP estimates, we calculate the finite M scaling factors for IP bias correction, all reported in Table 2 with 95% confidence intervals obtained by the bootstrap procedure with \(10^3-1\) replications outlined in Dette et al. (2022). The estimated factor \({\hat{\xi }}_{M,RQ}\) is larger than one and numerically similar for all sampling frequencies which corresponds to the evidence of similar patterns in Fig. 5. We also estimate finite sample corrections for BV, TP, and QP. The correction estimate \({\hat{\zeta }}_M\) for BV gets closer to one with increasing sampling frequencies. The same holds for other finite M scaling factors so that BV is biased downward. For example, we get \({\hat{\xi }}_{M,QP}\approx \)1.23 for 15 min frequency but only \({\hat{\xi }}_{M,QP}\approx \)1.03 for 5 min frequency returns. This evidence indicates that the IP bias problem should not be very acute for the U.S. stock market during the considered period of time.

Table 2 Estimated empirical IP factors with bootstrapped 95% confidence intervals in parentheses

Next, we provide descriptive statistics of realized estimators. In Table 3 we report the full sample averages of both uncorrected and IP-corrected realized measures of daily IV and IQ. The average RV is stable for all sampling frequencies, whereas the average BV is biased downwards compared to RV even after the IP correction, however, the corrected BV gets numerically closer to RV. The average values for TP and QP are almost the same for 15 and 10 min frequencies, but are much higher for 5 min and 2 min returns. Note that there are reported difficulties in estimating IQ based on frequencies of 5 min or higher (Andersen et al. 2014). Another important quantity is the average relative component \((RV-BV)/RV\) which, as expected, gets substantially smaller with the IP-bias corrected \({\widetilde{BV}}\) both for all sampling frequencies.

Table 3 Empirical averages of both uncorrected and IP-corrected realized measures

Discussion and further extensions

Our study is primarily focused on the popular realized measures which are widely used in practice. Recently, there have been much developments in estimation of IV and IQ based on high frequency observations. In particular, we would like to mention the robust threshold power variations (Corsi et al. 2010), the minRV and medRV estimators (Andersen et al. 2012), and the nearest neighborhood generalizations (Andersen et al. 2014). Moreover, in order to mitigate MMN-problems which arise by using ultra high frequency returns, several noise robust methods have been developed, as e.g. the pre-averaged BV (Podolskij and Vetter 2009), pre-averaged threshold RV (cf. Aït-Sahalia and Jacod 2014), or pre-averaged threshold BV (Christensen et al. 2014). The IP bias problem is also present for some of these more advanced realized measures as it could be shown by a simple Monte Carlo simulation exercise. To illustrate this point, we plot the IP-biases for minRV and medRV measures of Andersen et al. (2012) in Fig. 2 and observe that they are even larger than for the original BV estimator. The same argument applies to the minRQ and medRQ measures of IQ which are elaborated by Andersen et al. (2014). Of course, the IP-correction factors should be newly derived for each particular measure, however, this challenging task is beyond the scope of our paper and is left for future research, as e.g. the task of deriving IP correction factors for realized portfolio weights (Golosnoy et al. 2019, 2020, or Golosnoy and Gribisch 2022). The impact of IP on these additional estimators is investigated in the follow-up paper of Dette et al. (2022).

Summary

Availability of intraday high frequency returns on risky assets allows construction of precise realized estimators such as realized volatility (RV) and bipower variation (BV) which are commonly applied for estimation of daily integrated volatility (IV), or tri-power (TP) and quad-power (QP) variations which are serving as measures for daily integrated quarticity (IQ).

In this paper we investigate the impact of intraday periodicity (IP) on the finite sample properties of these realized measures. For our analysis we assume a discrete time model for intraday returns on risky assets and postulate a multiplicative deterministic IP component which is often of U-shape empirically. For a number of intraday returns \(M\rightarrow \infty \) the impact of IP is asymptotically negligible, however, we show that the IP-impact should be taken into account for a practically relevant situation with finite M. In particular, we prove that finite sample corrections of BV as well as of TP and QP measures are necessary to obtain valid statistical inferences concerning daily IV. We derive analytically the factors for IP-correction and analyse their stochastic properties. Our results are illustrated by means of a Monte Carlo simulation study for both constant and stochastic intraday volatility models. Finally, we evaluate IP correction factors empirically for daily IV of the Dow Jones Industrial Average Index.