1 Introduction

Risk management has spurred a vast literature in financial econometrics to meet the challenges imposed by the Basel-II and Basel-III agreements and develop model-based approaches to calculate regulatory capital requirements (Kinateder, 2016) in a forecasting perspective. For tail market risk, special attention was devoted to the Value-at-Risk (VaR) measure at a given confidence level \(\tau \), VaR(\(\tau \)), defined as the worst portfolio value movement (return) to be expected at \(1-\tau \) probability over a specific horizon (Jorion, 1997). The VaR measure is complemented by another tail risk measure called Expected Shortfall (ES), defined as the conditional expectation of returns in excess of the VaR (see Acerbi & Tasche, 2002a, Rockafellar & Uryasev, 2002, among others). Unlike VaR, ES is a coherent risk measure (Artzner et al., 1999; Acerbi & Tasche, 2002b) and provides deeper information on the shape and the heaviness of the tail in the loss distribution. Together, such measures represent the most popular benchmark in the risk management practice (Christoffersen & Gonçalves, 2005; Sarykalin et al., 2008).

Being the \(\tau \)-quantile of a portfolio return distribution, the VaR(\(\tau \)) can be predicted as the product of the portfolio volatility forecast times the quantile of the hypothesized distribution. For the first component, volatility clustering, modeled by conditionally autoregressive models (such as the ARCH/GARCH - Engle, 1982; Bollerslev, 1986), produces good forecasts capable of reproducing well known stylized facts of financial time series, including skewed behavior and fat tails (Cont, 2001, Engle & Patton, 2001, among others). Further improvements were made possible by the direct predictability of realized measures of financial volatility (Andersen et al., 2006b). While a choice of a specific parametric distribution for the innovation term may be uninfluential for model parameter estimation (Bollerslev & Wooldridge, 1992), unless a few extreme events (e.g. the Flash Crash of May 2005 or the presence of outliers, Carnero et al., 2012) occur, a wrong choice of distribution for the innovation term delivers inaccurate quantiles and hence an inadequate VaR(\(\tau \)) forecasting: see for example Manganelli and Engle (2001) and El Ghourabi et al. (2016).

As an alternative, the VaR(\(\tau \)) can be directly derived through quantile regression methods (Koenker & Bassett, 1978; Engle & Manganelli, 2004) where no distributional hypothesis is required. A first suggestion in this direction comes from Koenker and Zhao (1996) who use quantile regression for a particular class of ARCH models, i.e., the Linear ARCH models (Taylor, 1986), chosen for its ease of tractability in deriving theoretical properties. Subsequent refinements are, for instance, Xiao and Koenker (2009), Lee and Noh (2013), Zheng et al. (2018) for GARCH models, Noh and Lee (2016) who consider asymmetry, Chen et al. (2012) who consider nonlinear regression quantile approach with intra-day price, Bayer (2018) who combines VaR forecasts via penalized quantile regressions, Taylor (2019) who considers the Asymmetric Laplace distribution to jointly estimate VaR and ES and the multivariate generalization of Merlo et al. (2021).

A relatively recent stream of literature investigates the value of information provided by data available at both high- and low-frequency incorporated into the same model in assessing the dynamics of financial market activity: this is the case of the GARCH-MIDAS model proposed by Engle et al. (2013) (building on the MI(xed)-DA(ta) Sampling approach by Ghysels et al., 2007), the regime switching GARCH-MIDAS of Pan et al. (2017), the recent paper by Xu et al. (2021) who consider a MIDAS component in the Conditional Autoregressive Value-at-Risk (CAViaR) of Engle and Manganelli (2004), the work of Pan et al. (2021) where the parameters of the GARCH-MIDAS models for jointly calculating VaR and ES are obtained through the loss function of Fissler and Ziegel (2016), and the contribution of Xu et al. (2022) who calculate the weekly tail risks of three market indices using information from daily variables.

The main contribution of this paper is a novel Mixed-Frequency Quantile Regression model (MF-QR, extending Koenker & Zhao, 1996): we show how the constant term in the quantile regression can be written as a function of data sampled at lower frequencies (and hence becomes a low-frequency component), while the high-frequency component is regulated by the daily data. As a result, with the aim of capturing dependence on the business cycle, we benefit from the information contained in low-frequency variables (cf. Mo et al., 2018, Conrad & Loch, 2015, among others), and we achieve a rather flexible representation of volatility dynamics. Since both components enter additively, our model can be seen as a quantile model version of the Component GARCH by Engle and Lee (1999).

In the proposed model, we also include a predetermined variable observed daily, typically a realized measure: this adds the “–X” component in the resulting MF-QR-X model. This variable can capture extra information useful in modeling and forecasting future volatility and may improve the accuracy of tail risk forecasts. Such a use in the quantile regression framework is not new in itself: the paper by Gerlach and Wang (2020) jointly forecasts VaR and ES and Zhu et al. (2021) predict VaR by adopting a GARCH-X model for the volatility term. Also the work of Žikeš and Baruník (2016) uses the realized measures in the context of quantile regressions to investigate the features of conditional quantiles of realized volatility and asset returns.

The proposed MF-QR-X specification and its nested alternatives (including the QR version of Koenker and Zhao 1996) belong to the class of semi-parametric models, without resorting to restrictive assumptions about the error term distribution and are able to calculate the VaR directly. Such a model can also jointly forecast the VaR and ES via the Asymmetric Laplace distribution as proposed by Taylor (2019).

From a theoretical point of view, we provide the conditions for the weak stationarity of the daily return process suggested. The finite sample properties are investigated through an extensive Monte Carlo exercise. The empirical application is carried out on the VaR and ES predictive capability for two energy commodities, the West Texas Intermediate (WTI) Crude OilFootnote 1 and the Reformulated Blendstock for Oxygenate Blending (RBOB) Gasoline futures, both observed daily. The period under investigation starts on January 2010 and ends on July 2022, covering both the Covid-19 pandemic and some consequences of the Russian aggression of Ukraine. The competing models consist of many common parametric, semi-parametric and non-parametric choices. Some parametric models like the GARCH-MIDAS use the same low-frequency variable employed in the proposed MF-QR-X specification. Given our empirical interest in evaluating risks related to energy commodities, a relevant choice for such a variable is the geopolitical risk (GPR) index proposed by Caldara and Iacoviello (2022), observed monthly.Footnote 2 The resulting VaR and ES predictions are evaluated in- and out-of-sample, according to the customary backtesting procedures: our out-of-sample period starts on January 2017 and ends on July 2022, and the VaR and ES forecasts are obtained using a rolling window that updates the parameter estimates every five, ten and twenty days. The results show that our MF-QR-X outperforms all the other competing models considered, proving the merits of resorting to a mixed-frequency source of information. The useful contribution of a low-frequency variable in a risk management perspective thus lies in its capability of capturing secular movements in the conditional distributions related to risk factors slowly shifting through time.

The rest of the paper is organized as follows. In Sect. 2 we introduce the notation and the basis for a dynamic model for the VaR and ES and we provide details of the conditional quantile regression approach. Section 3 presents our MF-QR-X model. Section 4 is devoted to the Monte Carlo experiment. Section 5 details the backtesting procedures. Section 6 illustrates the empirical application. Conclusions follow.

2 Approaches to VaR and ES estimation

For the purposes of this paper we will adopt a double time index, it, where \(t=1,\ldots ,T\) scans a low frequency time scale (i.e., monthly) and \(i=1,\ldots ,N_t\) identifies the day of the month, with a varying number of days \(N_t\) in the month t, and an overall number N of daily observations \(N=\sum _{t=1}^T N_t\). Let the daily returns \(r_{i,t}\) be, as customarily defined, the log-first differences of prices of an asset or a market index, and let the information available at time it be \(\mathcal {F}_{i,t}\). In what follows, we are interested in the conditional distribution of returns, with the assumption:

$$\begin{aligned} r_{i,t} = \sigma _{i,t} z_{i,t} \quad \text {with} \quad t=1,\ldots ,T, \ i=1,\ldots ,N_t, \end{aligned}$$
(1)

where \(z_{i,t} \overset{iid}{\sim }(0,1)\) have a cumulative distribution function denoted by \(F(\cdot )\). The zero conditional mean assumption in Eq. (1) is not restrictive; in fact, when explicitly modeled, such a conditional mean is very close to zero, consistently with the market efficiency hypothesis.

Based on this setup, the conditional (one-step-ahead) VaR for day it at \(\tau \) level (\(VaR_{i,t}(\tau )\)) for \(r_{i,t}\) is defined as

$$\begin{aligned} Pr(r_{i,t} < VaR_{i,t}(\tau )|\mathcal {F}_{i-1,t})= \tau , \end{aligned}$$

i.e., the \(\tau \)-th conditional quantile of the series \(r_{i,t}\), given \(\mathcal {F}_{i-1,t}\); consequently, we can write

$$\begin{aligned} VaR_{i,t}(\tau )\equiv Q_{r_{i,t}}\left( \tau |\mathcal {F}_{i-1,t}\right) = \sigma _{i,t} F^{-1}(\tau ), \end{aligned}$$
(2)

where \({F^{-1}(\tau )}= \inf \left\{ z_{i,t}: F(z_{i,t}) \ge \tau \right\} \). For a given \(\tau \), the traditional volatility–quantile approach to estimate the \(VaR_{i,t}(\tau )\) is thus based on modeling \(\sigma _{i,t}\) from a dynamic model of either the conditional variance of returns (following Engle, 1982, Bollerslev, 1986) or as a conditional expectation of a realized measure (Andersen et al., 2006a) and retrieving the constant \(F^{-1}(\tau )\) either parametrically or nonparametrically. In either case, from an empirical point of view, it turns out that distribution tests mostly reject specific parametric choices, and that using the empirical distributions is prone to bias/variance problems and lack of stability through time.

Alternatively, we can estimate \(Q_{r_{i,t}}\left( \tau |\mathcal {F}_{i-1,t}\right) \) directly using a quantile regression approach (Koenker & Bassett, 1978; Engle & Manganelli, 2004) which has become a widely used technique in many theoretical problems and empirical applications. While classical regression aims at estimating the mean of a variable of interest conditioned to regressors, quantile regression provides a way to model the conditional quantiles of a response variable with respect to a set of covariates in order to have a more robust and complete picture of the entire conditional distribution. This approach is quite suitable to be used in all the situations where specific features, like skewness, fat-tails, outliers, truncation, censoring and heteroskedasticity are present. The basic idea behind the quantile regression approach, as shown by Koenker and Bassett (1978), is that the \(\tau \)-th quantile of a variable of interest (in our case \(r_{i,t}\)), conditional on the information set \(\mathcal {F}_{i-1,t}\), can be directly expressed as a linear combination of a \(q+1\) vector of variables \({x}_{i-1,t}\) (including a constant term), with parameters \({\Theta }_\tau \), that is:

$$\begin{aligned} Q_{r_{i,t}}\left( \tau |\mathcal {F}_{i-1,t}\right) = {x}_{i-1,t}' {\Theta }_\tau . \end{aligned}$$
(3)

An estimator for the \((q+1)\) vector of coefficients \({{\Theta }}_\tau \) is obtained minimizing a suitable loss function (also known as check function):

$$\begin{aligned} \hat{{\Theta }}_\tau = {\mathop {\mathrm{arg\,min}}\limits _{{\Theta }}} \sum \rho _{\tau } \left( r_{i,t} - {x}_{i-1,t}' {\Theta _\tau } \right) , \end{aligned}$$
(4)

with \(\rho _\tau (u)=u\left( \tau - \mathbbm {1}\left( u<0\right) \right) \), where \(\mathbbm {1}\left( \cdot \right) \) denotes an indicator function. In our context, the advantage of such an approach is to avoid the need to specify the distribution of \(z_{i,t}\) in Eq. (1), either parametrically or nonparametrically.

Following the approach by Koenker and Zhao (1996), we assume a dependence of \(\sigma _{i,t}\) on past absolute values of returns:

$$\begin{aligned} \sigma _{i,t} =\beta _0 + \beta _1 |r_{i-1,t}| + \cdots + \beta _q |r_{i-q,t}|, \quad \text { with } \quad t=1,\ldots ,T, \ i=1,\ldots ,N_t, \end{aligned}$$
(5)

with \(0<\beta _0<\infty \), \(\beta _1,\ldots ,\beta _q\ge 0\). Thus, substituting the generic term \({x}_{i-1,t}\) in Eq. (3) with the specific vector in Eq. (5), we have

$$\begin{aligned} \sigma _{i,t}=\left( 1, |r_{i-1,t}|,\ldots ,|r_{i-q,t}|\right) '\left( \beta _0,\beta _1,\ldots ,\beta _q\right) = x_{i-1,t}'\Theta . \end{aligned}$$
(6)

Such an approach turns out to be convenient, since it allows for a direct comparability of the two setups to estimate the VaR(\(\tau \)) in Eq. (2):

$$\begin{aligned} VaR_{i,t}(\tau ) = \left\{ \begin{array}{ll} x_{i-1,t}'\Theta \ F^{-1}(\tau )&{}\text {volatility--quantile} \\ x_{i-1,t}'\Theta _\tau &{}\text {conditional quantile regression}, \end{array} \right. \end{aligned}$$
(7)

which establishes the equivalence \(\Theta \, F^{-1}(\tau )=\Theta _\tau \) which will prove useful later in our Monte Carlo simulations. Moreover, as also pointed out by Koenker and Zhao (1996), what we estimate in the conditional quantile regression framework are the parameters in \(\Theta _{\tau }\), which are different from the parameters included in \(\Theta \) of the volatility–quantile context. While the parameters in \(\Theta \) are constrained to be non-negative, the parameters in \(\Theta _{\tau }\) may be negative depending on the value of \(\tau \). The volatility–quantile and conditional quantile regression options in Eq. (7) give rise to the so-called parametric and semi-parametric models for the VaR, respectively. Alternatively, the most prominent example of a non-parametric approach to derive the VaR is the Historical Simulation (HS - Hendricks, 1996). The HS model calculates this risk measure as the empirical quantile over a window of returns with length w, that is:

$$\begin{aligned} VaR_{i,t}(\tau )={Q}_{{\varvec{r}}_{i,t}^{w}}(\tau ), \end{aligned}$$
(8)

where \({\varvec{r}}_{i,t}^{w} = (r_{i-w,t},r_{i-w+1,t},\dots ,r_{i-1,t})\).

The linear representation in (5) can be further justified by noting that the term \(\sigma _{i,t}\) defining the volatility of returns can also be seen as the conditional expectation of absolute returns in the Multiplicative Error Model representation used by Engle and Gallo (2006):

$$\begin{aligned} |r_{i,t}|=\sigma _{i,t} \eta _{i,t}. \end{aligned}$$
(9)

The term \(\eta _{i,t}\) is an i.i.d. innovation with non-negative support and unit expectation, and the Eq. (9) can be used to derive an estimate of the VaR. The representation in (5) can also be seen as a simple and convenient nonlinear autoregressive model for \(|r_{i,t}|\) with multiplicative errors, which we hold as the maintained base specification to explore the merits of our proposal. Moreover, this lays the grounds for extending the approach, using other specifications for \(\sigma _{i,t}\) in Eq. (5) as functions of past volatility-related observable variables. For example, as an alternative, we can consider:

$$\begin{aligned} \sigma _{i,t} =\omega + \alpha _1 rv_{i-1,t} + \cdots + \alpha _q rv_{i-q,t}, \quad \text { with } \quad t=1,\ldots ,T, \ i=1,\ldots ,N_t, \end{aligned}$$

with \(rv_{i,t}\) the daily realized volatility.

A similar framework can be adopted to calculate the ES, following, again, the same parametric, non-parametric and semi-parametric approaches as before. The parametric models with Gaussian error distribution calculate the ES through:

$$\begin{aligned} \text {ES}_{i,t}(\tau )= -h_{i,t}^{1/2} \frac{\phi (\Phi ^{-1}(\tau ))}{\tau } \,, \end{aligned}$$
(10)

where \(h_{i,t}\) is the conditional variance, \(\phi (\cdot )\) and \(\Phi ^{-1}(\tau )\) are the probability density function (PDF) and quantile function of the standard Gaussian distribution, respectively. The parametric models with Student’s t error distribution calculate the ES via:

$$\begin{aligned} \text {ES}_{i,t}(\tau )= -h_{i,t}^{1/2} \left( \frac{g_{\nu }(G_{\nu }^{-1}(\tau ))}{\tau } \right) \left( \frac{\nu + (G_{\nu }^{-1}(\tau ))^{2}}{\nu -1} \right) \sqrt{\frac{\nu -2}{\nu }} \,, \end{aligned}$$
(11)

where \(g_{\nu }\) and \(G_{\nu }^{-1}(\tau )\) are the PDF and quantile function of the Student’s t with \(\nu \) degrees of freedom, respectively.

The HS calculates the ES as follows:

$$\begin{aligned} {ES}_{i,t}(\tau )= \frac{\sum _{i=1}^w r_{t-w-1+i} \mathbbm {1}_{(r_{t-w-1+i} \le {VaR}_{i,t}(\tau ))}}{\sum _{i=1}^w \mathbbm {1}_{(r_{t-w-1+i} \le {VaR}_{i,t}(\tau ))}}, \end{aligned}$$
(12)

where \({VaR}_{i,t}(\tau )\) is the VaR obtained through Eq. (8).

Following Taylor (2019), the quantile regression framework allows to jointly estimate the VaR and ES by maximizing the following Asymmetric Laplace density (ALD), that is:

$$\begin{aligned} f(r_{i,t} \mid VaR_{i,t}(\tau ), \tau ) = \frac{\tau -1}{ES_{i,t}(\tau )} \exp \left( \frac{\left( r_{i,t} - {VaR}_{i,t}(\tau )\right) \left( \tau - \mathbbm {1}_{(r_{i,t} \le {VaR}_{i,t}(\tau ))}\right) }{\tau ES_{i,t}(\tau )} \right) , \end{aligned}$$
(13)

where the ES in (13) is calculated as:

$$\begin{aligned} ES_{i,t}(\tau ) = \left( 1 + \exp \left( \gamma _{\text {ES}}\right) \right) VaR_{i,t}(\tau ). \end{aligned}$$
(14)

We now move to the introduction of our MIDAS extension to the model in (5) in a quantile regression framework, taking advantage of the well-known predictive power of low-frequency variables for the volatility observed at a daily frequency (e.g. Conrad & Kleen, 2020). We also add an “–X” term to the proposed specification. This additional high-frequency variable could be a lagged realized measure of volatility (see also Gerlach & Wang, 2020, within a CAViAR context), in order to add the informational content of a more accurate measure to the volatility dynamics, or a volatility index, like the VIX, or even accommodate asymmetric effects associated to negative returns.

3 The MF-QR-X model

3.1 Model specification and properties

In order to take advantage of the information coming from variable(s) observed at different frequency, we introduce a low-frequency component in model (5). This low-frequency term represents a one-sided filter of K lagged realizations of a given variable \(MV_t\) (any low-frequency variable), through a weighting function \(\delta (\omega )\), where \(\omega =(\omega _1, \omega _2)\). Our resulting Mixed-Frequency Quantile Regression (MF-QR) model becomes:

$$\begin{aligned} r_{i,t}= & {} \left[ \left( \beta _0 + \theta \Big |\sum _{k=1}^K \delta _k(\omega )MV_{t-k}\Big |\right) + \left( \beta _1 |r_{i-1,t}| + \cdots + \beta _q |r_{i-q,t}| \right) \right] z_{i,t} \end{aligned}$$
(15)
$$\begin{aligned}\equiv & {} \left[ (\beta _0 + \theta |WS_{t-1}|) + (\beta _1 |r_{i-1,t}| + \cdots + \beta _q |r_{i-q,t}| ) \right] z_{i,t}, \end{aligned}$$
(16)

where the parameter \(\theta \) represents the impact of the weighted summation of the K past realizations of \(MV_t\), observed at each period t, that is, \(WS_{t-1}= \sum _{k=1}^K \delta _k(\omega )MV_{t-k}\). The importance of each lagged realization of \(MV_t\) depends on \(\delta (\omega )\), which can be assumed as a Beta or Exponential Almon lag function (see, for instance, Ghysels & Qian, 2019). Here we use the former function, that is:

$$\begin{aligned} \delta _k(\omega )=\frac{(k/K)^{\omega _1-1} (1-k/K)^{\omega _2-1}}{\sum _{j=1}^K (j/K)^{\omega _1-1}(1-j/K)^{\omega _2-1}}. \end{aligned}$$
(17)

Equation (17) is a rather flexible function able to accommodate various weighting schemes. Here we follow the literature and give a larger weight to the most recent observations, that is, we set \(\omega _1=1\) and \(\omega _2 \ge 1\). The resulting weights \(\delta _k(\omega )\) are at least zero and at most one, and their sum equals one, so that \( \sum _{k=1}^K \delta _k(\omega )MV_{t-k}\) is an affine combination of \(\left( MV_{t-1},\cdots ,MV_{t-K}\right) \).

In order to refine the VaR dynamics in our model, we include a predetermined variable \(X_{i,t}\), so that we can explore the empirical merits of such an extended specification, already present in the GARCH and MEM literature (Han & Kristensen, 2015; Engle & Gallo, 2006). Such a variable may be the realized volatility of the asset or a market volatility index (see the use of the VIX in Amendola et al., 2021, among others). The resulting eXtended Mixed-Frequency Quantile Regression model, labelled MF-QR-X, becomes:

$$\begin{aligned} r_{i,t} = \left[ (\beta _0 + \theta |WS_{t-1}|) + (\beta _1 |r_{i-1,t}| + \cdots + \beta _q |r_{i-q,t}| + \beta _X |X_{i-1,t}| ) \right] z_{i,t}. \end{aligned}$$
(18)

In either Eqs. (16) or (18), the first component (including the constant) depends only on the low-frequency term (changing at every t, according to the term \(WS_{t-1}\)), while the second comprises variables changing daily (i.e., every it) and include lagged returns and the high-frequency term. In such a representation, the two components enter additively, in the spirit of the component model of Engle and Lee (1999):

$$\begin{aligned} r_{i,t} = \left[ \sigma _t^{LF}+\sigma _{i,t}^{HF}\right] z_{i,t}, \end{aligned}$$
(19)

which, for the MF-QR-X model, becomes

$$\begin{aligned} r_{i,t}= & {} \left[ \underbrace{(\beta _0 + \theta |WS_{t-1}|)}_{\sigma _t^{LF}} + \underbrace{(\beta _1 |r_{i-1,t}| + \ldots + \beta _q |r_{i-q,t}| + \beta _X |X_{i-1,t}| )}_{\sigma _{i,t}^{HF}} \right] z_{i,t}. \end{aligned}$$
(20)

In the following theorem we show that, under mild conditions, the process in (20) is weakly stationary:

Theorem 1

Let \(MV_{t}\) and \(X_{i,t}\) be weakly stationary processes. Assume that \(\beta _0 >0\), \(\beta _1,\cdots ,\beta _q,\beta _x \ge 0\) and \(\theta \ge 0\). Let \(z^*\equiv \left( E|z_{i,t}|^{p} \right) ^{1/p}<\infty \), for \(p=\left\{ 1,2 \right\} \) and the polynomial

$$\begin{aligned} \phi (\lambda )=z^* \left( \beta _1 \lambda ^{q+1} + \beta _2 \lambda ^{q} + \cdots + \beta _q \lambda ^{q-2}\right) - \lambda ^{q+2} \end{aligned}$$
(21)

has all roots \(\lambda \) inside the unit circle. Then the process \(r_{i,t}\) in (20) is weakly stationary.

Proof: see “Appendix A”.

3.2 Inference on the MF-QR-X Model

In order to make inference on the MF-QR-X model, we need to solve Eq. (4) where

$$\begin{aligned} {x}_{i-1,t}= & {} \left( 1,|WS_{t-1}|,|r_{i-1,t}|,\ldots ,|r_{i-q,t}|,|X_{i-t,t}|\right) ' \end{aligned}$$
(22)
$$\begin{aligned} {\Theta }_\tau= & {} \left( \beta _{0,\tau },\theta _{\tau }, \beta _{1,\tau }, \ldots , \beta _ {q,\tau }, \beta _{X,\tau } \right) . \end{aligned}$$
(23)

The estimation of the vector \({\Theta }_\tau \) is encumbered by the fact that the mixed-frequency term \(WS_{t-1}\) is not observable, as it depends on the unknown \(\omega _2\) parameter of the weighting function \(\delta _k(\omega )\), also to be estimated. To make estimation feasible, we resort to the expedient of profiling outFootnote 3 the parameter \(\omega _2\), through a two-step procedure: we first fix \(\omega _2\) at an initial arbitrary value, say \(\omega _2^{(b)}\), which turns the vector \(x_{i-1,t}\) into a completely observable counterpart, in short \({x}_{i-1,t}^{(b)}\). This gives a solution to the minimization of the loss function, which is dependent on \(\omega _2^{(b)}\), that is,

$$\begin{aligned} \widehat{\Theta }_\tau (\omega _2^{(b)}) \equiv \widehat{\Theta }^{(b)}_\tau = {\mathop {\mathrm{arg\,min}}\limits _{\Theta _\tau }} \sum \rho _{\tau } \left( r_{i,t} - \left( {x}_{i-1,t}^{(b)}\right) ' {\Theta }_\tau \right) . \end{aligned}$$
(24)

This procedure is repeated over a grid of B values for \(\omega _2\), so that we have \(\left\{ \widehat{\Theta }^{(b)}_\tau \right\} _{b=1}^B\), and the chosen overall estimator is \(\left( \hat{\omega }_2^*,\widehat{\Theta }^{(*)}_\tau \right) \), corresponding to the smallest overall value of the loss function.

Accordingly, the MF-QR-X estimator of the VaR is

$$\begin{aligned} \widehat{Q}_{r_{i,t}}\left( \tau |\mathcal {F}_{i-1,t}\right) = \left( {x}_{i-1,t}^{(*)}\right) ' \widehat{\Theta }^{(*)}_\tau . \end{aligned}$$
(25)

Summarizing, the proposed MF-QR-X is thus a flexible VaR model not requiring any distributional assumptions for the error term and accommodating both low-frequency and high-frequency additional variables. In Sect. 6, we will elaborate on its capability to jointly estimate the VaR and ES, adopting the approach proposed by Taylor (2019).

To obtain reliable VaR and ES estimates in our model (25), an important issue is the choice of the optimal number of lags q for the daily absolute returns in Eq. (5). To that end, we select the lag order suggested by a sequential likelihood ratio (LR) test on individual lagged coefficients (see also Koenker & Machado, 1999). In particular, for a given \(\tau \), at each step j of the testing sequence over a range of J values, we compare the unrestricted model where the number of lags is set equal to j (labelled U, with an associated loss function \({V}_{U,\tau }^{(j)}\)), against a restricted model where the number of lags is \(j-1\) (labelled R, with an associated loss function \({V}_{R,\tau }^{(j-1)}\)). In this setup, the null hypothesis of interest is

$$\begin{aligned} H_0: \beta _j=0, \end{aligned}$$
(26)

i.e., the coefficient on the most remote lag is zero. The procedure starts contrasting a lag-1 model against a model with just a constant, then a lag-2 against a lag-1, and so on.

For a given \(\tau \), at each step j, we calculate the test statistic

$$\begin{aligned} LR_{\tau }^{(j)}= \frac{2 \left( {V}_{R,\tau }^{(j-1)}-{V}_{U,\tau }^{(j)} \right) }{\tau \left( 1-\tau \right) s(\tau )}, \end{aligned}$$
(27)

where \(s(\tau )\) is the so-called sparsity function estimated accordingly to Siddiqui (1960) and Koenker and Zhao (1996). Under the adopted configuration, \(LR_\tau ^{(j)}\) is asymptotically distributed as a \(\chi _1^{2}\), so that we select q to be the last value of j in the sequence, for which we reject the null hypothesis.

Table 1 Percentage of rejection of the LR test for the null \(\beta _j=0\)

4 Monte Carlo simulation

The finite sample properties of the sequential test and of the estimator of the MF-QR modelFootnote 4 can be investigated by means of a Monte Carlo experiment. In what follows we consider \(R=5000\) replications of the data generating process (DGP):

$$\begin{aligned} r_{i,t}=\left( \beta _0 + \theta |WS_{t-1}| + \beta _1|r_{i-1,t}| +\beta _2|r_{i-2,t}| +\beta _3|r_{i-3,t}| + \beta _4|r_{i-4,t}| \right) z_{i,t}, \end{aligned}$$

where we assume a \(\mathcal {N}(0,1)\) distribution for \(z_{i,t}\) and we set to zero the relevant initial values for \(r_{i,t}\). Moreover, the stationary variable \(MV_t\) entering the weighted sum \(WS_{t-1}\) is assumed to be drawn from an autoregressive AR(1) process \(MV_t = \varphi MV_{t-1} + e_t\), with \(\varphi =0.7\) and the error term \(e_t\) following a Skewed t-distribution (Hansen, 1994), with degrees of freedom \(df=7\) and skewing parameter \(sp=-6\). The frequency of \(MV_{t}\) is monthly and \(K=24\). The values of the parameters (collected in a vector \(\Theta \)) are detailed in the first column of the Tables 2, 3 and 4. For the simulation exercise we consider \(N=1250\), \(N=2500\) and \(N=5000\) observations, to mimic realistic daily samples. Having fixed \(K=24\) (that is, two years of monthly data), the number of daily observations should be large enough to allows for model estimation. In our case, we set this limit to 1250 daily observations. Finally, three different levels of the VaR coverage level \(\tau \) are chosen: 0.01, 0.05, and 0.10.

In the Monte Carlo experiment, we start by evaluating the features of the LR test for the lag selection in Eq. (27). To that end, we test sequentially \(H_0:\beta _j=0\) over J steps at a significance level \(\alpha \). Since the DGP is a fourth-order process, we expect to have a high rejection rate when the null involves a zero restriction on coefficients \(\beta _j, \ j=1,\ldots ,4\). In order to confirm the expected low rate of rejections, we extend the sequence of testing of further \(\beta _j\)’s, up to \(J=6\).

Table 2 Monte Carlo estimates, \(\tau =0.01\)
Table 3 Monte Carlo estimates, \(\tau =0.05\)

Looking at the Table 1, where we report the percentages of rejections for different VaR coverage levels \(\tau =0.01, 0.05, 0.1\) at the nominal significance level of \(\alpha =5\%\) across replications, we validate the good behavior of the test. Overall, the sequential test procedure satisfactorily identifies the number of lags to be included in the MF-QR model, with the performance improving with the number of observations, especially for \(H_0:\beta _4=0\); for the latter case, the percentage of rejections of the null increases considerably across coverage levels when \(N=5000\).

Turning to the small sample properties of our estimator, the evaluation is done in terms of the original coefficients in the DGP, collected in the vector \(\Theta = \left( \beta _0,\theta ,\beta _1,\ldots ,\beta _q\right) \), using the relationship with the quantile regression parameters \(\Theta _\tau \), i.e., \(\Theta =\Theta _\tau / F^{-1}(\tau )\).Footnote 5 In Tables 2, 3 and 4 we report the Monte Carlo averages of the parameters (\(\hat{\Theta }\)) across replications for three levels of \(\tau \), and the estimated Mean Squared Errors relative to the true values.

Overall, the proposed model presents good finite sample properties: independently of the \(\tau \) level chosen, for small sample sizes, the estimates appear, in general, slightly biased, although, reassuringly, the MSE of the estimates relative to the true values always decreases as the sample period increases.

Table 4 Monte Carlo estimates, \(\tau =0.1\)

5 Model evaluation

In order to evaluate the quality of the tail risk estimates we can resort to a set of tests suitable to the needs of risk management. Above all, the backtesting procedure is very popular in evaluating risk measure performance (see the reviews of Campbell, 2006, Nieto & Ruiz, 2016, among others). For our model we use the Actual over Expected (AE) exceedance ratio and five other tests in this class: the Unconditional Coverage (UC, Kupiec, 1995), the Conditional Coverage (CC, Christoffersen, 1998), and the Dynamic Quantile (DQ, Engle & Manganelli, 2004) tests for the VaR and the UC and CC tests for the ES (Acerbi & Szekely, 2014).

The AE exceedance ratio is the number of times that the VaR measures have been violated over the expected VaR violations. The closer to one the ratio, the better is the model to forecast VaRs. The UC test is a LR-based test, where the null hypothesis assesses whether the actual frequency of VaR violations is equal to the chosen \(\tau \) level. Formally, the null hypothesis of the UC test is

$$\begin{aligned} H_0: \pi = \tau , \end{aligned}$$

where \(\pi =\mathbb {E}[L_{i,t}(\tau )]\), with \(L_{i,t}(\tau )=\mathbbm {1}_{\left( r_{i,t}<VaR_{i,t}(\tau )\right) }\) representing the series of VaR violations. The UC test statistic is asymptotically \(\chi ^2\) distributed, with one degree of freedom, assuming independence of the \(L_{i,t}(\tau )\) series.

Another critical aspect to test for is the independence of VaR violations over time. The main idea is to discard models whose VaR forecasts are violated in subsequent days. Moreover, if the assumption of independence is not satisfied by the violations, the asymptotic results on the distribution of the UC test can fail to hold. The independence test used in this context is that of Christoffersen (1998), where the null hypothesis consists of independence of \(L_{i,t}(\tau )\), while the alternative hypothesis is that \(L_{i,t}(\tau )\) follows a first-order Markov Chain. Under \(H_0\), the LR-based test is asymptotically \(\chi ^2\) distributed, with one degree of freedom.

An overall assessment of the VaR measures is given by the CC test conducted on both null hypotheses of the UC and of the independence tests jointly (asymptotically the test statistic is \(\chi ^2\) distributed, with two degrees of freedom).

The DQ test also applies to the independence of the VaR violations jointly with the correctness of the number of violations as the CC test, but it was shown (Berkowitz et al., 2011) to have more power over it. In particular, the DQ test consists of running a linear regression where the dependent variable is the sequence of VaR violations and the covariates are the past violations and possibly any other explanatory variables. More in detail, let \(Hit_{i,t}(\tau )=L_{i,t}(\tau )-\tau \) be the so-called series of the hit variable. This series, under correct specification, should have zero mean, be serially uncorrelated and, moreover, uncorrelated with any other past observed variables. The DQ test can be carried via the following OLS regression:

$$\begin{aligned} Hit_{i,t}(\tau )=\beta _0+\sum _{k=1}^{K_1} \beta _k \, Hit_{i-k,t}(\tau ) + \sum _{k=1}^{K_2} \gamma _k \, Z_{i-k,t}(\tau ) + u_{i,t}, \end{aligned}$$
(28)

where \(u_{i,t}\) is the error term and \(Z_{i,t}(\tau )\)’s include potentially relevant variables belonging to the available information set, like, for instance, previous Hits, lagged VaR or past returns. In matrix notation, the OLS regression in (28) becomes:

$$\begin{aligned} {\varvec{Hit}}={\varvec{Z}}{\varvec{\psi }} + {\varvec{u}}, \end{aligned}$$
(29)

where the vector \({\varvec{Hit}}\) has dimension N (with N indicating the total number of observations), the matrix of predictors \({\varvec{Z}}\) has dimension \(N \times (K_1+K_2+1)\), the vector \({\varvec{\psi }}=\left( \beta _0,\beta _1,\ldots ,\beta _{K_1},\gamma _1,\ldots ,\gamma _{K_2} \right) \) has dimension \((K_1+K_2+1)\), and the error vector \({\varvec{u}}\) has dimension N. Under correct specification we test the null \({\varvec{\psi }}={\varvec{0}}\) with a test statistic:

$$\begin{aligned} DQ_{CC}=\frac{\hat{{\varvec{\psi }}}^{'}{\varvec{Z}}^{'}{\varvec{Z}}\hat{{\varvec{\psi }}}}{\tau (1-\tau )}\overset{d}{\rightarrow }\chi _{K_1+K_2+1}^2, \end{aligned}$$

where \(\hat{{\varvec{\psi }}}\) is the estimated vector of coefficients obtained from the OLS regression in (29).

For the expected shortfall ES, the UC test of Acerbi and Szekely (2014) is based on the following statistic:

$$\begin{aligned} Z_{UC}=\frac{1}{N(1-\tau )}\sum _{i=1}^{N_t}\sum _{t=1}^{T} \frac{r_{i,t}L_{i,t}(\tau )}{ES_{i,t}(\tau )}+1. \end{aligned}$$
(30)

If the distributional assumptions are correct, the expected value of \(Z_{UC}\) is zero, that is \(\mathbb {E}\left( Z_{UC}\right) =0\). The CC test of Acerbi and Szekely (2014) has the following statistic:

$$\begin{aligned} Z_{CC}=\frac{1}{NumFail}\sum _{i=1}^{N_t}\sum _{t=1}^{T} \frac{r_{i,t}L_{i,t}(\tau )}{ES_{i,t}(\tau )}+1, \end{aligned}$$
(31)

where \(NumFail=\sum _{i=1}^{N_t}\sum _{t=1}^{T}L_{i,t}(\tau )\). If the distributional assumptions are correct, the expected value of \(Z_{CC}\), given that there is at least one VaR violation, is zero, i.e. \(\mathbb {E}\left( Z_{CC}|NumFail>0\right) =0\). The UC and CC tests are one-sided and reject the null when the model underestimates the risk (significantly negative test statistic).

6 Empirical analysis

In this section, we apply the MF-QR-X model to estimateFootnote 6 VaR and ES for the daily log-returns of two energy commodities: the WTI Crude Oil and the RBOB Gasoline futures.Footnote 7 The low-frequency variable is the monthly GPR index, which enters our mixed-frequency models as the first difference divided by one lagged realization. The “–X” variable is the VIX index.Footnote 8 The period of investigation covers almost 13 years, from January 2010 to July 2022 on a daily basis, split between in- (from January 2010 to December 2016) and out-of-sample periods (from January 2017 to July 2022). The data are summarized in Table 5, and plotted in Fig. 1.

Table 5 Summary statistics
Fig. 1
figure 1

Crude Oil, Gasoline, VIX and GPR

We compare the estimated VaR and ES with several well-known competitive specifications belonging to the class of parametric (GARCH, GJR (Glosten et al., 1993), and GARCH-MIDAS, with Gaussian and Student’s t error distributions), non-parametric (HS) and semi-parametric models (the Symmetric Absolute Value (SAV), Asymmetric Slope (AS) and Indirect GARCH (IG) specifications of the CAViaR (Engle & Manganelli, 2004)). As per the mixed-frequency specifications, the same low-frequency variable (GPR index) is inserted as the low-frequency variable in the GARCH-MIDAS specifications as well as our proposed MF-QR and MF-QR-X models. All the functional forms of these models are reported in Table 6.

Table 6 Model specifications

6.1 In-sample analysis

Tables 7 reports the p-values of the LR test (Eq. (27)) using \(\tau =0.05\), on the period from 2010 to 2016, for the two commodities under investigation, which suggests the inclusion of up to six, respectively, five lagged daily log-returns in the models for the Crude Oil and Gasoline futures.

Table 7 LR test, p-values of the null \(\beta _j=0\)

As regards the number of lagged realizations entering the low-frequency component, we choose \(K=36\), for all mixed frequency models. The in-sample estimated parameters for the parametric (with Quasi Maximum Likelihood standard errors, cf. Bollerslev & Wooldridge, 1992) and semi-parametric models (with bootstrap-based standard errors, as done also by Xu et al., 2021) are reported in Tables 8 (Crude Oil) and 9 (Gasoline). The algorithm used to obtain the bootstrap standard errors is sketched in “Appendix B”. Note that for the proposed MF-QR-X model, the low-frequency parameters as well as the parameters associated to the “–X” variable are generally significant.

Table 8 In-sample estimates for Crude Oil
Table 9 In-sample estimates for Gasoline

The in-sample backtesting evaluations are reported in Tables 10 (Crude Oil) and 11 (Gasoline). All models pass the chosen backtesting procedures (p-values in columns 3–7), with a strong preference for the longer windows in the HS non-parametric model.

Table 10 In-sample backtesting for Crude Oil
Table 11 In-sample backtesting for Gasoline

6.2 Out-of-sample evaluation

The empirical analysis is completed by the out-of-sample analysis. In line with Lazar and Xue (2020), the one-step-ahead VaR and ES forecasts of the parametric and semi-parametric models are obtained with parameters estimated every five days, using a rolling window of size 1500 observations. For our main MF-QR-X model, the VaR and ES forecasts are graphically reported in Fig. 2.

Fig. 2
figure 2

MF-QR-X VaR and ES forecasts. Notes: Plot of the Crude Oil (top) and Gasoline (bottom) daily log-returns (black lines) and of the VaR (red lines) and ES (blue lines) forecasts obtained from the MF-QR-X model. Sample period: from January 2017 to July 2022

The results of the out-of-sample evaluations are synthesized in Tables 12 (Crude Oil) and 13 (Gasoline), respectively. While the AE ratios closest to one are seen for model GM-N for Crude Oil in Table 12, and for model QR for Gasoline (Table 13), a more formal statistical evaluation of the VaR and ES performances by different models is given by backtesting procedures. Contrary to the in-sample period where almost all the models passed the backtesting procedures, going out-of-sample, the proposed MF-QR-X is the only one that fails to reject the null for all the VaR and ES tests for both the Crude Oil and the Gasoline log-returns (while the QR model passes all tests only for the latter), with more scattered and less systematic evidence for the other models, but with a consistent failure of all the tests by GM-t, short window HS and SAV, AS and IG. In “Appendix C”, we also report the results of the backtesting evaluations using a slower frequency (ten/twenty days) of parameter updates. The results are quite robust to different frequency updating schemes, as it can be seen in Tables from 14, 15, 16 and 17.

Table 12 Out-of-sample backtesting for Crude Oil
Table 13 Out-of-sample backtesting for Gasoline

7 Concluding remarks

This paper suggested the inclusion of mixed-frequency (MF) components in a quantile regression (QR) approach to VaR and ES estimations, within a dynamic model of volatility with the original introduction of a low- and a high-frequency (“–X”) components: the outcome was labelled MF-QR-X model. Given its nature of quantile regression, no explicit distribution for the returns is necessary and robustness to outliers in the data is guaranteed.

Starting from the assessment of the weak stationarity conditions of our semi-parametric MF-QR-X process, we suggested an estimation procedure the performance of which was investigated through an extensive Monte Carlo exercise in finite samples. Overall, we have satisfactory properties of the estimates and the resulting VaR forecasts are robust to some misspecification in the weighting parameter entering the mixed-frequency component.

Energy commodities—Crude Oil and Gasoline futures—take the center stage in the illustration of the empirical performance, both in- and out-of-sample, of the proposed MF-QR-X model, contrasting it against several popular parametric, non-parametric and semi-parametric alternatives. The results are encouraging since our model is the only model consistently passing all the VaR and ES backtesting procedures out-of-sample for the Crude Oil log-returns (together with the QR model for the Gasoline log-returns). The empirical results support the use of MF-QR-X models to exploit the information content of mixed-frequency data in a risk management framework.

Further research may focus on the multivariate extension of the tail risk forecasts, as done by Torres et al. (2015), Di Bernardino et al. (2015), Bernardi et al. (2017), and Petrella and Raponi (2019), among others. Another interesting point would be the investigation of the performance of the MF-QR-X with an asymmetric term, both for what concerns the daily returns and the low-frequency component, as done by Amendola et al. (2019), for instance.