El Niño and forecastability of oil-price realized volatility

We forecast monthly realized volatility (RV) of the oil price based on an extended heterogenous autoregressive (HAR)-RV model that incorporates the role of the El Niño Southern Oscillation (ENSO), as captured by the Equatorial Southern Oscillation Index (EQSOI). Based on the period covering 1986 January to 2020 December and studying various rolling-estimation windows and forecast horizons, we find that the EQSOI has predictive value for oil-price RV particularly at forecast horizons from 2 to 4 years, and for rolling-estimation windows of length 4 to 6 years. We show that this result holds not only based on standard tests of out-of-sample predictability, but also under an asymmetric loss function.


Introduction
The El Niño Southern Oscillation (ENSO) is an irregularly periodic variation in winds and sea surface temperatures over the tropical eastern Pacific Ocean, affecting the climate of much of the tropics and subtropics (Trenberth et al. 2007). The warming phase of the sea temperature is known as El Niño and the cooling phase as La Niña. The two periods last several months each and typically occur every few years with varying intensity per period. The ENSO changes the global climate pattern (Martin et al. 2013;Staupe-Delgado et al. 2018;Rojas et al. 2019), which in turn affects the demand and supply in the oil market, resulting in movements in its price (Changnon 1999;Cruz and Krausmann 2013;Cashin et al. 2017;Qin et al. 2020). Since the ENSO can cause severe natural disasters (for example, droughts, floods, and hurricanes; Cane 2004;Alajo et al. 2006;Miyakawa et al. 2017;Hu and Fedorov 2019), Demirer et al. (2020a) recently show that the ENSO also affects the volatility of the oil price. As explained by Demirer et al. (2018) earlier while analyzing the relationship between the oil market and rare disaster risks, this is due to disaster risks contributing to jump risk in oil prices, which in turn impacts oil-price volatility, given that that jumps account for a large part of the variation in crude oil prices (Gkillas et al. Forthcoming). 1 Against the backdrop of the recent evidence of predictability (based on k-th order nonparametric causality-inquantiles test) of the ENSO for oil-price volatility (besides returns) provided by Demirer et al. (2020a), in this paper, we aim to extend this line of research by verifying if this evidence does translate also to out-of-sample forecasting. This is important from a statistical perspective, since in-sample predictability does not guarantee out-of-sample forecasting gains emanating from a specific predictor, besides the fact that it is out-of-sample forecasting that tends to provide a more robust test of the appropriateness of an econometric model and the predictor (Campbell 2008). In light of this, we forecast monthly realized volatility (RV) of the West Texas Intermediate (WTI) oil price, using an extended version of the heterogeneous autoregressive (HAR)-RV model of Corsi (2009), which incorporates the role of the ENSO over the period of 1986 January to 2020 December. Note that measuring volatility using RV, which in our case is captured by the sum of squared returns over a month (following Andersen and Bollerslev 1998), provides an observable and unconditional metric of volatility, which is otherwise a latent process. 2 At the same time, the benchmark HAR-RV model can capture long-memory and multi-scaling properties of oil market volatility McAleer 2019, 2020), despite having a simplistic structure. In this regard, the key feature of the HAR-RV model is that it uses volatilities from different time resolutions to forecast the realized oil-price volatility. The model, thereby, captures the main idea motivating the heterogeneous market hypothesis (Müller et al. 1997), which states that different classes of market participants populate the oil market and differ in their sensitivity to information flows at different time horizons. 3 One must realize that understanding the factors, which in our case happens to be the ENSO, in accurately forecasting oil-price volatility, also has economic implications, being a pertinent question for both policymakers and investors. This is because the second-moment movements in crude oil can predict recessions (Elder and Serletis 2010;van Eyden et al. 2019;Pierdzioch and Gupta 2020). Moreover, the oil market's recent financialization has led to increased participation of hedge funds, pension funds, and insurance companies in the market, thus rendering oil a profitable alternative investment in the portfolio decisions of financial institutions (Bahloul et al. 2018;Bonato 2019). Given that volatility, when interpreted as uncertainty, becomes a key input to investment decisions and portfolio choices (Poon and Granger 2003), precise forecasts of oil-price volatility are of vital importance to oil traders.
To the best of our knowledge, this is the first paper that evaluates the out-of-sample forecasting power of the ENSO for oil-price volatility based on a HAR-RV model. Our paper, thus, adds to the already existing large literature on the forecastability of oil-price volatility based on a wide array of models and macroeconomic, financial, 2 Conventionally, the time-varying volatility is modeled, and the fit assessed, using various generalized autoregressive conditional heteroscedastic (GARCH) models, under which the conditional variance is a deterministic function of model parameters and past data. Alternatively, some recent papers have considered stochastic volatility models, where the volatility is a latent variable that follows a stochastic process. Irrespective of whether we use GARCH or SV models as widely used in the oil-volatility literature (see Chan and Grant (2016) for a detailed review in this regard), the underlying estimate of volatility is not model-free as in the case of RV. 3 For example, oil traders and speculators are very sensitive to shortterm investment horizons, whereas oil investors are more concerned with long-term investment horizons. The remainder of the paper is organized as follows: In Section 2, we describe our data. In Section 3, we briefly discuss the HAR-RV model. In Section 4, we present the results from our forecasting experiment. In Section 5, we conclude.

Data
As far as the crude oil price is concerned, we use the daily data of the nominal West Texas Intermediate (WTI) oil price (as used by Demirer et al. 2020a), which is available from 2nd January, 1986, until 31st December, 2020, at the time of writing this paper, and is derived from the United States (US) Energy Information Administration (EIA). 4 After computing the daily log-returns, we get the monthly realized volatility by taking the sum of daily squared returns over a specific month to cover the period from 1986 January to 2020 December. The upper panel of Fig. 1 shows the realized oil-price volatility.
For the metric of the ENSO cycle, again consistent with Demirer et al. (2020a), we use the Southern Oscillation Index (SOI), obtained from the Bureau of Meteorology, Government of Australia. 5 The SOI, gives an indication of the development and intensity of El Niño or La Niña events in the Pacific Ocean. The SOI is calculated using the pressure differences between Tahiti and Darwin. Sustained negative (positive) values of the SOI below (above) −7(+7) often indicate El Niño (La Niña) episodes. Low atmospheric pressure tends to occur over warm water and high pressure occurs over cold water, in part because of deep convection over the warm water. El Niño episodes are defined as sustained warming of the central and eastern tropical Pacific Ocean, and La Niña episodes are defined as sustained cooling of the central and eastern tropical Pacific Ocean, thus resulting in a decrease and an increase in the strength of the Pacific trade winds respectively.
The reliability of the SOI, however, is considered limited due to the presence of both Darwin and Tahiti well south of the equator, resulting in the surface air pressure at both locations being less directly related to ENSO. To overcome this issue, a new index named the Equatorial Southern Oscillation Index (EQSOI) was created. 6 To generate the data for this index, two new regions, centered on the equator, were delimited, with the western one located over Indonesia and the eastern one located over equatorial Pacific, close to the South American coast. The EQSOI is obtained from the Climate Prediction Center (National Weather Service) of the National Oceanic and Atmospheric Administration (US Department of Commerce). 7 The lower panel of Fig. 1 shows the EQSOI data.
In our main analysis, we use the EQSOI as measure of the ENSO, while the SOI is used as a robustness check. Results for SOI corroborate the results that we obtain for EQSOI. 8

The HAR-RV Model
We use a variant of the popular HAR-RV model developed by Corsi (2009) to model the RV of oil-price movements. The benchmark HAR-RV model is given by where h denotes the forecast horizon, RV t+h denotes the average realized volatility as observed during the forecast horizon, RV q,t denotes the quarterly realized volatility (that is, the average from month t − 3 to t − 1), and RV y,t denotes the yearly realized volatility (the average from month t − 12 to t − 1). The coefficients to be estimated are β j , j = 0, 1, 2, 3 and t denotes a disturbance term.
6 See the discussion of Anthony Barnston of the National Oceanic and Atmospheric Administration here: https://www.climate.gov/newsfeatures/blogs/enso/why-are-there-so-many-enso-indexes-instead-justone for further details. 7 https://www.cpc.ncep.noaa.gov/data/indices/. 8 See the end of the paper (Appendix) for results using the SOI data.
We compare the predictive accuracy of the benchmark HAR-RV model with the predictive accuracy of the following HAR-RV model extended to include EQSOI : We are interested in the out-of-sample predictive accuracy of the HAR-RV models given in Eqs. 1 and 2. In order to generate out-of-sample forecasts, we estimate the models using rolling-estimation windows of different lengths, and then use the estimation results to forecast the average realized volatility over the forecast horizon being studied.

Empirical results
In order to account for a potentially time-varying predictive value of ESQOI for the realized oil-price volatility, we use rolling-estimation windows of different lengths to compute out-of-sample forecasts, where the sample period starts in 1986 January. We use the R language for statistical computing (R Core Team 2019) to carry out our forecasting experiments. Table 1 summarizes basic forecasting statistics. Specifically, we report the root-mean-squared forecasting error (RMSFE) and the mean absolute forecasting error (MAFE) for different forecast horizons (h = 1, 3, 12, 24, 36, 48) and different lengths of the rolling-estimation window (48, 60, 72, 96, and 120 months). We scale the forecasting statistics using the corresponding forecasting statistics of the baseline HAR-RV model, that is, we divide the RMSFE (MAFE) statistic for the HAR-RV model extended to include EQSOI by the RMSFE (MAFE) statistic that we obtain for the baseline HAR-RV model. As a result, the forecasting statistic that exceeds unity indicates a superior forecasting performance of the extended model. The forecasting statistics mainly exceed unity and, thus, including EQSOI in the HAR-RV model gives more accurate forecasts in terms of the RMSFE and MAFE statistics, for the longer forecast horizons (h = 24, 36, 48) and the short and intermediate rolling-estimation windows (48 to 72 months).
We report results (p-values, computed using robust standard errors) of the Clark and West (2007) test for an equal mean-squared prediction (MSPE) error in Table 2. We use the classic HAR-RV model as the benchmark model, and the HAR-RV that features ESQOI as the rival model. The alternative hypothesis is that the rival model has a smaller MSPE than the benchmark model. The test results corroborate the results for the RMSFE and MAFE statistics. The test results are significant for the three shortest rollingestimation windows and the three longest forecast horizons.
As an alternative test of the contribution of ESQOI to forecast accuracy, we report in Table 3 the results of the Diebold and Mariano (1995) test. The results are obtained    Harvey et al. (1997). The p-values are computed using the R package "forecast" (Hyndman 2017;Hyndman and Khandakar 2008). Again, the baseline HAR-RV model forms the benchmark model. We report test results (p-values based on robust standard errors) for the case that a forecast has a loss function that depends on the squared forecast error and, as a robustness check, the case that the loss function depends on the absolute forecast error. On balance, the results corroborate the results of the Clark-West test. When the squared forecast error is relevant for a forecaster, the test results are significant for the two shortest rolling-estimation windows and the three longest forecast horizons. When the absolute forecast error is relevant for a forecaster, in turn, the test results are also significant for a rolling-estimation window of 72 months and h = 36, 48. In Fig. 2, we generalize the M-DM test for a forecaster who has an asymmetric loss function. Specifically, we consider a forecaster who has a loss function of the following format (Elliott et al. 2005(Elliott et al. , 2008: <0] ]|F E t+h | s , where we denote the forecast error (that is, actual value of RV minus forecast of RV ) by F E t+h . This loss function is of the so-called lin-lin type when we set s = 1, and of the so-called quad-quad type when we set s = 2, and the parameter α ∈ (0, 1) governs its shape. We obtain a symmetric loss function when α = 0.5. When we assume α = 0.5 and s = 1, then the symmetric loss function depends on the absolute forecast error, while it depends on the squared forecast error when we set α = 0.5 and s = 2 (see Table 3). When we assume α > 0.5 (α < 0.5), the loss of an underprediction (overprediction) of realized volatility outweighs the loss of an overprediction (underprediction) of the same seize.
The results that we summarize in Fig. 2 show that the results given in Table 3  The null hypothesis is that the accuracy of forecasts extracted from of a model that does not feature ESQOI as a predictor is equal to the accuracy of forecast computed by means a model that uses ESQOI as a predictor. The alternative hypothesis is that the latter forecasts are more accurate than the former. The p-values are based on robust standard errors function. We observe significant test results mainly for the short rolling-estimation windows and the three longer forecast horizons. Exceptions arise in some cases when the asymmetry parameter approaches the lower and/or upper boundary of its domain. As a robustness check, we summarize in Table 4 results for the Clark-West and M-DM tests (squared error loss) that we obtain when we forecast the square root of realized volatility. Such a nonlinear transformation mitigates the effect of the large outburst of realized volatility at the end of the sample period. The results of this robustness check corroborate that EQSOI significantly contributes to predictive accuracy in case of the three longest forecast horizons and the two and three shortest rolling-estimation windows. For the Clark-West test, we further observe that the test results are also significant for a forecast horizon of 3 months, further strengthening the claim that EQSOI has predictive value for the realized oil-price volatility.

Conclusion
In recent literature, researchers have reported evidence that ENSO has in-sample predictive value for oil-price volatility. Because in-sample predictability does not necessarily translate into out-of-sample gains, we have extended this line of research by forecasting monthly realized oil-price volatility based on an extended HAR-RV model that incorporates the role of the ENSO, as captured by the Equatorial Southern Oscillation Index (EQSOI). Based on the period covering 1986 January to 2020 December and studying various rolling-estimation windows and forecast horizons, we find that the EQSOI has predictive value for subsequent realized oil-price volatility. The predictive value of the EQSOI data is concentrated at forecast horizons from 2 to 4 years and for rolling-estimation windows of length 4 to 6 years. We have shown that this result holds not only when we consider standard tests of out-of-sample predictability, but also when we study an asymmetric loss function.
Our results can be used by policymakers to obtain information on the future path of oil-price volatility due to changes in climate patterns. This knowledge, in turn, may be useful to anticipate economic activity, given that oil-price movements are known to lead business cycles. Our results, therefore, may help policymakers to make appropriate policy choices in the wake of changing climate patterns and the resulting oil-price movements. Moreover, with volatility being a key input in portfolio decisions, the forecastability of oil-price volatility due to ENSO may be of vital importance to oil investors.
Future research can build on the results we have documented in this research to consider the volatility of non-energy commodities (such as agricultural commodities) given previous evidence on their price sensitivity to weather variability. squared error loss). The null hypothesis is that the accuracy of forecasts extracted from of a model that does not feature ESQOI as a predictor is equal to the accuracy of forecast computed by means a model that uses ESQOI as a predictor. The alternative hypothesis is that the latter forecasts are more accurate than the former. The parameter h denotes the forecast horizon (in months). The p-values are based on robust standard errors