Introduction

Oil is an essential commodity for the US economy, and in consequence, there are well-developed oil derivative markets. To price and hedge derivative contracts written on oil, it necessary to understand the price dynamics. The derivative pricing literature for oil is relatively sparse compared to the existing literature on equity derivatives. Casassus and Collin-Dufresne (2005)Footnote 1 and Christoffersen et al. (2016) argue that allowing for jumps is crucial for modeling crude oil futures. In Casassus et al, where parameters are assumed to be constant, two jumps are found to be significant. In Christoffersen et al, the conditional variance of the normal innovation is governed by GARCH-type dynamics. A single jump process is assumed. The arrival intensity of the jump process is assumed to be proportional to the conditional variance of the normal innovation. In both papers, the arrival intensities are endogenous.Footnote 2

Press (1967) argued that to accommodate the random arrival of information events, the price change over an interval can be represented as a normal innovation and a Poisson compound process. The compound process represents the impact of major information events, and the normal innovation represents normal information events.Footnote 3 There is a large literature that starts with the premise that information events affect asset prices; however, lacking access to news data banks, some form of proxy is used for the intensity of the arrival of information events.Footnote 4

With the collection and analysis of large data banks detailing different types of news articles,Footnote 5 there is a growing amount of empirical evidence that the arrival of information affects asset prices—see, for example, Tetlock et al. (2008), Heston and Sinha (2004), Borovkova and Lammiman (2015), Boudoukh et al. (2019), Sinha (2016) and Fisher et al. (2020). This has important implications for the modeling of asset pricing dynamics, as it is now possible to directly incorporate the impact of information events. This paper introduces a new description of asset pricing dynamics that explicitly considers the arrival of different types of news information. News information is taken from the Thomson Reuters News Analytics database (TRNA), which collects time-stamped news feeds. For each news article, a linguistic analysis is performed, and an assessment is made of the relative importance of the news article. Following Press (1967), we allow for asymmetric responses, by dividing the news items into major news and normal (nonmajor) news. We give an operational definition of what constitutes major news later in the paper. Sensitive to the growing evidence of an asymmetric response to negative and positive news, we further divide major news into two separate groups: major negative news (futures prices decrease) and major positive news (futures prices increase).

In a preliminary analysis of oil futures prices, we show in a model-free setting that negative (positive) news items are associated with lower (higher) returns and higher (lower) realized variance. Employing the methodology described in Engle et al. (2012), we show that for oil futures information, arrival plays a role in explaining volatility clustering. However, unlike Engle et al, who aggregate all news items, we show that negative news plays a more important explanatory role than positive news. We find that the term structure of futures prices is impacted by negative and positive news. The effects are asymmetric and decrease in magnitude with increased time to maturity, consistent with the Samuelson (1965) hypothesis.

In our pricing model, we describe the conditional variance of the normal innovations using a standard GARCH model. For the major innovations, two Poisson compound processes are introduced: one for important negative news items and one for important positive items. The number of news articles varies over time; there are quiet periods and busy periods.Footnote 6 Major information events typically generate news articles lasting over several periods and may generate new information events. We count the number of negative and positive important news items each period. The arrival processes are autocorrelated, implying that all the moments for the return distribution will also vary over time. It also implies that for major news periods, volatility clustering is more likely to occur. To the best of our knowledge, this is the first study that explicitly considers how the dynamics of information flow and the different types of information directly influence the price dynamics.

Our paper differs from extant literature in four ways. First, we show in a model-free framework that negative (positive) net news is associated with low (high) returns. This is not surprising. A finding that is of more interest is that negative (positive) net news is associated with higher (lower) realized variance. Second, we explicitly model the impact of news items on futures prices dynamics. Unlike extant literature that assumes the arrival intensities of jump processes are endogenous, in our model the intensities of the news arrival processes for jumps are exogenous. This difference is important. Our model naturally incorporates quiet periods, such as weekends and holidays, when news flows decrease; It incorporates the impact of major information events that can often generate many news articles. Third, the two Poisson compound processes are important in describing the price dynamics, consistent with the finding of Casassus and Collin-Dufresne (2005). The arrival intensities are positive and statistically significant and their prices of risk statistically significant. The moments of the distribution of returns vary with the cycle of news items, and our model does a reasonably job in predicting the fluctuations in sign of the moments of the return distribution. Fourth, the distributions describing the impact of negative and positive news are different and the autoregressive news arrival processes are different for each type of news. These results are consistent with our model-free results that show an asymmetry in response to negative and positive news. Our results offer a possible explanation of the asymmetric response in volatility to positive and negative price shocks found in Christoffersen et al. (2016). They did not explicitly consider the impact of news but rather modeled exogenous price shocks, independent of an underlying source.Footnote 7

A growing number of empirical studies examine how soft information and hard information affect asset prices and volatility. Two approaches have evolved in the analysis of news data: a bag of words and machine learning. In the bag-of-words approach, a dictionary of words is compiled, and the words are classified as conveying either positive or negative meaning.Footnote 8 The second approach and one that is employed in this study formalize the techniques used in the bag-of-words approach by employing machine learning and using a training set to calibrate what is neutral, negative or positive news. This database is created by applying a neural network methodology based on name entity recognition and part-of-speech tagging to produce measures of the relevance and the sentiment of news data. For each news item, TRNA estimates the probability that the item has negative, neutral and positive sentiment.Footnote 9 Heston and Sinha (2004) compare the performance of the Harvard-IV-4 dictionary of words, the financial dictionary used by Loughran and McDonald (2011) and the TRNA neural network methodology. In all cases, there is evidence that news has some predictive power for stock returns over event windows of one to four days. When portfolios are formed on a weekly basis, the TRNA data outperform the other methodologies by predicting positive returns up to 13 weeks after a news story release.

A machine learning approach has generated some criticism because the way it classifies documents in a nontransparent fashion can lead to greater misspecification—Boudoukh et al. (2019). It is far from clear why using neural networks for classification leads to a greater misspecification. Misspecification will occur in any approach for a variety of reasons: for example, poor initial training set or failure to interpret words within context. For the bag-of-words approach, the training set is the list of words used to identify meaning. Given that the algorithm has been calibrated using a training sample, if the new sample is very different from the training sample, then misclassification will occur. This will in general mitigate against finding statistical significance. Interpreting words within the correct context is essential. For example, news about a firm cutting its dividend is usually interpreted as bad news. However, during a recession, such news might be interpreted as good news, as it shows the firm is conserving cash. Simple bag-of-word approaches fail to consider the context.

The choice of oil futures brings benefits and challenges. Unlike an equity stock, at any date there is a term structure of contracts (oil futures contracts that mature at different dates). Over the sample horizon, there were periods when the term structure was upward sloping and periods when it was downward sloping. Fitting the whole term structure makes the estimation more challenging than separately estimating the dynamics of individual contracts. The estimation facilitates investigating how news affects the dynamics of the different contracts, issues that are important for pricing, hedging and risk management—see Samuelson (1965) and Trolle and Schwartz (2009).

The evidence of the impact of news on commodity prices is relatively small compared to equity and fixed income markets. Dreibus (2014) documents that wheat futures prices increased following a February 10, 2014, US Department of Agriculture report that forecasted an increase for US wheat exports. Wexler (2014) reports that prices of Arabica coffee bean futures increased by 9.1%, the largest one-day price increase since November 2004, following news that a forecasted weekend rainfall was not realized. The importance of oil to the US economy suggests that it should be sensitive to the impact of macroeconomic announcements. However, the evidence is mixed. Roache and Rossi (2010), Kilian and Vega (2011) and Chatrath et al. (2012) find little evidence of oil prices responding to announcements. In contrast, Elder et al. (2013), Amengual and Xiu (2018), Scrimgeour (2014) and Heath (2018) find that macroeconomic announcements can have an impact on oil prices and volatility.

Borovkova and Lammiman (2015) use the TRNA data to construct a daily sentiment index, characterizing negative and positive news, for all news items that refer to crude oil. An event study methodology is used to study the days when the news sentiment index occupies the extreme percentiles. They find that returns increase (decrease) for positive (negative) news. However, there is an asymmetric response to positive and negative news. The losses are larger for negative news compared to the gains from positive news.

The remainder of the paper proceeds as follows. "Data" section describes the data. "Preliminary analysis" section describes the results from the model-free preliminary analysis. In "Model" section, the model is described. "Estimation" section discusses the estimation and results. "Robustness" section describes different robustness tests, and in "Summary" section, a summary is given.

Data

Futures data

We use a dataset of Light Sweet Crude oil futures (symbol WTI) contracts traded on the Chicago Mercantile Exchange (CME). We have daily settlement prices, trading volume and open interest from January 5, 2004, to December 28, 2012. There are 2261 trading days in our sample.Footnote 10 Contracts mature approximately three weeks into the delivery month, and delivery can take place no earlier than the first calendar day of the delivery month. Consecutive months are listed for the current year and the next five years.Footnote 11 From information provided by the CME, the first six months tend to be the most liquid. For contracts with maturities greater than six months, trading is concentrated in contracts expiring March, June, September and December. (See Trolle and Schwartz (2009) for a discussion of these issues.)

We roll futures contracts before they have less than one month to maturity (approximately 22 business days), following Gorton et al. (2013). We define maturity as the first day of the delivery month.Footnote 12 This avoids the delivery period.Footnote 13 Let F(t, T) denote the futures price at time t of a contract that matures at time T, \((t\le T)\). We compute the return defined as the logarithm of \(F(t+1,T)/F(t,T),\) and provide summary statistics for the first 12 contracts in Table 1. In Panel A, we see that the standard deviation of the return decreases with the maturity of the contract. This is consistent with Samuelson’s (1965) hypothesis. Skewness is negative and decreases with maturity. Kurtosis on average decreases with maturity. These properties for the four moments are consistent with Christoffersen et al. (2016). Panel B considers the nearest to maturity contract and provides the first four moments for each year in our sample. There is variability in all four moments over the nine-year period with normalized skewness and excess kurtosis fluctuating in sign. Note that the kurtosis for year 2012 is 4.021, up from 1.770 in the previous year. This increase is attributable to one observation on June 29, when oil increased by 3.2% following positive US manufacturing data.

Table 1 Crude oil futures data

We employ the Zhang et al. (2005) methodology to estimate the daily realized volatility for the nearest to maturity contract using tick data provided by TickData. In Fig. 1, we plot price, returns, the price ratio of the sixth contract in the futures maturity term structure to the nearest to maturity contract (F6/F1) and realized volatility. Panel A shows a doubling in oil prices over the period from December 1, 2007, to January 30, 2009, followed by a collapse. Khan (2009) describes the run-up in prices as an oil bubble.) Not surprisingly, there are large fluctuations in returns (Panel B). For most of the period, the term structure of futures is upward sloping (in contango), as captured by the F6/F1 ratio being greater than one. (Panel C). Realized volatility is shown in Panel (D). During the collapse of the oil bubble period, there was a sustained period of high volatility.

Fig. 1
figure 1

Price, return, term structure and realized variance of the nearest-to-maturity contract (F1). This figure provides daily time-series data for the nearest-to-maturity futures contract (F1). Panel A plots the settlement price. Panel B plots the log return, \(ln(F_{t+1,T})/(F_{t,T})\), where \(F_{t,T}\) is the time t settlement price of the contract that matures at time T. Panel C plots the price ratio of F6 and F1, where F6 is the sixth contract in the futures maturity term structure and F1 is the nearest-to-maturity contract. The solid horizontal line at one indicates equal price, and a ratio greater (less) than one indicates an upward (downward) sloping term structure. Panel D plots realized variance calculated using intraday price data as described in the text. The sample period is January 5, 2004, to December 28, 2012 (2661 trading days)

Figure 2, Panel A (Panel B), plots the daily settle price (returns) for the nearby contract over the period December 3, 2007, to June 30, 2009. In Panel B, we identify the five largest positive (negative) returns. The table below the panels (a) provides the date of the large return; (b) identifies news events for each of these days that can explain the direction of the return; and (c) provides the returns over three days prior to the spike in the return. For example, on December 11, 2008, Saudi Arabia announced a big oil production cut. This is consistent with the spike P2 in Fig. 2, Panel B. On January 7, 2009, the US Energy Information Administration reported a large increase in oil inventories, consistent with drop in prices denoted by spike N4.

Fig. 2
figure 2

Extreme oil futures returns and associated major news event around oil bubble peak. Panel A plots the daily settlement price, and Panel B plots daily return \(ln(F_{t+1,T})/(F_{t,T})\), where \(F_{t,T}\) is the time t settlement price of the contract that matures at time T, for the nearest-to-maturity futures contract (F1). The five largest positive (P1 through P5) and the five largest negative (N1 through N5) return days labeled. The table provides the returns, \(R_t\), on these extreme days, the returns of previous three days (\(R_{t-3},R_{t-2},R_{t-1}\)) and the associated major news event(s). The time period is December 1, 2007, to June 30, 2009 (397 trading days)

Figure 3 provides a similar analysis for realized volatility. Panel A plots the realized volatility during the bubble period, and Panel B identifies the five largest positive changes in volatility. The table below the panels identifies the major news event from the time of the change and provides the level of realized volatility just prior to the jump in volatility. In all cases, the level of volatility more than doubles on the arrival of major news information.

Fig. 3
figure 3

Extreme realized variance daily changes and associated major news event around oil bubble peak. Panel A plots daily realized variance, \(\sigma ^2_t\), and Panel B plots daily realized variance changes, \(\Delta _t^{\sigma ^2}=\sigma ^2_t-\sigma ^2_{t-1}\), for the nearest to maturity futures contract, F1. The five largest positive (P1 through P5) realized variance change days are labeled on Panel B. The table provides the realized variance changes, \(\Delta _t^{\sigma ^2}\), on these extreme days, the realized variance change of the previous day, \(\Delta _{t-1}^{\sigma ^2}=\sigma ^2_{t-1}-\sigma ^2_{t-2}\), and the contemporaneous and lag 1 and lag 2 realized variances, \(\sigma ^2_{t-2},\sigma ^2_{t-1},\sigma ^2_{t}\) and the associated major news event(s). IEA is the International Energy Agency, and EIA is the US Energy Information Administration. The time period is December 1, 2007, to June 30, 2009 (397 trading days)

Interest rate data

We use 3-month Treasury bill interest rate data from the Federal Bank of St. Louis Web site. Table 2 provides summary statistics for these data. Interest rates varied greatly over our sample period, as the Federal Reserve increased rates from 2004 to mid-2007 and, in response to the credit crisis, dropped rates quickly in 2008. See Table 2, Panel A. (We discuss below the estimation results provided in Panel B.)

Table 2 Interest rate data

News data

We obtain news data from the Thomson Reuters News Analytics (TRNA) database for the period January 5, 2004, to December 28, 2012. Each news item has a time stamp indicating the day and time the news was published.Footnote 14 TRNA uses an algorithm to analyze each news item. With each news item, it associates a probability \(p^{+}\) that the news has positive sentiment, a probability \(p^{-}\) that it has negative sentiment and a probability \(1-p^{+}-p^{-}\) the news has neutral sentiment. The analysis also generates a relevance score defined between zero and one. It is calculated by comparing the relative number of occurrences of the asset and the number of occurrences of other organizations and commodities. If the asset is mentioned in the headlines, the relevance is set equal to one. The news items are classified as alerts, articles, append or overwrite.

In order to focus only on major oil news, we filter the dataset to retain only news items with relevance equal to one. In an effort to focus on news that might affect oil prices rather than on reporting about oil prices, we delete any news items with price in the title. Each news item is classified as positive (negative, neutral) if \(p^{+}\) ( \(p^{-},1-\)\(p^{+}-\)\(p^{-}\)) is greater than the other two probabilities.

We compare the distribution of unfiltered news with the distribution of our filtered news. See Table 3 Panel A for summary statistics. The numbers in parenthesis indicate the percentage of news items in each category as a percentage of the total number under the current filtering condition. The unfiltered news has 808,958 items in total (358 per day), with 370,990 (164 per day) positive news items and 303,701 (134 per day) negative news items; after filtering, we have 519,877 (230 per day) news items, with 216,185 (96 per day) positive news items and 206,432 (91 per day) negative news items. Filtering has a substantial impact on the size of the dataset.

Table 3 News data with relevance filter

Table 3 Panel B evaluates how filtering the news affects its explanatory power on returns and realized variances. We compare unfiltered news with news filtered for relevance. The column names denote the dependent variables \((Z_{t})\), either the daily returns or daily log change in realized variance. The row names indicate the adjusted \(R^{2}\) and diagnostic statistics with different explanatory variables structures: \(R_{A}^{2}\) is the \(R^{2}\) measure of the basic AR(1) model: \(Z_{t}=a_{0}+a_{1}Z_{t-1}+ \epsilon _{t}\). Next, we add the number of contemporaneous positive and negative news items in a given day: \(Z_{t}=a_{0}+a_{1}Z_{t-1}+b^{+}n_{t}^{+}+b^{-}n_{t}^{-}+\epsilon _{t}\). The adjusted \(R^{2}\) is denoted by \(R_{AN}^{2}\). The change in the \(R^{2}\) compared to the AR(1) base case, denoted by the term, \(\Delta _{+N}\), is positive for both return and realized variance. The Wald statistic \((W_{+N})\) tests whether the effect of adding the news factors is positive or negative and is described by an F distribution with degrees of freedom \((2,N_{Obs}-3)\), where \(N_{Obs}\) is the number of observations. The Wald statistic is positive and statistically significant at the 1% level for both the return and realized variances, implying that the news terms are important explanatory variables. The term \(R_{AL}^{2}\) denotes the adjusted \(R^{2}\) with the lagged news, \(Z_{t}=a_{0}+a_{1}Z_{t-1}+b^{+}n_{t-1}^{+}+b^{-}n_{t-1}^{-}+\epsilon _{t}\). The change, \(\Delta _{+L}\), in the \(R^{2}\) compared to the AR(1) base case, is negative for returns and positive for realized variance. Lagged news seems to have little predictability for returns and a small impact on realized variance. The Wald statistics \((W_{+L})\) are positive for both return and realized variances but are not significant.

Many news items have little real news content. To avoid being overwhelmed with noise, we apply a probability threshold filter. We only classified a news item as having positive (negative) tone if the probability value is greater than or equal to a specified threshold, \(p^{+}\ge \pi\)\((p^{-}\ge \pi )\). In Table 4 Panel A, we investigate the impact of this filter. We report the daily average number of news items in each category for various sentiment probability thresholds. Because the TRNA algorithm assigns to each news item a nonzero probability that the item is negative, neutral or positive, the maximum value of each of these probabilities is less than one. In each of these columns, the number in parenthesis is the ratio of the number of news items in that category relative to the number of news items in the same category for the unfiltered data (provided in the first row). For example, when the threshold is set at 0.45, the number of positive news items is reduced to \(93\%\) of the original number and the number of negative news items reduced to \(92\%\) of the original number. By picking a large value for the threshold, we are left with a sample of news items with a high probability of being either a major positive or major negative news item. If the threshold is set at 0.75, there are 31 (33) major positive (negative) news items per day, on average. The Spearman correlation between the positive and negative news time series, reported in the final column, becomes statistically insignificant from zero when the threshold is greater than 0.55, implying orthogonality.

The consequences of increasing the threshold are analyzed further in Panels B and C. To provide a benchmark, we draw upon the results reported in the first row of Table 3 Panel B, where we estimate a simple AR(1) process with no news terms included, \(Z_{t}=a_{0}+a_{1}Z_{t-1}+\epsilon _{t}\). In Table 4 Panels B and C, we consider the impact of news using \(Z_{t}=a_{0}+a_{1}Z_{t-1}+b^{+}{\tilde{n}}_{t}^{+}+b^{-}{\tilde{n}} _{t}^{-}+\epsilon _{t},\) where \({\tilde{n}}_{t}^{+}\)\(({\tilde{n}}_{t}^{-})\) is the number of positive (negative) news items, given a threshold \(\pi\). As the threshold increases, implying that we are filtering out more noise, the \(R^{2}\) increases for the returns, and the Wald statistic, \(W_{+N}\), increases in size and is statistically significant. Similar results are seen in Panel C for the realized variance, although the \(R^{2}\) does not increase after the threshold reaches 0.65. The regression results are insensitive to the threshold when lagged news items are used.

Table 4 News data with sentiment filter

Based on this analysis, we use a threshold of \(\pi =0.75\) to identify major news data. With \(\pi =0.75\), we have 69, 173 (31 per day) positive news items and 73, 821 (33 per day) negative news items. Summary statistics are provided in Table 5 Panel A. Looking at the 50th quantile in Panel A, from 2004 to 2007, there is more positive than negative oil news. From 2010 to 2012, there is more negative than positive oil news. In Robustness section, we examine the implications of using a lower threshold of \(\pi =0.65\) . The estimation results provided in Panel B are discussed below.

Table 5 Major news data

Preliminary analysis

Each day a random number of news items occur. We want to measure only major news and determine whether the aggregate over the day has been positive or negative. We define Net News over day t as

$$\begin{aligned} \text {Net News}_{t}\text { } & \equiv \sum _{j}1_{(p_{j}^{+}\ge \pi )}-1_{(p_{j}^{-}\ge \pi )} \nonumber \\ & \equiv \text {Positive News}_{t}-\text { Negative News}_{t} \end{aligned}$$
(1)

where the summation is over all major news items that arrived during day t. We can separate Net News into aggregate positive news and negative news. Using the nearest-to-maturity futures contract, we examine how return and realized volatility are affected by Net News. In Fig. 4 Panels A1-A3, we examine how the distribution of returns varies with Net News using box plots.Footnote 15 Panel A is for the full sample period (January 5, 2004–December 28, 2012), Panel B for the oil bubble period (January 5, 2004–June 30, 2009) and Panel C is the post-bubble period (July 1, 2009–December 28, 2012). The first quartile has the lowest Net News and the last quartile the highest Net News. We observed that the median of the return distribution increases with Net News: negative Net News is associated with low returns and positive Net News with high returns on average. While not surprising, it does provide confidence in the classification of negative and positive news articles. For equity returns, Dzielinski and Hasseltoft (2013) and Sinha (2016) find a similar relationship.

Fig. 4
figure 4

Contemporaneous daily return and realized volatility versus daily net major news quantiles. This figure plots the distribution of daily return (Panels A1, A2 and A3) and realized volatility (Panels B1, B2 and B3) for the nearest to maturity futures contract (F1) versus daily net major news count quantiles for the full sample period, the bubble period and the post-bubble period. Daily return is \(ln(F_{t+1,T})/(F_{t,T})\), where \(F_{t,T}\) is the time t settlement price of the contract that matures at time T. Realized variance is calculated using intraday price data as described in the text. Boxplots give the 25th (Q1) and 75th (Q3) percentile of net major news count with upper and lower end of the whiskers representing \(\mathrm{min}(\mathrm{max(data)},Q3+1.5IQR)\) and \(\mathrm{max}(\mathrm{min(data)},Q1-1.5IQR)\), respectively, and \(IQR=Q3-Q1\). The full sample period is January 5, 2004, through December 28, 2012 (2661 trading days), the bubble period is January 5, 2004, through June 30, 2009 (1378 trading days), and the post-bubble period is July 1, 2009, through December 28, 2012 (883 trading days)

In Panels B1–B3, the relationship between realized variance and Net News is less clear. For the full sample period, panel B1, the median for the first quartile, 0.0566, is the largest and the fourth quartile median, 0.0501, is the smallest. For the oil bubble period, panel B2, the first quartile median, 0.0824, is largest and the fourth quartile median, 0.0530, the smallest. For the post-bubble period, panel B3, the first quartile median, 0.0456, is the largest and the third quartile median, 0.0365, the smallest. These results suggest a negative correlation between Net News and realized variance. The support of the distribution for the first quartile is greater than the other quartiles, indicating that there were some large values of realized variance. Negative Net News appears to be associated with higher realized variance than positive Net News, suggesting an asymmetric response of realized variance to positive and negative news. Panels A and B suggest that the impact of positive and negative Net News on returns and volatility is asymmetric. Borovkova and Lammiman (2015) find asymmetric responses in oil futures returns and volatility to positive and negative news.Footnote 16 Amengual and Xiu (2016) find that negative jumps in volatility, often occurring due to policy changes, play an important role in equity markets.

Volatility clustering

We address the question of whether persistence in volatility is affected by adding measures of news arrival. We use as a benchmark an autoregressive model and then compare it to an autoregressive model with contemporaneous news. Our analysis is similar to that used by Engle et al. (2012), with one important difference: we distinguish between positive and negative news, rather than aggregating all news. If information arrival is clustered, then this will lead to clustering in volatility, reducing the persistence of past volatility.

Table 6 shows how news can help explain volatility clustering, by checking the half-life of impulse response function. First, we check the basic autoregressive models on daily realized variance: \(RV_{t} = \sum _{i=1}^{5}a_{i}RV_{t-i}+\epsilon _{t}\) denoted as AR(5).Footnote 17 We calculate the adjusted \(R^{2}\), the half-life, the log-likelihood and the corrected AIC for the model. Half-life here means the time for the magnitude of the forecast to become half that of the initial forecast.Footnote 18 The first column shows that the half-life is 36 days.


The term \(A_{+}^{5}\) indicates a model with the original autoregressive structure, but with a contemporaneous positive news factor added, \(RV_{t}=\sum _{i=1}^{5}a_{i}RV_{t-i}+b^{+}n_{t}^{+}+\epsilon _{t}\). The term \(A_{-}^{5}\) is the AR(5) model plus contemporaneous negative news and \(A_{+ \& -}^{5}\) means the AR(5) model with both positive and negative contemporaneous news added. The models \(A_{L+}^{5}\), \(A_{L-}^{5}\) and \(A_{L+ \& -}^{5}\) are similarly defined but using news factors at lag 1 \((n_{t-1}^{+/-})\).

We can observe that adding contemporaneous news reduces the half-life of volatility shocks, indicating that news can help explain the clustering in the realized variance. The effect is asymmetric: the effect for positive news is very small, but for negative news the half-life is reduced by \(8.3\%\). For lagged positive news, there is no change, and for lagged negative news, half-life is reduced by \(2.8\%\).

Our results suggest that volatility clustering is related to information arrival and that the failure to include news effects in standard volatility models results in overestimated persistence. This is not surprising, given the results in Goodhart et al. (1993), Andersen and Bollerslev (1998) and Engle et al. (2012). However, unlike extant literature, we show that the impact of news on persistence in volatility depends on whether the news is positive or negative.

Table 6 Realized variance clustering and major news

Term structure effects

The impact of news on both return and variance should depend on the maturity of the futures contract. We examine possible term structure effects in Table 7. Panel A reports the Pearson correlation between the number of news items and the return and the squared return for the futures contracts F1 to F12. The standard errors are given in parenthesis. For positive news, the correlations for return and squared return are positive and significant. For negative news, the correlations for return are negative and significant and those for squared return are positive and significant. The correlations tend to decrease in absolute magnitude as contract maturity increases. This is consistent with the Samuelson's (1965) hypothesis that short-term contracts are more affected by news than long-term contracts. Comparing the orders of magnitude, the correlations for negative news are approximately twice the size of the correlations for positive news. For neutral news, the correlations for either returns or squared returns are positive, though insignificant. These results affirm the conclusion that there is an asymmetric response to positive and negative news.

In Panel B, we estimate the Pearson correlation between positive, neutral and negative news. The correlation between positive and negative news is insignificant from zero. These findings are consistent with the results in Table 4 Panel A. The correlation between both positive (negative) and neutral news is positive and significant.

Table 7 Major news and price movement correlations

Model

In the previous sections, we show, in a model-free environment, that 1) negative (positive) news is associated with lower (higher) returns and higher (lower) realized variance, 2) information arrival, especially negative news, plays a role in explaining volatility clustering and 3) the term structure of futures prices is impacted, asymmetrically, by negative and positive news with decreasing magnitude as contract maturity increases. In this section, we develop a reduced-form model that incorporates these findings to price oil futures.

We consider the pricing of futures contracts in a discrete-time economy. Over a horizon [0, T], we consider m intervals of length \(\Delta\), where \(m\Delta =T\).Footnote 19 The commodity spot price is denoted by \(S_{t}\), and we define \(X_{t}=\ln (S_{t})\). The dynamics of \(X_{t}\) under the natural probability measure P are described byFootnote 20

$$\begin{aligned} X_{t+1}-X_{t}=(r_{t}-\delta _{t})\Delta -l_{t}+e_{t+1}^{X} \end{aligned}$$
(2)

where \(X_{t+1}\) represents the value of X at time \(t+\Delta\),\(\ r_{t}\) denotes the instantaneous risk-free rate of interest, \(\delta _{t}\) the convenience yield and \(exp(l_{t})\equiv E_{t}^{Q}[\exp (e_{t+1}^{X})]\), with the expectation under the pricing measure Q. The random variable \(e_{t+1}^{X}\) is defined below. Expression (2) implies

$$\begin{aligned} E_{t}^{Q}[S_{t+1}]=S_{t}\exp [(r_{t}-\delta _{t})\Delta ] \end{aligned}$$
(3)

This is the discrete-time equivalent of the continuous-time expression (4) in Casassus and Collin-Dufresne (2005).

Each day a random number of major news items arrive. Some major news items are more important than others; that is, they have a greater impact on price. Ideally, we would like to incorporate this type of filtering by having separate processes for the different types of information events. Due to the challenges of estimation, we decided to leave this type of filtering to future research. During any period, say \([t-1,t)\), there are \(n_{t}^{+}(n_{t}^{-})\) positive (negative) news items. Define the set \(A\equiv (+,-)\), where the elements refer to positive and negative news, respectively. Following Press (1973), we model the structure of asset returns as being a mixture of a conditional normal distribution and two Poisson compound processes, one for positive news and one for negative news. The arrival processes of the Poisson distributions are taken as exogenous. A direct implication of the compound processes is that the volatility of the return is directly affected by the frequencies of the arrival processes (see Clark 1973; Ross 1989 and Andersen 1996). The results in Engle et al. (2012) are consistent with this type of model. We assume that for the conditional normal process, the variance is described by a GARCH process. Our model differs from the model described in Christoffersen et al. (2016). They assume a normal distribution with variance governed by a GARCH process and a single Poisson compound process. The arrival process for the Poisson distribution is endogenous; it is not explicitly identified with news arrival.

The arrival of each type of news item is modeled as a Poisson process with intensity \(\lambda _{t}^{j}\) per unit time, \(j\in A\). The intensity \(\lambda _{t}^{j}\) can vary over time, depending on news cycle. As an information event develops, the number of news items usually increases, reaches plateaux and then fades. Major information events typically evolve, generating new series of news articles. To model this phenomenon, we assume

$$\begin{aligned} \lambda _{t}^{j}=y_{0}^{j}+y_{1}^{j}n_{t}^{j} \end{aligned}$$
(4)

where \(n_{t}^{j}\) is the number of jumps recorded over the interval up to time t and is measurable at time t and \(y_{0}^{j}\) and \(y_{1}^{j}\) are nonnegative constants. This implies that there is a base rate of information arrival, \(y_{0}^{j}\). The term \(y_{1}^{j}n_{t}^{j}\) implies that the news arrival intensity changes over a news cycle and this will affect the moments of the distribution of returns. The impact of each news item on the level of the futures prices is described by \(\theta _{k}^{j}\), \(k=1,\ldots .,n_{t}^{j}\), \(j\in A\), where \(\theta _{k}^{j}\) is assumed to be a normally distributed random, \(\theta _{k}^{j}\sim N(\mu _{j},\sigma _{j}^{2})\) and \(\{\theta _{k}^{j}\}\) independent. News from particular sources and about particular events may have greater impact on price compared to other news items that have the same probability of being positive (negative). We did not explicitly consider how the “equality” of a news item affects returns, as this also entails modeling the arrival process of quality news items and the impact on returns, greatly increasing the number of parameters requiring estimation. To accommodate the fact that different types of news articles may have different impacts on the return distribution, we allow the impact of a news item to be described by a random variable. Consequently, if the news items were not equivalent, this would be reflected in the moments of the distribution. The impact of news articles during any finite period is unlikely to independent. The assumptions of independence and the same distribution were made for the sake of mathematical simplicity. It is expected that the mean for positive news will be greater than the mean for negative news: \(\mu _{+}>\mu _{-}\).

We assume that the error term \(e_{t+1}^{X}\) depends on a stochastic volatility term described by a GARCH(1, 1) process and a term that depends on the impact of the jumps caused by the random arrival of news items. We write this in the form

$$\begin{aligned} e_{t}^{X}=e_{t}^{V}+e_{t}^{J} \end{aligned}$$
(5)

where

$$\begin{aligned} e_{t}^{V}=\sqrt{h_{t-1}\Delta }{\bar{e}}_{t}^{V} \text { and } e_{t}^{J}=\sum _{k=1}^{N_{t}^{+}}\theta _{k}^{+}+\sum _{k=1}^{N_{t}^{-}}\theta _{k}^{-} \end{aligned}$$

and \({\bar{e}}_{t}^{V}\sim N(0,1)\) and \(e_{t}^{J}\) represents the sum of two compound processes: one for positive news and one for negative news. The variance, \(h_{t-1}\), is described by a GARCH(1, 1) process, following Heston and Nandi (2000),

$$\begin{aligned} h_{t}=w_{h}+b_{h}h_{t-1}+a_{h}({\bar{e}}_{t}^{V}-c_{h}\sqrt{h_{t-1}})^{2} \end{aligned}$$
(6)

where \(w_{h}\), \(b_{h}\), \(a_{h}\) and \(c_{h}\) are constants. Note this incorporates mean reversion of the volatility as a special case. Define \(U_{t}^{j}=\sum _{k=1}^{N_{t}^{j}}\theta _{k}^{j}\), \(j\in A\), so that

$$\begin{aligned} e_{t}^{J}=U_{t}^{+}+U_{t}^{-}. \end{aligned}$$
(7)

This implies that the net impact of negative and positive news stories will impact the distribution of returns. This is consistent with the model in Engle et al. (2012), who assume that the effect of past news on volatility is due to investors taking time to process extant public information.


The dynamics of the convenience yield are described by

$$\begin{aligned} \delta _{t+1}-\delta _{t}=[\kappa (\alpha -\delta _{t})+\psi _{\delta }X_{t}+\psi _{h}h_{t}]\Delta +e_{t+1}^{\delta } \end{aligned}$$
(8)

where \(\kappa\), \(\alpha\), \(\psi _{\delta }\) and \(\psi _{h}\) are constants and \(e_{t}^{\delta }=\beta _{V}^{\delta }e_{t}^{V}+\beta _{J}^{\delta }e_{t}^{J}+{\bar{e}}_{t}^{\delta }\), \({\bar{e}}_{t}^{\delta }\sim N(0,\Delta \sigma _{\delta }^{2})\) and \(\beta _{V}^{\delta }\) and \(\beta _{J}^{\delta }\) assumed to be constants. The term \(\beta _{V}^{\delta }\) measures the sensitivity to the shocks affecting the asset price. From the work of Liu and Tang (2011), who present evidence of heteroskedasticity in the convenience yield for oil, we would expect the coefficient \(\beta _{V}^{\delta }\) to be positive and statistically significant. When storage is near its lower bound, the spot price and convenience yield should be high, implying that the coefficient \(\Psi _{\delta }\) should be positive (see Casassus and Collin-Dufresne 2005). If the volatility \(h_{t}\) increases, as might be the case if storage is low, the convenience yield will increase if \(\psi _{h}\) is positive, implying that the spot price increases relative to the futures price. Geman and Nguyen (2005) use a similar specification.

The dynamics of the spot interest rate, r, are described by

$$\begin{aligned} r_{t+1}-r_{t}=a(m-r_{t})\Delta +{\bar{e}}_{t+1}^{r} \end{aligned}$$
(9)

where a and m are parameters and \({\bar{e}}_{t}^{r}\sim N(0,\Delta \sigma _{r}^{2})\). A similar specification is used in Schwartz (1997).

A summary of the covariances between the spot price, convenience yield and spot interest rate is given below

$$\begin{aligned} e_{t}^{r}= & {} {\bar{e}}_{t}^{r}\nonumber \\ e_{t}^{\delta }= & {} \beta _{V}^{\delta }e_{t}^{V}+\beta _{J}^{\delta }e_{t}^{J}+{\bar{e}}_{t}^{\delta } \nonumber \\ e_{t}^{X}= & {} e_{t}^{V}+e_{t}^{J} \end{aligned}$$
(10)

where \({\bar{e}}_{t}^{r}\), \(e_{t}^{V}\), \(e_{t}^{J}\) and \({\bar{e}}_{t}^{\delta }\) are assumed to be independent. We need to derive results under the pricing measure Q. Following Le et al. (2010), the Radon–Nikodym derivative is assumed to be

$$\begin{aligned} \frac{\Delta Q}{\Delta P}=\frac{\exp (-\Lambda ^{\prime }\nu _{t+1})}{ L(\Lambda ;\nu _{t})} \end{aligned}$$
(11)

where \(L(\Lambda ;\nu _{t})=E_{t}^{P}[\exp (-\Lambda ^{\prime }\nu _{t+1})]\); \(\nu _{t}=(e_{t}^{V}\), \({\bar{e}}_{t}^{r}\),\({\bar{e}}_{t}^{\delta }\), \(U_{t}^{+}\), \(U_{t}^{-}\)) is a (5, 1) vector of error terms and \(\Lambda\) is a (5, 1) vector of prices of risk, assumed constant.

Consider a futures contract that matures at time T. At time t\((t\le T)\) with n intervals to the maturity date, the futures price F(t, T) is given by

$$\begin{aligned} F(t,T) & = E_{t}^{Q}[S_{T}]\\ & = exp \left( B_{n}+D_{n}^{X}X_{t}+D_{n}^{r}r_{t}-D_{n}^{\delta }\delta _{t}+G_{n}^{h}h_{t}+\sum _{j\in A}D_{n}^{j}n_{t}^{j}\right) \nonumber \end{aligned}$$
(12)

See "Appendix 1" for the derivation and the definitions of the coefficients. Unlike the jump model described in Casassus and Collin-Dufresne (2005), where the effects of jumps only appear in the constant term, in (12), the impact of jumps appear both in the constant term and with the state variables \(\{n_{t}^{j}\}\). Given the asymmetric response to positive and negative news, see Borovkova and Lammiman (2015), one would expect \(D_{n}^{+}>D_{n}^{-}\). The coefficient \(D_{n}^{r}\) is positive if \(\alpha \le 1\). In Schwartz (1997), interest rates have a positive effect on futures prices. The coefficient \(D_{n}^{\delta }\)is positive if \(\kappa \le 1\). One would also expect the coefficient \(D_{n}^{X}\) to be positive.

To examine the factors that affect the variance of the spot price at time t, we use \(X_{t}=\ln \left( S_{t}\right)\), so that

$$\begin{aligned} \mathrm{var}_{t}(X_{t+1})=h_{t}\Delta +\sum _{j\in A}(\mu _{j}^{2}+\sigma _{j}^{2})\lambda _{t}^{j}\Delta \end{aligned}$$

The spot volatility depends on the volatility \(h_{t}\) and the frequency of the different jumps and the distribution describing the jumps. To examine the variance of the futures price, consider \(\ln (F(t,T))\), so that

$$\begin{aligned} \mathrm{var}_{t}(R_{t+1})= & {} 2(G_{n-1}^{h}a_{h})^{2}+h_{t}(2G_{n-1}^{h}a_{h}c_{h}-u_{n-1}^{V}\sqrt{ \Delta })^{2} \nonumber \\&\quad +\, (D_{n-1}^{r})^{2}\sigma _{r}^{2}\Delta +(D_{n-1}^{\delta })^{2}\sigma _{\delta }^{2}\Delta \nonumber \\&\quad +\, \sum _{j\in A}\lambda _{t}^{j}\Delta [(D_{n-1}^{j}+\mu _{j}u_{n-1}^{J})^{2}+(\sigma _{j}u_{n-1}^{J})^{2}] \end{aligned}$$
(13)

where \(u_{n-1}^{V}=D_{n-1}^{X}-\beta _{\delta }^{V}D_{n-1}^{(\delta )}\), and \(u_{n-1}^{J}=D_{n-1}^{X}-\beta _{\delta }^{J}D_{n-1}^{\delta }\). All derivations are given in "Appendix 2." If \(\alpha \le 1\), and \(\kappa \le 1\) , the coefficients \(D_{n}^{r}\) and \(D_{n}^{\delta }\) will decrease in magnitude as n decreases. An inverse relation between variance and maturity is consistent with Samuelson's (1965) hypothesis that volatility of futures prices decreases with maturity. Here, the net effect will depend on the stochastic volatility term \(h_{t}\) and the frequency and size of the jumps. Both \(h_{t}\) and \(\lambda _{t}^{j}(=y_{0}^{j}+y_{1}^{j}n_{t}^{j})\) vary stochastically over time. In high news periods, the variance is expected to increase in magnitude. A similar comment applies to kurtosis. For skewness, the net effect depends on the relative magnitude of the number of negative news articles and their impact on return \((\mu _{-})\) compared to the number of positive news articles and their impact on the return \((\mu _{+})\). "Appendix 2" provides expressions for these higher moments.

Estimation

In this section, we describe the steps in the estimation of the model. To simplify the estimation process and reduce the number of parameters, we need to simultaneously estimate using Kalman filtering, we separately estimate, under the P probability measure, the interest rate process and the news arrival processes. The total number of parameters that must be estimated for the full model is 20. To reduce the number of parameters and aid empirical identification, we set the coefficient \(\psi _{\delta }\) and \(\psi _{h}\) to zero in the expression for the convenience yield. The parameters of the interest rate (9) were estimated using maximum likelihood. These estimation results are provided in Table 2 Panel B. For the full sample and the first subperiod, all coefficients are positive and statistically significant. For the post-bubble period, from July 1, 2009, to December 28, 2012, the Treasury rate was relatively flat and the only statistically significant coefficient is the volatility.

For the intensities of the arrival processes for negative and positive news, we count the number of positive news items \(n_{t}^{j}\) over the period \([t-\Delta ,t)\), \(j\in A\). The intensity is described by

$$\begin{aligned} \lambda _{t}^{j}=y_{0}^{j}+y_{1}^{j}n_{t}^{j} \end{aligned}$$

where \(y_{0}^{j}\) and \(y_{1}^{j}\) are constants. Maximum likelihood is used to estimate the coefficients under the P measure. The results are shown in Table 5 Panel B. All the coefficients are positive and highly significant over the full sample period and both subperiods. The fact that the coefficients \(y_{1}^{j}\), \(j\in A\) are significant implies that the number of news items arriving in any given period directly affects the futures prices (see expression (12)) and all the moments of the return distribution. From the summary statistics in Table 4 Panel A, on average there are 31 positive news items per day and 33 negative news items. It follows that on average over the full sample period, the average intensity of positive news is

$$\begin{aligned} \lambda _{t}^{+}=20.225+(0.339)(31)=30.734 \end{aligned}$$

and of negative news is

$$\begin{aligned} \lambda _{t}^{-}=18.573+(0.424)(33)=32.546. \end{aligned}$$

This implies that on average there is a higher probability of negative news arrival than positive news.

The remaining parameters, including prices of risk for the interest rate and compound processes, were estimated under the Q probability measure using Kalman filtering. The precise details of the Kalman filter are described in "Appendix 3." To describe the evolution of the variance \(\{h_{t}\}\), we need to either estimate \(\{{\bar{e}}_{t}^{V}\},\) see (6) or directly estimate the variance. In addition, we also need to estimate parameters for the two compound processes. To aid identification, we assume that for the nearest-to-maturity futures contract, which is the most liquid contract, we can estimate the variance of the return without error. Let \(\sigma _{t,F1}^{2}\) denote the realized variance of the return for contract F1, then by assumption

$$\begin{aligned} \mathrm{var}_{t}(R_{t+1})=\sigma _{t,F1}^{2}. \end{aligned}$$
(14)

Using expression (13), given the parameter values and \(\lambda _{t}^{+}\) and \(\lambda _{t}^{-}\), we can solve for \(h_{t}\).

The estimation results are shown in Table 8. We first estimate a model ignoring the impact of news arrival, labeled No News, and then estimate a model that includes the effects of news, labeled With News. The constant \(w_{h}\) and the autoregression coefficient, \(b_{h}\), for the GARCH process are statistically significant. The \(a_{h}\) coefficient is not statistically significantly different from zero. The unconditional volatility is 0.368, similar in magnitude to values reported in the literature. The volatility \(\sigma _{\delta }\) and speed of mean reversion \(\kappa\) of the convenience yield process are statistically significant. The covariance term \(\beta ^{V}\) is positive and statistically significant, implying that innovations in the (logarithm) of the spot price are positively correlated with innovations in the convenience yield.

Table 8 Futures pricing model estimation

If the random arrival of news items impacts prices, we would expect the GARCH process to play a less significant role when news information is included. This is indeed the case: the unconditional variance decreases to 0.305 from 0.368. The incorporation of news explains approximately 19.56% of the variance of return; negative news explains 10.37% of the variance and positive news explains 9.19% of the variance. News arrival does affect the price dynamics in a nontrivial way. It affects all the moments of the distribution and consequently hedging ratios.

The expected jump size for positive news, \(\mu ^{+},\) is positive though not statistically significant. For negative news, the expected jump size \(\mu ^{-}\) is negative and statistically significant. In absolute terms, the expected jump size and standard deviation for negative news are larger than those for positive news. These results are also consistent with the finding in Fig. 4 Panel A, which suggest that the mean for positive (negative) news should be positive (negative). Our results show that negative news and positive news have an asymmetric impact on returns. This is apparent when we calculate the means and volatilities of the two compound processes.Footnote 21 For negative news, the mean of the compound process is \(-\,0.1905\) and the volatility is 0.1533. For positive news, the mean is 0.0775 and the volatility 0.1438.Footnote 22 The sum of the means of the negative and positive news compound processes is \(-\,0.1130\), implying that over the estimation period, news had on average a negative impact. The mean of the compound process reported in Christoffersen et al. (2016) is of similar magnitude, \(-\,0.07953\).Footnote 23

Both Poisson compound processes affect the distribution of returns, with negative news playing a more significant role. In Fig. 4, Panel B, there is evidence of negative correlation between the net number of news articles and volatility. These two results are consistent with results in Trolle and Schwartz (2009) who find volatility and futures returns are negatively correlated and the existence of two unspanned volatility factors.

The log-likelihood for the model With News is larger than the likelihood for the model with No News, and the difference is statistically significant with a p value approximately equal to zero. The inclusion of news decreases the Akaike Information Criterion (corrected for additional variables), implying a better fit.Footnote 24

We examine the impact of news on volatility using the estimates in Table 8. Using the No News model, we estimate the conditional variance due to GARCH using filtered values and compare this to the conditional variance due to GARCH estimated in the using the With News model. The No News model neglects the impact of news, and consequently, the GARCH process would be expected to increase (relative to the With News model), if the impact of news articles is important. This difference will increase as the variance generated by the arrival of important news increases. We did the analysis for all 12 contracts (F1,..., F12). The results were similar for all contracts, so we randomly picked F3 for reporting. The results are shown in Fig. 5 Panel A. The estimates from the No News model are larger than those of the With News Model, implying that news arrival does affect volatility.

Fig. 5
figure 5

Paired plot of daily conditional variance from GARCH process and total conditional variance. Panel A plots the natural logarithm of daily conditional variance for the F3, the third nearest to maturity contract, determined using the stochastic volatility factor from the futures pricing model estimated with news variables versus the same series from the model estimated without news. Panel B plots the natural logarithm of daily total conditional variance for F3 from the futures pricing model estimated with news variables versus the same series from the model estimated without news

In Panel B, we compute the total conditional variance (see expression (13)) for the two models. The conditional variance from the interest rate process is the same for both models. With the No News model, we must estimate the conditional variance of the convenience yield. For the With News model, we must estimate the conditional variance of the convenience yield and the news processes. In Panel B, it is seen that the total variance of the With News model can be larger than the total variance of the No News model. This will occur if the variance generated from the arrival of news information is relatively large, as the variance of the convenience yield is similar for both models. If the conditional variance of the No News model is relatively large, it is seen that there is little difference between the conditional variances of the two models, the contribution from the news processes being relatively small.

In summary, the takeaway in Fig. 5 is that news is an important determinant of volatility. Engle et al. (2012) find similar results for equity returns.

In Table 9, we examine the ability of the model to reproduce the summary statistics of the data sample shown in Table 1. In Panel A, we compare the first four moments of the returns across the term structure of futures prices. For the first moment, the model estimate is very close to the values reported in Table 1 Panel A. For the second moment, the model overestimates for the first three contracts and underestimates for the remaining contracts. On average, it underestimates. For skewness, the model tends to underestimate. For kurtosis, the model overestimates for the first seven contracts and underestimates for the remaining contract.

Table 9 Modeled crude oil futures prices

In Panel B, we estimate the first four moments of the nearest to maturity contract, F1, for each year in the 9-year period of 2004 to 2012. From Table 1, signs for normalized skewness and excess kurtosis vary over this period. The model tends to overestimate the variance. For skewness, the model correctly predicts the sign for all years except year 2006. For kurtosis, the model incorrectly predicts the sign twice and there is overestimation over the whole period.

Robustness

We consider three robustness tests: 1) subperiod estimation, 2) sensitivity of the estimation results to the probability threshold value used to define major negative and positive news and 3) the ability of our model to estimate futures prices out of sample.

Subperiod estimation

The price of oil was below $40 a barrel at the beginning of our sample period; within 5 years, it was over $140 and then fell to below $50 and was trading around $80 a barrel at the end of the sample period. (See Fig. 1 Panel A.) The slope of the term structure of futures prices (Fig. 1 Panel C) also varied. It was downward sloping during 2004 to mid-2005 and upward sloping for most of the remaining period. We examine how this price variation affects parameter estimation.

In Table 10, the model is estimated over the two subperiods: the oil bubble period (January 5, 2004, to June 30, 2009) and the post-bubble period (July 1, 2009 to December 28, 2012). From Fig. 1 Panel A, the oil bubble period is characterized by raising oil prices and a dramatic drop at the end of the period. It is also observed in Fig. 1 Panel D that this turbulent period is a period of high variance clustering. During the post-bubble period, oil prices trend upward and there is lower clustering.

Table 10 Futures pricing model subsample estimation

Comparing the No News model over the two periods, the \(w_{h}\) coefficient of the GARCH is larger during the oil bubble period than in the post-bubble period. The unconditional variance over the first period is 0.569 and during the second period 0.515. For the With News model, the autoregressive coefficient \(b_{h}\) increases during the second period. The unconditional volatility increases from 0.459 to 0.484 during the second period. The change in the flow of negative and positive news items is reflected in the changes in the coefficients for the intensities for the arrival of negative and positive news during these periods—see Table 5 Panel B. During the bubble period, the average arrival intensity for positive news is slightly larger than that of negative news. In the post-bubble period, the average arrival intensities undergo a substantial decrease. The average arrival intensity for negative news is larger than that for positive news. In Table 5, Panel A, it is observed that on average there are more negative than positive news items. The expected jump size \(\mu ^{+}\) and volatility \(\sigma ^{+}\) for positive news are slightly larger in magnitude during the oil bubble period than in the post-bubble period. The reverse occurs for negative news: the expected jump size for negative news \(\mu ^{-}\) and volatility \(\sigma ^{-}\) are substantially larger in absolute magnitude during the oil bubble period than in the post-bubble period. The expected value (variance) of the sum of the compound processes for negative and positive news is − 0.0812 (0.0625) during the oil bubble period and − 0.0278 (0.0309) during the post-bubble period. News explains 24.29 percent of the annual volatility during the oil bubble period and 18.87 in the post-bubble period. For both the No News and With News models, the RMSE for all contracts is larger during the oil bubble period than in the post-bubble period. In both periods, the RMSE is smaller for the With News model compared to the No News model.

Major news filter

We consider the sensitivity of the parameter estimates to a lower probability threshold, namely 0.65. As seen in Table 4 Panel A, this lower threshold substantially increases the number of news items classified as major negative or major positive news and, consequently, the number neutral news items decreases. The increase in the number of news items now classified as major news implies that these news items have a lower probability of actually being major news. This will likely reduce the absolute magnitude of the mean jump sizes, and the intensities for the arrival processes for negative and positive news should increase. With fewer news items classified as noise, the unconditional volatility of the GARCH process might be expected to decrease. The effect on the other model parameters should be small.

The results are shown in Table 11. From Table 4 Panel A, the correlation between negative and positive news is still zero with a threshold of 0.65. We make the stronger assumption of independence. The parameters (\(y_{0}\) and \(y_{1}\)) for the arrival processes for the negative and positive news increase, as expected. For positive news, the mean jump size \(\mu ^{+}\) decreases in magnitude as the threshold decreases. Similarly, for negative news, the absolute magnitude of the mean jump size \(\mu ^{-}\) decreases. The expected value of the compound process for positive news increases from 0.0775 to 0.0812, even though the mean of the jump size decreases.Footnote 25 This is because the average number of positive news items increases from 31 to 54 as the threshold decreases. This increase, however, is insufficient to increase the variance of the compound process.

For negative news items, the magnitude of the expected value of the compound process decreases from 0.1904 to 0.0856, in absolute value. The increase in the number of negative news items from 33 to 47 is insufficient to offset the decrease in absolute value of the mean of the jump size. Volatility also decreases. The expected value of the sum of the two compound processes changes from -0.1130 to -0.0046, when the threshold decreases. The proportion of variance explained by the news drops from 19.56 to 14.92%.

For the GARCH process, the unconditional volatility decreases from 0.305 to 0.211. The average RMSE remains approximately constant but there are small changes in the RMSE across the term structure of prices. In summary, the changes in the parameters estimates are as expected.

Table 11 Futures pricing model estimation for different sentiment probability thresholds

Out-of-sample estimation

In Table 12, we examine the ability of the model to estimate futures prices for a sample not used in the estimation. The results from Table 8 are reproduced in Table 12 Panel A, second column. We refer to this model, estimated using all contracts, as odd plus even. We then estimate the model using only odd month contracts. These results are provided in the third column. We refer to this model as odd. A comparison indicates that there are small changes in the coefficient estimates.

Table 12 Futures pricing model estimation using different futures contracts

In Table 13, we report, for each model, the RMSE for the odd contracts and the RMSE for even contracts. We first consider odd contracts. Using the odd plus even model, the average RMSE is 0.00575. For the odd model, the average RMSE is slightly lower, 0.0055. We compute the 95% confidence interval using a bootstrap sample size of 5000 and infer that statistically, there is no difference in the results.

For both models, we then compute the average RMSE for the even contracts. For the odd model, the even month contracts are out of sample. It is not unreasonable to expect the results to be inferior compared to odd plus even model. This is the case, though the differences are small and generally statistically insignificant. The results provide strong support for the model’s out-of-sample performance.

Table 13 RMSE analysis from estimation results using different contracts

Summary

This paper examines the impact of news on CME oil futures returns. In a model-free environment, we show that negative and positive news items have an asymmetric impact on returns and volatility. The magnitude of the impact depends on the maturity of the futures contracts, decreasing as maturity increases, consistent with Samuelson’s (1965) hypothesis. We show that persistence in the volatility of oil returns is affected by news arrival. Engle et al. (2012) found a similar result for equity returns.

To the best of our knowledge, there are no arbitrage-free models that consider the explicit impact of news on the price dynamics of oil futures. We incorporate our model-free results into a new reduced-form model that formally recognizes the impact of news, to price oil futures. Two independent Poisson compound processes represent the impact of important negative and positive news. The parameters of the model are estimated and show, consistent with our model-free results, that the effects of negative and positive news are described by different processes, that a significant proportion of volatility is caused by news arrival and that the impact of negative news is larger than the impact of positive news. Christoffersen et al. (2016) also found that price shocks have an asymmetric impact on volatility. They did not, however, consider the flow of news information as a source of those shocks.

Our model, incorporating the intensity of news arrival, is estimated over periods of high and low oil prices and high and low volatility, and the results have important implications for modeling and understanding asset pricing dynamics. To make price and volatility predictions, one needs to estimate the number of negative and positive news items. If the number of positive or negative news items is expected to increase, both the return and volatility will be impacted and the effect of negative news on returns and volatility will be greater in absolute terms than positive news. In our sample period, this relationship was observed in the 2009 and 2014 oil price collapses. Anecdotally, this is also true during the 2020 extreme oil price collapse and volatility spike: a high intensity of negative news items was observed.

Our analysis also provides a rich menu for future research. We applied a threshold to filter news items into distinct groups. This approach can be refined to consider the quality of information, recognizing that some sources of information are more important than others. Announcements of macroeconomic conditions and political events impact oil prices, as shown by Heath (2018). News reporting of these changes implies that the intensity of news arrival and the impact of the news may be state dependent. The arrival these types of news events could be modeled separately, though the estimation burden increases.