Longterm prediction intervals of economic time series
Abstract
We construct longterm prediction intervals for timeaggregated future values of univariate economic time series. We propose computational adjustments of the existing methods to improve coverage probability under a small sample constraint. A pseudooutofsample evaluation shows that our methods perform at least as well as selected alternative methods based on modelimplied Bayesian approaches and bootstrapping. Our most successful method yields prediction intervals for eight macroeconomic indicators over a horizon spanning several decades.
Keywords
Heavytailed noise Long memory Kernel quantile estimator Stationary bootstrap BayesJEL Classification
C14 C15 C53 C87 E271 Introduction
Longterm predictions of economic time series are published yearly by the US Congressional Budget Office (CBO) in its Longterm Budget and Economic Outlook.^{1} In this report, the CBO predicts US federal spending and revenue growth in the coming decades under the assumption of stable tax and spending policies. However, structural changes occur over the long run (taking the turbulent period after the Great Moderation as an example), and not only as a result of changes in legislation. The CBO stated in its January 2000 Budget and Economic Outlook that the baseline projections allow for an average recession within the next 10 years (2000–2010). Today, we know that the 2008 recession was more severe than the predicted average recession. Moreover, in its 2011 report^{2} the US Financial Crisis Inquiry Commission concluded that the crisis would have been avoidable if timely preventive measures had been introduced. We do not link the absence of these measures with the CBO’s projections from 18 years ago, but we do believe that accurate longterm economic predictions can trigger right and timely decisions. Economic predictions for several decades ahead are crucial for strategic decisions made by trust funds, pension management and insurance companies, portfolio management of specific derivatives (Kitsul and Wright 2013) and assets (see Bansal et al. 2016). Several facts hamper the longterm prediction of economic time series: small sample size because most postWWII economic indicators are reported on monthly/quarterly bases, (anti) persistence^{3} (see Diebold and Rudebusch 1989; Baillie 1996; Diebold and Linder 1996, who also give PIs), heteroscedasticity and structural change (Cheng et al. 2016), the latter of which is inevitable in the long run (Stock and Watson 2005).

First, since Zhou et al. (2010) evaluate their PIs using only simulated data, we find it necessary to verify their results using real data.

Second, the methods of Zhou et al. (2010), although theoretically valid, do not to account for some characteristics of economic time series. Therefore, we propose computational adjustments of the PIs of Zhou et al. (2010) that lead to better predictive performance for small samples and long horizons. Our adjustments employ a stationary bootstrap (Politis and Romano 1994) and kernel quantile estimators (Sheather and Marron 1990).

Third, since neither Zhou et al. (2010) nor Müller and Watson (2016) compare their PIs to any benchmark, we take over this responsibility and conduct an extensive pseudooutofsample (POOS) comparison. We augment the comparison with PIs implied by arfima–garch models computed as one of the following: (i) forecasts for timeaggregated series or (ii) timeaggregated forecasts of disaggregated series. To compute (i) and (ii) we use both analytic formulas and bootstrap path simulations (Pascual et al. 2006).

First, our simulation study reveals that the PIs of Zhou et al. (2010) fail to achieve their nominal coverage rate under a growing horizon as a result of rapidly shrinking width. Particularly under longmemory DGP, the coverage rate reaches only half of the nominal level.

Second, using the proposed computational adjustments, we achieved an improvement in the coverage rate of 20pp, which may, however, still be below the nominal level.

Third, based on real data (S&P 500 returns and US 3month TB interest rates), the adjusted PIs of Zhou et al. (2010) provide a valid competitor for Müller and Watson (2016). Particularly in case of asset returns, the PIs of Müller and Watson (2016) provide higher coverage but less precision (larger width), while for assets’ volatility, the roles are switched. In both cases, adjusted Zhou et al. (2010) PIs outperform the bootstrap PIs of Pascual et al. (2006).

Fourth, with the adjusted method of Zhou et al. (2010), we construct longterm prediction intervals for selected US macroeconomic time series including GDP growth, total factor productivity, inflation, population, and others. These PIs provide an alternative for PIs given by Müller and Watson (2016) in Table 5 on pages 1731–1732 in the referenced paper.
2 Methods and simulations
2.1 Methods for computing prediction intervals of temporal averages
2.1.1 Bootstrap prediction intervals by Pascual et al. (2004, 2006)
In order to obtain PIs for \(\bar{y}_{+1:m}\) with arfima–garchtype models, we can either (i) use averages of the insample observations \(\bar{y}_{t(m)}\), as defined above, or (ii) average the forecasts of \(y_t\), over \(t=T+1,\ldots ,T+m\). In both cases, we fit arfima(p, d, q)–garch(P, Q) models to the data with the rugarch Rpackage. As already mentioned, the full resampling scheme takes into account the parameter uncertainty, however, for minor improvement of performance and high cost concerning computation time. Therefore, we use a partial resampling scheme which accounts for the uncertainty due to the unknown distribution of innovations. The fractional parameter \(d\in \left[ 0,0.5\right) \) is, depending on the series, either fixed to 0 (only for stock returns, see Sect. 3) or estimated by maximum likelihood (ML). The arma orders are restricted to \(p,q\in \left\{ 1,\ldots ,4\right\} \) and are selected by aic. The garch orders are restricted to \((P,Q)\in \left\{ (0,0), (1,1)\right\} \). The details of our implementation follow:
 1.
Compute the series of overlapping rolling averages \(\bar{y}_{t(m)}=m^{1}\sum _{i=1}^{m} y_{ti+1}\), for \(t=m,\ldots ,T\).
 2.
Fit the selected arfima–garch model to the series of \(\bar{y}_{t(m)}\).
 3.Compute
 (anlt)
mstepahead MSE forecasts \(\hat{\bar{y}}_{T,+1:m}\) and \(\hat{\bar{\sigma }}^2_{T,+1:m}\) analogously to (2.3) by substituting the observations \(y_t\) by rolling averages \(\bar{y}_{t(m)}\). Then, PIs are given by (2.5).^{4}
 (boot)
residuals \(\hat{e}_t, t=1,\ldots ,T\), and generate \(b=1,\ldots ,B\) future paths \(\hat{\bar{y}}^b_{T(m),t}, t=T+1,\ldots ,T+m\), recursively using (2.5) and the parameter estimates from the original sample. Obtain the PIs by inverting the empirical distribution of \(\hat{\bar{y}}^b_{T(m),T+m}, b=1,\ldots ,B\).
 (anlt)
 1.
Fit the selected arfima–garch model to the series \(y_t\).
 2.Compute
 (anlt)
\(\hat{\bar{y}}_{T,+1:m}=m^{1}\sum _{i=1}^m\hat{y}_{T,T+i}\), with \(\hat{y}_{T,T+i}\) the istepahead analytic forecast from (2.3). The scaling factor in PI \([L,U]=\hat{\bar{y}}_{T,+1:m}+[ Q_t(\alpha /2), Q_t(1\alpha /2)]\hat{\text {as}}_{T,+1:m}\) is derived in “Appendix C.”
 (boot)
residuals \(\hat{e}_t, t=1,\ldots ,T\) and generate \(b=1,\ldots ,B\) future paths \(\hat{y}^b_{T,t}, t=T+1,\ldots ,T+m\), recursively using (2.5) and the parameter estimates from the original sample. Compute the temporal averages \(\hat{\bar{y}}^b_{T,+1:m}\), as estimators of \(\bar{y}^b_{T,+1:m}\), \(b=1,\ldots ,B\). We obtain the PIs by inverting the empirical distribution of \(\hat{\bar{y}}^b_{T,+1:m}\), \(b=1,\ldots ,B\).
 (anlt)
2.1.2 Robust bayes prediction intervals by Müller and Watson (2016)
A natural way of how to incorporate the uncertainty about \(\vartheta \), which is crucial for the asymptotic predictive distribution of \(\bar{y}_{+1:m}\), is to assume a prior for \(\vartheta \). A practical drawback of such an approach is that the preassigned coverage holds only on average relative to the prior. Hence, Müller and Watson (2016) further robustify their bayes PIs in order to attain “frequentist coverage,” i.e., coverage that holds over the whole parameter space.
 1.
Set q small and compute the cosine transformations \(X=(X_1, ,X_q)\) of the target series \(y_t\). Standardize them as \(Z=(Z_1,\cdots ,,Z_q)^{^{{\mathsf {T}}}} =X/\sqrt{X^{^{{\mathsf {T}}}}X}\).
 2.
For a chosen grid of parameter values \(\vartheta =(b,c,d)\) satisfying \(0.4\le d\le 1\); \(b,c\ge 0\) compute the matrices \(\varSigma (m/T,q,\vartheta )\) following formulas (9) and (20) from Müller and Watson (2016) and using, e.g., a numerical integration algorithm. (Details are given in “Appendix” of the original paper.)
 3.
Choose a prior for \(\vartheta =(b,c,d)\) and compute the posterior covariance matrix \(\varSigma =\bigl ({\begin{matrix}\varSigma _{ZZ} &{} \varSigma _{Z\bar{e}}\\ \varSigma _{Z\bar{e}} &{} \varSigma _{\bar{e}\bar{e}}\end{matrix}}\bigr )\).
 4.
Obtain the covariance matrix of the residuals as \(\varSigma _{UU}=\varSigma _{\bar{e}\bar{e}}\varSigma _{Z\bar{e}}^{'}(\varSigma _{ZZ}^{1})\varSigma _{Z\bar{e}}\).
 5.
Compute the quantiles \(Q_q^{\text {tmix}}(\alpha /2),Q_q^{\text {tmix}}(1\alpha /2)\) of the conditional (mixturet) distribution of \(\bar{y}_{+1:m}\) using, e.g., sequential bisection approximation. (Details are given in “Appendix” of the original paper.)
 6.
The PIs are given by \([L,U]=\bar{y}+[Q_q^{\text {tmix}}(\alpha /2),Q_q^{\text {tmix}}(1\alpha /2)]\sqrt{X^{^{{\mathsf {T}}}}X}\).
2.1.3 Prediction intervals by Zhou et al. (2010)
 CLT method (clt): If the process \(e_t\) shows shortrange dependence and lighttailed behavior, then in the light of a quenched CLT, Zhou et al. (2010) propose the following PI for \(\bar{e}_{+1:m}\)where \(\sigma \) is the longrun standard deviation (sd) of \(e_t\). However, since \(\sigma \) is unknown, it must be estimated. One popular choice is the lag window estimator$$\begin{aligned}{}[L,U]=[Q^N(\alpha /2),Q^N(1\alpha /2)]\frac{\sigma }{\sqrt{m}}, \end{aligned}$$(2.10)The PI for \(\bar{y}_{+1:m}\) with nominal coverage \(100(1\alpha )\%\) is given by$$\begin{aligned} \hat{\sigma }^2=\sum _{k=k_T}^{k_T}\hat{\gamma }_k=\sum _{k=k_T}^{k_T}\frac{1}{T}\sum _{t=1}^{Tk}(\hat{e}_t\bar{\hat{e}})(\hat{e}_{t+k}\bar{\hat{e}}). \end{aligned}$$(2.11)$$\begin{aligned}{}[L,U]=\bar{y}+[Q^N(\alpha /2),Q^N(1\alpha /2)]\frac{\hat{\sigma }}{\sqrt{m}}. \end{aligned}$$(2.12)
2.1.4 Practical comparison of the previously discussed methods
Pros and cons of Pascual et al. (2004, 2006) When comparing the bootstrap PIs to the analytic PIs, the former provide the advantage of including the uncertainty due to the unknown distribution of the residuals and unknown parameters into the uncertainty about the target. In cases when the distribution of the residuals is asymmetric and doubts about the proximity of the estimated model to the true DGP exist, the bootstrap approach dominates the analytic. Furthermore, regarding the implementation, the analytic PIs are more difficult to obtain since we deal with a nonstandard target. By contrast, the bootstrap PIs are readily available in the Rpackage rugarch. Hence, their estimation is cheap. Concerning the two ways of fitting the models to data, i.e., (i) using the series of rolling temporal averages or (ii) using the original series and averaging the forecasts, there are pros and cons for each approach regarding the implementation and effective use of our relatively small sample. The literature (e.g., page 302 in Lütkepohl 2006; Marcellino 1999) does not provide any conclusion about the superiority of (i) over (ii), or vice versa. Therefore, we include both (i) and (ii) in the POOS comparison in Sect. 3.
Pros and cons of Müller and Watson (2016) Their methods represent the state of the art, being robust against stylized peculiarities of economic time series. Their Bayesian approach accounts for both model and parameter uncertainty, but the focus is only on those parameters ruling the persistence, which is in contrast to the previously discussed bootstrap approach where the focus is on the shortterm dynamics. To date, no package implementation has been available, which makes the approach less attractive to practitioners. Moreover, the PIs depend on several forecastermade choices, such as the number of frequencies q to keep, the grid of values for parameters, the choice of prior. Even with these inputs fixed, the computation takes longer due to multiple advanced numerical approximations required for the (bayes) PIs and further optimization to attain the “frequentist coverage.” PIs for fixed parameters \(q=12\) and \(0.075 \le m /T\le 1.5\) used in their paper (and also in the current paper) are available faster thanks to some precomputed inputs available from the replication files.^{5}
Pros and cons of Zhou et al. (2010) Their methods provide a simple alternative to the previously discussed ones. As to their scope of applicability, the clt method does not require any specific rate of how fast the horizon can grow compared to the sample size. However, the predictive performance heavily depends on the estimator of the longterm volatility \(\sigma \). Furthermore, for some processes with heavytailed innovations or longrange dependence, the notion of the longrun variance \(\sigma ^2\) does not exist, and thus this method is not applicable. The attractive feature of the qtl PIs is the simplicity and more general applicability than the clt. Their computation requires almost no optimization (at least in our univariate case) and is straightforward. Pascual et al. (2004, 2006) and Müller and Watson (2016) assume that the DGP of \(y_t\) is (possibly longmemory and heteroscedastic version of) an arma process. Zhou et al. (2010) do not a priori assume any parametrization for the dynamics of \(y_t\), but argue that both qtl and clt PIs are valid for arma processes, whereas only the former should be used for processes with a long memory. The simulations of Zhou et al. (2010) confirm their claims. However, as we demonstrate next, the qtl PIs underperform when T is small and \(m/T\approx 1/2\). Therefore, we propose some computational adjustment and provide a simulationbased justification of their superiority over the original clt and qtl.
2.2 Zhou et al. (2010) under small sample: adjustment and simulations
2.2.1 Computational adjustments
The simulation setup in Zhou et al. (2010), page 1440, assumes \(T=6000\) and horizon \(m=168\). By contrast, in an economic forecasting setup, one typically has only a few hundred of observations, while our horizon m stays approximately the same. Here we show how one can easily modify the computation of clt and qtl PIs in order to enhance their performance. In particular, for qtl, we use a stationary bootstrap (Politis and Romano 1994) with optimal window width as proposed by Politis and White (2004) and Patton et al. (2009) to obtain a set of replicated series. Next, kernel quantile estimators (see Silverman 1986; Sheather and Marron 1990) are used instead of sample quantiles. In order to improve the clt method, we employ a different estimator (cf. 2.18) of \(\sigma \) than (2.11) and account for the estimation uncertainty. These three modifications are then shown to improve the empirical coverage using simulations.
Kernel quantile estimation The efficiency of kernel quantile estimators over the usual sample quantiles has been proved in Falk (1984) and was extended to several variants by Sheather and Marron (1990). As proposed in the latter, the improvement in MSE is a constant order of \(\int uK(u) K^{1}(u)\mathrm{d}u\) for the used symmetric kernel K. The theorems mentioned in Sect. 2 are easily extendable to these kernel quantile estimators. We conjecture that one can use the Bahadurtype representations for the kernel quantile estimators as shown in Falk (1985) and obtain similar results of consistency for at least linear processes. We used the popular Epanechnikov kernel \(K(x)=0.75(1x^2)_{+}\) for our computations because of its efficiency in terms of mean integrated square error.
 1.
Replicate series \(e_t\), B times obtaining \(e_t^b, t=1,\ldots ,T\), \(b=1,\ldots ,B\).
 2.
Compute \((\bar{e}_{t(m)}^b)=m^{1} \sum _{i=1}^m e^b_{ti+1}\), \(t=m,\ldots T\) from every replicated series.
 3.
Estimate the \(\alpha /2\)th and \((1\alpha /2)\)th quantiles \(\hat{Q}(\alpha /2)\) and \(\hat{Q}(1\alpha /2)\) using the Epanechnikov kernel density estimator from \(\bar{e}_{T(m)}^b\), \(b=1,\ldots ,B\).
 4.
The PI for \(\bar{y}_{+1:m}\) is \([L,U]=\bar{y}+[\hat{Q}(\alpha /2),\hat{Q}(1\alpha /2)]\).
2.2.2 Simulations
An extensive outofsample forecasting evaluation based on independent samples is possible only with artificial data. Our simulation setup is designed to assess the performance of the original methods of Zhou et al. (2010) as described in Sect. 2.1.3 and the computational modifications described in 2.2.1. The simulation results provide evidence for the usefulness of these modifications in an artificial setup based on possibly longmemory armalike processes. This setup would provide an advantage for approaches described in 2.1.1 and 2.1.2, should we challenge them. We leave this task for the next section and real data.
 (i)
\(e_t=0.6e_{t1}+\sigma \epsilon _t\), for i.i.d mixturenormal \(\epsilon _t\sim 0.5 N(0,1)+0.5 N(0,1.25)\),
 (ii)
\(e_t=\sigma \sum _{j=0}^\infty (j+1)^{0.8}\epsilon _{tj}\), with noise as in (i),
 (iii)
\(e_t=0.6e_{t1}+\sigma \epsilon _t\), for stable \(\epsilon _t\) with heavy tail index 1.5 and scale 1,
 (iv)
\(e_t=\sigma \sum _{j=0}^\infty (j+1)^{0.8}\epsilon _{tj}\), with noise as in (iii),
Coverage probability \(\hat{p}\)  Relative median width \(\hat{w}\)  

Horizon  20  30  40  60  90  130  20  30  40  60  90  130  
(A) Results of simulated forecasting experiment for nominal coverage probability \(p=1\alpha =90\%\)  
shortlight  qtloriginal  85.25  82.33  78.85  72.85  62.19  47.97  0.94  0.90  0.86  0.77  0.64  0.47 
qtlkernel  87.53  84.82  81.81  76.28  66.46  51.91  0.99  0.96  0.92  0.83  0.69  0.52  
qtlboot  82.56  80.84  80.45  79.19  76.70  74.70  0.86  0.85  0.85  0.85  0.85  0.84  
kernelboot  85.82  84.28  83.81  82.47  80.38  78.06  0.93  0.92  0.92  0.92  0.92  0.91  
cltoriginal  85.05  83.50  82.52  80.89  78.80  76.44  0.91  0.89  0.89  0.88  0.89  0.88  
clttdist  86.11  84.03  83.50  82.17  79.97  77.51  0.92  0.91  0.91  0.90  0.90  0.89  
longlight  qtloriginal  80.88  75.96  71.72  63.45  51.20  37.31  0.77  0.72  0.66  0.56  0.43  0.29 
qtlkernel  83.45  79.06  74.99  67.89  56.40  41.39  0.82  0.76  0.71  0.61  0.48  0.32  
qtlboot  76.42  72.20  69.25  64.21  58.54  53.78  0.68  0.63  0.59  0.53  0.48  0.44  
kernelboot  80.58  76.02  73.25  68.43  62.87  57.63  0.74  0.69  0.64  0.58  0.52  0.48  
cltoriginal  77.33  70.98  66.96  61.12  55.01  50.18  0.68  0.60  0.55  0.49  0.44  0.40  
clttdist  84.44  78.50  74.80  68.74  63.26  57.96  0.82  0.72  0.66  0.58  0.53  0.48  
shortheavy  qtloriginal  84.39  80.21  76.71  70.12  58.54  44.45  0.99  0.87  0.78  0.66  0.51  0.36 
qtlkernel  85.93  82.15  79.13  73.43  63.37  48.51  1.01  0.91  0.83  0.72  0.57  0.40  
qtlboot  82.40  79.38  78.04  74.96  70.53  66.74  0.87  0.81  0.77  0.72  0.66  0.62  
kernelboot  84.59  82.10  80.70  78.40  74.78  71.46  0.92  0.86  0.82  0.77  0.72  0.67  
cltoriginal  83.43  80.27  78.31  74.76  69.49  64.32  0.80  0.74  0.71  0.66  0.61  0.57  
clttdist  83.92  80.55  78.72  74.99  69.62  64.44  0.81  0.75  0.72  0.67  0.62  0.58  
longheavy  qtloriginal  80.64  76.27  71.27  62.87  49.82  33.62  0.66  0.58  0.52  0.43  0.32  0.21 
qtlkernel  82.53  78.51  74.32  66.80  55.13  37.49  0.69  0.62  0.56  0.47  0.36  0.24  
qtlboot  78.23  74.59  70.16  64.31  57.06  50.64  0.58  0.52  0.47  0.41  0.35  0.31  
kernelboot  80.96  77.22  73.38  68.14  61.08  54.23  0.62  0.55  0.50  0.44  0.38  0.34  
cltoriginal  77.73  71.64  66.70  59.51  51.70  44.34  0.54  0.46  0.41  0.35  0.29  0.26  
clttdist  83.24  78.34  73.81  67.49  60.01  52.46  0.67  0.57  0.50  0.43  0.37  0.32  
(B) Results of simulated forecasting experiment for nominal coverage probability \(p=1\alpha =67\%\)  
shortlight  qtloriginal  62.79  60.68  59.19  54.49  45.88  33.48  0.97  0.92  0.91  0.85  0.71  0.52 
qtlkernel  65.28  63.12  61.20  56.16  47.76  34.91  1.02  0.97  0.96  0.89  0.74  0.55  
qtlboot  58.82  57.74  56.81  55.63  53.22  50.49  0.88  0.85  0.86  0.85  0.85  0.84  
kernelboot  62.48  61.55  60.63  59.12  56.89  54.13  0.95  0.92  0.93  0.92  0.92  0.92  
cltoriginal  61.52  59.88  58.93  57.21  55.14  52.14  0.93  0.89  0.89  0.88  0.88  0.88  
clttdist  61.81  60.53  59.47  57.61  55.47  52.29  0.93  0.89  0.90  0.89  0.88  0.88  
longlight  qtloriginal  57.98  56.14  53.68  48.54  38.94  27.71  0.79  0.75  0.73  0.66  0.52  0.35 
qtlkernel  60.30  58.05  55.88  49.70  40.39  28.75  0.83  0.79  0.76  0.68  0.54  0.36  
qtlboot  52.78  50.01  47.72  43.92  38.84  35.42  0.69  0.64  0.60  0.55  0.49  0.45  
kernelboot  56.27  53.11  50.60  46.86  41.46  37.82  0.74  0.69  0.65  0.59  0.53  0.48  
cltoriginal  53.22  47.78  44.50  39.71  34.97  31.64  0.68  0.60  0.55  0.49  0.44  0.40  
clttdist  59.29  54.20  50.21  45.16  40.08  36.17  0.79  0.69  0.64  0.57  0.51  0.46  
shortheavy  qtloriginal  62.74  60.31  59.95  54.78  44.39  31.04  0.98  0.95  1.01  0.91  0.72  0.49 
qtlkernel  65.36  63.16  61.99  56.50  46.58  32.43  1.04  1.02  1.04  0.95  0.76  0.52  
qtlboot  60.15  57.80  57.52  55.54  51.06  46.67  0.91  0.89  0.90  0.87  0.81  0.77  
kernelboot  63.89  61.67  60.90  58.78  54.62  50.24  0.99  0.97  0.97  0.94  0.88  0.84  
cltoriginal  64.49  59.31  56.29  51.00  44.63  40.45  1.00  0.91  0.87  0.81  0.75  0.71  
clttdist  65.12  59.64  56.94  51.68  45.07  40.71  1.01  0.92  0.88  0.82  0.75  0.71  
longheavy  qtloriginal  59.02  55.99  54.33  48.08  37.93  24.88  0.74  0.70  0.69  0.61  0.47  0.31 
qtlkernel  61.52  58.50  56.04  49.65  39.27  25.96  0.78  0.74  0.71  0.62  0.48  0.32  
qtlboot  55.02  51.52  49.55  45.38  39.54  33.50  0.65  0.60  0.56  0.50  0.44  0.39  
kernelboot  58.55  55.03  52.49  48.24  41.97  35.92  0.70  0.65  0.60  0.54  0.47  0.42  
cltoriginal  57.44  49.86  45.48  39.22  32.53  27.15  0.66  0.56  0.50  0.42  0.36  0.32  
clttdist  64.32  57.08  51.97  45.33  38.57  32.41  0.79  0.67  0.59  0.50  0.43  0.38 
We focus on the evaluation under the longest horizon \(m=130\).
Scenario (i) When \(m=130\) the original qtl covers the future realizations in only around \(48\%\) of cases, while the nominal coverage is \(90\%\). Employing the kernel quantile adjustment on qtl increases this number by 4 percent points (pp), and when combined with the adjustment based on bootstrapping it yields an additional 26pp on top. Intuitively, using Student’s t quantiles (instead of normal) leads to a higher coverage probability for the clt. As expected, the two methods perform similarly well in this particular scenario.
Scenario (ii) Long memory of the DGP has a strongly negative impact on both methods. The combined kernelbootstrap adjustment increases the coverage of qtl by 20pp, which is, however, still very low. The same holds for the performance of clt under tquantile adjustment.
Scenario (iii) Heavytailed noise has also a negative impact on the original clt (coverage probability falls by 13pp compared to the lighttailed case), whereas qtl, as expected, is more robust (falls by 4–6pp). The kernelbootstrap adjustment increases coverage probability by 27pp for the qtl, whereas the clt tdist yields only negligible improvement compared to the original clt.
Scenario (iv) The combined effect of (ii) and (iii) cuts the coverage probabilities further down—below \(45\%\). The proposed adjustments increase the coverage probabilities by up to 10pp.
Overall, for the short and medium horizons, i.e., \(m=20,\ldots ,60\), we corroborate the conclusion from Zhou et al. (2010) that the (original) clt loses against the (original) qtl. However, both original methods exhibit rapid decay in their coverage probabilities as the forecasting horizon grows. For instance, in the scenario (iv) the gap between horizon \(m=20\) and \(m=130\) for the qtl is 47pp. Concerning the width of the PIs, we can see that both adjusted and original methods underestimate the dispersion and the gap between the width of PIs and the width of the empirical interquantile range increases with the horizon. However, our computational adjustments improve the original methods consistently over all scenarios. The improvement is most remarkable for the combined adjustment (kernelboot).
3 Forecast comparison with long financial time series
 (spret)
S&P 500 value weighted daily returns including dividends available from January 2, 1926, till December 31, 2014, with a total of 23, 535 observations,
 (spret2)
squared daily returns, with the same period and
 (tb3m)
nominal interest rates for 3month US Treasury Bills available from April 1, 1954, till August 13, 2015, with a total of 15,396 observations.
Realdata results: comparison of coverage probabilities for zxw, mw and prr on each of the three daily time series: spret, spret2, tb3m
Coverage probability \(\hat{p}\)  Relative median width \(\hat{w}\)  

Horizon (days)  20  30  40  60  90  130  20  30  40  60  90  130  
(A) Results of POOS forecasting experiment for nominal coverage probability \(p=1\alpha =90\%\)  
S&P 500 returns  kernelboot  89.94  89.16  88.30  88.89  87.60  81.56  0.87  0.94  0.91  0.89  0.87  0.77 
clttdist  87.70  87.35  85.89  85.53  83.72  78.21  0.81  0.85  0.84  0.81  0.79  0.70  
robust  92.00  94.58  94.32  94.83  96.90  96.09  1.06  1.37  1.22  1.46  1.57  1.43  
bayes  87.96  92.39  90.71  93.28  92.64  93.30  0.92  1.19  1.02  1.21  1.29  1.19  
naive  86.07  90.19  88.47  92.25  92.25  84.92  0.83  1.07  0.90  1.06  1.09  0.97  
seriesanlt  85.21  84.52  84.51  81.14  79.84  75.98  0.79  0.84  0.83  0.80  0.78  0.69  
seriesboot  85.12  85.16  83.30  81.40  79.84  73.74  0.77  0.83  0.82  0.78  0.77  0.68  
4castanlt  83.32  80.39  81.41  77.78  68.22  68.72  0.88  0.90  0.85  0.77  0.69  0.57  
4castboot  82.46  80.26  81.76  72.09  67.44  62.57  0.86  0.88  0.84  0.77  0.70  0.57  
S&P 500 \(\text {returns}^2\)  kernelboot  92.61  91.87  91.05  90.44  90.31  87.15  0.39  0.37  0.36  0.39  0.36  0.31 
clttdist  91.49  93.03  92.08  91.99  91.86  89.39  0.43  0.43  0.41  0.46  0.43  0.37  
robust  89.51  89.42  87.78  87.60  87.21  86.03  0.35  0.34  0.30  0.33  0.29  0.26  
bayes  88.39  88.26  85.20  86.30  84.50  83.24  0.32  0.32  0.27  0.31  0.28  0.24  
naive  87.79  88.77  81.76  83.46  78.29  68.16  0.30  0.30  0.21  0.24  0.19  0.13  
seriesanlt  41.44  33.55  28.92  23.77  16.67  13.97  0.10  0.08  0.07  0.07  0.06  0.04  
seriesboot  75.00  75.00  70.89  67.00  62.69  52.27  0.32  0.27  0.21  0.31  0.25  0.06  
4castanlt  72.74  70.45  66.95  59.59  57.75  44.13  0.19  0.18  0.14  0.14  0.12  0.09  
4castboot  78.66  77.18  70.78  65.08  58.67  51.67  0.23  0.17  0.24  0.13  0.23  0.10  
TB3M interest rate  kernelboot  89.29  88.69  85.45  85.32  82.74  75.86  0.05  0.06  0.07  0.08  0.11  0.11 
clttdist  92.33  92.46  89.95  89.68  86.31  81.90  0.06  0.07  0.08  0.09  0.12  0.13  
robust  90.34  85.91  86.24  80.16  73.21  72.41  0.08  0.08  0.09  0.09  0.11  0.12  
bayes  90.34  85.52  85.45  79.37  73.21  72.41  0.08  0.08  0.09  0.09  0.11  0.12  
naive  64.42  61.11  45.50  42.86  32.14  25.00  0.15  0.15  0.11  0.11  0.09  0.07  
seriesanlt  84.13  81.94  79.10  75.40  76.19  68.10  0.05  0.06  0.07  0.08  0.11  0.13  
seriesboot  63.93  68.38  60.49  64.41  57.89  47.06  0.06  0.06  0.08  0.08  0.08  0.10  
4castanlt  83.07  80.95  71.69  61.11  52.98  36.21  0.05  0.06  0.07  0.07  0.09  0.09  
4castboot  42.46  41.67  41.80  47.22  45.83  45.69  0.08  0.10  0.10  0.13  0.15  0.16  
(B) Results of POOS forecasting experiment for nominal coverage probability \(p=1\alpha =67\%\)  
S&P 500 returns  kernelboot  67.50  68.52  67.30  66.67  63.95  61.45  1.01  0.97  1.01  1.00  0.96  0.87 
clttdist  65.43  65.03  64.37  61.50  58.91  60.34  0.94  0.89  0.92  0.92  0.88  0.80  
robust  71.28  79.61  80.21  84.75  84.11  82.12  1.16  1.34  1.37  1.70  1.83  1.69  
bayes  66.38  74.32  68.50  73.13  75.19  72.63  1.03  1.18  1.05  1.27  1.28  1.17  
naive  63.89  72.65  64.20  70.03  70.93  68.16  0.93  1.07  0.95  1.16  1.17  1.08  
seriesanlt  60.19  59.74  58.69  56.59  52.33  58.10  0.82  0.79  0.83  0.82  0.80  0.73  
seriesboot  62.51  61.03  61.45  57.11  53.49  57.54  0.88  0.83  0.88  0.87  0.85  0.77  
4castanlt  56.92  55.74  56.80  53.49  49.61  45.25  0.96  0.87  0.87  0.83  0.74  0.61  
4castboot  59.07  55.61  56.28  51.16  46.90  38.55  0.99  0.91  0.90  0.86  0.77  0.65  
S&P 500 \(\text {returns}^2\)  kernelboot  76.61  78.45  78.31  79.33  78.68  77.65  0.60  0.64  0.63  0.63  0.59  0.63 
clttdist  81.77  81.29  82.44  82.43  82.17  79.89  0.78  0.80  0.78  0.77  0.71  0.72  
robust  75.84  77.03  74.35  74.94  71.32  66.48  0.61  0.65  0.58  0.59  0.52  0.49  
bayes  73.95  74.58  70.91  70.28  67.44  62.57  0.58  0.60  0.51  0.52  0.44  0.42  
naive  71.97  73.42  57.31  60.98  51.94  44.69  0.55  0.57  0.41  0.41  0.32  0.27  
seriesanlt  21.32  14.97  12.74  8.79  6.59  6.15  0.14  0.12  0.11  0.09  0.07  0.07  
seriesboot  50.31  46.82  48.73  45.00  32.84  31.82  0.36  0.35  0.31  0.38  0.26  0.13  
4castanlt  45.06  41.55  39.41  36.01  33.72  25.14  0.28  0.26  0.21  0.18  0.16  0.13  
4castboot  58.84  55.19  51.95  42.06  36.00  31.67  0.50  0.39  0.36  0.36  0.25  0.29  
TB3M interest rate  kernelboot  69.18  67.86  66.93  61.11  57.14  59.48  0.05  0.06  0.07  0.08  0.10  0.12 
clttdist  76.32  74.40  71.16  67.86  63.10  62.93  0.06  0.07  0.08  0.09  0.11  0.14  
robust  71.83  65.87  67.46  56.75  50.00  56.03  0.08  0.08  0.09  0.09  0.11  0.12  
bayes  71.56  65.67  66.93  56.75  50.00  56.03  0.08  0.08  0.09  0.09  0.11  0.12  
naive  34.39  32.54  22.49  18.65  18.45  13.79  0.14  0.14  0.11  0.10  0.08  0.07  
seriesanlt  57.67  59.13  55.03  55.16  51.19  51.72  0.04  0.05  0.06  0.07  0.09  0.11  
seriesboot  48.09  50.43  49.38  49.15  28.95  29.41  0.05  0.06  0.06  0.09  0.10  0.09  
4castanlt  55.82  56.55  49.47  42.06  28.57  20.69  0.04  0.05  0.06  0.07  0.08  0.09  
4castboot  37.17  35.12  34.66  40.48  35.12  36.21  0.08  0.10  0.10  0.13  0.15  0.17 
POOS results Similarly as in Sect. 2.2.2, we evaluate the coverage probability (2.20) and relative median width (2.21), for nominal coverage probabilities \(90\%\) (see Table 2A) and \(67\%\) (see Table 2B). Overall, the results show tight competition between mw and zxw. Better coverage probability is generally compensated by a larger width, hence less precision. Only for tb3m, zxw performs better in both aspects. The prr PIs show mixed performance, and it is difficult to draw any general conclusion whether one should prefer averaging of series (series) or averaging the forecasts (4cast) and whether to use analytic formulas (anlt) rather than bootstrapping (boot) to obtain PIs. We keep our focus on the coverage yield for the longer horizons.
spret Based on the simulation results for shortmemory and heavytailed series, we expect that both methods of zxw should give decent coverage probability close to the nominal level. The realdata performance is better than suggested using artificial data, with an average drop of 9 resp. 12pp for kernelboot resp. clttdist below the nominal level. On the other hand, mw exceed the nominal coverage even with the naive method. The difference in coverage probability between robust and kernelboot reaches 15pp. The zxw provide advantage regarding the width, as the robust has twice the width of kernelboot for \(m=130\). The prr gives decent coverage only for short horizon. For medium and long horizons, both the coverage and the width of prr exhibit a rapid decay. Averaging series dominates over averaging forecasts by 10pp. To our surprise, the bootstrap PIs do not outperform the analytic PIs. Regarding the width, the mw are by 40pp more conservative than the empirical interquantile range of the outofsample meanreturns, whereas zxw resp. prr are 23–30pp resp. 31–43pp below the interquantile range width.
spret2 Realized volatility is known for the persistence and heavy tails. The mw give slightly lower coverage probabilities than zxw compensated by a relatively smaller width, thus better precision. With the growing horizon all prr methods suffer a drop in coverage, at least 40pp below the nominal level, accompanied by the largest reduction of width among all methods. The bootstrap PIs dominate over analytic, and the competition between 4cast and series is tight. Concerning the relative width, when compared to the previous case of returns, all methods provide very narrow PIs. We believe that the seemingly shrinking width of PIs is caused by a larger dispersion of the entire spret2 [entering the denominator of (2.21)] compared to the dispersion of each local average [the nominator in (2.21)]. Note that for spret2 the denominator in (2.21) does not provide adequate scale and the discrepancy will become even worse for more persistent tb3m.
tb3m Interest rates exhibit strong persistence, and for enhanced performance of zxw, we again apply differencing, but with \(d=1\). Note that the naive PIs have coverage probabilities as low as \(25\%\). The coverage probabilities of all methods are lower than for the last two series, but zxw performs better than mw for all horizons. Moreover, zxw gives better results in terms of width. The coverage probability for prr falls far below the nominal level as the horizon grows. Here, series dominates 4cast, and bootstrap PIs are inferior to the analytic PIs, even though with only half the nominal coverage.
Except spret, clttdist gives slightly higher coverage probabilities than kernelboot corresponding to smaller precision. Eventually, we prefer the kernelboot method and use it in the following section for computing PIs for eight economic time series and S&P 500 returns.
4 Prediction intervals for economic series’ growth rates and S&P 500 returns
Prediction intervals for longrun averages of quarterly postWWII growth rates
\(\hat{p}=67\%\)  \(\hat{p}=90\%\)  

Horizon (years)  10  25  50  10  25  50 
GDP/Pop  [\(\) 0.88, 4.65]  [\(\) 1.11, 4.76]  [\(\) 0.87, 4.68]  [\(\) 2.97, 6.78]  [\(\) 2.96, 6.59]  [\(\) 2.86, 6.48] 
Cons/Pop  [0.56, 3.45]  [0.60, 3.45]  [0.59, 3.49]  [\(\) 0.54, 4.53]  [\(\) 0.43, 4.38]  [\(\) 0.40, 4.55] 
TF prod.  [\(\) 0.46, 2.92]  [\(\) 0.37, 2.96]  [\(\) 0.40, 2.76]  [\(\) 1.61, 4.16]  [\(\) 1.55, 4.02]  [\(\) 1.49, 3.83] 
Labor prod.  [0.89, 3.42]  [0.84, 3.24]  [0.90, 3.37]  [\(\) 0.11, 4.35]  [0.06, 4.11]  [0.08, 4.15] 
Population  [0.44, 0.95]  [0.25, 1.00]  [\(\) 0.11, 0.90]  [0.24, 1.17]  [\(\) 0.06, 1.35]  [\(\) 0.50, 1.35] 
PCE infl.  [\(\) 4.06, 2.32]  [\(\) 6.01, 3.83]  [\(\) 9.50, 5.15]  [\(\) 7.39, 4.70]  [\(\) 9.74, 7.40]  [\(\) 14.38, 9.86] 
CPI infl.  [\(\) 4.75, 1.61]  [\(\) 6.32, 2.04]  [\(\) 9.43, 3.29]  [\(\) 9.00, 4.03]  [\(\) 10.54, 6.57]  [\(\) 14.95, 7.46] 
Jap. CPI infl.  [\(\) 5.20, 2.79]  [\(\) 7.12, 4.18]  [\(\) 8.72, 5.85]  [\(\) 8.10, 7.63]  [\(\) 11.38, 10.07]  [\(\) 14.51, 12.19] 
Returns  [2.20, 12.20]  [3.50, 10.75]  [4.78, 10.22]  [\(\) 1.93, 15.69]  [0.88, 12.95]  [2.90, 11.71] 
Prediction intervals for longrun averages of annual growth rates and annual S&P 500 returns
Horizon (years)  10  25  50  75  

\(67\%\)  GDP/Pop  [\(\) 1.43, 5.61]  [\(\) 1.59, 5.68]  [\(\) 1.85, 5.65]  [\(\) 1.72, 5.36] 
Cons/Pop  [\(\) 1.07, 4.27]  [\(\) 1.15, 4.41]  [\(\) 0.96, 4.33]  [\(\) 1.08, 4.26]  
Population  [0.33, 0.99]  [0.08, 1.11]  [\(\) 0.21, 1.16]  [\(\) 0.54, 1.15]  
CPI infl.  [\(\) 2.72, 6.02]  [\(\) 2.80, 6.21]  [\(\) 3.19, 6.69]  [\(\) 5.27, 9.46]  
Returns  [0.38, 13.61]  [3.74, 10.68]  [3.60, 9.67]  [4.44, 8.29]  
\(90\%\)  GDP/Pop  [\(\) 5.00, 8.44]  [\(\) 4.30, 8.47]  [\(\) 4.92, 8.24]  [\(\) 4.49, 7.96] 
Cons/Pop  [\(\) 3.12, 6.21]  [\(\) 3.03, 6.22]  [\(\) 2.80, 6.03]  [\(\) 2.90, 6.27]  
Population  [0.13, 1.23]  [\(\) 0.24, 1.51]  [\(\) 0.63, 1.74]  [\(\) 1.13, 1.81]  
CPI infl.  [\(\) 6.02, 12.65]  [\(\) 9.00, 12.13]  [\(\) 8.13, 12.87]  [\(\) 11.26, 16.13]  
Returns  [\(\) 3.64, 17.50]  [0.45, 12.62]  [1.61, 11.77]  [2.82, 9.49] 
The eight economic time series are real per capita GDP, real per capita consumption expenditures, total factor productivity, labor productivity, population, inflation (PCE^{7}), inflation (CPI^{8}) and Japanese inflation (CPI)—all transformed into logdifferences. (Plots are given in “Appendix A.”) The data are available from 1Q1947 till 4Q2014, and we forecast them over next \(m= 10,25\) and 50 years. For a subset of these series, we report results based on longer (yearly) sample starting in 1Q1920, and we add the horizon \(m=75\) years for these yearly series.
For per capita real GDP, per capita consumption and productivity, we use differencing with \(d=0.5\) for the kernelboot PIs. Thus, these intervals are wider than in Müller and Watson (2016), especially those for GDP. This case is similar to the case of realized volatility in the previous section. Wide PIs are often considered as a failure of the forecasting method or model. On the other hand, they can also reflect the higher uncertainty about the series future. The width of PIs for GDP is not surprising given that similar as CBO, we do not account for the possible antipersistence during the Great Moderation. With the longer yearly sample, our PIs get even wider, as a result of higher volatility in the early twentieth century. Interestingly, the growth in Labor production seems to be higher in general than reported by Müller and Watson (2016).
Consumption, population and inflation are well known as quite persistent. Therefore, we would expect that similarly as in case of interest rates, kernelboot could give better coverage and possibly narrower PIs than robust. The uncertainty is similarly large according to both our kernelboot and robust, but the location is generally shifted downward, especially for inflation, where the shift is about \(2 \text {pp}\) compared to Müller and Watson (2016).
Finally, for the quarterly returns, we might expect kernelboot to give less conservative thus narrow estimates, and we see this happening with discrepancy growing with the forecasting horizons. It is clear that robust is very conservative in uncertainty about positive returns, where the difference from kernelboot reached \(11 \text {pp}\). Employing the longer yearly time series makes the difference fall to \(3 \text {pp}\). On the other hand, \(3 \text {pp}\) is a lot from an investors perspective.
5 Discussion
We have constructed prediction intervals for univariate economic time series. Our forecasting comparison shows that even the simple methods of Zhou et al. (2010) provide a valid alternative for sophisticated prediction intervals designed specifically for the economic framework by Müller and Watson (2016). However, based on our simulation results, we emphasize that both the methods and the series need to be suitably adjusted, especially under the small sample constraint, which, on the other hand, is quite common in practice. Based on the comparison results, we eventually provided alternative longrun prediction intervals for eight US economic indicators.
Forecasting average growth of economic series over the coming decades is a very ambitious task, and naturally, there are doubts about its usefulness in practice. The test of Breitung and Knüppel (2018), whether a forecast is informative, is based on the prediction error variance. They conclude that economic forecasts beyond a horizon of several quarters become uninformative. At first sight, such a claim seems to be an argument against following the research of Müller and Watson (2016) and Zhou et al. (2010). However, there are some differences in the assumptions and targets which have to be carefully analyzed before we make such statements. The assumption of a longmemory component is crucial, and it is hard to verify and distinguish it from a possible structural break. In our paper, we did not tackle the issue of whether longterm predictions are informative or not. We instead probed into the existing methods and provided new empirical comparison results.
Throughout this paper, we focused on PIs estimated from historical data on the predicted series. A multivariate or highdimensional extension would, of course, be attractive. It is widely recognized that big data contain additional forecasting power. Unfortunately, in the economic literature, the boom of forecasting with many predictors (e.g., Stock and Watson 2012; Elliott et al. 2013; Kim and Swanson 2014) is mainly focused on short horizons and pointforecasting (for an exception see Bai and Ng 2006). This is not a coincidence. Many economic time series exhibit persistence (of varying degrees), and this is their essential property in the long run. These longterm effects, combined over many series, are difficult to understand, partially due to their dependence on unknown nuisance parameters (see Elliott et al. 2015). The role of cointegration in longrun forecasting is investigated by Christoffersen and Diebold (1998).
We do not use some methods such as quantile (auto) regression (Koenker 2005) in the current study, and the outofsample forecasting comparison could be enhanced by statistical tests (see Clements and Taylor 2003; Gneiting and Raftery 2007, for example).
An extension (including the theory) of Zhou et al. (2010) into a highdimensional regression framework using the LASSO estimator is currently being developed. Even more challenging is a case of multivariate target series and subsequent construction of simultaneous prediction intervals which can have interesting implications for market trading strategies.
Footnotes
 1.
Available from https://www.cbo.gov/publication/52480.
 2.
Available from https://www.govinfo.gov/app/details/GPOFCIC.
 3.
Antipersistence can be observed as well, often as a result of (over) differencing.
 4.
Instead of the normal quantiles, we rather utilize Student’s t where df is estimated by ML.
 5.
The replication files for these methods are available in MATLAB from M. Watson’s homepage.
 6.
The value of \(\sigma \) was obtained from an ar(1) model fitted to the full data set of S&P 500 returns.
 7.
Personal consumption expenditure deflator.
 8.
Consumer price index.
 9.
Notes
Acknowledgements
Open access funding provided by University of Vienna. We would like to thank Jörg Breitung, Francis Diebold, Alexander Kment, Lubos Pastor, Irina Pimenova, Justin Veenstra, Mark Watson as well as the participants of the conference: Big Data in Predictive Dynamic Econometric Modeling held at the University of Pennsylvania and the 1st Vienna Workshop on Economic Forecasting 2018 held at the Institute for Advanced Studies for helpful discussion and for answering our questions. Special thanks go to Eric Nesbitt and the two anonymous referees. We also acknowledge the computational resources provided by the Vienna Scientific Cluster. Note that the opinions expressed in the paper are those of the authors and do not necessarily reflect the opinions of the Institute for Financial Policy.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Funding
The research was partly supported by grant NSF/DMS 1405410. M. Chudý received financial support from J.W. Fulbright Commission for Educational Exchange in the Slovak Republic, The Ministry of Education, Science, Research and Sport of the Slovak Republic and the Stevanovich Center for Financial Mathematics.
References
 Andersen TG, Bollerslev T, Diebold FX, Labys P (2003) Modeling and forecasting realized volatility. Econometrica 71(2):579–625Google Scholar
 Bai J, Ng S (2006) Confidence intervals for diffusion index forecasts and inference for factoraugmented regressions. Econometrica 74:1133–1150Google Scholar
 Baillie RT (1996) Long memory processes and fractional integration in econometrics. J Econ 73:5–59Google Scholar
 Bansal R, Kiku D, Yaron A (2016) Risks for the long run: estimation with time aggregation. J Monet Econ 82:52–69Google Scholar
 Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, LondonGoogle Scholar
 Breitung J, Knüppel M (2018) How far can we forecast? Statistical tests of the predictive content. Deutsche Bundesbank Discussion Paper No. 07/2018Google Scholar
 Carlstein E (1986) The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann Stat 14:1171–1179Google Scholar
 Chatfield C (1993) Calculating interval forecasts. J Bus Econ Stat 11(2):121–135Google Scholar
 Cheng X, Liao Z, Schorfheide F (2016) Shrinkage estimation of highdimensional factor models with structural instabilities. Rev Econ Stud 83(4):1511–1543Google Scholar
 Christoffersen PF, Diebold FX (1998) Cointegration and longhorizon forecasting. J Bus Econ Stat 16:450–458Google Scholar
 Clements MP, Taylor N (2003) Evaluating interval forecasts of highfrequency financial data. Appl Econ 18:445–456Google Scholar
 Dehling H, Fried R, Shapirov O, Vogel D, Wornowizki M (2013) Estimation of the variance of partial sums of dependent processes. Stat Probab Lett 83(1):141–147Google Scholar
 Diebold FX, Linder P (1996) Fractional integration and interval prediction. Econ Lett 50:305–313Google Scholar
 Diebold FX, Rudebusch GD (1989) Long memory and persistence in aggregate output. J Monet Econ 24:189–209Google Scholar
 Elliott G, Gargano A, Timmermann A (2013) Complete subset regressions. J Econ 177(2):357–373Google Scholar
 Elliott G, Mller UK, Watson MW (2015) Nearly optimal tests when a nuisance parameter is present under the null hypothesis. Econometrica 83(2):771–811Google Scholar
 Falk M (1984) Relative deficiency of kernel type estimators of quantiles. Ann Stat 12(1):261–268Google Scholar
 Falk M (1985) Asymptotic normality of the kernel quantile estimator. Ann Stat 13(1):428–433Google Scholar
 Ghalanos A (2017) rugarch: Univariate GARCH models. R package version 1.38Google Scholar
 Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102(477):359–378Google Scholar
 Gonçalves S, de Jong R (2003) Consistency of the stationary bootstrap under weak moment conditions. Econ Lett 81(2):273–278Google Scholar
 Han H, Linton O, Oka T, Whang YJ (2016) The crossquantilogram: measuring quantile dependence and testing directional predictability between time series. J Econ 193(1):251–270Google Scholar
 Kim H, Swanson N (2014) Forecasting financial and macroeconomic variables using data reduction methods: New empirical evidence. J Econ 178:352–367Google Scholar
 Kitsul Y, Wright J (2013) The economics of optionsimplied inflation probability density functions. J Financ Econ 110:696–711Google Scholar
 Koenker R (2005) Quantile regression. Cambridge University Press, CambridgeGoogle Scholar
 Künsch HR (1989) The jackknife and the bootstrap for general stationary observations. Ann Stat 17(3):1217–1241Google Scholar
 Lahiri SN (2013) Resampling methods for dependent data. Springer, BerlinGoogle Scholar
 Lütkepohl H (2006) Forecasting with varma models. In: Granger G, Granger WJ, Timmermann AG (eds) Handbook of economic forecasting, vol 1. Elsevier B.V., North Holland, pp 287–325Google Scholar
 Marcellino M (1999) Some consequences of temporal aggregation in empirical analysis. J Bus Econ Stat 17(1):129–136Google Scholar
 Müller U, Watson M (2016) Measuring uncertainty about longrun predictions. Rev Econ Stud 83(4):1711–1740Google Scholar
 Pascual L, Romo J, Ruiz E (2004) Bootstrap predictive inference for arima processes. J Time Ser Anal 25(4):449–465Google Scholar
 Pascual L, Romo J, Ruiz E (2006) Bootstrap prediction for returns and volatilities in garch models. Comput Stat Data Anal 50(9):2293–2312Google Scholar
 Pastor L, Stambaugh RF (2012) Are stocks really less volatile in the long run. J Finance 67(2):431–478Google Scholar
 Patton A, Politis DN, White H (2009) Correction to automatic blocklength selection for the dependent bootstrap. Econ Rev 28(4):372–375Google Scholar
 Politis DN, Romano JP (1994) The stationary bootstrap. J Am Stat Assoc 89:1303–1313Google Scholar
 Politis DN, White H (2004) Automatic blocklength selection for the dependent bootstrap. Econ Rev 23(1):53–70Google Scholar
 Sheather SJ, Marron JS (1990) Kernel quantile estimators. J Am Stat Assoc 85:410–416Google Scholar
 Silverman B (1986) Density estimation for statistics and data analysis. Chapman & Hall/CRC, LondonGoogle Scholar
 Stock J, Watson M (2012) Generalised shrinkage methods for forecasting using many predictors. J Bus Econ Stat 30(4):482–493Google Scholar
 Stock JH, Watson MW (2005) Understanding changes in international business cycle dynamics. J Eur Econ Assoc 3:968–1006Google Scholar
 Sun S, Lahiri SN (2006) Bootstrapping the sample quantile of a weakly dependent sequence. Sankhyā Indian J Stat 68:130–166Google Scholar
 White H (2000) A reality check for data snooping. Econometrica 68(5):1097–1126Google Scholar
 Wu WB (2005) Nonlinear system theory: another look at dependence. Proc Natl Acad Sci USA 102(40):14150–14154 (electronic)Google Scholar
 Wu WB, Woodroofe M (2004) Martingale approximations for sums of stationary processes. Ann Probab 32(2):1674–1690Google Scholar
 Zhang T, Wu WB (2015) Timevarying nonlinear regression models: nonparametric estimation and model selection. Ann Stat 43(2):741–768Google Scholar
 Zhou Z, Xu Z, Wu WB (2010) Longterm prediction intervals of time series. IEEE Trans Inform Theory 56(3):1436–1446Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.