Forecasting with large datasets: compressing information before, during or after the estimation?


We study the forecasting performance of three alternative large data forecasting approaches. These three approaches handle the dimensionality problem evoked by a large dataset by compressing its informational content, yet at different stages of the forecasting process. We consider different factor models, a large Bayesian vector autoregression and model averaging techniques, where the data compression takes place before, during and after the estimation of the respective forecasting models. We use a quarterly dataset for Germany that consists of 123 variables and find that overall the large Bayesian vector autoregression and the Bayesian factor augmented vector autoregression provide the most precise forecasts for a set of 11 core macroeconomic variables. Further, we find that the performance of these two models is very robust to the exact specification of the forecasting model.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. 1.

    Variable selection methods, such as targeted predictors (Bai and Ng 2008), Bayesian variable selection (Korobilis 2013) or the LASSO approach (Tibshirani 1996), are alternative approaches to solving the dimensionality problem.

  2. 2.

    Beyond pure reduced form forecasting models, Wolters (2015) compared the forecasting accuracy of large Bayesian vector autoregressions to dynamic stochastic general equilibrium (DSGE) models and the Fed’s Greenbook projections, and Carriero et al. (2015b) compared different time series models including a Bayesian vector autoregression to a dynamic stochastic general equilibrium (DSGE) model.

  3. 3.

    Two exceptions to this are Müller-Dröge et al. (2014) and Buchen and Wohlrabe (2014), who evaluate the forecasts for a larger set of German core macroeconomic variables as well. However, both papers have a different methodological focus than this paper.

  4. 4.

    The ifo business climate index is based on a monthly survey among about 7000 firms which report their assessments of the current business situation and their expectations for the next six months. From these two assessments, the overall ifo index is calculated. The out-of-sample predictive ability of the ifo index for German GDP has been widely studied, see, for example, Dreger and Schumacher (2005), Kholodilin and Siliverstovs (2006), Abberger (2007), Drechsel and Scheufele (2012) or Henzel and Rast (2013).

  5. 5.

    For a more detailed description of the different forecasting models, we refer the reader to the earlier working paper versions of this paper (see, for example, Pirschel and Wolters 2014).

  6. 6.

    Of course, this approach is merely an ad hoc rule of thumb. Alternatively, \(\lambda \) could also be chosen to maximize the out-of-sample forecasting performance over a pre-sample as, for example, in Litterman (1986). Giannone et al. (2015) suggested a more sophisticated hierarchical approach to specifying \(\lambda \) which relies on maximizing the marginal likelihood, i.e., the density of the data conditional on \(\lambda \) after integrating out the uncertainty about the parameters of the VAR. However, since we find that the forecasting performance of the large BVAR is very robust to the exact specification of \(\lambda \), we stick to the rule of thumb.

  7. 7.

    According to common practice, we chose the direct version of the autoregressive model because the iterated model variant would require the specification of a subsidiary model for the factors in order to compute forecasts for horizons \(h>1\).

  8. 8.

    We also estimate a FAVAR that includes a small set of core variables (including the variable to be predicted) and the factors (see, for example, Bernanke and Boivin 2003; Banbura et al. 2010). The forecasting performance of this alternative, however, is considerably worse, so that we do not include this model in the main results.

  9. 9.

    The model-specific posterior probability \(P(M_{j})\) is calculated in each estimation period t for each forecasting horizon h. For simplicity, however, we omit the respective subscripts.

  10. 10.

    The underlying idea is to account for the linear dependence between the different variables that might simultaneously drive their MSEs and thus inflate the measure of joint predictive ability. In principle, this is comparable to the approach of computing the variance of the sum of several random variables where a correction term accounting for the covariance of the pairs of variables is needed as well.

  11. 11.

    The “Online Appendix” to this paper contains figures showing the forecasts.

  12. 12.

    However, by construction this model can hardly predict a further deepening of the recession. Since the forecast is computed as \(\varDelta gdp_{t+h} = \hat{\alpha }_{h} + \hat{\beta }_{h} \text {ifo}_t\), the coefficient \(\hat{\beta }_{h}\) would need to increase strongly with the forecasting horizon h to predict the further deepening of the recession.


  1. Abberger K (2007) Forecasting quarter-on-quarter changes of German GDP with monthly business tendency survey results, ifo Working Paper No. 40

  2. Bai J, Ng S (2002) Determining the number of factors in approximate factor models. Econometrica 70(1):191–221

    Article  Google Scholar 

  3. Bai J, Ng S (2007) Determining the number of primitive shocks in factor models. J Bus Econ Stat 25(1):52–60

    Article  Google Scholar 

  4. Bai J, Ng S (2008) Forecasting economic time series using targeted predictors. J Econom 146(2):304–317

    Article  Google Scholar 

  5. Banbura M, Giannone D, Reichlin L (2010) Large Bayesian vector autoregressions. J Appl Econ 25(1):71–92

    Article  Google Scholar 

  6. Bates JM, Granger CWJ (1969) The combination of forecasts. Oper Res Q 20(4):451–468

    Article  Google Scholar 

  7. Berg TO, Henzel SR (2015) Point and density forecasts for the Euro area using many predictors: are large BVARs really superior? Int J Forecast 31(4):1067–1095

  8. Bernanke B, Boivin J (2003) Monetary policy in a data-rich environment. J Monet Econ 50(3):525–546

    Article  Google Scholar 

  9. Bernanke B, Boivin J, Eliasz P (2005) Measuring monetary policy: a factor augmented vector autoregressive (FAVAR) approach. Q J Econ 120(1):387–422

    Google Scholar 

  10. Buchen T, Wohlrabe K (2014) Assessing the macroeconomic forecasting performance of boosting, evidence for the United States, the Euro Area, and Germany. J Forecast 33(4):231–242

    Article  Google Scholar 

  11. Carriero A, Kapetanios G, Marcellino M (2011) Forecasting large datasets with Bayesian reduced rank multivariate models. J Appl Econom 26(5):735–761

  12. Carriero A, Clark TE, Marcellino M (2015a) Bayesian VARs: specification choices and forecast accuracy. J Appl Econ 30(1):46–73

  13. Carriero A, Galvao A, Kapetanios G (2015b) A comprehensive evaluation of macroeconomic forecasting methods. Mimeo

  14. Christoffersen PF, Diebold FX (1998) Cointegration and long-horizon forecasting. J Bus Econ Stat 16(4):450–458

    Google Scholar 

  15. Croushore D (2006) Forecasting with real-time macroeconomic data. In: Elliott G, Granger CWJ, Timmermann A (eds) Handbook of economic forecasting, vol 1. Elsevier, Amsterdam

  16. Del Negro M, Schorfheide F (2013) DSGE model-based forecasting. In: Elliott G, Timmermann A (eds) Handbook of economic forecasting, vol 2. Elsevier, Amsterdam

  17. Diron M (2008) Short-term forecasts of euro area real GDP growth: an assessment of real-time performance based on vintage data. J Forecast 27:371–390

    Article  Google Scholar 

  18. Drechsel K, Scheufele R (2012) The performance of short-term forecasts of the German economy before and during the 2008/2009 recession. Int J Forecast 28(2):428–445

  19. Dreger C, Schumacher C (2005) Out-of-sample performance of leading indicators for the German business cycle: single vs. combined forecasts. J Bus Cycle Meas Anal 1:71–87

    Article  Google Scholar 

  20. Faust J, Wright JH (2009) Comparing Greenbook and reduced form forecasts using a large realtime dataset. J Bus Econ Stat 27(4):468–479

    Article  Google Scholar 

  21. Forni M, Hallin M, Lippi M, Reichlin L (2000) The generalized dynamic-factor model: identification and estimation. Rev Econ Stat 82(4):540–554

    Article  Google Scholar 

  22. Forni M, Hallin M, Lippi M, Reichlin L (2003) Do financial variables help forecasting inflation and real activity in the Euro area? J Monet Econ 50(6):1243–1255

    Article  Google Scholar 

  23. Forni M, Hallin M, Lippi M, Reichlin L (2005) The generalized dynamic factor model: one-sided estimation and forecasting. J Am Stat Assoc 100(471):830–840

    Article  Google Scholar 

  24. Giacomini R, White H (2006) Tests of conditional predictive ability. Econometrica 74(6):1545–1578

    Article  Google Scholar 

  25. Giannone D, Reichlin L, Small D (2008) Nowcasting: the real-time informational content of macroeconomic data. J Monet Econ 55(4):665–676

    Article  Google Scholar 

  26. Giannone D, Lenza M, Primiceri GE (2015) Prior selection for vector autoregressions. Rev Econ Stat 97(2):436–451

  27. Heinisch K, Scheufele R (2017) Bottom-up or direct? forecasting German GDP in a data-rich environment. Empir Econ. doi:10.1007/s00181-016-1218

  28. Henzel S, Rast S (2013) Prognoseeigenschaften von Indikatoren zur Vorhersage des Bruttoinlandsprodukts in Deutschland. ifo Schnelldienst 66(17):39–46

  29. Kadiyala KR, Karlsson S (1997) Numerical methods for estimation and inference in Bayesian VAR-models. J Appl Econom 12(2):99–132

    Article  Google Scholar 

  30. Kholodilin KA, Siliverstovs B (2006) On the forecasting properties of the alternative leading indicators for the German GDP: recent evidence. J Econ Stat 226(3):234–259

    Google Scholar 

  31. Korobilis D (2013) VAR forecasting using Bayesian variable selection. J Appl Econom 28(2):204–230

    Article  Google Scholar 

  32. Kuzin V, Marcellino M, Schumacher C (2013) Pooling versus model selection for nowcasting GDP with many predictors: empirical evidence for six industrialized countries. J Appl Econom 28(3):392–411

    Article  Google Scholar 

  33. Litterman RB (1986) Forecasting with Bayesian vector autoregressions—five years of experience. J Bus Econ Stat 4(1):25–38

    Google Scholar 

  34. Mincer J, Zarnowitz V (1969) The evaluation of economic forecasts. In: Mincer J (ed) Economic forecasts and expectations. NBER, New York

    Google Scholar 

  35. Mol CD, Giannone D, Reichlin L (2008) Forecasting using a large number of predictors: Is Bayesian regression a valid alternative to principal components? J Econom 146(2):318–328

    Article  Google Scholar 

  36. Müller-Dröge HC, Sinclair TM, Stekler HO (2014) Evaluating forecasts of a vector of variables: a German forecasting competition, CAMA Working Paper 55, Centre for Applied Macroeconomic Analysis, Crawford School of Public Policy, The Australian National University

  37. Pirschel I, Wolters MH (2014) Forecasting German key macroeconomic variables using large dataset methods. Kiel Working Paper 1925

  38. Schumacher C (2007) Forecasting German GDP using alternative factor models based on large datasets. J Forecast 26(4):271–302

    Article  Google Scholar 

  39. Schumacher C (2010) Factor forecasting using international targeted predictors: the case of German GDP. Econ Lett 107(2):95–98

    Article  Google Scholar 

  40. Schumacher C (2011) Forecasting with factor models estimated on large datasets: a review of the recent literature and evidence for German GDP. J Econ Stat 231(1):28–49

    Google Scholar 

  41. Schumacher C, Dreger C (2004) Estimating large-scale factor models for economic activity in Germany: Do they outperform simpler models? J Econ Stat 224(6):732–750

    Google Scholar 

  42. Sims C, Zha T (1998) Bayesian methods for dynamic multivariate models. Int Econ Rev 39(4):949–968

    Article  Google Scholar 

  43. Stock JH, Watson MW (2002a) Forecasting using principal components from a large number of predictors. J Am Stat Assoc 97(460):1167–1179

    Article  Google Scholar 

  44. Stock JH, Watson MW (2002b) Macroeconomic forecasting using diffusion indexes. J Bus Econ Stat 20(2):147–162

    Article  Google Scholar 

  45. Stock JH, Watson MW (2003) Forecasting output and inflation: the role of asset prices. J Econ Lit 41(3):788–829

    Article  Google Scholar 

  46. Stock JH, Watson MW (2004) Combination forecasts of output growth in a seven-country data set. J Forecast 23(6):405–430

    Article  Google Scholar 

  47. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58(1):267–288

    Google Scholar 

  48. Timmermann A (2006) Forecast combinations. In: Elliott G, Granger CWJ, Timmermann A (eds) Handbook of economic forecasting, vol 1. Elsevier, Amsterdam

  49. Timmermann A, van Dijk HK (2013) Dynamic econometric modeling and forecasting in the presence of instability. J Econom 177(2):131–133

    Article  Google Scholar 

  50. Wolters MH (2015) Evaluating point and density forecasts of DSGE models. J Appl Econom 30(1):74–96

    Article  Google Scholar 

  51. Wright J (2009) Forecasting U.S. inflation by Bayesian model averaging. J Forecast 28(2):131–144

    Article  Google Scholar 

Download references


We thank Christian Schumacher for sharing the dataset used in Schumacher (2007) and for useful comments and discussions. We further thank Jens Boysen-Hogrefe, Kai Carstensen, Domenico Giannone, Nils Jannsen, Martin Plödt, Tim Schwarzmüller, Herman Stekler, Klaus Wohlrabe, the editors and two anonymous referees as well as participants of the 2014 International Symposium on Forecasting in Rotterdam, the 2014 Conference on Advances in Applied Macro-Finance and Forecasting in Istanbul, the 2014 CEF annual conference in Oslo, the 2014 annual conference of the Verein für Socialpolitik in Hamburg, the 2013 DIW macroeconometric workshop in Berlin and the 2013 IWH macroeconomic workshop in Halle for useful comments.

Author information



Corresponding author

Correspondence to Inske Pirschel.

Additional information

Disclaimer: The views expressed in this paper are those of the authors and do not necessarily reflect those of the Swiss National Bank.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 171 KB)


Appendix A: Real-time performance-based model specification and pooling over various model specifications

Table 6 Multivariate mean-squared forecast errors with IC, PBRT and pooling

The table displays the absolute multivariate mean-squared forecast errors for the alternative model specification approaches in the quasi-real-time exercise. For the performance-based model selection, we evaluate the performance of the various specifications of the different forecasting models over a subevaluation sample of 4 quarters and chose the one specification that yields the smallest subsample MSE to estimate the respective model with information up to T and to compute forecasts for \(T + h\). For forecast pooling, we implement two versions: unweighted pooling (final forecast is obtained by averaging over the various forecasts computed with different specifications) and MSE-weighted pooling (final forecast is weighted mean, where the weight is the inverse of the MSE of the respective model specification over a 4-quarter subevaluation sample). All forecasting models are estimated over a rolling window of 60 quarters. The forecasts obtained by the different models are evaluated over the sample ranging from 1997Q3 until 2013Q3; thus, for each horizon a total of 65 forecasts are computed (Table 6).

Table 7 Absolute multivariate mean-squared forecast errors with IC, PBC and PBTV

Appendix B: Ex post best performing model specifications

Panels (b) and (c) in Table 7 contain the results for the different models obtained with their ex post best performing specification that is found using full sample information. In particular, with PBTV the evaluation sample is divided into subsamples covering 4 quarters, and for each of these subsamples, we select the specification for each forecasting model and for each forecasting horizon that minimizes the respective subsample MSE. By contrast, with PBC we choose the specification for each model that minimizes the MSE over the whole evaluation sample for each horizon. All forecasting models are estimated over a rolling window of 60 quarters. The forecasts obtained by the different models are evaluated over the sample ranging from 1994Q4 until 2013Q3; thus, for each horizon, a total of 76 forecasts are computed.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pirschel, I., Wolters, M.H. Forecasting with large datasets: compressing information before, during or after the estimation?. Empir Econ 55, 573–596 (2018).

Download citation


  • Large Bayesian VAR
  • Model averaging
  • Factor models
  • Great Recession
  • Ifo business climate index

JEL Classification

  • C53
  • C55
  • E31
  • E32
  • E37
  • E47