Abstract
In this paper, simple regression estimates and factor-based models are utilised to produce forecasts for Bahrain quarterly gross domestic product growth. Using simulated out-of-sample experiments, we assess and compare the performance of the simple regression estimates, which exploit the available information on selected indicator variables, with factor-based estimates. These estimates use up to 65 variables to obtain new factors that embody most of the potential information and handle it in a systematic way following the Stock–Watson approach. Additionally, we compare the performance of the nowcast factor MIDAS with the quarterly factor (static-SW) models based on time-aggregated data, which neglect the most recent information. Our empirical findings can be summarised as follows. First, using more information does not help to produce more accurate results. Preselected indicator variables can clearly improve the forecast performance in comparison with the use of large dataset. Second, quarterly factor models are in general outperformed by the nowcast factor models that directly relate low-frequency data to those of high frequency. Third, the best forecasting performance can be reached using simple regression estimates with a handful of variables. However, it fails in density forecast evaluation tests. Thus, the alternative factor-MIDAS model is considered as an optimal model that passes all the performance evaluation tests. Fourth, concerning the difference between MIDAS projection methods, the results indicate that MIDAS with exponential distributed lag functions outperforms the MIDAS with unrestricted lag polynomials. The best performing projection based on the number of factors is the model with three factors. They can pick up the rapid switch in the utility of the indicators automatically. Finally, although the most accurate Flash estimates are obtained at 84 days, nowcasting using industrial production into a bridge equation witnessed insignificant loss in accuracy at 54 days only.
Similar content being viewed by others
Notes
Gupta and Kabundi (2011) is an interesting paper that use large factor models for forecasting macroeconomic variables for the South African economy.
This has been started since year 2007 only.
Mazzi et al. (2009) assess the ability of both regression and factor-based approaches using ‘soft’ and ‘hard’ data to nowcast the Euro-area quarterly GDP growth. The performance of the different statistical nowcasting models varies considerably according to which statistical model is used.
Variables are selected based on in-sample correlation with the dependent variable. For more details, see data section.
Given that these data are available on monthly basis, we have also downloaded and treated them to facilitate the adaption of factor-MIDAS models.
All indicator variables \(x_{j}\) that enter equation (1), if necessary, are differenced until stationary.
The number of lags \(p\) has been selected based on Akaike and Bayesian Information Criterion, AIC and BIC, respectively.
The extraction of factors that represent the ‘underlying state of the economy’ has a long tradition going back to Burns and Mitchell (1946). Alternatives to principal components analysis are identification and estimation of the factors using a parametric model. For example, the state-space approach can be used when the set of indicator variables is quite small (say \(<\) 12); e.g., see Stock and Watson (1989) and Camba-Mendez et al. (2001).
Boivin and Ng (2005) show that the key difference of these two approaches is that the latter approach extracts the factors from the unobserved common to all information variables component. In doing so, the dynamic principal component method of [FHLR] imposes the factor structure on the forecasting model. However, there is no empirical evidence that the latter method outperforms the former.
It is worth to note that the Stock and Watson method can consider quarterly data only, if GDP is part of the dataset.
Stock and Watson proposed using BIC for selecting the optimal number of factors, but with a restriction of having a case where \(N \gg T\).
Bai and Ng (2002) derive information criteria to determine the number of static factors \(r\) in (4). The information criteria represent the usual trade-off between goodness of fit and overfitting. The information criteria can be seen as extensions to the familiar Bayes or Akaike criteria. This method does not have any restrictions between \(N\) and \(T\).
Giannone et al. (2008) were the first to obtain factors that summarise large monthly dataset and plug them into the state-space framework to generate current quarter nowcasts of GDP.
Andreou et al. (2013) have enhanced the dynamics of the MIDAS model by augmenting it the lags of the dependent variable (quarterly series), factors that are extracted from aggregated quarterly dataset and daily financial variables. They have shown that the nowcasts obtained from this model improve the traditional forecasts on aggregated data.
For quarterly output growth and monthly indicators as explanatory variables, \(m = 3\).
For more details, see Marcellino and Schumacher (2010).
Let \(x_{t}= E(y_{t}, \varOmega _{t-1})\) be the predictor of \(y_{t}\) found with respect to the information set \(\varOmega _{t-1}\), with \(n\) observations \((y_{1},x_{1} ),(y_{2},x_{2} ),\ldots ,(y_{n},x_{n})\) available. The test proposed by Pesaran and Timmermann (1992) is based on the proportion of times that the direction of changes in \(y_{t}\) is correctly predicted by \(x_{t}\). The test statistic is computed as: \(S_{n}= \frac{P - P^*}{\sqrt{V(P) - V(P^*)}}\sim N(0,1)\) where \(P = \bar{Z} = \frac{1}{n}\Sigma _{i=1}^n Z_{i}\), \(P^*=P_{y} P_{x}+(1 - P_{y} )(1-P_{x} )\), \(V(P^* )= \frac{1}{n} P^* (1 - P^*)\) and \(V(P)= n [(2P_{y} - 1)^2 P_{x} (1 - P_{x} )+(2 P_{x} - 1)^2 P_{y} (1 - P_{y})+ \frac{4}{n} P_{y} P_{x} (1 - P_{y})(1 - P_{x})]\). \(Z_{i}\) is an indicator variable, which takes value of one when the sign of \(y_{t}\) is correctly predicted by \(x_{t}\), and zero otherwise, \(P_{y}\) is the proportion of times \(y_{t}\) takes a positive value and \(P_{x}\) is the proportion of times \(x_{t}\) takes a positive value. The null hypothesis, which illustrates that \(x_{t}\) and \(y_{t}\) are distributed independently, is set against the alternative that \(x_{t}\) and \(y_{t}\) are not statistically independent.
The Diebold and Mariano (1995) test examines the null hypothesis of equal forecast accuracy of two competing forecasts. It uses a forecast error loss differential \(d_{t}=g(e_t^A )-g(e_t^B)\), which is assumed to be a weakly stationary process with short memory. The main rationale underlying this test is that forecast errors are usually serially correlated. In multi-step forecasting \((h > 1)\), forecasts errors are assumed to be at most \(h - 1\) dependent. This is a plausible assumption, since two consecutive h-steps-ahead forecasts have \(h-1\) periods with similar information in common. The Diebold and Mariano (1995) test is a modified \(t\)-test, whereby the modification accounts for the serial correlation of the loss differential. The mean \(\bar{d}\) is assumed to be asymptotically normally distributed \(\sqrt{T(\bar{d}- \mu )}\rightarrow ^d\) \(N(0,V(\bar{d}))\), whereby \(V(\bar{d})\) stands for the serially correlated errors’ corrected variances of the sample mean \((\bar{d})\), given by the sum of the variance and the autocovariance up to lag \(h-1\), assuming that there are no autocorrelations at a lag equal to or greater than \(h:V(\bar{d})= \frac{1}{T}(\gamma _{0} + 2\Sigma _{r=1}^{h-1}\gamma _{\tau }\) where \(T\) denotes the sample size and the autocovariance is given by: \(\gamma _{\tau } = \frac{2}{T}\Sigma _{t=\tau +1}^T (d_{t} - \bar{d})(d_{t-\tau } - \bar{d})\) the asymptotically normally distributed test statistic. Harvey et al. (1997) argued that the DM test can be quite over sized for small samples, and this problem can be more dramatic as forecast horizons increase. They thus suggest a modified DM test as: \(DM^*= \frac{DM}{\sqrt{\frac{T+1-2h+\frac{h(h-1)}{T}}{T}}}\).
A density forecast of the realisation of a random variable at some future time is an estimate of the probability distribution of the possible future values of that variable. It thus provides a complete description of the uncertainty associated with a prediction and stands in contrast to a point forecast, which by itself contains no description of the associated uncertainty. For more details on evaluating econometric forecasts, see Clements (2005).
To describe the distribution, \(q_t(z_t)\), of the probability integral transform.
The null of \(i.i.d.\) uniformity is a joint hypothesis. For more details, see Clements (2005).
The null hypothesis of Ljung-Box test is \(H_{0}\): all correlation coefficients up to lag ‘\(j\)’ are zero and \(H_{1}:\) not all lags up to lag ‘\(j\)’ are zero.
Normality in statistics is used to evaluate the fitting of the data in the model applied. It tests whether it has been well modelled by a normal distribution or not, or to compute how likely an underlying random variable is not to be normally distributed.
Clements and Smith (2000) use density forecast performance to compare linear models with nonlinear forecasting models of output growth and unemployment.
Classical trial and error test results are obtained using only the first 35 observations (1995:Q1–2003:Q3).
In preliminary steps of this investigation, we utilise different combinations of datasets and found that the root mean square forecasted error become significantly different and varies from 1.979 to 13.58511 in some cases. Empirical results using these alternatives can be obtained from the authors upon request. However, these models performed worse than the alternatives presented here.
The model has been implemented in Gauss.
Diebold and Mariano (1995) test has been applied for all the models against the bench mark model and then for the optimal against the rest of the models.
This is consistent with the findings of Rünstler and Sédillot (2003) for Eurozone GDP growth.
Stock and Watson (2004) find evidence that simple mean combination forecasts (derived from simple indicator regression augmented with AR terms with no more than three indicators) outperform dynamic factor model-based forecast in many cases.
At this point, in time monthly key indicators are available for the three months of the entire quarter, and therefore, using the aggregated monthly indicator variables in factor-MIDAS models.
References
Andreou E, Ghysels E, Kourtellos A (2010) Regression models with mixed sampling frequencies. J Econom 158(2):246–261
Andreou E, Ghysels E, Kourtellos A (2013) Should macroeconomic forecasters use daily financial data and how? J Bus Econ Stat 31(2):240–251
Angelini E, Henry J, Mestre R, Bank EC (2001) Diffusion index-based inflation forecasts for the Euro Area. European Central Bank
Angelini E, Bańbura M, Rünstler G (2008) Estimating and forecasting the Euro Area monthly national accounts from a dynamic factor model. Tech. rep, European Central Bank
Armesto MT, Hernández-Murillo R, Owyang MT, Piger J (2009) Measuring the information content of the beige book: a mixed data sampling approach. J Money Credit Bank 41(1):35–55
Artis M, Banerjee A, Marcellino M (2005) Factor forecasts for the UK. J Forecast 24(4):279–298
Baffigi A, Golinelli R, Parigi G (2004) Bridge models to forecast the Euro area GDP. Int J Forecast 20(3):447–460
Bai J, Ng S (2002) Determining the number of factors in approximate factor models. Econometrica 70(1):191–221
Berkowitz J (2001) Testing density forecasts, with applications to risk management. J Bus Econ Stat 19(4):465–474
Bernanke BS, Boivin J (2003) Monetary policy in a data-rich environment. J Monet Econ 50(3):525–546
Boivin J, Ng S (2005) Understanding and comparing factor-based forecasts. Tech. rep., National Bureau of Economic Research
Boivin J, Ng S (2006) Are more data always better for factor analysis? J Econom 132(1):169–194
Burns AF, Mitchell WC (1946) Measuring business cycles. National Bureau of Economic Research, New York
Caggiano G, Kapetanios G, Labhard V (2011) Are more data always better for factor analysis? results for the Euro area, the six largest Euro area countries and the UK. J Forecast 30(8):736–752
Camba-Mendez G, Kapetanios G, Smith R, Weale M (2001) An automatic leading indicator of economic activity: forecasting GDP growth for European countries. Econom J 4(1):S56–S90
Clements MP (2005) Evaluating econometric forecasts of economic and financial variables. Palgrave Macmillan Basingstroke, Hampshire
Clements MP, Galvão AB (2008) Macroeconomic forecasting with mixed-frequency data: forecasting output growth in the United States. J Bus Econ Stat 26(4):546–554
Clements MP, Galvão AB (2009) Forecasting US output growth using leading indicators: an appraisal using MIDAS models. J Appl Econom 24(7):1187–1206
Clements M, Hendry D (1996) Intercept corrections and structural change. J Appl Econom 11(5):475–494
Clements M, Hendry D (1999) Forecasting non-stationary economic time series: the Zeuthen lectures on economic forecasting
Clements M, Smith J (2000) Evaluating the forecast densities of linear and non-linear models: applications to output growth and unemployment. J Forecast 19(4):255–276
D’Agostino A, Giannone D (2012) Comparing alternative predictors based on large-panel factor models. Oxf Bull Econ Stat 74(2):306–326
Diebold F, Mariano R (1995) Comparing predictive accuracy. J Bus Econ Stat 13(3):253–263
Diebold F, Gunther T, Tay A (1998) Evaluating density forecasts, with applications to financial risk management. Int Econ Rev 39:863–883
Diron M (2008) Short-term forecasts of Euro area real GDP growth: an assessment of real-time performance based on vintage data. J Forecast 27(5):371–390
Doornik J, Hansen H (1994) A practical test for univariate and multivariate normality. Tech. rep., Discussion paper, Nuffield College
Eickmeier S, Ziegler C (2008) How successful are dynamic factor models at forecasting output and inflation? A meta-analytic approach. J Forecast 27(3):237–265
Forni M, Hallin M, Lippi M, Reichlin L (2000) The generalized dynamic-factor model: identification and estimation. Rev Econ Stat 82(4):540–554
Forni M, Hallin M, Lippi M, Reichlin L (2003) Do financial variables help forecasting inflation and real activity in the Euro area? J Monet Econ 50(6):1243–1255
Forni M, Giannone D, Lippi M, Reichlin L (2009) Opening the black box: structural factor models with large cross-sections. Econom Theory 25(05):1319–1347
Foroni C, Marcellino M, Schumacher C (2015) Unrestricted mixed data sampling (MIDAS): MIDAS regressions with unrestricted lag polynomials. J R Stat Soc Ser A (Stat Soc) 178(1):57–82
Ghysels E, Santa-Clara P, Valkanov R (2004) The MIDAS touch: mixed data sampling regression models. Tech. rep, Anderson Graduate School of Management, UCLA
Ghysels E, Santa-Clara P, Valkanov R (2005) There is a risk-return trade-off after all. J Financ Econ 76(3):509–548
Ghysels E, Santa-Clara P, Valkanov R (2006) Predicting volatility: getting the most out of return data sampled at different frequencies. J Econom 131(1):59–95
Ghysels E, Sinko A, Valkanov R (2007) MIDAS regressions: further results and new directions. Econom Rev 26(1):53–90
Giannone D, Reichlin L, Small D (2008) Nowcasting: the real-time informational content of macroeconomic data. J Monet Econ 55(4):665–676
Gosselin M, Tkacz G (2001) Evaluating factor models: an application to forecasting inflation in Canada. Citeseer
Granger CWJ, Pesaran MH (1999) A decision theoretic approach to forecast evaluation. In: Chan WS, Li WK, Tong H (eds) Statistics and finance: an interface. Imperial College Press, London, pp 261–278
Granger CW, Pesaran MH (2000) Economic and statistical measures of forecast accuracy. J Forecast 19(7):537–560
Grasmann P, Keereman F (2001) An indicator-based short-term forecast for quarterly GDP in the Euro area. Tech. rep., Directorate General Economic and Monetary Affairs, European Commission
Gupta R, Kabundi A (2010) Forecasting macroeconomic variables in a small open economy: a comparison between small-and large-scale models. J Forecast 29(1–2):168–185
Gupta R, Kabundi A (2011) A large factor model for forecasting macroeconomic variables in South Africa. Int J Forecast 27(4):1076–1088
Hansson J, Jansson P, Löf M (2005) Business survey data: do they help in forecasting GDP growth? Int J Forecast 21(2):377–389
Harvey AC (1991) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge
Harvey D, Leybourne S, Newbold P (1997) Testing the equality of prediction mean squared errors. Int J Forecast 13(2):281–291
Kuzin V, Marcellino M, Schumacher C (2011) MIDAS vs. mixed-frequency VAR: nowcasting GDP in the Euro Area. Int J Forecast 27(2):529–542
Ljung G, Box G (1978) On a measure of lack of fit in time series models. Biometrika 65(2):297
Marcellino M (1999) Some consequences of temporal aggregation in empirical analysis. J Bus Econ Stat 17(1):129–136
Marcellino M, Schumacher C (2010) Factor MIDAS for nowcasting and forecasting with ragged-edge data: a model comparison for German GDP. Oxf Bull Econ Stat 72(4):518–550
Marcellino M, Stock J, Watson M (2003) Macroeconomic forecasting in the Euro area: country specific versus area-wide information. Eur Econ Rev 47(1):1–18
Mazzi G, Mitchell J, Montana G, Mouratidis K, Weale M (2009) The Euro-area recession and nowcasting GDP growth using statistical models. In: International seminar on early warning and business cycle indicators. Scheveningen, The Netherlands
Mitchell J, Wallis KF (2011) Evaluating density forecasts: forecast combinations, model mixtures, calibration and sharpness. J Appl Econom 26(6):1023–1040
Pesaran MH, Skouras S (2002) Decision-based methods for forecast evaluation. In: Clements MP, Hendry DF (eds) A companion to economic forecasting. Blackwell Publishing, pp 241–267
Pesaran M, Timmermann A (1992) A simple nonparametric test of predictive performance. J Bus Econ Stat 10(4):461–465
Rathjens P, Robins R (1993) Forecasting quarterly data using monthly information. J Forecast 12(3–4):321–330
Rosenblatt M (1952) Remarks on a multivariate transformation. Ann Math Stat 23(3):470–472
Rünstler G, Sédillot F (2003) Short-term estimates of Euro area real GDP by means of monthly data. Tech. rep
Schumacher C (2007) Forecasting German GDP using alternative factor models based on large datasets. J Forecast 26(4):271–302
Schumacher C, Dreger C (2004) Estimating large-scale factor models for economic activity in Germany: do they outperform simpler models? J Econ Stat 224(6):731–750
Sédillot F, Pain N (2003) Indicator models of real GDP growth in selected OECD countries. Tech. rep., OECD Publishing
Siliverstovs B, Dijk Dv (2003) Forecasting industrial production with linear, nonlinear, and structural change models. Tech. rep., Erasmus School of Economics (ESE)
Stock J, Watson M (1989) New indexes of coincident and leading economic indicators. In: NBER macroeconomics annual 1989, vol 4. MIT Press, Cambridge, pp 351–409
Stock J, Watson M (1998) Diffusion indexes. Tech. rep., National Bureau of Economic Research
Stock J, Watson M (1999) Forecasting inflation. J Monet Econ 44(2):293–335
Stock J, Watson M (2002a) Forecasting using principal components from a large number of predictors. J Am Stat Assoc 97(460):1167–1179
Stock J, Watson M (2002b) Macroeconomic forecasting using diffusion indexes. J Bus Econ Stat 20(2):147–162
Stock J, Watson M (2004) Combination forecasts of output growth in a seven-country data set. J Forecast 23(6):405–430
Stock J, Watson M (2005) An empirical comparison of methods for forecasting using many predictors. Manuscript, Princeton University
Wallis KF (2003) Chi-squared tests of interval and density forecasts, and the bank of England’s fan charts. Int J Forecast 19(2):165–175
Watson M (2003) Macroeconomic forecasting using many predictors. Econom Soc Monogr 37:87–114
Acknowledgments
We thank Christian Schumacher for helpful comments and for sharing elements of his Matlab codes. We are also very grateful to the coordinating editor, Prof. Robert Kunst, and the two anonymous referees for their very helpful comments and suggestions which substantially improved the original manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Data description
This appendix describes the panel of time series for the Kingdom of Bahrain economy. The whole dataset for Bahrain contains 65 series over the sample period from 1995:q1 to 2008:q3. The sources of the time series are the Central Bank of Bahrain (CBB), the Central Information Organization of Bahrain (CIO), the International Energy Agency (IEA), and the International Monetary Fund (IMF).
Since GDP is the reference series, all time series are taken in quarterly basis to get a better picture about the economy activities and situation. Moreover, natural logarithms were taken for all positive time series. Most of the data that are taken from the above sources are already seasonally adjusted. Following Stock and Watson (2002), stationarity was obtained by appropriately differencing the time series, as the principal component (PC) estimation of the factors requires stationary time series (Tables 8, 9).
Details on variables and transformation required for stationarity are provided below.
Rights and permissions
About this article
Cite this article
Naser, H. Estimating and forecasting Bahrain quarterly GDP growth using simple regression and factor-based methods. Empir Econ 49, 449–479 (2015). https://doi.org/10.1007/s00181-014-0892-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00181-014-0892-9