Skip to main content

Forecasting US real private residential fixed investment using a large number of predictors

Abstract

This paper employs classical bivariate, slab-and-spike variable selection, Bayesian semi-parametric shrinkage, and factor-augmented predictive regression models to forecast US real private residential fixed investment over an out-of-sample period from 1983Q1 to 2005Q4, based on in-sample estimates for 1963Q1–1982Q4. Both large-scale (188 macroeconomic series) and small-scale (20 macroeconomic series) slab-and-spike variable selection, and Bayesian semi-parametric shrinkage, and factor-augmented predictive regressions, as well as 20 bivariate regression models, capture the influence of fundamentals in forecasting residential investment. We evaluate the ex post out-of-sample forecast performance of the 26 models using the relative average mean square error for one-, two-, four-, and eight-quarter-ahead forecasts and test their significance based on the McCracken (2004, J Econom 140:719–752, 2007) mean-square-error F statistic. We find that, on average, the slab-and-spike variable selection and Bayesian semi-parametric shrinkage models with 188 variables provides the best forecasts among all the models. Finally, we use these two models to predict the relevant turning points of the residential investment, via an ex ante forecast exercise from 2006Q1 to 2012Q4. The 188 variable slab-and-spike variable selection and Bayesian semi-parametric shrinkage models perform quite similarly in their accuracy of forecasting the turning points. Our results suggest that economy-wide factors, in addition to specific housing market variables, prove important when forecasting in the real estate market.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. See, for example, Egebo et al. (1990), Brayton and Tinsley (1996), Edge (2000), McCarthy and Peach (2002), Berger-Thomas and Ellis (2004), Dynan et al. (2006), Fisher and Gervais (2007), Choy et al. (2011).

  2. The list of references to document the choice of these variables is available from the authors. The 20 variables include interest rates (3-month Treasury bill rate, 3TB), real gross domestic product (RGDP), the consumer price index (CPI), the unemployment rate (UNRATE), the labor force participation rate (LFPR), the 30-year conventional mortgage interest rate (MORTG), the business confidence index (BCON), the real house price index (RHP), the money supply (M1), real private consumption expenditure (RPCON), real government consumption expenditure (RGCON), the real change in private inventories (RCPINV), housing starts (HOUST), real non-residential fixed investment (RNRFINV), the Standard & Poor’s stock price index (S&P), retail sales (RSALES), new private housing units authorized by building permit (PERMIT), number of new houses sold (HSOLD), and the months’ supply of housing ratio (HSUPPLY).

  3. The Dirichlet process, or Ferguson distribution, was developed by Ferguson (1973) as a continuous probability distribution (Shotwell and Slatey 2011) instead of over numbers (real numbers, nonnegative integers, etc.). The usual parameterization includes a concentration parameter and a base measure.

  4. The gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, usually parameterized with (1) shape and scale parameters, (2) shape and inverse scale parameters, or (3) shape and mean parameters (SAS 2012). The inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is distributed as the reciprocal of a variable distributed according to the gamma distribution (SAS 2012). The beta distribution is a general statistical distribution that relates to the gamma distribution and contains two free parameters, often used as a prior distribution for binomial proportions in Bayesian analysis (Forbes et al. 2010).

  5. The online Technical Appendix of Korobilis (2013b) details the MCMC method.

  6. The Appendix contains a full description of all variables and the relevant stationarity transformations used.

  7. Based on the suggestions of an anonymous referee, we also considered the performance of the double-differencing device (DDD) model proposed by Hendry (2006), as a possible benchmark. Since the AR(1) model consistently outperformed the DDD model at all horizons over the out-of-sample; however, we still chose the former as our benchmark. Complete details comparing the forecasting results of the AR(1) and DDD models are available from the authors.

  8. The MSE-F statistic uses the loss differential as follows: \(MSE-F=(T-R-h+1)\left( {\bar{{d}}/MSE_1 } \right) \), where T equals the number of observations in the total sample, R equals number of observations used to estimate the model from which we calculate the first forecast (i.e., the in-sample portion of T), h equals the forecast horizon, \({\bar{d}}=M{\hat{S}}E_0 -M{\hat{S}}E_1\), \(M{\hat{S}}E_i =(T-R-h+1)^{-1}\sum _{t=R}^{T-h} {(u_{i,t+1})^{2}} \) with \(i=1, 0,\, M{\hat{S}}E_1\) corresponds to the MSE of the unrestricted model (i.e., the model with the relevant macroeconomic predictor variables), and \(M{\hat{S}}E_0\) corresponds to the MSE of the restricted model (i.e., the AR(1)-benchmark model). A significant MSE-F statistic indicates that the unrestricted model forecasts are statistically more accurate than those of the restricted model. Note, however, that the MSE-F statistic exhibits a non-standard and non-pivotal limiting distribution in the case of nested models and \(h>1\). Hence, we base our inferences on the bootstrap procedure described in detail in (Rapach et al. 2005; Rapach and Wohar 2006). These two papers provide further details.

  9. We recursively update the ex post forecasts in-sample in the forecasting equation to generate the multi-step-ahead forecasts, whereas we produce the ex ante multi-step-ahead forecasts from a specific point in time (generally, from the end-point of data available on the predictors, which in our case is 2006:Q1-2012:Q4) without updating the parameter estimates. The ex ante forecasts give an objective statistical method (approach) to choose the best performing models, which, in turn, we use to predict the turning points.

  10. As indicated earlier, the structure of the macroeconomic data commonly used by macroeconomists frequently involves highly correlated variables, which the SSVS model does not account for, but is incorporated in the BSS modeling approach. While, on average, we observe the gains from using the large-scale BSS approach over the corresponding SSVS approach, in our case, these two models produce statistically similar accuracy in forecasting the real private fixed residential investment.

  11. An earlier version of the paper considered an extended out-of-sample horizon covering 1983:Q1-2011:Q2. Based on this out-of-sample period, the SSVS-Large model performed the best, on average, followed by the bivariate predictive regression model consisting of H4SALE. When we used these two models to compare the turning points over the out-of-sample period, we found that the forecast from the H4SALE model was more volatile relative to the SSVS-Large model, and in general, the SSVS-Large model tracked the turning points reasonably well, except during the recent crisis.

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Goodness C. Aye.

Additional information

We would like to thank two anonymous referees and an Associate Editor for many helpful comments. However, any remaining errors are solely ours.

Appendix

Appendix

See Appendix Table 3.

Table 3 Description of variables

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aye, G.C., Miller, S.M., Gupta, R. et al. Forecasting US real private residential fixed investment using a large number of predictors. Empir Econ 51, 1557–1580 (2016). https://doi.org/10.1007/s00181-015-1059-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00181-015-1059-z

Keywords

  • Private residential investment
  • Predictive regressions
  • Factor-augmented models
  • Bayesian shrinkage
  • Forecasting

JEL Classification

  • C32
  • E22
  • E27