Forecast Evaluation

  • Mingmian ChengEmail author
  • Norman R. Swanson
  • Chun Yao
Part of the Advanced Studies in Theoretical and Applied Econometrics book series (ASTA, volume 52)


The development of new tests and methods used in the evaluation of time series forecasts and forecasting models remains as important today as it has been for the last 50 years. Paraphrasing what Sir Clive W.J. Granger (arguably the father of modern day time series forecasting) said in the 1990s at a conference in Svinkloev, Denmark, “OK, the model looks like an interesting extension, but can it forecast better than existing models?” Indeed, the forecast evaluation literature continues to expand, with interesting new tests and methods being developed at a rapid pace. In this chapter, we discuss a selected group of predictive accuracy tests and model selection methods that have been developed in recent years, and that are now widely used in the forecasting literature. We begin by reviewing several tests for comparing the relative forecast accuracy of different models, in the case of point forecasts. We then broaden the scope of our discussion by introducing density-based predictive accuracy tests. We conclude by noting that predictive accuracy is typically assessed in terms of a given loss function, such as mean squared forecast error or mean absolute forecast error. Most tests, including those discussed here, are consequently loss function dependent, and the relative forecast superiority of predictive models is therefore also dependent on specification of a loss function. In light of this fact, we conclude this chapter by discussing loss function robust predictive density accuracy tests that have recently been developed using principles of stochastic dominance.


  1. Andrews, D. W., & Soares, G. (2010). Inference for parameters defined by moment inequalities using generalized moment selection. Econometrica, 78(1), 119–157.CrossRefGoogle Scholar
  2. Andrews, D. W. K. (2002). Higher-order improvements of a computationally attractive “k”-step bootstrap for extremum estimators. Econometrica, 70(1), 119–162.CrossRefGoogle Scholar
  3. Andrews, D. W. K. (2004). The block–block bootstrap: Improved asymptotic refinements. Econometrica, 72(3), 673–700.CrossRefGoogle Scholar
  4. Bierens, H. J. (1990). A consistent conditional moment test of functional form. Econometrica, 58, 1443–1458.CrossRefGoogle Scholar
  5. Bierens, H. J., & Ploberger, W. (1997). Asymptotic theory of integrated conditional moment tests. Econometrica, 65, 1129–1151.CrossRefGoogle Scholar
  6. Chang, Y., Gomes, J. F., & Schorfheide, F. (2002). Learning-by-doing as a propagation mechanism. American Economic Review, 92(5), 1498–1520.CrossRefGoogle Scholar
  7. Chao, J., Corradi, V., & Swanson, N. R. (2001). Out-of-sample tests for granger causality. Macroeconomic Dynamics, 5(4), 598–620.CrossRefGoogle Scholar
  8. Clark, T. E., & McCracken, M. W. (2001). Tests of equal forecast accuracy and encompassing for nested models. Journal of Econometrics, 105(1), 85–110.CrossRefGoogle Scholar
  9. Clark, T. E., & McCracken, M. W. (2003). Evaluating long horizon forecasts. Working Paper, University of Missouri-Columbia.Google Scholar
  10. Corradi, V., & Distaso, W. (2011). Multiple forecast model evaluation. In The handbook of economic forecasting (pp. 391–414). Oxford: Oxford University Press.Google Scholar
  11. Corradi, V., & Swanson, N. R. (2002). A consistent test for out of sample nonlinear predictive ability. Journal of Econometrics, 110, 353–381.CrossRefGoogle Scholar
  12. Corradi, V., & Swanson, N. R. (2005). A test for comparing multiple misspecified conditional interval models. Econometric Theory, 21(5), 991–1016.CrossRefGoogle Scholar
  13. Corradi, V., & Swanson, N. R. (2006a). Predictive density and conditional confidence interval accuracy tests. Journal of Econometrics, 135(1), 187–228.CrossRefGoogle Scholar
  14. Corradi, V., & Swanson, N. R. (2006b). Predictive density evaluation. Handbook of Economic Forecasting, 1, 197–284.CrossRefGoogle Scholar
  15. Corradi, V., & Swanson, N. R. (2007). Nonparametric bootstrap procedures for predictive inference based on recursive estimation schemes. International Economic Review, 48(1), 67–109.CrossRefGoogle Scholar
  16. Corradi, V., Swanson, N. R., & Olivetti, C. (2001). Predictive ability with cointegrated variables. Journal of Econometrics, 104(2), 315–358.CrossRefGoogle Scholar
  17. De Jong, R. M. (1996). The bierens test under data dependence. Journal of Econometrics, 72(1), 1–32.CrossRefGoogle Scholar
  18. Diebold, F. X., & Mariano, R. S. (2002). Comparing predictive accuracy. Journal of Business & Economic Statistics, 20(1), 134–144.CrossRefGoogle Scholar
  19. Fernández-Villaverde, J., & Rubio-RamÍrez, J. F. (2004). Comparing dynamic equilibrium models to data: A Bayesian approach. Journal of Econometrics, 123(1), 153–187.CrossRefGoogle Scholar
  20. Gianni, A., & Giacomini, R. (2007). Comparing density forecasts via weighted likelihood ratio tests. Journal of Business & Economic Statistics, 25(2), 177–190.CrossRefGoogle Scholar
  21. Granger, C. W. J. (1999). Outline of forecast theory using generalized cost function. Spanish Economic Review, 1, 161–173.CrossRefGoogle Scholar
  22. Granger, C. W. J. (1993). On the limitations of comparing mean square forecast errors: A comment. Journal of Forecasting, 12(8), 651–652.CrossRefGoogle Scholar
  23. Hall, P., & Horowitz, J. L. (1996). Bootstrap critical values for tests based on generalized-method-of-moments estimators. Econometrica, 64(4), 891–916.CrossRefGoogle Scholar
  24. Hansen, B. E. (1996a). Inference when a nuisance parameter is not identified under the null hypothesis. Econometrica, 64, 413–430.CrossRefGoogle Scholar
  25. Hansen, B. E. (1996b). Stochastic equicontinuity for unbounded dependent heterogeneous arrays. Econometric Theory, 12, 347–359.CrossRefGoogle Scholar
  26. Hansen, R. P. (2005). A test for superior predictive ability. Journal of Business & Economic Statistics, 23(4), 365–380.CrossRefGoogle Scholar
  27. Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65–70.Google Scholar
  28. Inoue, A., & Shintani, M. (2006). Bootstrapping GMM estimators for time series. Journal of Econometrics, 133(2), 531–555.CrossRefGoogle Scholar
  29. Jin, S., Corradi, V., & Swanson, N. R. (2017). Robust forecast comparison. Econometric Theory, 33(6), 1306–1351.CrossRefGoogle Scholar
  30. Kilian, L. (1999). Exchange rates and monetary fundamentals: What do we learn from long-horizon regressions? Journal of Applied Econometrics, 14(5), 491–510.CrossRefGoogle Scholar
  31. Kitamura, Y. (2002). Econometric comparisons of conditional models. Working Paper, University of Pennsylvania.Google Scholar
  32. Lee, T. H., White, H., & Granger, C. W. J. (1993). Testing for neglected nonlinearity in time series models: A comparison of neural network methods and alternative tests. Journal of Econometrics, 56(3), 269–290.CrossRefGoogle Scholar
  33. Linton, O. B., Maasoumi, E., & Whang, Y. J. (2002). Consistent testing for stochastic dominance: A subsampling approach. Social Science Electronic Publishing, 72(3), 735–765.Google Scholar
  34. Linton, O., Maassoumi, E., & Whang, Y. J. (2005). Consistent testing for stochastic dominance: A subsampling approach. Review of Economic Studies, 72, 735–765.CrossRefGoogle Scholar
  35. McCracken, M. W. (2000). Robust out-of-sample inference. Journal of Econometrics, 99, 195–223.CrossRefGoogle Scholar
  36. Meese, R. A., & Rogoff, K. (1983). Empirical exchange rate models of the seventies: Do they fit out-of-sample? Journal of International Economics, 14, 3–24.CrossRefGoogle Scholar
  37. Politis, D. N., Romano, J. P., & Wolf, M. (1999). Subsampling. Springer Series in Statistics. New York: Springer.Google Scholar
  38. Romano, J. P., & Wolf, M. (2005). Stepwise multiple testing as formalized data snooping. Econometrica, 73(4), 1237–1282.CrossRefGoogle Scholar
  39. Rossi, B. (2005). Testing long-horizon predictive ability with high persistence, and the meese–rogoff puzzle. International Economic Review, 46(1), 61–92.CrossRefGoogle Scholar
  40. Schorfheide, F. (2010). Loss function-based evaluation of DSGE models. Journal of Applied Econometrics, 15(6), 645–670.CrossRefGoogle Scholar
  41. Stinchcombe, M. B., & White, H. (1998). Consistent specification testing with nuisance parameters present only under the alternative. Econometric Theory, 14(3), 295–325.CrossRefGoogle Scholar
  42. Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica, 57(2), 307–333.CrossRefGoogle Scholar
  43. Weiss, A. (1996). Estimating time series models using the relevant cost function. Journal of Applied Econometrics, 11(5), 539–560.CrossRefGoogle Scholar
  44. West, K. D. (1996). Asymptotic inference about predictive ability. Econometrica, 64, 1067–1084.CrossRefGoogle Scholar
  45. White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica, 50(1), 1–25.CrossRefGoogle Scholar
  46. White, H. (2000). A reality check for data snooping. Econometrica, 68(5), 1097–1126.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of FinanceLingnan (University) College, Sun Yat-sen UniversityGuangzhouChina
  2. 2.Department of Economics, School of Arts and SciencesRutgers UniversityNew BrunswickUSA

Personalised recommendations