When and when not to use optimal model averaging

  • Michael SchomakerEmail author
  • Christian Heumann
Regular Article


Traditionally model averaging has been viewed as an alternative to model selection with the ultimate goal to incorporate the uncertainty associated with the model selection process in standard errors and confidence intervals by using a weighted combination of candidate models. In recent years, a new class of model averaging estimators has emerged in the literature, suggesting to combine models such that the squared risk, or other risk functions, are minimized. We argue that, contrary to popular belief, these estimators do not necessarily address the challenges induced by model selection uncertainty, but should be regarded as attractive complements for the machine learning and forecasting literature, as well as tools to identify causal parameters. We illustrate our point by means of several targeted simulation studies.


Model selection Model averaging Prediction Machine learning Causal inference 


  1. Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Proceeding of the second international symposiumon information theory, Budapest, pp 267–281Google Scholar
  2. Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 64(2):962–972MathSciNetCrossRefGoogle Scholar
  3. Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefGoogle Scholar
  4. Buckland ST, Burnham KP, Augustin NH (1997) Model selection: an integral part of inference. Biometrics 53:603–618CrossRefGoogle Scholar
  5. Burnham K, Anderson D (2002) Model selection and multimodel inference. A practical information-theoretic approach. Springer, New YorkzbMATHGoogle Scholar
  6. Chatfield C (1995) Model uncertainty, data mining and statistical inference. J R Stat Soc A 158:419–466CrossRefGoogle Scholar
  7. Cheng TCF, Ing CK, Yu SH (2015) Toward optimal model averaging in regression models with time series errors. J Econometr 189(2):321–334MathSciNetCrossRefGoogle Scholar
  8. Daniel RM, Cousens SN, De Stavola BL, Kenward MG, Sterne JA (2013) Methods for dealing with time-dependent confounding. Stat Med 32(9):1584–1618MathSciNetCrossRefGoogle Scholar
  9. Draper D (1995) Assessment and propagation of model uncertainty. J R Stat Soc B 57:45–97MathSciNetzbMATHGoogle Scholar
  10. Fletcher D, Dillingham PW (2011) Model-averaged confidence intervals for factorial experiments. Comput Stat Data Anal 55:3041–3048MathSciNetCrossRefGoogle Scholar
  11. Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRefGoogle Scholar
  12. Gao Y, Zhang XY, Wang SY, Zou GH (2016) Model averaging based on leave-subject-out cross-validation. J Econometr 192(1):139–151MathSciNetCrossRefGoogle Scholar
  13. Gelman A, Su YS (2016) arm: data analysis using regression and multilevel/hierarchical models. R package version 1.9-3. Accessed 12 Sept 2018
  14. Gruber S, van der Laan MJ (2012) tmle: an R package for targeted maximum likelihood estimation. J Stat Softw 51(13):1–35CrossRefGoogle Scholar
  15. Hansen BE (2007) Least squares model averaging. Econometrica 75:1175–1189MathSciNetCrossRefGoogle Scholar
  16. Hansen BE (2008) Least squares forecast averaging. J Econometr 146:342–350MathSciNetCrossRefGoogle Scholar
  17. Hansen BE, Racine J (2012) Jackknife model averaging. J Econometr 167:38–46MathSciNetCrossRefGoogle Scholar
  18. Hjort L, Claeskens G (2003) Frequentist model average estimators. J Am Stat Assoc 98:879–945MathSciNetCrossRefGoogle Scholar
  19. Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 14:382–417MathSciNetCrossRefGoogle Scholar
  20. Kabaila P, Welsh A, Abeysekera W (2016) Model-averaged confidence intervals. Scand J Stat 43:35–48MathSciNetCrossRefGoogle Scholar
  21. Leeb H, Pötscher BM (2005) Model selection and inference: facts and fiction. Econometr Theory 21:21–59MathSciNetCrossRefGoogle Scholar
  22. Leeb H, Pötscher BM (2008) Model Selection. Springer, New York, pp 785–821zbMATHGoogle Scholar
  23. Lendle SD, Schwab J, Petersen ML, van der Laan MJ (2017) ltmle: an R package implementing targeted minimum loss-based estimation for longitudinal data. J Stat Softw 81(1):1–21CrossRefGoogle Scholar
  24. Liang H, Zou GH, Wan ATK, Zhang XY (2011) Optimal weight choice for frequentist model average estimators. J Am Stat Assoc 106(495):1053–1066MathSciNetCrossRefGoogle Scholar
  25. Liu C, Kuo B (2016) Model averaging in predictive regressions. Econometr J 19(2):203–231MathSciNetCrossRefGoogle Scholar
  26. Liu QF, Okui R, Yoshimura A (2016) Generalized least squares model averaging. Econometr Rev 35(8–10):1692–1752MathSciNetCrossRefGoogle Scholar
  27. Mallows C (1973) Some comments on \(C_p\). Technometrics 15:661–675zbMATHGoogle Scholar
  28. Petersen M, Schwab J, Gruber S, Blaser N, Schomaker M, van der Laan M (2014) Targeted maximum likelihood estimation for dynamic and static longitudinal marginal structural working models. J Causal Inference 2:147–185CrossRefGoogle Scholar
  29. Polley E, LeDell E, Kennedy C, van der Laan M (2017) SuperLearner: super learner prediction. R package version 2.0-22. Accessed 12 Sept 2018
  30. Pötscher B (2006) The distribution of model averaging estimators and an impossibility result regarding its estimation. Lect Notes Monogr Ser 52:113–129MathSciNetCrossRefGoogle Scholar
  31. Raftery A, Hoeting J, Volinsky C, Painter I, Yeung KY (2017) BMA: Bayesian model averaging. R package version 3.18.7. Accessed 12 Sept 2018
  32. Rao CR, Wu Y (2001) On model selection. Lect Notes Monogr Ser 38:1–64Google Scholar
  33. Robins J, Hernan MA (2009) Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G (eds) Longitudinal data analysis. CRC Press, Boca Raton, pp 553–599Google Scholar
  34. Sala-I-Martin X, Doppelhofer G, Miller RI (2004) Determinants of long-term growth: a Bayesian averaging of classical estimates (bace) approach. Am Econ Rev 94(4):813–835CrossRefGoogle Scholar
  35. Schomaker M (2012) Shrinkage averaging estimation. Stat Pap 53(4):1015–1034MathSciNetCrossRefGoogle Scholar
  36. Schomaker M (2017a) MAMI: model averaging (and model selection) after multiple imputation. R package version 0.9.10Google Scholar
  37. Schomaker M (2017b) Model averaging and model selection after multiple imputation using the R-package MAMI. Accessed 12 Sept 2018
  38. Schomaker M, Heumann C (2014) Model selection and model averaging after multiple imputation. Comput Stat Data Anal 71:758–770MathSciNetCrossRefGoogle Scholar
  39. Schomaker M, Heumann C (2018) Bootstrap inference when using multiple imputation. Stat Med 37(14):2252–2266MathSciNetCrossRefGoogle Scholar
  40. Schomaker M, Davies MA, Malateste K, Renner L, Sawry S, N’Gbeche S, Technau K, Eboua FT, Tanser F, Sygnate-Sy H, Phiri S, Amorissani-Folquet M, Cox V, Koueta F, Chimbete C, Lawson-Evi A, Giddy J, Amani-Bosse C, Wood R, Egger M, Leroy V (2016) Growth and mortality outcomes for different antiretroviral therapy initiation criteria in children aged 1–5 years: a causal modelling analysis from West and Southern Africa. Epidemiology 27:237–246Google Scholar
  41. Sofrygin O, van der Laan MJ, Neugebauer R (2017) simcausal R package: conducting transparent and reproducible simulation studies of causal effect estimation with complex longitudinal data. J Stat Softw 81(2):1–47CrossRefGoogle Scholar
  42. Tibsharani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288MathSciNetGoogle Scholar
  43. Turek D, Fletcher D (2012) Model-averaged wald confidence intervals. Comput Stat Data Anal 56:2809–2815MathSciNetCrossRefGoogle Scholar
  44. Van der Laan M, Petersen M (2007) Statistical learning of origin-specific statistically optimal individualized treatment rules. Int J Biostat 3:3zbMATHGoogle Scholar
  45. Van der Laan M, Rose S (2011) Targeted learning. Springer, New YorkCrossRefGoogle Scholar
  46. Van der Laan M, Polley E, Hubbard A (2008) Super learner. Stat Appl Genet Mol Biol 6:25MathSciNetzbMATHGoogle Scholar
  47. Wan ATK, Zhang X, Zou GH (2010) Least squares model averaging by Mallows criterion. J Econometr 156:277–283MathSciNetCrossRefGoogle Scholar
  48. Wang H, Zhou S (2012) Interval estimation by frequentist model averaging. Commun Stat Theory Methods 42(23):4342–4356MathSciNetCrossRefGoogle Scholar
  49. Wang H, Zhang X, Zou G (2009) Frequentist model averaging: a review. J Syst Sci Complex 22:732–748MathSciNetCrossRefGoogle Scholar
  50. Wood SN (2006) Generalized additive models: an introduction with R. Chapman and Hall/CRC, Boca RatonCrossRefGoogle Scholar
  51. Yan J (2007) Enjoy the joy of copulas: with package copula. J Stat Softw 21:1–21CrossRefGoogle Scholar
  52. Zhang X, Liu CA (2017) Inference after model averaging in linear regression models. IEAS working paper: academic research 17-A005. Institute of Economics, Academia Sinica, Taipei, Taiwan. Accessed 12 Sept 2018
  53. Zhang XY, Zou GH, Liang H (2014) Model averaging and weight choice in linear mixed-effects models. Biometrika 101(1):205–218MathSciNetCrossRefGoogle Scholar
  54. Zhang XY, Zou GH, Carroll RJ (2015) Model averaging based on Kullback-Leibler distance. Stat Sin 25(4):1583–1598MathSciNetzbMATHGoogle Scholar
  55. Zhang XY, Ullah A, Zhao SW (2016a) On the dominance of mallows model averaging estimator over ordinary least squares estimator. Econ Lett 142:69–73MathSciNetCrossRefGoogle Scholar
  56. Zhang XY, Yu DL, Zou GH, Liang H (2016b) Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. J Am Stat Assoc 111(516):1775–1790MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Centre for Infectious Disease Epidemiology and ResearchUniversity of Cape TownCape TownSouth Africa
  2. 2.Institut für StatistikLudwig-Maximilians Universität MünchenMünchenGermany

Personalised recommendations