Abstract
Most statistical methods are based on models, but most practical applications ignore the fact that the results depend on the model as well as on the data. This paper examines the size of this model dependence, and finds that there can be very considerable variation between the results of fitting different models to the same data, even if the models being considered are restricted to those which give an acceptable fit to the data. Under reasonable regularity conditions, we show that different empirically acceptable models can give rise to non-overlapping confidence intervals for the same parameter. Application papers need to recognize that the validity of conventional statistical results rests on the assumption that the underlying model is known to be correct, and that this is a much stronger requirement than merely confirming that the model gives a good fit to the data. The problem of model dependence is only partially resolved by using formal methods of model selection or model averaging.
Similar content being viewed by others
References
Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71, 791–799.
Claeskens, G., Hjort, N. L. (2008). Model selection and model averaging. Cambridge: Cambridge University Press.
Cox, D. R. (1970). Analysis of binary data. London: Chapman and Hall/CRC.
Cox, D. R. (1995). Contribution to the discussion of the paper by Draper. Journal of the Royal Statistical Society, Series B, 57, 78.
Draper, D. (1995). Assessment and propagation of model uncertainty (with discussion). Journal of the Royal Statistical Society, Series B, 57, 45–97.
Efron, B. (2014). Estimation and accuracy after model selection. Journal of the American Statistical Association, 109, 991–1007.
Everitt, B. S. (1977). The analysis of contingency tables. London: Chapman and Hall/CRC.
Ferrari, D., Yang, Y. (2015). Confidence sets for model selection by F-testing. Statistica Sinica, 25, 1637–1658.
Hjort, N. L., Claeskens, G. (2003). Frequentist model average estimators. Journal of the American Statistical Association, 98, 879–899.
Hodges, J. S. (1987). Uncertainty, policy analysis and statistics. Statistical Science, 2, 259–291.
Hoeting, J. A., Madigan, D., Raftery, A. E., Volinsky, C. T. (1999). Bayesian model averaging: A tutorial. Statistical Science, 14, 382–417.
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124.t001.
Langford, J. (2005). Tutorial on practical prediction theory for classification. Journal of Machine Learning Research, 6, 273–306.
Leeb, H., Potscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory, 21, 21–59.
Miller, A. J. (2002). Subset selection in regression (2nd ed.). London: Chapman and Hall/CRC.
Nan, Y., Yang, Y. (2014). Variable selection diagnostic measures for high-dimensional regression. Journal of Computational and Grahical Statistics, 23, 636–656.
Penrose, K., Nelson, A., Fisher, A. (1985). Generalized body composition prediction equation for men using simple measurement techniques (abstract). Medicine and Science in Sports and Exercise, 17, 189.
Potscher, B. M. (1991). Effects of model selection on inference. Econometric Theory, 7, 163–185.
Royston, P., Sauerbrei, W. (2008). Multivariate model-building. Chichester: Wiley.
Simmons, J. P., Nelson, L. D., Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 20, 1–8.
Wadman, M. (2013). NIH mulls for validating key results. Nature, 500, 14–16.
Acknowledgements
The authors would like to thank the editors and referees for their very helpful comments on an earlier version of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
The online version of this article contains supplementary material.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material
Supplementary Appendix A, giving the proof of equation (17) in Section 3.3, is available online at the journal website. Similarly, Supplementary Appendix B gives the proof of equation (30) in Section 4.2.(PDF 84KB)
About this article
Cite this article
Copas, J., Eguchi, S. Strong model dependence in statistical analysis: goodness of fit is not enough for model choice. Ann Inst Stat Math 72, 329–352 (2020). https://doi.org/10.1007/s10463-018-0691-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-018-0691-8