Journal of Quantitative Criminology

, Volume 26, Issue 2, pp 217–236 | Cite as

Statistical Inference After Model Selection

  • Richard Berk
  • Lawrence Brown
  • Linda Zhao
Original Paper


Conventional statistical inference requires that a model of how the data were generated be known before the data are analyzed. Yet in criminology, and in the social sciences more broadly, a variety of model selection procedures are routinely undertaken followed by statistical tests and confidence intervals computed for a “final” model. In this paper, we examine such practices and show how they are typically misguided. The parameters being estimated are no longer well defined, and post-model-selection sampling distributions are mixtures with properties that are very different from what is conventionally assumed. Confidence intervals and statistical tests do not perform as they should. We examine in some detail the specific mechanisms responsible. We also offer some suggestions for better practice and show though a criminal justice example using real data how proper statistical inference in principle may be obtained.


Model selection Statistical inference Mixtures of distributions 



Richard Berk’s work on this paper was funded by a grant from the National Science Foundation: SES-0437169, “Ensemble methods for Data Analysis in the Behavioral, Social and Economic Sciences.” The work by Lawrence Brown and Linda Zhao was supported in part by NSF grant DMS-07-07033. Thanks also go to Andreas Buja, Sam Preston, Jasjeet Sekhon, Herb Smith, Phillip Stark, and three reviewers for helpful suggestions about the material discussed in this paper.


  1. Barnett V (1983) Comparative statistical inference, 2nd edn. Wiley, New YorkGoogle Scholar
  2. Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29(4):1165–188CrossRefGoogle Scholar
  3. Berk RA (2003) Regression analysis: a constructive critique. Sage Publications, Newbury ParkGoogle Scholar
  4. Blumstein A, Cohen J, Martin SE, Tonrey MH (eds) (1983) Research on sentencing: the search for reform, vols 1 and 2. National Academy Press, Washington, DCGoogle Scholar
  5. Box GEP (1976) Science and statistics. J Am Stat Assoc 71:791–799CrossRefGoogle Scholar
  6. Breiman L (2001) Statistical modeling: two cultures (with discussion). Stat Sci 16:199–231CrossRefGoogle Scholar
  7. Brown LD (1967) The conditional level of student’s t test. Ann Math Stat 38(4):1068–1071CrossRefGoogle Scholar
  8. Brown LD (1990) An ancillarity paradox which appears in multiple linear regression. Ann Stat 18(2):471–493CrossRefGoogle Scholar
  9. Candes E, Tao T (2007) The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35(6):2313–2331CrossRefGoogle Scholar
  10. Cook DR, Weisberg S (1999) Applied regression including computing and graphics. Wiley, New YorkCrossRefGoogle Scholar
  11. Davies G, Dedel K (2006) Violence screening in community corrections. Criminol Public Policy 5(4):743–770CrossRefGoogle Scholar
  12. Efron B, Hastie T, Tibshinani R (2007) Discussion: the Dantzig selector: statistical estimation with p much larger than n. Ann Stat 35(6):2358–2364CrossRefGoogle Scholar
  13. Efron B (2007) Correlation and large-scale simultaneous significance testing. J Am Stat Assoc 102(477):93–103CrossRefGoogle Scholar
  14. Freedman DA (1987) As others see us: a case study in path analysis (with discussion). J Educ Stat 12:101–223CrossRefGoogle Scholar
  15. Freedman DA (2004) Graphical models for causation and the identification problem. Eval Rev 28:267–293CrossRefGoogle Scholar
  16. Freedman DA (2005) Statistical models: theory and practice. Cambridge University Press, CambridgeGoogle Scholar
  17. Freedman DA, Navidi W, Peters SC (1988) On the impact of variable selection in fitting regression equations. In: Dijkstra TK (eds) On model uncertainty and its statistical implications. Springer, Berlin, pp 1–16Google Scholar
  18. Greene WH (2003) Econometric methods, 5th edn. Prentice Hall, New YortkGoogle Scholar
  19. Johnson BD (2006) The multilevel context of criminal sentencing: integrating judge- and county-level influences. Criminology 44(2):235–258CrossRefGoogle Scholar
  20. Lalonde RJ, Cho RM (2008) The impact of incarceration in state prison on the employment prospects of women. J Quant Criminol 24:243–265CrossRefGoogle Scholar
  21. Leeb H, Pötscher BM (2005) Model selection and inference: facts and fiction. Econ Theory 21:21–59CrossRefGoogle Scholar
  22. Leeb H, Pötscher BM (2006) Can one estimate the conditional distribution of post-model-selection estimators? Ann Stat 34(5):2554–2591CrossRefGoogle Scholar
  23. Leeb H, Pötscher BM (2008) Model selection. In: Anderson TG, Davis RA, Kreib J-P, Mikosch T (eds) The handbook of financial time series. Springer, New York, pp 785–821Google Scholar
  24. Leamer EE (1978) Specification searches: ad hoc inference with non-experimental data. Wiley, New YorkGoogle Scholar
  25. Manski CF (1990) Nonparametric bounds on treatment effects. Am Econ Rev Pap Proc 80:319–323Google Scholar
  26. McCullagh P, Nelder JA (1989) Generalized linear models. 2nd edn. Chapman & Hall, New YorkGoogle Scholar
  27. Morgan SL, Winship C (2007) Counterfactuals and causal inference: methods and principles for social research. Cambridge University Press, CambridgeGoogle Scholar
  28. Morris N, Tonry M (1990) Prison and probation: intermediate punishment in a rational sentencing system. Oxford, University Press, New YorkGoogle Scholar
  29. Olshen RA (1973) The conditional level of the F-test. J Am Stat Assoc 68(343):692–698CrossRefGoogle Scholar
  30. Ousey GC, Wilcox P, Brummel S (2008) Déjà vu all over again: investigating temporal continuity of adolescent victimization. J Quant Criminol 24:307–335CrossRefGoogle Scholar
  31. Petersilia J (1997) Probation in the United States. Crime Justice 22:149–200CrossRefGoogle Scholar
  32. Rubin DB (1986) Which ifs have causal answers. J Am Stat Assoc 81:961–962CrossRefGoogle Scholar
  33. Sampson RJ, Raudenbush SW (2004) Seeing disorder: neighborhood stigma and the social construction of broken windows. Soc Psychol Q 67(4):319–342CrossRefGoogle Scholar
  34. Schroeder RD, Giordano PC, Cernkovich SA (2007) Drug use and desistance processes. Criminology 45(1):191–222CrossRefGoogle Scholar
  35. Wooldredge J, Griffin T, Rauschenberg F (2005) (Un)anticipated effects of sentencing reform on disparate treatment of defendants. Law Soc Rev 39(4):835–874CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of PennsylvaniaPhiladelphiaUSA
  2. 2.Department of CriminologyUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations