Simple calculations seem to show that larger studies should have higher statistical power, but empirical meta-analyses of published work in criminology have found zero or weak correlations between sample size and estimated statistical power. This is “Weisburd’s paradox” and has been attributed by Weisburd et al. (in Crime Justice 17:337–379, 1993) to a difficulty in maintaining quality control as studies get larger, and attributed by Nelson et al. (in J Exp Criminol 11:141–163, 2015) to a negative correlation between sample sizes and the underlying sizes of the effects being measured. We argue against the necessity of both these explanations, instead suggesting that the apparent Weisburd paradox might be explainable as an artifact of systematic overestimation inherent in post-hoc power calculations, a bias that is large with small N.
We discuss Weisburd’s paradox in light of the concepts of type S and type M errors, and re-examine the publications used in previous studies of the so-called paradox.
We suggest that the apparent Weisburd paradox might be explainable as an artifact of systematic overestimation inherent in post-hoc power calculations, a bias that is large with small N.
Speaking more generally, we recommend abandoning the use of statistical power as a measure of the strength of a study, because implicit in the definition of power is the bad idea of statistical significance as a research goal.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Our summary is at http://www.stat.columbia.edu/~gelman/documents/weisburd_table_of_studies.pdf.
Brame R, Bushway S, Paternoster R, Turner M (2014) Demographic patterns of cumulative arrest prevalence by ages 18 and 23. Crime Delinquency 60:471–486
Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, Munafo MR (2013) Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci 14:365–376
Carroll KM, Easton CJ, Nich C, Hunkele KA, Neavins TM, Sinha R, Ford HL, Vitolo SA, Doebrick CA, Rounsaville BJ (2006) The use of contingency management and motivational/skills-building therapy to treat young adults with marijuana dependence. J Consult Clin Psychol 74:955–966
Carroll KM, Martino S, Ball SA, Nich C, Frankforter T, Anez LM, Paris M, Suarez-Morales L, Szapocznik J, Miller WR, Rosa C, Matthews J, Farentinos C (2009) A multi-site randomised effectiveness trial of motivational enhancement therapy for Spanish-speaking substance users. J Consult Clin Psychol 77(5):993–999
Deschenes EP, Turner S, Greenwood PW (1995) Drug court or probation?: An experimental evaluation of Maricopa County’s drug court. Justice Syst J 18:55–73
Franco A, Malhotra N, Simonovits G (2014) Publication bias in the social sciences: unlocking the file drawer. Science 345:1502–1505
Gerber AS, Malhotra N (2008a) Publication bias in empirical sociological research: Do arbitrary significance levels distort published results? Sociol Methods Res 37:3–30
Gerber AS, Malhotra N (2008b) Do statistical reporting standards affect what is published? Publication bias in two leading political science journals. Q J Polit Sci 3:313–326
Gelman A (2015) Statistics and the crisis of scientific replication. Significance 12(3):23–25
Gelman A, Carlin JB (2014) Beyond power calculations: Assessing Type S (sign) and Type M (magnitude) errors. Perspect Psychol Sci 9:641–651
Gelman A, Loken E (2014) The statistical crisis in science. Am Sci 102:460–465
Gelman A, Tuerlinckx F (2000) Type S error rates for classical and Bayesian single and multiple comparison procedures. Comput Stat 15:373–390
Ginsel B, Aggarwal A, Xuan W, Harris I (2015) The distribution of probability values in medical abstracts: an observational study. BMC Res Notes 8:721
Jager LR, Leek JT (2014) An estimate of the science-wise false discovery rate and application to the top medical literature. Biostatistics 15:1–12
Lewis RV (1983) Scared straight—California style. Evaluation of the San Quentin squires program. Crim Justice Behav 10:209–226
Masicampo EJ, Lalande D (2012) A peculiar prevalence of p values just below.05. Q J Exp Psychol 65:2271–2279
Nelson MS, Wooditch A, Dario LM (2015) Sample size, effect size, and statistical power: a replication study of Weisburd’s paradox. J Exp Criminol 11:141–163
Patrick S, Marsh R (2005) Juvenile diversion: results from a 3-year experimental study. Crim Justice Policy Rev 16:59–73
Piquero AR, Jennings WG, Diamond B, Farrington DP, Tremblay RE, Welsh BC, Reingle Gonzalez JM (2016) A meta-analysis update on the effects of early family/parent training programs on antisocial behavior and delinquency. J Exp Criminol 12:229–248
Rothstein H (2008) Publication bias as a threat to the validity of meta-analytic results. J Exp Criminol 4:61–81
Senn SJ (2002) Power is indeed irrelevant in interpreting completed studies. Br Med J 325:1304
Sherman LW (2007) The power few: experimental criminology and the reduction of harm. The 2006 Joan McCord Prize Lecture. J Exp Criminol 3:299–321
Simmons JP, Nelson LD, Simonsohn U (2011) False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol Sci 22:1359–1366
Slavin R, Smith D (2009) The relationship between sample sizes and effect sizes in systematic reviews in education. Educ Eval Policy An 31:500–506
Weisburd D, Petrosino A, Mason G (1993) Design sensitivity in criminal justice experiments. Crime Justice 17:337–379
Wilson SJ, Tanner-Smith EE, Lipsey MW, Steinka-Fry K, Morrison J (2011) Dropout prevention and intervention programs: effects on school completion and dropout among school-aged children and youth. Campbell Systematic Reviews 2011:8. Oslo: The Campbell Collaboration
National Science Foundation (Grant No. SES-1534414), Institute of Education Sciences (Grant No. R305D140059-16), Office of Naval Research (Grant No. N00014-15-1-2541), Defense Advanced Research Projects Agency (Grant No. DARPA BAA-16-32)
We thank Justin Pickett and Gary Sweeten for suggesting this topic, several reviewers for helpful comments, and the U.S. National Science Foundation, Institute of Education Sciences, Office of Naval Research, and Defense Advanced Research Projects Agency for partial support of this work.
About this article
Cite this article
Gelman, A., Skardhamar, T. & Aaltonen, M. Type M Error Might Explain Weisburd’s Paradox. J Quant Criminol 36, 295–304 (2020). https://doi.org/10.1007/s10940-017-9374-5
- Weisburd paradox
- Type M error
- Statistical power
- Publication bias