Propensity score analysis: promise, reality and irrational exuberance
- William R. Shadish
- … show all 1 hide
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
The aim of this work is to examine the promise that propensity scores can yield accurate effect estimates in nonrandomized experiments, review research on the realities of the conditions needed to meet this promise, and caution against irrational exuberance about their capacity to meet this promise.
A review of selected experimental work that illustrates both the promise and realities of propensity score analysis.
Propensity score analysis of nonrandomized experiments can yield the same results as randomized experiments. Those estimates depend on meeting the strong ignorability assumption that the available covariates well describe selection processes and on use of comparison groups that are from the same location with very similar focal characteristics. When those assumptions are not met, propensity scores may not yield accurate estimates.
The use of propensity score analysis has proliferated exponentially, especially in the last decade, but careful attention to its assumptions seems to be very rare in practice. Researchers and policymakers who rely on these extensive propensity score applications may be using evidence of largely unknown validity. All stakeholders should devote far more empirical attention to justifying that each study has met these assumptions.
- Belister, SV, Martens, EP, Pestman, WR, Groenwold, RHH, Boer, A, Klungel, OH (2011) Measuring balance and model selection in propensity score methods. Pharmacoepidemiology and Drug Safety 20: pp. 1115-1129 CrossRef
- Cook, TD, Shadish, WR, Wong, VC (2008) Three conditions under which experiments and observational studies produce comparable causal estimates: new findings from within-study comparisons. Journal of Policy Analysis and Management 27: pp. 724-750 CrossRef
- Feng, P, Zhou, Z-H, Zou, Q-M, Fan, M-Y, Li, X-S (2011) Generalized propensity score for estimating the average treatment effect of multiple treatments. Statistics in Medicine 12: pp. 681-697
- Francis, G (2012) Too good to be true. Publication bias in two prominent studies from experimental psychology. Psychonomic Bulletin and Review 19: pp. 151-156 CrossRef
- Guo, S, Fraser, MW (2010) Propensity score analysis: Statistical methods and applications. Sage Publications, Thousand Oaks
- Ioannidis, JPA (2005) Why most published research findings are false. PLOS Medicine 2: pp. e124 CrossRef
- Ioannidis, JPA (2008) Perfect study, poor evidence: interpretation of biases preceding study design. Seminars in Hematology 45: pp. 160-166 CrossRef
- Ioannidis, J, Lau, J (2001) Evolution of treatment effects over time: empirical insight from recursive cumulative meta-analyses. Proceedings of the National Academy of Science USA 98: pp. 831-836 CrossRef
- Ioannidis, JPA, Panagiotou, OA (2011) Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. Journal of the American Medical Association 305: pp. 2200-2210 CrossRef
- Ioannidis, JP, Trikalinos, TA (2007) An exploratory test for an excess of significant findings. Clinical Trials 4: pp. 245-253 CrossRef
- Kyzas, PA, Loizou, KT, Ioannidis, JP (2005) Selective reporting biases in cancer prognostic factor studies. Journal of the National Cancer Institute 97: pp. 1043-1055 CrossRef
- LaLonde, R. (1986). Evaluating the econometric evaluations of training programs with experimental data. American Economic Review, 76, 604–620.
- Light, RJ, Singer, JD, Willett, JB (1990) By design: Planning research in higher education. Harvard University Press, Cambridge
- Luellen, J. (2007). A comparison of propensity score estimation and adjustment methods on simulated data (Unpublished doctoral dissertation). The University of Memphis, Memphis, TN.
- McCandless, L.C., Richardson, S. & Best, N. (2012). Adjustment for missing confounders using external validation data and propensity scores. Journal of the American Statistical Association, 107, 40–51. http://dx.doi.org/10.1080/01621459.2011.643739
- Moser, S., West, S. G., & Hughes, J. N. (2012). Trajectories of math and reading achievement in low achieving children in elementary school: How are they affected by retention in first and later grades? Journal of Educational Psychology, 104, 603–621. doi:10.1037/a0027571
- Peikes, DN, Moreno, L, Orzol, SM (2008) Propensity score matching: a note of caution for evaluators of social programs. The American Statistician 62: pp. 222-231 CrossRef
- Pohl, S, Steiner, PM, Eisermann, J, Soellner, R, Cook, TD (2009) Unbiased causal inference from an observational study: results of a within-study comparison. Educational Evaluation and Policy Analysis 31: pp. 463-479 CrossRef
- Popper, KR (1959) The logic of scientific discovery. Basic Books, New York
- Renkewitz, R, Fuchs, HM, Fiedler, S (2011) Is there evidence of publication biases in JDM research?. Judgment and Decision Making 6: pp. 870-881
- Rosenbaum, PR, Rubin, DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70: pp. 41-55 CrossRef
- Rubin, DB (2001) Using propensity scores to help design observational studies: application to the tobacco litigation. Health Services and Outcomes Research Methodology 2: pp. 169-188 CrossRef
- Shadish, WR, Cook, TD (2009) The renaissance of field experimentation in evaluating interventions. Annual Review of Psychology 60: pp. 607-629 CrossRef
- Shadish, WR, Clark, MH, Steiner, PM (2008) Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random to nonrandom assignment. Journal of the American Statistical Association 103: pp. 1334-1343 CrossRef
- Shadish, W.R., Steiner, P.M., & Cook, T.D. (2008). Peikes, D.N., Moreno, L. & Orzol, S.M. (2008). Propensity score matching: A note of caution for evaluators of social programs. The American Statistician, 62, 222-231: Comment by Shadish, Steiner and Cook. Unpublished manuscript.
- Simmons, JP, Nelson, LD, Simonsohn, U (2011) False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22: pp. 1359-1366 CrossRef
- Steiner, PM, Cook, TD, Shadish, WR (2011) On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics 36: pp. 213-236 CrossRef
- Steiner, PM, Cook, TD, Shadish, WR, Clark, MH (2010) The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods 15: pp. 250-267 CrossRef
- Wagenmakers, E-J, Wetzels, R, Borsboom, D, Maas, HLJ (2011) Why psychologists must change the way that they analyze their data: the case of Psi: comment on Bem (2011). Journal of Personality and Social Psychology 100: pp. 426-432 CrossRef
- Zhao, Z (2004) Using matching to estimate treatment effects: data requirements, matching metrics and Monte Carlo evidence. The Review of Economics and Statistics 86: pp. 91-107 CrossRef
- Propensity score analysis: promise, reality and irrational exuberance
Journal of Experimental Criminology
Volume 9, Issue 2 , pp 129-144
- Cover Date
- Print ISSN
- Online ISSN
- Springer Netherlands
- Additional Links
- Propensity score
- Nonrandomized experiment
- Industry Sectors
- Author Affiliations
- 1. School of Social Sciences, Humanities and Arts, University of California, Merced, 5200 North Lake Rd, Merced, CA, 95343, USA