Journal of Experimental Criminology

, Volume 9, Issue 2, pp 129–144 | Cite as

Propensity score analysis: promise, reality and irrational exuberance

  • William R. ShadishEmail author



The aim of this work is to examine the promise that propensity scores can yield accurate effect estimates in nonrandomized experiments, review research on the realities of the conditions needed to meet this promise, and caution against irrational exuberance about their capacity to meet this promise.


A review of selected experimental work that illustrates both the promise and realities of propensity score analysis.


Propensity score analysis of nonrandomized experiments can yield the same results as randomized experiments. Those estimates depend on meeting the strong ignorability assumption that the available covariates well describe selection processes and on use of comparison groups that are from the same location with very similar focal characteristics. When those assumptions are not met, propensity scores may not yield accurate estimates.


The use of propensity score analysis has proliferated exponentially, especially in the last decade, but careful attention to its assumptions seems to be very rare in practice. Researchers and policymakers who rely on these extensive propensity score applications may be using evidence of largely unknown validity. All stakeholders should devote far more empirical attention to justifying that each study has met these assumptions.


Propensity score Nonrandomized experiment Quasi-experiment 


  1. Belister, S. V., Martens, E. P., Pestman, W. R., Groenwold, R. H. H., de Boer, A., & Klungel, O. H. (2011). Measuring balance and model selection in propensity score methods. Pharmacoepidemiology and Drug Safety, 20, 1115–1129.CrossRefGoogle Scholar
  2. Cook, T. D., Shadish, W. R., & Wong, V. C. (2008). Three conditions under which experiments and observational studies produce comparable causal estimates: new findings from within-study comparisons. Journal of Policy Analysis and Management, 27, 724–750.CrossRefGoogle Scholar
  3. Feng, P., Zhou, Z.-H., Zou, Q.-M., Fan, M.-Y., & Li, X.-S. (2011). Generalized propensity score for estimating the average treatment effect of multiple treatments. Statistics in Medicine, 12, 681–697. doi: 10.1002/sim.4168.Google Scholar
  4. Francis, G. (2012). Too good to be true. Publication bias in two prominent studies from experimental psychology. Psychonomic Bulletin and Review, 19, 151–156. doi: 10.3758/s13423-012-0227-9.CrossRefGoogle Scholar
  5. Guo, S., & Fraser, M. W. (2010). Propensity score analysis: Statistical methods and applications. Thousand Oaks: Sage Publications.Google Scholar
  6. Ioannidis, J. P. A. (2005). Why most published research findings are false. PLOS Medicine, 2(8), e124. doi: 10.1371/journal.pmed/0020124.CrossRefGoogle Scholar
  7. Ioannidis, J. P. A. (2008). Perfect study, poor evidence: interpretation of biases preceding study design. Seminars in Hematology, 45, 160–166.CrossRefGoogle Scholar
  8. Ioannidis, J., & Lau, J. (2001). Evolution of treatment effects over time: empirical insight from recursive cumulative meta-analyses. Proceedings of the National Academy of Science USA, 98, 831–836.CrossRefGoogle Scholar
  9. Ioannidis, J. P. A., & Panagiotou, O. A. (2011). Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. Journal of the American Medical Association, 305, 2200–2210.CrossRefGoogle Scholar
  10. Ioannidis, J. P., & Trikalinos, T. A. (2007). An exploratory test for an excess of significant findings. Clinical Trials, 4, 245–253.CrossRefGoogle Scholar
  11. Kyzas, P. A., Loizou, K. T., & Ioannidis, J. P. (2005). Selective reporting biases in cancer prognostic factor studies. Journal of the National Cancer Institute, 97, 1043–1055.CrossRefGoogle Scholar
  12. LaLonde, R. (1986). Evaluating the econometric evaluations of training programs with experimental data. American Economic Review, 76, 604–620.Google Scholar
  13. Light, R. J., Singer, J. D., & Willett, J. B. (1990). By design: Planning research in higher education. Cambridge: Harvard University Press.Google Scholar
  14. Luellen, J. (2007). A comparison of propensity score estimation and adjustment methods on simulated data (Unpublished doctoral dissertation). The University of Memphis, Memphis, TN.Google Scholar
  15. McCandless, L.C., Richardson, S. & Best, N. (2012). Adjustment for missing confounders using external validation data and propensity scores. Journal of the American Statistical Association, 107, 40–51. Google Scholar
  16. Moser, S., West, S. G., & Hughes, J. N. (2012). Trajectories of math and reading achievement in low achieving children in elementary school: How are they affected by retention in first and later grades? Journal of Educational Psychology, 104, 603–621. doi: 10.1037/a0027571 Google Scholar
  17. Peikes, D. N., Moreno, L., & Orzol, S. M. (2008). Propensity score matching: a note of caution for evaluators of social programs. The American Statistician, 62, 222–231.CrossRefGoogle Scholar
  18. Pohl, S., Steiner, P. M., Eisermann, J., Soellner, R., & Cook, T. D. (2009). Unbiased causal inference from an observational study: results of a within-study comparison. Educational Evaluation and Policy Analysis, 31, 463–479.CrossRefGoogle Scholar
  19. Popper, K. R. (1959). The logic of scientific discovery. New York: Basic Books.Google Scholar
  20. Renkewitz, R., Fuchs, H. M., & Fiedler, S. (2011). Is there evidence of publication biases in JDM research? Judgment and Decision Making, 6, 870–881.Google Scholar
  21. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55.CrossRefGoogle Scholar
  22. Rubin, D. B. (2001). Using propensity scores to help design observational studies: application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2, 169–188.CrossRefGoogle Scholar
  23. Shadish, W. R., & Cook, T. D. (2009). The renaissance of field experimentation in evaluating interventions. Annual Review of Psychology, 60, 607–629.CrossRefGoogle Scholar
  24. Shadish, W. R., Clark, M. H., & Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random to nonrandom assignment. Journal of the American Statistical Association, 103, 1334–1343.CrossRefGoogle Scholar
  25. Shadish, W.R., Steiner, P.M., & Cook, T.D. (2008). Peikes, D.N., Moreno, L. & Orzol, S.M. (2008). Propensity score matching: A note of caution for evaluators of social programs. The American Statistician, 62, 222-231: Comment by Shadish, Steiner and Cook. Unpublished manuscript.Google Scholar
  26. Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366. doi: 10.1177/0956797611417632.CrossRefGoogle Scholar
  27. Steiner, P. M., Cook, T. D., & Shadish, W. R. (2011). On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics, 36, 213–236.CrossRefGoogle Scholar
  28. Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15, 250–267.CrossRefGoogle Scholar
  29. Wagenmakers, E.-J., Wetzels, R., Borsboom, D., & van der Maas, H. L. J. (2011). Why psychologists must change the way that they analyze their data: the case of Psi: comment on Bem (2011). Journal of Personality and Social Psychology, 100, 426–432. doi: 10.1037/a0022790.CrossRefGoogle Scholar
  30. Zhao, Z. (2004). Using matching to estimate treatment effects: data requirements, matching metrics and Monte Carlo evidence. The Review of Economics and Statistics, 86, 91–107.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2012

Authors and Affiliations

  1. 1.School of Social Sciences, Humanities and ArtsUniversity of California, MercedMercedUSA

Personalised recommendations