Using Propensity Score Analysis for Making Causal Claims in Research Articles



The central role of the propensity score analysis (PSA) in observational studies is for causal inference; as such, PSA is often used for making causal claims in research articles. However, there are still some issues for researchers to consider when making claims of causality using PSA results. This summary first briefly reviews PSA, followed by discussions of its effectiveness and limitations. Finally, a guideline of how to address these concerns is also provided for researchers to make appropriate causal claims using PSA results in their research articles.


Propensity score Observational studies Causal effects Causal inference 


  1. Abadie, A., & Imbens, G. (2006). Large sample properties of matching estimators for average treatment effects. Econometrica, 74(1), 235–267.CrossRefGoogle Scholar
  2. Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Sturmer, T. (2006). Variable selection for propensity score models. American Journal of Epidemiology, 163(12), 1149–1156.CrossRefGoogle Scholar
  3. Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of Economic Surveys, 22(1), 31–72.CrossRefGoogle Scholar
  4. Cepeda, M. S., Boston, R., Farrar, J. T., & Strom, B. L. (2003). Comparison of logistic regression versus propensity score when the number of events is low and there are multiple confounders. American Journal of Epidemiology, 158(3), 280–287.CrossRefGoogle Scholar
  5. Dehejia, R. H., & Wahba, S. (2002). Propensity score-matching methods for nonexperimental causal studies. The Review of Economics and Statistics, 84(1), 151–161.CrossRefGoogle Scholar
  6. Glynn, R. G., Schneeweiss, S., & Stürmer, T. (2006). Indications for propensity scores and review of their use in pharmacoepidemiology. Basic Clinicla Pharmacology Toxicology, 98(3), 253–259.CrossRefGoogle Scholar
  7. Greenland, S. (1989). Model and variable selection in epidemiologic analysis. American Journal of Public Health, 79(3), 340–349.CrossRefGoogle Scholar
  8. Greenland, S. (2007). Invited commentary: Variable selection versus shrinkage in the control of multiple confounders. American Journal of Epidemiology, 167(5), 523–529.CrossRefGoogle Scholar
  9. Gu, X. S., & Rosenbaum, P. R. (1993). Comparison of multivariate matching methods: Structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 2(4), 405–420.CrossRefGoogle Scholar
  10. Guo, S., Barth, R., & Gibbons, C. (2006). Propensity score matching strategies for evaluating substance abuse services for child welfare clients. Children and Youth Services Review, 28, 357–383.CrossRefGoogle Scholar
  11. Hahs-Vaughn, D. L., & Onwuegbuzie, A. J. (2006). Estimating and using propensity score analysis with complex samples. The Journal of Experimental Education, 75(1), 31–65.CrossRefGoogle Scholar
  12. Heckman, J., Ichimura, H., Smith, J., & Todd, P. (1998). Characterizing selection bias using experimental data. Econometrica, 66(5), 1017–1098.CrossRefGoogle Scholar
  13. Hill, J., & Reiter, J. P. (2006). Interval estimation for treatment effects using propensity score matching. Statistics in Medicine, 25, 2230–2256.CrossRefGoogle Scholar
  14. Hirano, K., Imbens, G., & Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71(4), 1161–1189.CrossRefGoogle Scholar
  15. Hosmer, D. L., & Lemeshow, S. (2000). Applied logistic regression (2nd ed.). New York: Wiley.CrossRefGoogle Scholar
  16. Imai, K., & van Dyk, D. A. (2004). Causal inference with general treatment regimes: Generalizing the propensity score. Journal of the American Statistical Association, 99, 854–866.CrossRefGoogle Scholar
  17. Joffe, M. M., & Rosenbaum, P. R. (1999). Invited commentary: Propensity scores. American Journal of Epidemiology, 150, 327–333.Google Scholar
  18. Kurth, T., Walker, A. M., Glynn, R. J., Chan, K. A., Gaziano, J. M., Berger, K., et al. (2006). Results of multivariable logistic regression, propensity matching, propensity adjustment, and propensity-based weighting under conditions of nonuniform effect. American Journal of Epidemiology, 163, 262–270.CrossRefGoogle Scholar
  19. Lechner, M. (2001). Identification and estimation of causal effects of multiple treatments under the conditional independence assumption. In M. Lechner & F. Pfeifer (Eds.), Econometric evaluation of labour market policies (pp. 1–18). Heidelberg: Physica.CrossRefGoogle Scholar
  20. McCandless, L. C., Gustafson, P., & Austin, P. C. (2008). Bayesian propensity score analysis for observational data. Accepted in Statistics in Medicine, 28, 94–112.CrossRefGoogle Scholar
  21. Michalopoulos, C., Bloom, H. S., & Hill, C. J. (2004). Can propensity-score methods match the findings from a random assignment evaluation of mandatory welfare-to-work programs? The Review of Economics and Statistics, 86, 156–179.CrossRefGoogle Scholar
  22. Rosenbaum, P. R. (1987). Model-based direct adjustment. Journal of the American Statistical Association, 82, 387–394.CrossRefGoogle Scholar
  23. Rosenbaum, P., & Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–50.CrossRefGoogle Scholar
  24. Rosenbaum, P. R., & Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79, 516–524.CrossRefGoogle Scholar
  25. Rosenbaum, P. R., & Rubin, D. B. (1985). Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. American Statistician, 39(1), 33–38.CrossRefGoogle Scholar
  26. Rothman, K. J., & Greenland, S. (1998). Modern epidemiology (2nd ed.). Philadelphia: Lippincott-Raven.Google Scholar
  27. Rubin, D. B. (1997). Estimating causal effects from large data sets using propensity scores. Annals of Internal Medicine, 127(8), 757–763.Google Scholar
  28. Rubin, D. B., & Thomas, N. (1996). Matching using estimated propensity scores: Relating theory to practice. Biometrics, 52, 249–264.CrossRefGoogle Scholar
  29. Stürmer, T., Joshi, M., Glynn, R. J., Avorn, J., Rothman, K. J., & Schneeweiss, S. (2006). A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. Journal of Clinical Epidemiology, 59(5), 437–461.CrossRefGoogle Scholar
  30. Thomson Corporation (2009). Web of science. Retrieved from
  31. Weitzen, S., Lapane, K. L., Toledano, A. Y., Hume, A. L., & Mor, V. (2004). Principles for modeling propensity scores in medical research: A systematic literature review. Pharmacoepidemiology and Drug Safety, 13, 841–853.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Department of Educational and Human SciencesUniversity of Central FloridaOrlandoUSA

Personalised recommendations