Abstract
Propensity score methods are popular and effective statistical techniques for reducing selection bias in observational data to increase the validity of causal inference based on observational studies in behavioral and social science research. Some methodologists and statisticians have raised concerns about the rationale and applicability of propensity score methods. In this review, we addressed these concerns by reviewing the development history and the assumptions of propensity score methods, followed by the fundamental techniques of and available software packages for propensity score methods. We especially discussed the issues in and debates about the use of propensity score methods. This review provides beneficial information about propensity score methods from the historical point of view and helps researchers to select appropriate propensity score methods for their observational studies.
Similar content being viewed by others
References
Ahmed A, Husain A, Love TE, Gambassi G, Dell’Italia LJ, Francis GS, Gheorghiade M, Allman RM, Meleth S, Bourge RC (2006) Heart failure, chronic diuretic use, and increase in mortality and hospitalization: an observational study using propensity score methods. Eur Heart J 27(12):1431–1439
Austin PC (2011) An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res 46(3):399–424
Austin PC, Stuart EA (2015) Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Stat Med 34(28):3661–3679
Bai H (2011a) A comparison of propensity score matching methods for reducing selection bias. Int J Res Method Educ 34(1):81–107
Bai H (2011b) Using propensity score analysis for making causal claims in research articles. Educ Psychol Rev 23:273–278
Bai H (2013) A bootstrap procedure of propensity score estimation. J Exp Educ 81(2):157–177
Bai H (2015) Methodological considerations in implementing propensity score matching. In: Pan W, Bai H (eds) Propensity score analysis: fundamentals, developments, and extensions. Guilford Press, New York, pp 74–88
Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 61(4):962–973
Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Stürmer T (2006) Variable selection for propensity score models. Am J Epidemiol 163(12):1149–1156
Caliendo M, Kopeinig S (2008) Some practical guidance for the implementation of propensity score matching. J Econ Surveys 221:31–72
Cochran WG, Rubin DB (1973) Controlling bias in observational studies: a review. Sankhyā Indian J Stat Ser A 35(4):417–446
Dehejia RH, Wahba S (2002) Propensity score-matching methods for nonexperimental causal studies. Rev Econ Stat 84(1):151–161
Diamond A, Sekhon JS (2013) Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev Econ Stat 95:932–945
Fisher RA (1951) The design of experiments. Oliver & Boyd, Edinburgh
Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, Davidian M (2001) Doubly robust estimation of causal effects. Am J Epidemiol 173(7):761–767
Greenland S (2005) Multiple-bias modelling for analysis of observational data. J R Stat Soc Ser A (Stat Soc) 168(2):267–306
Groenwold RHH, Nelson DB, Nichol KL, Hoes AW, Hak E (2010) Sensitivity analyses to estimate the potential impact of unmeasured confounding in causal research. Int J Epidemiol 39(1):107–117
Guo S, Barth RP, Gibbons C (2006) Propensity score matching strategies for evaluating substance abuse services for child welfare clients. Child Youth Serv Rev 28(4):357–383
Hamilton MA (1979) Choosing the parameter for a 2 × 2 table or a 2 × 2 × 2 table analysis. Am J Epidemiol 109(3):362–375
Hansen BB (2004) Full matching in an observational study of coaching for the SAT. J Am Stat Assoc 99(467):609–618
Harder VS, Stuart EA, Anthony JC (2010) Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research. Psychol Methods 15(3):234–249
Heckman JJ, Ichimura H, Todd PE (1997) Matching as an econometric evaluation estimator: evidence from evaluating a job training programme. Rev Econ Stud 64(4):605–654
Hirano K, Imbens GW (2001) Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv Outcomes Res Method 2(3):259–278
Hirano K, Imbens GW, Ridder G (2003) Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71(4):1161–1189
Ho DE, Imai K, King G, Stuart EA (2011) MatchIt: nonparametric preprocessing for parametric causal inference. J Stat Softw 42(8):1–28
Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81(396):945–960
Huesch MD (2013) External adjustment sensitivity analysis for unmeasured confounding: an application to coronary stent outcomes, Pennsylvania 2004–2008. Health Serv Res 48(3):1191–1214
Imai K, Ratkovic M (2014) Covariate balancing propensity score. J R Stat Soc Ser B (Stat Methodol) 76(1):243–263
Keele LJ (2015) Package ‘rbounds’, version 2.1. https://cran.r-project.org/web/packages/rbounds/rbounds.pdf. Accessed 20 Jan 2016
Kempthorne O (1952) The design and analysis of experiments. Wiley, Oxford
King G, Nielsen R (2016) Why propensity scores should not be used for matching. https://gking.harvard.edu/files/gking/files/psnot.pdf. Accessed 26 June 2017
Lemon SC, Roy JR, Clark MA, Friedmann PD, Rakowski WR (2003) Classification and regression tree analysis in public health: methodological review and comparison with logistic regression. Ann Behav Med 26:172–181
Leuven E, Sianesi B (2012) PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. Statistical Software Components S432001. Boston College Department of Economics. http://ideas.repec.org/c/boc/bocode/s432001.html. Accessed 6 May 2014
Li L, Shen C, Wu AC, Li X (2011) Propensity score-based sensitivity analysis method for uncontrolled confounding. Am J Epidemiol 174(3):345–353
Li J, Handorf E, Bekelman J, Mitra N (2017) Propensity score and doubly robust methods for estimating the effect of treatment on censored cost. Stat Med 35(12):1985–1999
Lin DY, Psaty BM, Kronmal RA (1998) Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics 54(3):948–963
MacLehose RF, Kaufman S, Kaufman JS, Poole C (2005) Bounding causal effects under uncontrolled confounding using counterfactuals. Epidemiology 16(4):548–555
Månsson R, Joffe MM, Sun W, Hennessy S (2007) On the estimation and use of propensity scores in case-control and case-cohort studies. Am J Epidemiol 166(3):332–339
McCaffrey DF, Ridgeway G, Morral AR (2004) Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods 9(4):403–425
Pan W, Bai H (eds) (2015a) Propensity score analysis: fundamentals and developments. Guilford Press, New York
Pan W, Bai H (2015b) Propensity score interval matching: using bootstrap confidence intervals for accommodating estimation errors of propensity scores. BMC Med Res Methodol 15(53):1–9
Pan W, Bai H (2016a) A robustness index of propensity score estimation to uncontrolled confounders. In: He H, Wu P, Chen D (eds) Statistical causal inferences and their applications in public health research. Springer, New York, pp 91–100
Pan W, Bai H (2016b) Propensity score methods in nursing research: take advantage of them but proceed with caution. Nurs Res 65(6):421–424
Pattanayak CW (2015) Evaluating covariate balance. In: Pan W, Bai H (eds) Propensity score analysis: fundamentals and developments. Guilford Press, New York, pp 89–112
Pearl J (2010) The foundations of causal inference. Sociol Methodol 40(1):75–149
Robins JM, Hernan MA, Brumback B (2000a) Marginal structural models and causal inference in epidemiology. Epidemiology 11:550–560
Robins JM, Rotnitzky A, Scharfstein DO (2000b) Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Halloran ME, Berry D (eds) Statistical models in epidemiology, the environment, and clinical trials. Springer, New York, pp 1–94
Rosenbaum PR (1989) Optimal matching for observational studies. J Am Stat Assoc 84(408):1024–1032
Rosenbaum PR (2010) Observational studies, 2nd edn. Springer, New York
Rosenbaum PR, Rubin DB (1983a) Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J R Stat Soc Ser B (Methodol) 45(2):212–218
Rosenbaum PR, Rubin DB (1983b) The central role of the propensity score in observational studies for causal effects. Biometrika 70:41–55
Rosenbaum PR, Rubin DB (1984) Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc 79(387):516–524
Rosenbaum PR, Rubin DB (1985) Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am Stat 39(1):33–38
Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66(5):688
Rubin DB (1977) Assignment to treatment group on the basis of a covariate. J Educ Behav Stat 2(1):1–26
Rubin DB (1978) Bayesian inference for causal effects: the role of randomization. Ann Stat 6(1):34–58
Rubin DB (1980) Bias reduction using Mahalanobis metric matching. Biometrics 36:293–298
Rubin DB (1997) Estimating causal effects from large data sets using propensity scores. Ann Intern Med 127(8_Part_2):757–763
Rubin DB (2001) Using propensity scores to help design observational studies: application to the tobacco litigation. Health Serv Outcomes Res Method 2(3–4):169–188
Rubin DB (2008) For objective causal inference, design trumps analysis. Ann Appl Stat 2(3):808–840
Rubin DB (2009) Should observational studies be designed to allow lack of balance in covariate distributions across treatment groups? Stat Med 28(9):1420–1423
Rubin DB, Thomas N (1996) Matching using estimated propensity scores: relating theory to practice. Biometrics 52:249–264. https://doi.org/10.2307/2533160
SAS Institute Inc. (2017a) SAS/STAT® 14.3 user’s guide: the CAUSALTRT procedure. SAS Institute Inc., Cary, NC
SAS Institute Inc. (2017b) SAS/STAT® 14.3 user’s guide: the PSMATCH procedure. SAS Institute Inc., Cary, NC
Schafer JL, Kang J (2008) Average causal effects from nonrandomized studies: a practical guide and simulated example. Psychol Methods 13:279–313
Schneeweiss S (2006) Sensitivity analysis and external adjustment for unmeasured confounders in epidemiologic database studies of therapeutics. Pharmacoepidemiol Drug Saf 15(5):291–303
Schuler M (2015) Overview of implementing propensity score analyses in statistical software. In: Pan W, Bai H (eds) Propensity score analysis: fundamentals and developments. Guilford Press, New York, pp 20–48
Shadish WR, Cook TD, Campbell DT (2002) Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin, Boston
Shadish WR, Clark MH, Steiner PM (2008) Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. J Am Stat Assoc 3(484):1334–1344
Smith JA, Todd PE (2005) Does matching overcome LaLonde’s critique of nonexperimental estimators? J Econ 125:305–353
Stone CA, Tang Y (2013) Comparing propensity score methods in balancing covariates and recovering impact in small sample educational program evaluations. Pract Assess Res Eval 18(13):1–12
Thoemmes F (2012) Propensity score matching in SPSS. https://arxiv.org/abs/1201.6385. Accessed 26 May 2014
Westreich D, Lessler J, Funk MJ (2010) Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J Clin Epidemiol 36(8):826–833
Winship C, Morgan SL (1999) The estimation of causal effects from observational data. Ann Rev Sociol 25:659–706
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Communicated by Takahiro Hoshino.
About this article
Cite this article
Pan, W., Bai, H. Propensity score methods for causal inference: an overview. Behaviormetrika 45, 317–334 (2018). https://doi.org/10.1007/s41237-018-0058-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41237-018-0058-8