Abstract
This paper examines the use of propensity score matching in economic analyses of observational data. Several excellent papers have previously reviewed practical aspects of propensity score estimation and other aspects of the propensity score literature. The purpose of this paper is to compare the conceptual foundation of propensity score models with alternative estimators of treatment effects. References are provided to empirical comparisons among methods that have appeared in the literature. These comparisons are available for a subset of the methods considered in this paper. However, in some cases, no pairwise comparisons of particular methods are yet available, and there are no examples of comparisons across all of the methods surveyed here. Irrespective of the availability of empirical comparisons, the goal of this paper is to provide some intuition about the relative merits of alternative estimators in health economic evaluations where nonlinearity, sample size, availability of pre/post data, heterogeneity, and missing variables can have important implications for choice of methodology. Also considered is the potential combination of propensity score matching with alternative methods such as differences-in-differences and decomposition methods that have not yet appeared in the empirical literature.
Similar content being viewed by others
References
Terza J, Basu A, Rathouz P. Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling. J Health Econ. 2008;27:531–43.
D’Agostino RB Jr. Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med. 1998;17(19):2265–81.
Baser O. Too much ado about propensity score models? Comparing methods of propensity score matching. Value Health. 2006;9:377–85.
Austin P. A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat Med. 2008;27:2037–49.
Jones AM, Rice N. Econometric evaluation of health policies. In: Glied S, Smith P, editors. The Oxford handbook of health economics. Oxford: Oxford University Press; 2009.
Basu A, Polsky D, Manning W. Estimating treatment effects on healthcare costs under exogeneity: is there a “Magic Bullet”? Health Serv Outcomes Res Methodol. 2011;11(1–2):1–26.
Basu A, Heckman J, Navarro-Lozano S, Urzua S. Use of instrumental variables in the presence of heterogeneity and self-selection: an application to treatments of breast cancer patients. Health Econ. 2007;16:1133–57.
Heckman J, Navarro-Lozano S. Using matching, instrumental variables, and control functions to estimate choice models. Rev Econ and Stat. 2004;86(1):30–57.
Crown W. There’s a reason they call them dummy variables: a note on the use of structural equation techniques in comparative effectiveness research. Pharmacoeconomics. 2010;28(10):1–9.
Hausman J. Specification and estimation of simultaneous equations models, In: Griliches Z, Intriligator MD, editors. Handbook of econometrics, vol. 1. Amsterdam: North Holland; 1983. p. 391–448.
Wooldridge J. Econometric analysis of cross-section and panel data. Cambridge: MIT Press; 2002.
Crump R, Holtz V, Imbens G, Mitnik O. Dealing with limited overlap in estimation of average treatment effects. Biometrika. 2009;96(1):187–99.
Rosenbaum PR, Rubin DB. The central role of propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.
Borah B, Moriaty J, Crown W, Doshi J. Applications of propensity score methods in observational comparative effectiveness and safety research: where have we come and where should we go? J Comparative Effectiveness Res (in press).
Diamond S, Sekhon JS. Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev Econ Stat. 2005;95(3):932–45.
Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Stürmer T. Variable selection for propensity score models. Am J Epidemiol. 2006;15;163(12):1149–56.
Johnson ML, Bush RL, Collins TC, Lin PH, Liles DA, Henderson WG, Khuri SF, Petersen LA. Propensity score analysis in observational studies: outcomes following abdominal aortic aneurysm repair. Am J Surg. 2006;192(3):336–43.
Austin P. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399–424.
Cochran W. The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics. 1968;24:295–313.
Hullsiek KH, Louis TA. Propensity score modeling strategies for the causal analysis of observational data. Biostatistics. 2002;3:179–93.
Imbens G. Nonparametric estimation of average treatment effects under exogeneity: a review. Review Econ Stat. 2004;86:4–29.
Little R, Rubin D. Causal effects in clinical and epidemiological studies via potential outcomes, concepts, and analytic approaches. Ann Rev Pub Health. 2000;21:121–45.
Hirano K, Imbens G, Ridder G. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica. 2003;71:1161–89.
Joffe M, Ten Have T, Feldman H, Kemmel S. Model selection, confounder control, and marginal structural models: review and new applications. Am Stat. 2004;58:272–9.
Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61:962–72.
Scharfstein DO, Rotnitzky A., Robins JM. Adjusting for nonignorable drop-out using semiparametric nonresponse models. J Am Stat Assoc. 1999;94:1096–1120 (Rejoinder, 1135–1146).
Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Amer Stat Assoc. 1995;90:106–21.
Johnson M, Crown W, Martin B, Dormuth C, Siebert U. Good research practices for comparative effectiveness research: analytic methods to improve causal inference from non-randomized studies of treatment effects using secondary data sources. Report of the ISPOR retrospective database analysis task force—Part III. Value Health. 2009;12(8):1062–73.
Manca A, Austin P. Using propensity score methods to analyze individual patient-level cost-effectiveness data from observational studies. The University of York: Health Economics and Data Group Working Paper 08/20; 2008.
Mitra N, Indurkhya A. A propensity score approach to estimating the cost-effectiveness of medical therapies from observational data. Health Econ. 2005;14(8):805–15.
Sekhon JS, Grieve RD. A matching method for improving covariate balance in cost-effectiveness analyses. Health Eco. 2011;21(6):695–714.
Dheiia R, Wahba S. Nonexperimental studies: reevaluating the evolution of training programs. J Amer Stat Assoc. 1999;94(448):1053–62.
Seeger J, Walker A, Williams P, Saperia G, Sacks F. A propensity score-matched cohort study of the effect of statins, mainly fluvastatin, on the occurrence of acute myocardial infarction. Am J Cardiol. 2003;92:1447–51.
Kang JD, Schafer JL. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci. 2007;22(4):523–39.
Carpenter JR, Kenwood MG. A comparison of multiple imputation and doubly robust estimation for analyses with missing data. J R Statist Soc. 2006;169(3):1–14.
Hausman J. Specification tests in econometrics. Econometrica. 1978;46:1251–71.
Cameron C, Trivedi P. Regression analysis of count data. Cambridge: Cambridge University Press; 2013.
Maddala GS. Limited dependent variables and qualitative variables in econometrics. Cambridge: Cambridge University Press; 1986.
Vytlacil E. Independence, monotonicity, and Latent Index Models: an equivalence result. Econometrica. 2002;70(1):331–41.
Evans H, Basu A. Exploring comparative effect heterogeneity with instrumental variables: prehospital intubation and mortality. Health, Econometrics, and Data Group: The University of York; 2011.
Basu A. Economics of individualization in comparative effectiveness research and a basis for a patient-centered health care. J Health Econ. 2011;30:549–59.
Crown W, Obenchain R, Englehart L, Lair T, Buesching D, Croghan T. Application of sample selection models to outcomes research: the case of evaluating effects of antidepressant therapy on resource utilization. Stat Med. 1998;17:1943–58.
Hadley J, Polsky D, Mandelblatt J, Mitchell J, Weeks J, Wang Q, Hwang Y and the OPTIONS Research Team. An exploratory instrumental variable analysis of the outcomes of localized breast cancer treatments in a medicare population. Health Econ. 2003;12:171–86.
Bound J, Jaeger DA, Baker RM. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J Am Stat Assoc. 1995;90(430):443–50.
Staiger D, Stock JH. Instrumental variables regression with weak instruments. Econometrica. 1997;65:557–86.
Hahn J, Hausman J. A new specification test for the validity of instrumental variables. Econometrica. 2002;70:163–89.
Kleibergen F, Zivot E. Bayesian and classical approaches to instrumental variables regression. J Econometrics. 2003;114:29–72.
Crown W, Henk H, VanNess D. Endogenous treatment selection: how bias in instrumental variables estimators is affected by instrument strength, instrument Contamination, and sample size. Val Health. 2011;14:1078–84.
Brookhart M, Rassen J, Schneeweiss S. Instrumental variable methods in comparative safety and effectiveness research. Pharmacoepidemiol Drug Saf. 2010;19(6):537–54.
Murray M. Avoiding invalid instruments and coping with weak instruments. J Econ Perspect. 2007;20(4):111–32.
Bertrand M, Duflo E, Mullianathan S. How much should we trust differences-in-differences estimates? Quart J Econ. 2004;119(1):249–75.
Heckman J, Ichimura H, Todd P. Matching as an econometric evaluation estimator: evidence from evaluating a job training program. Rev Econ Stud. 1997;64(4):605–54.
Heckman J, Ichimura H, Todd P. Matching as an econometric evaluation estimator: evidence from evaluating a job training program. Rev Econ Stud. 1998;65(2):261–94.
Blinder A. Wage discrimination: reduced form and structural estimates. J Hum Resour. 1973;8:436–55.
Oaxaca R. Male-female wage differentials in urban labor markets. Int Econ Rev. 1973;9:693–709.
Oaxaca R, Ransom M. On discrimination and the decomposition of wage differentials. J Econom. 1994;61:5–21.
Pylypchuk Y, Selden T. A discrete choice decomposition analysis of racial and ethnic differences in children’s health insurance coverage. J Health Econ. 2008;27:1109–28.
Cook B, McGuire T, Meara E, Zaslavsky A. Adjusting for health status in non-linear models of health care disparities. Health Serv Outcomes Res Method. 2009;9:1–21.
Chow G. Tests of equality between sets of coefficients in two linear regressions. Econometrica. 1960;28:591–605.
Kennedy P. A guide to econometrics. 6th ed. Hoboken: Wiley-Blackwell; 2008.
Efron B. Bootstrap methods: another look at the jackknife. Ann Stat. 1979;7:1–26.
Research support and potential conflict of interest statement
The author received salary support for developing this paper from Optum Labs. The conclusions are those of the author alone and there are no known potential conflicts of interest.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Crown, W.H. Propensity-Score Matching in Economic Analyses: Comparison with Regression Models, Instrumental Variables, Residual Inclusion, Differences-in-Differences, and Decomposition Methods. Appl Health Econ Health Policy 12, 7–18 (2014). https://doi.org/10.1007/s40258-013-0075-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40258-013-0075-4