Abstract
Regression, propensity score (PS) and double-robust (DR) methods can reduce selection bias when estimating average treatment effects (ATEs). Economic evaluations of health care interventions exemplify complex data structures, in that the covariate–endpoint relationships tend to be highly non-linear, with highly skewed cost and health outcome endpoints. When either the regression or PS model is correct, DR methods can provide unbiased, efficient estimates of ATEs, but generally the specification of both models is unknown. Regression-adjusted matching can also protect against bias from model misspecification, but has not been compared to DR methods. This paper compares regression-adjusted matching to selected DR methods (weighted regression and augmented inverse probability of treatment weighting) as well as to regression and PS methods for addressing selection bias in cost-effectiveness analyses (CEA). We contrast the methods in a CEA of a pharmaceutical intervention, where there are extreme estimated PSs, hence unstable inverse probability of treatment (IPT) weights. The case study motivates a simulation which considers settings with functional form misspecification in the PS and endpoint regression models (e.g. cost model with log instead of identity link), stable and unstable PS weights. We find that in the realistic setting of unstable IPT weights and misspecifications to the PS and regression models, regression-adjusted matching reports less bias than DR methods. We conclude that regression-adjusted matching is a relatively robust method for estimating ATEs in applications with complex data structures exemplified by CEA.
Similar content being viewed by others
Notes
Here a calliper is defined as the pre-specified amount by which propensity scores of matched pairs are allowed to differ.
The cross-validation used twofold split sample, and measured goodness of fit with the mean squared prediction error, averaged over 100 iterations.
Standardised differences are weighted using matching frequency weights and IPT weights.
The copula function can generate draws from a flexible multivariate distribution (in this case the bivariate) with different marginal distributions (here, the gamma and the normal).
This resulted in a correlation of 0.34 between the cost and QALY variable, which reflects the correlation (0.22) found in the case study.
The choice of normal distribution for Y E and the identity link function for Y C was made for transparency reasons and to facilitate replication.
The proportion of individuals in the treatment group were typically around 50 % (scenarios 1 and 2) and 60 % (scenarios 3 and 4), compared with 46 % in the case study.
References
Abadie, A., Drukker, D., Herr, J.L., Imbens, G.: Implementing matching estimators for average treatment effects in Stata. Stata J. 4(3), 290–311 (2004a)
Abadie, A., Herr, J.L., Imbens, G.W., Drukker, D.M.: NNMATCH: Stata module to compute nearest-neighbor bias-corrected estimators. http://fmwww.bc.edu/repec/bocode/n/nnmatch.hlp (2004b). Accessed 15 June 2012
Abadie, A., Imbens, G.W.: Large sample properties of matching estimators for average treatment effects. Econometrica 74(1), 235–267 (2006)
Abadie, A., Imbens, G.W.: Bias-corrected matching estimators for average treatment effects. J. Bus. Econ. Stat. 29(1), 1–11 (2011)
Austin, P.C.: A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat. Med. 27(12), 2037–2049 (2008)
Austin, P.C.: Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat. Med. 28, 3083–3107 (2009)
Austin, P.C.: Using ensemble-based methods for directly estimating causal effects: an investigation of tree-based G-computation. Multivariate Behav. Res. 47(1), 115–135 (2012)
Bang, H., Robins, J.M.: Doubly robust estimation in missing data and causal inference models. Biometrics 61, 962–972 (2005)
Barber, J., Thompson, S.G.: Multiple regression of cost data: use of generalised linear models. J. Health Serv. Res. Policy 9(4), 197–204 (2004)
Basu, A.: Economics of individualization in comparative effectiveness research and a basis for a patient-centered health care. J. Health Econ. 30(3), 549–559 (2011)
Basu, A., Manca, A.: Regression estimators for generic health-related quality of life and quality-adjusted life years. Med. Decis. Making 32(1), 56–69 (2011)
Basu, A., Manning, W.G.: Issues for the next generation of health care cost analyses. Med. Care 47(7_Supplement_1), S109–S114 (2009)
Basu, A., Polsky, D., Manning, W.: Estimating treatment effects on healthcare costs under exogeneity: is there a ‘magic bullet’? Health Serv. Outcomes Res. Methodol. 11(1), 1–26 (2011). doi:10.1007/s10742-011-0072-8
Basu, A., Rathouz, P.J.: Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics 6(1), 93–109 (2005)
Buntin, M.B., Zaslavsky, A.M.: Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. J. Health Econ. 23(3), 525–542 (2004)
Busso, M., DiNardo, J., McCrary, J.: New evidence on the finite sample properties of propensity score reweighting and matching estimators. In: Working paper, vol. 3998, 2011
Caliendo, M., Kopeinig, S.: Some practical guidance for the implementation of propensity score matching. J. Econ. Surv. 22(1), 31–72 (2008). doi:10.1111/j.1467-6419.2007.00527.x
Crump, R.K., Hotz, V.J., Imbens, G.W., Mitnik, O.A.: Dealing with limited overlap in estimation of average treatment effects. Biometrika 96(1), 187–199 (2009)
Davison, A., Hinkley, D.: Bootstrap Methods and Their Application. Cambridge University Press, New York (1997)
Dehejia, R.H., Wahba, S.: Propensity score-matching methods for nonexperimental causal studies. Rev. Econ. Stat. 84(1), 151–161 (2002)
Diamond, A., Sekhon, J.S.: Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev. Econ. Stat. 95(3), 932–945 (2013)
Fenwick, E., O’Brien, B., Briggs, A.: Cost-effectiveness acceptability curves—facts, fallacies and frequently asked questions. Health Econ. 13(5), 405–415 (2004)
Freedman, D., Berk, R.A.: Weighting regression by propensity score. Eval. Rev. 32(4), 392–409 (2008)
Fung, V., Brand, R.J., Newhouse, J.P., Hsu, J.: Using medicare data for comparative effectiveness research: opportunities and challenges. Am. J. Manag. Care 17(7), 489–496 (2011)
Funk, M.J., Westreich, D., Wiesen, C., Stürmer, T., Brookhart, M.A., Davidian, M.: Doubly robust estimation of causal effects. Am. J. Epidemiol. 173(7), 761–767 (2011). doi:10.1093/aje/kwq439
Glick, H., Doshi, J., Sonnad, S., Polsky, D.: Economic Evaluation in Clinical Trials. Oxford University Press, Oxford (2007)
Glynn, A.N., Quinn, K.M.: An introduction to the augmented inverse propensity weighted estimator. Political Anal. 18, 36–56 (2010)
Golinelli, D., Ridgeway, G., Rhoades, H., Tucker, J., Wenzel, S.: Bias and variance trade-offs when combining propensity score weighting and regression: with an application to HIV status and homeless men. Health Serv. Outcomes Res. Methodol. 12(2–3), 104–118 (2012)
Grieve, R., Sekhon, J.S., Hu, T.-W., Bloom, J.: Evaluating health care programs by combining cost with quality of life measures: a case study comparing capitation and fee for service. Health Serv. Res. 43(4), 1204–1222 (2008)
Gruber, S., van der Laan, M.J.: An application of collaborative targeted maximum likelihood estimation in causal inference and genomics. Int. J. Biostat. 6(1), Article 18 (2010). doi:10.2202/1557-4679.1182
Hill, J., Reiter, J.P.: Interval estimation for treatment effects using propensity score matching. Stat. Med. 25(13), 2230–2256 (2006)
Hirano, K., Imbens, G.W.: Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv. Outcomes Res. Methodol. 2(3), 259–278 (2001)
Hirano, K., Imbens, G.W., Ridder, G.: Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71(4), 1161–1189 (2003)
Ho, D.E., Imai, K., King, G., Stuart, E.A.: Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Anal. 15(3), 199–236 (2007)
Imbens, G.M., Wooldridge, J.M.: Recent developments in the econometrics of program evaluation. J. Econ. Lit. 47(1), 5–86 (2009)
Jackson, C., Bojke, L., Thompson, S., Claxton, K., Sharples, L.: A framework for addressing structural uncertainty in decision models. Med. Decis. Making 31, 662–674 (2011)
Jones, A., Lomas, J., Rice, N.: Applying beta-type size distributions to healthcare cost regressions. In: HEDG working papers, vol. WP 11/31. HEDG, c/o Department of Economics, University of York, 2011
Jones, A.M.: Models for health care. In: HEDG working papers. HEDG, c/o Department of Economics, University of York, 2010
Kang, J.D.Y., Schafer, J.L.: Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat. Sci. 22(4), 523–539 (2007)
Kreif, N., Grieve, R., Radice, R., Sadique, Z., Ramsahai, R., Sekhon, J.S.: Methods for estimating subgroup effects in cost-effectiveness analyses that use observational data. Med. Decis. Making 32(6), 750–763 (2012a). doi:10.1177/0272989x12448929
Kreif, N., Grieve, R., Sadique, Z.: Statistical methods for cost-effectiveness analyses that use observational data: a critical appraisal tool and review of current practice. Health Econ. 22(4), 486–500 (2012b). doi:10.1002/hec.2806
Lee, B.K., Lessler, J., Stuart, E.A.: Improving propensity score weighting using machine learning. Stat. Med. 29(3), 337–346 (2010)
Lunceford, J.K., Davidian, M.: Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat. Med. 23(19), 2937–2960 (2004)
Manca, A., Austin, P.C.: Using propensity score methods to analyse individual patient-level cost-effectiveness data from observational studies. http://www.york.ac.uk/res/herc/documents/wp/08_20.pdf (2008). Accessed 15 June 2012
Manning, W.G., Basu, A., Mullahy, J.: Generalized modeling approaches to risk adjustment of skewed outcomes data. J. Health Econ. 24(3), 465–488 (2005). doi:10.1016/j.jhealeco.2004.09.011
Mihaylova, B., Briggs, A., O’Hagan, A., Thompson, S.: Review of statistical methods for analysing healthcare resources and costs. Health Econ. (2010). doi:10.1002/hec.1653
NICE: Guide to the methods of technology appraisal 2013. http://www.nice.org.uk/media/D45/1E/GuideToMethodsTechnologyAppraisal2013.pdf (2013). Accessed 10 July 2013
Nixon, R., Wonderling, D., Grieve, R.: Non-parametric methods for cost-effectiveness analysis: the central limit theorem and the bootstrap compared. Health Econ. 19(3), 316–333 (2010)
Nixon, R.M., Thompson, S.G.: Methods for incorporating covariate adjustment, subgroup analysis and between-centre differences into cost-effectiveness evaluations. Health Econ. 14(12), 1217–1229 (2005)
Pearl, J.: Causal diagrams for empirical research. Biometrika 82(4), 669–688 (1995)
Petersen, M.L., Porter, K., Gruber, S., Wang, Y., Laan, M.J.V.D.: Diagnosing and responding to violations in the positivity assumption. Stat. Methods Med. Res. 21(1), 31–54 (2012)
Porter, K.E., Gruber, S., Laan, M.J.V.D., Sekhon, J.S.: The relative performance of targeted maximum likelihood estimators. Int. J. Biostat. (2011). doi:10.2202/1557-4679
Quinn, C.: The health-economic applications of copulas: methods in applied econometric research. http://ideas.repec.org/p/yor/hectdg/07-22.html (2007). Accessed 10 Aug 2011
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2011)
Radice, R., Grieve, R., Ramsahai, R., Kreif, N., Sadique, Z., Sekhon, J.S.: Evaluating treatment effectiveness in patient subgroups: a comparison of propensity score methods with an automated matching approach. Int. J. Biostat. 8(1), 25 (2012)
Robins, J., Rotnitzky, A., Zhao, L.P.: Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 89, 846–866 (1994)
Robins, J., Sued, M., Lei-Gomez, Q., Rotnitzky, A.: Comment: performance of double-robust estimators when “inverse probability” weights are highly variable. Stat. Sci. 22(4), 544–559 (2007)
Robins, J.M., Rotnitzky, A., Zhao, L.P.: Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90(429), 106–121 (1995)
Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983). doi:10.1093/biomet/70.1.41
Rowan, K., Welch, C., North, E., Harrison, D.: Drotrecogin alfa (activated): real-life use and outcomes for the UK. Crit. Care 12(2), R58 (2008)
Rubin, D.B.: The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics 29, 185–203 (1973)
Rubin, D.B.: The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat. Med. 26(1), 20–36 (2007)
Rubin, D.B.: On the limitations of comparative effectiveness research. Stat. Med. 29, 1991–1995 (2010)
Rubin, D.B., Thomas, N.: Combining propensity score matching with additional adjustments for prognostic covariates. J. Am. Stat. Assoc. 95, 573–585 (2000)
Sadique, M.Z., Grieve, R., Harrison, D., Cuthbertson, B., Rowan, K.: Is Drotrecogin alfa (activated) for adults with severe sepsis, cost-effective in routine clinical practice? Crit. Care 15(5), R228 (2011)
Sekhon, J.S.: Matching: multivariate and propensity score matching with automated balance search. J. Stat. Softw. 42(7), 1–52 (2011)
Sekhon, J.S., Grieve, R.D.: A matching method for improving covariate balance in cost-effectiveness analyses. Health Econ. 21(6), 695–714 (2011). doi:10.1002/hec.1748
StataCorp: Stata Statistical Software: Release 12. StataCorp LP, College Station (2011)
Stuart, E.A.: Matching methods for causal inference: a review and a look forward. Stat. Sci. 25(1), 1–21 (2010)
Trivedi, P.K., Zimmer, D.M.: Copula Modeling: An Introduction to Practitioners, vol. 1. Foundations and Trends in Econometrics. Now Publishing Inc., Delft (2005)
Tunis, S.R., Benner, J., McClellan, M.: Comparative effectiveness research: policy context, methods development and research infrastructure. Stat. Med. 29(19), 1963–1976 (2010). doi:10.1002/sim.3818
van der Laan, M.J.: Targeted maximum likelihood based causal inference: part I. Int. J. Biostat. (2010). doi:10.2202/1557-4679.1211
van der Laan, M.J., Gruber, S.: Collaborative double robust targeted maximum likelihood estimation. Int. J. Biostat. (2010). doi:10.2202/1557-4679.1181
van der Laan, M.J., Polley, E.C., Hubbard, A.E.: Super learner. Stat. Appl. Genet. Mol. Biol. (2007). doi:10.2202/1544-6115.1309
Westreich, D., Cole, S.R.: Invited commentary: positivity in practice. Am. J. Epidemiol. 171(6), 674–677 (2010). doi:10.1093/aje/kwp436
Westreich, D., Lessler, J., Funk, M.: Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J. Clin. Epidemiol. 63(8), 826–833 (2010)
Willan, A.R., Briggs, A.H., Hoch, J.S.: Regression methods for covariate adjustment and subgroup analysis for non-censored cost-effectiveness data. Health Econ. 13(5), 461–475 (2004)
Acknowledgments
We thank Zia Sadique (LSHTM) for help in the motivating case study, Roland Ramsahai (University of Cambridge) for valuable comments on the Monte Carlo simulations, Manuel Gomes, Karla Diaz-Ordaz, Adam Steventon, Rhian Daniel (all LSHTM) and Susan Gruber (Harvard School of Public Health) for comments on the manuscript. We also thank David Harrison and Kathy Rowan (ICNARC) for access to the data used in the case study. Funding from the Economic and Social Research Council (Grant no. RES-061-25-0343) is greatly appreciated.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1
Appendix 2: Code for implementing the methods
This section provides code for the implementation of the combined statistical approaches proposed in the paper, using the R (R Development Core Team 2011) and Stata statistical softwares (StataCorp 2011). The user-written functions implemented here call some pre-written R routines, for example “glm” for generalised linear models, or the “Matching” library (Sekhon 2011). When implementing the methods in Stata, we use the NNMATCH routine (Abadie et al. 2004b) for matching.
Appendix 3: R code for generating data in the simulations
Rights and permissions
About this article
Cite this article
Kreif, N., Grieve, R., Radice, R. et al. Regression-adjusted matching and double-robust methods for estimating average treatment effects in health economic evaluation. Health Serv Outcomes Res Method 13, 174–202 (2013). https://doi.org/10.1007/s10742-013-0109-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10742-013-0109-2