Regression-adjusted matching and double-robust methods for estimating average treatment effects in health economic evaluation

Kreif, Noémi; Grieve, Richard; Radice, Rosalba; Sekhon, Jasjeet S.

doi:10.1007/s10742-013-0109-2

Regression-adjusted matching and double-robust methods for estimating average treatment effects in health economic evaluation

Published: 23 October 2013

Volume 13, pages 174–202, (2013)
Cite this article

Health Services and Outcomes Research Methodology Aims and scope Submit manuscript

Noémi Kreif¹,
Richard Grieve¹,
Rosalba Radice¹ &
…
Jasjeet S. Sekhon²

2070 Accesses
26 Citations
1 Altmetric
Explore all metrics

Abstract

Regression, propensity score (PS) and double-robust (DR) methods can reduce selection bias when estimating average treatment effects (ATEs). Economic evaluations of health care interventions exemplify complex data structures, in that the covariate–endpoint relationships tend to be highly non-linear, with highly skewed cost and health outcome endpoints. When either the regression or PS model is correct, DR methods can provide unbiased, efficient estimates of ATEs, but generally the specification of both models is unknown. Regression-adjusted matching can also protect against bias from model misspecification, but has not been compared to DR methods. This paper compares regression-adjusted matching to selected DR methods (weighted regression and augmented inverse probability of treatment weighting) as well as to regression and PS methods for addressing selection bias in cost-effectiveness analyses (CEA). We contrast the methods in a CEA of a pharmaceutical intervention, where there are extreme estimated PSs, hence unstable inverse probability of treatment (IPT) weights. The case study motivates a simulation which considers settings with functional form misspecification in the PS and endpoint regression models (e.g. cost model with log instead of identity link), stable and unstable PS weights. We find that in the realistic setting of unstable IPT weights and misspecifications to the PS and regression models, regression-adjusted matching reports less bias than DR methods. We conclude that regression-adjusted matching is a relatively robust method for estimating ATEs in applications with complex data structures exemplified by CEA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stage matching-adjusted indirect comparison

Article Open access 08 August 2022

Choice of statistical model for cost-effectiveness analysis and covariate adjustment: empirical application of prominent models and assessment of their results

Article 07 October 2015

Regression-based estimation of heterogeneous treatment effects when extending inferences from a randomized trial to a target population

Article 10 January 2023

Notes

Here a calliper is defined as the pre-specified amount by which propensity scores of matched pairs are allowed to differ.
Further possible ways of balancing with the PS include stratification (blocking) by the quintiles of the PS, and adding the PS as a covariate (Rosenbaum and Rubin 1983). They have been demonstrated to be dominated by IPTW and matching (Lunceford and Davidian 2004).
The cross-validation used twofold split sample, and measured goodness of fit with the mean squared prediction error, averaged over 100 iterations.
Standardised differences are weighted using matching frequency weights and IPT weights.
The copula function can generate draws from a flexible multivariate distribution (in this case the bivariate) with different marginal distributions (here, the gamma and the normal).
This resulted in a correlation of 0.34 between the cost and QALY variable, which reflects the correlation (0.22) found in the case study.
The choice of normal distribution for Y _E and the identity link function for Y _C was made for transparency reasons and to facilitate replication.
The proportion of individuals in the treatment group were typically around 50 % (scenarios 1 and 2) and 60 % (scenarios 3 and 4), compared with 46 % in the case study.

References

Abadie, A., Drukker, D., Herr, J.L., Imbens, G.: Implementing matching estimators for average treatment effects in Stata. Stata J. 4(3), 290–311 (2004a)
Google Scholar
Abadie, A., Herr, J.L., Imbens, G.W., Drukker, D.M.: NNMATCH: Stata module to compute nearest-neighbor bias-corrected estimators. http://fmwww.bc.edu/repec/bocode/n/nnmatch.hlp (2004b). Accessed 15 June 2012
Abadie, A., Imbens, G.W.: Large sample properties of matching estimators for average treatment effects. Econometrica 74(1), 235–267 (2006)
Article Google Scholar
Abadie, A., Imbens, G.W.: Bias-corrected matching estimators for average treatment effects. J. Bus. Econ. Stat. 29(1), 1–11 (2011)
Article Google Scholar
Austin, P.C.: A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat. Med. 27(12), 2037–2049 (2008)
Article PubMed Google Scholar
Austin, P.C.: Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat. Med. 28, 3083–3107 (2009)
Article PubMed Google Scholar
Austin, P.C.: Using ensemble-based methods for directly estimating causal effects: an investigation of tree-based G-computation. Multivariate Behav. Res. 47(1), 115–135 (2012)
Article PubMed Google Scholar
Bang, H., Robins, J.M.: Doubly robust estimation in missing data and causal inference models. Biometrics 61, 962–972 (2005)
Article PubMed Google Scholar
Barber, J., Thompson, S.G.: Multiple regression of cost data: use of generalised linear models. J. Health Serv. Res. Policy 9(4), 197–204 (2004)
Article PubMed Google Scholar
Basu, A.: Economics of individualization in comparative effectiveness research and a basis for a patient-centered health care. J. Health Econ. 30(3), 549–559 (2011)
Article PubMed Google Scholar
Basu, A., Manca, A.: Regression estimators for generic health-related quality of life and quality-adjusted life years. Med. Decis. Making 32(1), 56–69 (2011)
Article PubMed Google Scholar
Basu, A., Manning, W.G.: Issues for the next generation of health care cost analyses. Med. Care 47(7_Supplement_1), S109–S114 (2009)
Google Scholar
Basu, A., Polsky, D., Manning, W.: Estimating treatment effects on healthcare costs under exogeneity: is there a ‘magic bullet’? Health Serv. Outcomes Res. Methodol. 11(1), 1–26 (2011). doi:10.1007/s10742-011-0072-8
Article PubMed Google Scholar
Basu, A., Rathouz, P.J.: Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics 6(1), 93–109 (2005)
Article PubMed Google Scholar
Buntin, M.B., Zaslavsky, A.M.: Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. J. Health Econ. 23(3), 525–542 (2004)
Article PubMed Google Scholar
Busso, M., DiNardo, J., McCrary, J.: New evidence on the finite sample properties of propensity score reweighting and matching estimators. In: Working paper, vol. 3998, 2011
Caliendo, M., Kopeinig, S.: Some practical guidance for the implementation of propensity score matching. J. Econ. Surv. 22(1), 31–72 (2008). doi:10.1111/j.1467-6419.2007.00527.x
Article Google Scholar
Crump, R.K., Hotz, V.J., Imbens, G.W., Mitnik, O.A.: Dealing with limited overlap in estimation of average treatment effects. Biometrika 96(1), 187–199 (2009)
Article Google Scholar
Davison, A., Hinkley, D.: Bootstrap Methods and Their Application. Cambridge University Press, New York (1997)
Book Google Scholar
Dehejia, R.H., Wahba, S.: Propensity score-matching methods for nonexperimental causal studies. Rev. Econ. Stat. 84(1), 151–161 (2002)
Article Google Scholar
Diamond, A., Sekhon, J.S.: Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev. Econ. Stat. 95(3), 932–945 (2013)
Article Google Scholar
Fenwick, E., O’Brien, B., Briggs, A.: Cost-effectiveness acceptability curves—facts, fallacies and frequently asked questions. Health Econ. 13(5), 405–415 (2004)
Article PubMed Google Scholar
Freedman, D., Berk, R.A.: Weighting regression by propensity score. Eval. Rev. 32(4), 392–409 (2008)
Article PubMed Google Scholar
Fung, V., Brand, R.J., Newhouse, J.P., Hsu, J.: Using medicare data for comparative effectiveness research: opportunities and challenges. Am. J. Manag. Care 17(7), 489–496 (2011)
Google Scholar
Funk, M.J., Westreich, D., Wiesen, C., Stürmer, T., Brookhart, M.A., Davidian, M.: Doubly robust estimation of causal effects. Am. J. Epidemiol. 173(7), 761–767 (2011). doi:10.1093/aje/kwq439
Article PubMed Google Scholar
Glick, H., Doshi, J., Sonnad, S., Polsky, D.: Economic Evaluation in Clinical Trials. Oxford University Press, Oxford (2007)
Google Scholar
Glynn, A.N., Quinn, K.M.: An introduction to the augmented inverse propensity weighted estimator. Political Anal. 18, 36–56 (2010)
Article Google Scholar
Golinelli, D., Ridgeway, G., Rhoades, H., Tucker, J., Wenzel, S.: Bias and variance trade-offs when combining propensity score weighting and regression: with an application to HIV status and homeless men. Health Serv. Outcomes Res. Methodol. 12(2–3), 104–118 (2012)
Article PubMed Google Scholar
Grieve, R., Sekhon, J.S., Hu, T.-W., Bloom, J.: Evaluating health care programs by combining cost with quality of life measures: a case study comparing capitation and fee for service. Health Serv. Res. 43(4), 1204–1222 (2008)
Article PubMed Google Scholar
Gruber, S., van der Laan, M.J.: An application of collaborative targeted maximum likelihood estimation in causal inference and genomics. Int. J. Biostat. 6(1), Article 18 (2010). doi:10.2202/1557-4679.1182
PubMed Google Scholar
Hill, J., Reiter, J.P.: Interval estimation for treatment effects using propensity score matching. Stat. Med. 25(13), 2230–2256 (2006)
Article PubMed Google Scholar
Hirano, K., Imbens, G.W.: Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv. Outcomes Res. Methodol. 2(3), 259–278 (2001)
Article Google Scholar
Hirano, K., Imbens, G.W., Ridder, G.: Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71(4), 1161–1189 (2003)
Article Google Scholar
Ho, D.E., Imai, K., King, G., Stuart, E.A.: Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Anal. 15(3), 199–236 (2007)
Article Google Scholar
Imbens, G.M., Wooldridge, J.M.: Recent developments in the econometrics of program evaluation. J. Econ. Lit. 47(1), 5–86 (2009)
Article Google Scholar
Jackson, C., Bojke, L., Thompson, S., Claxton, K., Sharples, L.: A framework for addressing structural uncertainty in decision models. Med. Decis. Making 31, 662–674 (2011)
Article PubMed Google Scholar
Jones, A., Lomas, J., Rice, N.: Applying beta-type size distributions to healthcare cost regressions. In: HEDG working papers, vol. WP 11/31. HEDG, c/o Department of Economics, University of York, 2011
Jones, A.M.: Models for health care. In: HEDG working papers. HEDG, c/o Department of Economics, University of York, 2010
Kang, J.D.Y., Schafer, J.L.: Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat. Sci. 22(4), 523–539 (2007)
Article Google Scholar
Kreif, N., Grieve, R., Radice, R., Sadique, Z., Ramsahai, R., Sekhon, J.S.: Methods for estimating subgroup effects in cost-effectiveness analyses that use observational data. Med. Decis. Making 32(6), 750–763 (2012a). doi:10.1177/0272989x12448929
Article PubMed Google Scholar
Kreif, N., Grieve, R., Sadique, Z.: Statistical methods for cost-effectiveness analyses that use observational data: a critical appraisal tool and review of current practice. Health Econ. 22(4), 486–500 (2012b). doi:10.1002/hec.2806
Article PubMed Google Scholar
Lee, B.K., Lessler, J., Stuart, E.A.: Improving propensity score weighting using machine learning. Stat. Med. 29(3), 337–346 (2010)
PubMed Google Scholar
Lunceford, J.K., Davidian, M.: Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat. Med. 23(19), 2937–2960 (2004)
Article PubMed Google Scholar
Manca, A., Austin, P.C.: Using propensity score methods to analyse individual patient-level cost-effectiveness data from observational studies. http://www.york.ac.uk/res/herc/documents/wp/08_20.pdf (2008). Accessed 15 June 2012
Manning, W.G., Basu, A., Mullahy, J.: Generalized modeling approaches to risk adjustment of skewed outcomes data. J. Health Econ. 24(3), 465–488 (2005). doi:10.1016/j.jhealeco.2004.09.011
Article PubMed Google Scholar
Mihaylova, B., Briggs, A., O’Hagan, A., Thompson, S.: Review of statistical methods for analysing healthcare resources and costs. Health Econ. (2010). doi:10.1002/hec.1653
PubMed Google Scholar
NICE: Guide to the methods of technology appraisal 2013. http://www.nice.org.uk/media/D45/1E/GuideToMethodsTechnologyAppraisal2013.pdf (2013). Accessed 10 July 2013
Nixon, R., Wonderling, D., Grieve, R.: Non-parametric methods for cost-effectiveness analysis: the central limit theorem and the bootstrap compared. Health Econ. 19(3), 316–333 (2010)
Article PubMed Google Scholar
Nixon, R.M., Thompson, S.G.: Methods for incorporating covariate adjustment, subgroup analysis and between-centre differences into cost-effectiveness evaluations. Health Econ. 14(12), 1217–1229 (2005)
Article PubMed Google Scholar
Pearl, J.: Causal diagrams for empirical research. Biometrika 82(4), 669–688 (1995)
Article Google Scholar
Petersen, M.L., Porter, K., Gruber, S., Wang, Y., Laan, M.J.V.D.: Diagnosing and responding to violations in the positivity assumption. Stat. Methods Med. Res. 21(1), 31–54 (2012)
Article PubMed Google Scholar
Porter, K.E., Gruber, S., Laan, M.J.V.D., Sekhon, J.S.: The relative performance of targeted maximum likelihood estimators. Int. J. Biostat. (2011). doi:10.2202/1557-4679
PubMed Google Scholar
Quinn, C.: The health-economic applications of copulas: methods in applied econometric research. http://ideas.repec.org/p/yor/hectdg/07-22.html (2007). Accessed 10 Aug 2011
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2011)
Google Scholar
Radice, R., Grieve, R., Ramsahai, R., Kreif, N., Sadique, Z., Sekhon, J.S.: Evaluating treatment effectiveness in patient subgroups: a comparison of propensity score methods with an automated matching approach. Int. J. Biostat. 8(1), 25 (2012)
PubMed Google Scholar
Robins, J., Rotnitzky, A., Zhao, L.P.: Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 89, 846–866 (1994)
Article Google Scholar
Robins, J., Sued, M., Lei-Gomez, Q., Rotnitzky, A.: Comment: performance of double-robust estimators when “inverse probability” weights are highly variable. Stat. Sci. 22(4), 544–559 (2007)
Article Google Scholar
Robins, J.M., Rotnitzky, A., Zhao, L.P.: Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90(429), 106–121 (1995)
Article Google Scholar
Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983). doi:10.1093/biomet/70.1.41
Article Google Scholar
Rowan, K., Welch, C., North, E., Harrison, D.: Drotrecogin alfa (activated): real-life use and outcomes for the UK. Crit. Care 12(2), R58 (2008)
Article PubMed Google Scholar
Rubin, D.B.: The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics 29, 185–203 (1973)
Article Google Scholar
Rubin, D.B.: The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Stat. Med. 26(1), 20–36 (2007)
Article PubMed Google Scholar
Rubin, D.B.: On the limitations of comparative effectiveness research. Stat. Med. 29, 1991–1995 (2010)
Article PubMed Google Scholar
Rubin, D.B., Thomas, N.: Combining propensity score matching with additional adjustments for prognostic covariates. J. Am. Stat. Assoc. 95, 573–585 (2000)
Article Google Scholar
Sadique, M.Z., Grieve, R., Harrison, D., Cuthbertson, B., Rowan, K.: Is Drotrecogin alfa (activated) for adults with severe sepsis, cost-effective in routine clinical practice? Crit. Care 15(5), R228 (2011)
Article PubMed Google Scholar
Sekhon, J.S.: Matching: multivariate and propensity score matching with automated balance search. J. Stat. Softw. 42(7), 1–52 (2011)
Google Scholar
Sekhon, J.S., Grieve, R.D.: A matching method for improving covariate balance in cost-effectiveness analyses. Health Econ. 21(6), 695–714 (2011). doi:10.1002/hec.1748
Article PubMed Google Scholar
StataCorp: Stata Statistical Software: Release 12. StataCorp LP, College Station (2011)
Google Scholar
Stuart, E.A.: Matching methods for causal inference: a review and a look forward. Stat. Sci. 25(1), 1–21 (2010)
Article PubMed Google Scholar
Trivedi, P.K., Zimmer, D.M.: Copula Modeling: An Introduction to Practitioners, vol. 1. Foundations and Trends in Econometrics. Now Publishing Inc., Delft (2005)
Google Scholar
Tunis, S.R., Benner, J., McClellan, M.: Comparative effectiveness research: policy context, methods development and research infrastructure. Stat. Med. 29(19), 1963–1976 (2010). doi:10.1002/sim.3818
Article PubMed Google Scholar
van der Laan, M.J.: Targeted maximum likelihood based causal inference: part I. Int. J. Biostat. (2010). doi:10.2202/1557-4679.1211
Google Scholar
van der Laan, M.J., Gruber, S.: Collaborative double robust targeted maximum likelihood estimation. Int. J. Biostat. (2010). doi:10.2202/1557-4679.1181
Google Scholar
van der Laan, M.J., Polley, E.C., Hubbard, A.E.: Super learner. Stat. Appl. Genet. Mol. Biol. (2007). doi:10.2202/1544-6115.1309
Google Scholar
Westreich, D., Cole, S.R.: Invited commentary: positivity in practice. Am. J. Epidemiol. 171(6), 674–677 (2010). doi:10.1093/aje/kwp436
Article PubMed Google Scholar
Westreich, D., Lessler, J., Funk, M.: Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J. Clin. Epidemiol. 63(8), 826–833 (2010)
Article PubMed Google Scholar
Willan, A.R., Briggs, A.H., Hoch, J.S.: Regression methods for covariate adjustment and subgroup analysis for non-censored cost-effectiveness data. Health Econ. 13(5), 461–475 (2004)
Article PubMed Google Scholar

Download references

Acknowledgments

We thank Zia Sadique (LSHTM) for help in the motivating case study, Roland Ramsahai (University of Cambridge) for valuable comments on the Monte Carlo simulations, Manuel Gomes, Karla Diaz-Ordaz, Adam Steventon, Rhian Daniel (all LSHTM) and Susan Gruber (Harvard School of Public Health) for comments on the manuscript. We also thank David Harrison and Kathy Rowan (ICNARC) for access to the data used in the case study. Funding from the Economic and Social Research Council (Grant no. RES-061-25-0343) is greatly appreciated.

Author information

Authors and Affiliations

Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, 15-17 Tavistock Place, London, WC1H 9SH, UK
Noémi Kreif, Richard Grieve & Rosalba Radice
Department of Political Science, and Statistics, University of California Berkeley, Berkeley, CA, USA
Jasjeet S. Sekhon

Authors

Noémi Kreif
View author publications
You can also search for this author in PubMed Google Scholar
Richard Grieve
View author publications
You can also search for this author in PubMed Google Scholar
Rosalba Radice
View author publications
You can also search for this author in PubMed Google Scholar
Jasjeet S. Sekhon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Noémi Kreif.

Appendices

Appendix 1

See Tables 6 and 7.

Table 6 Monte Carlo simulation results: relative bias and RMSE of the estimated incremental cost

Full size table

Table 7 Monte Carlo simulation results: relative bias and RMSE of the estimated incremental QALYs

Full size table

Appendix 2: Code for implementing the methods

This section provides code for the implementation of the combined statistical approaches proposed in the paper, using the R (R Development Core Team 2011) and Stata statistical softwares (StataCorp 2011). The user-written functions implemented here call some pre-written R routines, for example “glm” for generalised linear models, or the “Matching” library (Sekhon 2011). When implementing the methods in Stata, we use the NNMATCH routine (Abadie et al. 2004b) for matching.

Appendix 3: R code for generating data in the simulations

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kreif, N., Grieve, R., Radice, R. et al. Regression-adjusted matching and double-robust methods for estimating average treatment effects in health economic evaluation. Health Serv Outcomes Res Method 13, 174–202 (2013). https://doi.org/10.1007/s10742-013-0109-2

Download citation

Received: 21 January 2013
Revised: 01 October 2013
Accepted: 04 October 2013
Published: 23 October 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s10742-013-0109-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regression-adjusted matching and double-robust methods for estimating average treatment effects in health economic evaluation

Abstract

Access this article

Similar content being viewed by others

Two-stage matching-adjusted indirect comparison

Choice of statistical model for cost-effectiveness analysis and covariate adjustment: empirical application of prominent models and assessment of their results

Regression-based estimation of heterogeneous treatment effects when extending inferences from a randomized trial to a target population

Notes

References

Acknowledgments