Skip to main content
Log in

Regression-based estimation of heterogeneous treatment effects when extending inferences from a randomized trial to a target population

  • ESSAY
  • Published:
European Journal of Epidemiology Aims and scope Submit manuscript

Abstract

Most work on extending (generalizing or transporting) inferences from a randomized trial to a target population has focused on estimating average treatment effects (i.e., averaged over the target population’s covariate distribution). Yet, in the presence of strong effect modification by baseline covariates, the average treatment effect in the target population may be less relevant for guiding treatment decisions. Instead, the conditional average treatment effect (CATE) as a function of key effect modifiers may be a more useful estimand. Recent work on estimating target population CATEs using baseline covariate, treatment, and outcome data from the trial and covariate data from the target population only allows for the examination of heterogeneity over distinct subgroups. We describe flexible pseudo-outcome regression modeling methods for estimating target population CATEs conditional on discrete or continuous baseline covariates when the trial is embedded in a sample from the target population (i.e., in nested trial designs). We construct pointwise confidence intervals for the CATE at a specific value of the effect modifiers and uniform confidence bands for the CATE function. Last, we illustrate the methods using data from the Coronary Artery Surgery Study (CASS) to estimate CATEs given history of myocardial infarction and baseline ejection fraction value in the target population of all trial-eligible patients with stable ischemic heart disease.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data Availability

The analyses in our paper used CASS research materials obtained from the National Heart, Lung, and Blood Institute (NHLBI) Biologic Specimen and Data Repository Information Coordinating Center.

Code availability

Code is available online at GitHub [https://github.com/serobertson/GeneralizabilityCATE].

Abbreviations

CASS:

Coronary Artery Surgery Study

CATE:

Conditional average treatment effect

MI:

Myocardial infarction

References

  1. Hernán MA. “Discussion of “Perils and potentials of self-selected entry to epidemiological studies and surveys. J Royal Stat Soc Series A (Statistics in Society). 2016;179(2):346–7.

    Google Scholar 

  2. Dahabreh IJ, Hernán MA. Extending inferences from a randomized trial to a target population. Eur J Epidemiol. 2019;34(8):719–22.

    Article  CAS  Google Scholar 

  3. Cole SR, Stuart EA. Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial. Am J Epidemiol. 2010;172(1):107–15.

    Article  Google Scholar 

  4. Westreich D, Edwards JK, Lesko CR, Stuart E, Cole SR. Transportability of trial results using inverse odds of sampling weights. Am J Epidemiol. 2017;186(8):1010–4.

    Article  Google Scholar 

  5. Rudolph KE, van der Laan MJ. Robust estimation of encouragement design intervention effects transported across sites. J Royal Stat Soc Series B (Statistical Methodology). 2017;79(5):1509–25.

    Article  Google Scholar 

  6. Dahabreh IJ, Robertson SE, Tchetgen Tchetgen EJ, Stuart EA, Hernán MA. Generalizing causal inferences from individuals in randomized trials to all trial-eligible individuals. Biometrics. 2018;75(2):685–94.

    Article  Google Scholar 

  7. Dahabreh IJ, Robertson SE, Steingrimsson JA, Stuart EA, Hernán MA. Extending inferences from a randomized trial to a new target population. Stat Med. 2020;39(14):1999–2014.

    Article  Google Scholar 

  8. Dahabreh IJ, Hayward R, Kent DM. Using group data to treat individuals: understanding heterogeneous treatment effects in the age of precision medicine and patient-centred evidence. Int J Epidemiol. 2016;45(6):2184–93.

    Google Scholar 

  9. Seamans MJ, Hong H, Ackerman B, Schmid I, Stuart EA. Generalizability of subgroup effects. Epidemiology. 2021;32(3):389–92.

    Article  Google Scholar 

  10. VanderWeele TJ, Robins JM. Four types of effect modification: a classification based on directed acyclic graphs. Epidemiology. 2007;18(5):561–8.

    Article  Google Scholar 

  11. Hernán MA, Robins JM. Causal Inference: What If. 1st ed. Boca Raton, FL: Chapman & Hall/CRC; 2020.

    Google Scholar 

  12. Mehrotra ML, Westreich D, Glymour MM, Geng E, Glidden DV. Transporting subgroup analyses of randomized trials for planning implementation of new interventions’. Am J Epidemiol. 2021;190(8):1671–80.

    Article  Google Scholar 

  13. Robertson SE, Steingrimsson JA, Joyce NR, Stuart EA, Dahabreh IJ. Estimating subgroup effects in generalizability and transportability analyses,” American Journal of Epidemiology, kwac036, 2022.

  14. Robins JM, Ritov Y. Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. Stat Med. 1997;16(3):285–319.

    Article  CAS  Google Scholar 

  15. Abrevaya J, Hsu Y-C, Lieli RP. Estimating conditional average treatment effects. J Bus Econom Stat. 2015;33(4):485–505.

    Article  Google Scholar 

  16. Lee S, Okui R, Whang Y-J. Doubly robust uniform confidence band for the conditional average treatment effect function. J Appl Econom. 2017;32(7):1207–25.

    Article  Google Scholar 

  17. Lechner M. Modified causal forests for estimating heterogeneous causal effects. arXiv preprint arXiv:1812.09487, 2018.

  18. Kennedy EH. Optimal doubly robust estimation of heterogeneous causal effects. arXiv preprint arXiv:2004.14497, 2020.

  19. Semenova V, Chernozhukov V. Debiased machine learning of conditional average treatment effects and other causal functions. Econom J. 2021;24(2):264–89.

    Article  Google Scholar 

  20. Fan Q, Hsu Y-C, Lieli RP, Zhang Y. Estimation of conditional average treatment effects with high-dimensional data. J Bus Econom Stat. 2020;40(1):313–27.

    Article  Google Scholar 

  21. Knaus MC, Lechner M, Strittmatter A. Machine learning estimation of heterogeneous causal effects: empirical monte carlo evidence. Econom J. 2021;24(1):134–61.

    Article  Google Scholar 

  22. Dahabreh IJ, Haneuse SJ-P, Robins JM, Robertson SE, Buchanan AL, Stuart EA, Hernán MA. Study designs for extending causal inferences from a randomized trial to a target population. Am J Epidemiol. 2021;190(8):1632–42.

    Article  Google Scholar 

  23. Robins JM. Confidence intervals for causal parameters. Stat Med. 1988;7(7):773–85.

    Article  CAS  Google Scholar 

  24. Splawa-Neyman J. On the application of probability theory to agricultural experiments. essay on principles. section 9. [Translated from Splawa-Neyman, J (1923) in Roczniki Nauk Rolniczych Tom X, 1–51]. Stat Sci. 1990;5(4):465–72.

    Article  Google Scholar 

  25. Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66(5):688.

    Article  Google Scholar 

  26. Robins JM, Greenland S. Causal inference without counterfactuals: comment. J Am Stat Assoc. 2000;95(450):431–5.

    Article  Google Scholar 

  27. CASS Principal Investigators. Coronary artery surgery study (CASS): a randomized trial of coronary artery bypass surgery: comparability of entry characteristics and survival in randomized patients and nonrandomized patients meeting randomization criteria. J Am Collegef Cardiol. 1984;3(1):114–28.

    Article  Google Scholar 

  28. Passamani E, Davis KB, Gillespie MJ, Killip T, Investigators CP, Associates T. A randomized trial of coronary artery bypass surgery: survival of patients with a low ejection fraction. New England J Med. 1985;312(26):1665–71.

    Article  CAS  Google Scholar 

  29. Dahabreh IJ, Robins JM, Haneuse SJ-P, Hernán MA. Generalizing causal inferences from randomized trials: counterfactual and graphical identification. arXiv preprint arXiv:1906.10792, 2019 (accessed: 11/03/2020).

  30. Rubin DB. Statistics and causal inference: Comment: Which ifs have causal answers. J Am Stat Assoc. 1986;81(396):961–2.

    Google Scholar 

  31. Rubin DB. Reflections stimulated by the comments of Shadish (2010) and West and Thoemmes. Psychol Method. 2010;15(1):38–46.

    Article  Google Scholar 

  32. VanderWeele TJ. Concerning the consistency assumption in causal inference. Epidemiology. 2009;20(6):880–3.

    Article  Google Scholar 

  33. Halloran ME, Struchiner CJ. Causal inference in infectious diseases. Epidemiology, 1995; pp. 142–151. https://pubmed.ncbi.nlm.nih.gov/7742400.

  34. Dahabreh IJ, Robins JM, Haneuse SJ-P, Saeed I, Robertson SE, Stuart EA, Hernán MA. “Sensitivity analysis using bias functions for studies extending inferences from a randomized trial to a target population,” arXiv preprint arXiv:1905.10684, 2019.

  35. Pearl J, Bareinboim E. Transportability of causal and statistical relations: A formal approach. In: 11th AAAI conference on artificial intelligence 2011 Aug 4 pp. 540–547.

  36. Petersen ML, Porter KE, Gruber S, Wang Y, van der Laan MJ. Diagnosing and responding to violations in the positivity assumption. Stat Method Med Res. 2012;21(1):31–54.

    Article  Google Scholar 

  37. Robins JM, Hernán MA. (2009). Estimation of the causal effects of time-varying exposures. In Longitudinal Data Analysis G. Fitzmaurice, M. Davidian, G. Verbeke, and G. Molenberghs, eds.) (pp. 567-614). Chapman and Hall/CRC.

  38. Dahabreh IJ, Robins JM, Hernán MA. Benchmarking observational methods by comparing randomized trials and their emulations. Epidemiology. 2020;31(5):614–9.

    Article  Google Scholar 

  39. Tsiatis A. Semiparametric theory and missing data. New York:Springer, 2007. https://link.springer.com/book/10.1007/0-387-37345-4.

  40. Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, Robins J. Double/debiased machine learning for treatment and structural parameters. Econom J. 2018;21(1):C1–68.

    Article  Google Scholar 

  41. Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61(4):962–73.

    Article  Google Scholar 

  42. Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med. 2004;23(19):2937–60.

    Article  Google Scholar 

  43. Williamson EJ, Forbes A, White IR. Variance reduction in randomised trials by inverse probability weighting using the propensity score. Stat Med. 2014;33(5):721–37.

    Article  Google Scholar 

  44. Racine JS. Nonparametric Econometrics: A Primer. Foundation and Trends in Econometrics, 2008.https://socialsciences.mcmaster.ca/racinej/ECO0301.pdf.

  45. Hernán MA, Robins JM. Causal Inference: What If. Boca Raton, FL: Chapman & Hall/CRC; 2020.

    Google Scholar 

  46. Smucler E, Rotnitzky A, Robins JM. “A unifying approach for doubly-robust \(\ell _1\) regularized estimation of causal contrasts,” arXiv preprint arXiv:1904.03737, 2019.

  47. Benkeser D, Van Der Laan M. “The highly adaptive lasso estimator,” In :2016 IEEE international conference on data science and advanced analytics (DSAA), pp. 689–696, IEEE, 2016.

  48. Horowitz JL. Semiparametric and nonparametric methods in econometrics. New York: Springer, 2009. https://link.springer.com/book/10.1007/978-0-387-92870-8.

  49. Kennedy EH, Lorch S, Small DS. Robust causal inference with continuous instruments using the local instrumental variable curve. J Royal Stat Soc: Series B (Statistical Methodology). 2019;81:121–43.

    Article  Google Scholar 

  50. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

    Article  Google Scholar 

  51. Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Stat Soc: Series B (Statistical Methodology). 1996;58(1):267–88.

    Google Scholar 

  52. Efron B, Tibshirani RJ. An introduction to the bootstrap, vol. 57 of Monographs on Statistics and Applied Probability. Chapman & Hall/CRC, 1994.

  53. Huber PJ. Under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA 1967 (p. 221).

  54. Stefanski LA, Boos DD. The calculus of M-estimation. Am Stat. 2002;56(1):29–38.

    Article  Google Scholar 

  55. Belloni A, Chernozhukov V, Chetverikov D, Kato K. Some new asymptotic theory for least squares series: pointwise and uniform results. J Econom. 2015;186(2):345–66.

    Article  Google Scholar 

  56. Belloni A, Chernozhukov V, Chetverikov D, Wei Y. Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework. Annal stat. 2018;46(6B):3643.

    Article  Google Scholar 

  57. Vaart AW, Wellner JA. Weak convergence. InWeak convergence and empirical processes 1996 (pp. 16-28). Springer, New York, NY.

  58. William J, Russell R, Nicholas T, et al. Coronary artery surgery study (CASS): a randomized trial of coronary artery bypass surgery. Circulation. 1983;68(5):939–50.

    Article  Google Scholar 

  59. Alderman EL, Bourassa MG, Cohen LS, Davis KB, Kaiser GG, Killip T, Mock MB, Pettinger M, Robertson T. Ten-year follow-up of survival and myocardial infarction in the randomized coronary artery surgery study. Circulation. 1990;82(5):1629–46.

    Article  CAS  Google Scholar 

  60. Robertson SE, Leith A, Schmid CH, Dahabreh IJ. Assessing heterogeneity of treatment effects in observational studies. Am J Epidemiol. 2021;190(6):1088–100.

    Article  Google Scholar 

  61. Yusuf S, Zucker D, Passamani E, Peduzzi P, Takaro T, Fisher L, Kennedy J, Davis K, Killip T, Norris R, et al. Effect of coronary artery bypass graft surgery on survival: overview of 10-year results from randomised trials by the coronary artery bypass graft surgery trialists collaboration. The Lancet. 1994;344(8922):563–70.

    Article  CAS  Google Scholar 

  62. Velazquez EJ, Lee KL, Jones RH, Al-Khalidi HR, Hill JA, Panza JA, Michler RE, Bonow RO, Doenst T, Petrie MC, et al. Coronary-artery bypass surgery in patients with ischemic cardiomyopathy. New England J Med. 2016;374(16):1511–20.

    Article  CAS  Google Scholar 

  63. Core Team R. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2021.

  64. Zimmert M, Lechner M. “Nonparametric estimation of causal heterogeneity under high-dimensional confounding,” arXiv preprint arXiv:1908.08779, 2019.

  65. Künzel SR, Sekhon JS, Bickel PJ, Yu B. Metalearners for estimating heterogeneous treatment effects using machine learning. Proc Nat Acad Sci. 2019;116(10):4156–65.

    Article  Google Scholar 

  66. Nie X, Wager S. “Quasi-oracle estimation of heterogeneous treatment effects,” arXiv preprint arXiv:1712.04912, 2017.

  67. Athey S, Wager S. “Estimating treatment effects with causal forests: An application,” arXiv preprint arXiv:1902.07409, 2019.

  68. Chernozhukov V, Demirer M, Duflo E, Fernandez-Val I. “Generic machine learning inference on heterogenous treatment effects in randomized experiments, with an application to immunization in India,” National Bureau of Economic Research, 2018. https://arxiv.org/abs/1712.04802.

Download references

Funding

This work was supported in part by Agency for Healthcare Research and Quality (AHRQ) award R36HS028373-01 and Patient-Centered Outcomes Research Institute (PCORI) awards ME-1502-27794, ME-2019C3-17875, and ME-2021C2-22365, and National Library of Medicine (NLM) award R01LM013616.  The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of PCORI, the PCORI Board of Governors, or the PCORI Methodology Committee, NLM, or the CASS investigators.

Author information

Authors and Affiliations

Authors

Contributions

All authors were involved in drafting the manuscript and have read and approved the final version submitted. SER conducted the statistical analysis.

Corresponding author

Correspondence to Issa J. Dahabreh.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethical approval

Our research is not human subjects research.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 188 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Robertson, S.E., Steingrimsson, J.A. & Dahabreh, I.J. Regression-based estimation of heterogeneous treatment effects when extending inferences from a randomized trial to a target population. Eur J Epidemiol 38, 123–133 (2023). https://doi.org/10.1007/s10654-022-00901-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10654-022-00901-5

Keywords

Navigation