Matching Methods for Obtaining Survival Functions to Estimate the Effect of a Time-Dependent Treatment


In observational studies of survival time featuring a binary time-dependent treatment, the hazard ratio (an instantaneous measure) is often used to represent the treatment effect. However, investigators are often more interested in the difference in survival functions. We propose semiparametric methods to estimate the causal effect of treatment among the treated with respect to survival probability. The objective is to compare post-treatment survival with the survival function that would have been observed in the absence of treatment. For each patient, we compute a prognostic score (based on the pre-treatment death hazard) and a propensity score (based on the treatment hazard). Each treated patient is then matched with an alive, uncensored and not-yet-treated patient with similar prognostic and/or propensity scores. The experience of each treated and matched patient is weighted using a variant of Inverse Probability of Censoring Weighting to account for the impact of censoring. We propose estimators of the treatment-specific survival functions (and their difference), computed through weighted Nelson–Aalen estimators. Closed-form variance estimators are proposed which take into consideration the potential replication of subjects across matched sets. The proposed methods are evaluated through simulation, then applied to estimate the effect of kidney transplantation on survival among end-stage renal disease patients using data from a national organ failure registry.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3


  1. 1.

    Abadie A, Imbens GW (2008) On the failure of the bootstrap for matching estimators. Econometrica 76:1537–1557

    Article  MATH  MathSciNet  Google Scholar 

  2. 2.

    Abbring JH, van den Berg GJ (2004) Analyzing the effect of dynamically assigned treatments using duration models, binary treatment models, and panel data models. Empir Econ 29:5–20

    Article  Google Scholar 

  3. 3.

    Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, New York

    Book  MATH  Google Scholar 

  4. 4.

    Brumback BA, Hernán MA, Haneuse SJ, Robins JM (2004) Sensitivity analyses for unmeasured confounding assuming a marginal structural model for repeated measures. Stat Med 23:749–767. doi:10.1002/sim.1657

    Article  Google Scholar 

  5. 5.

    Bryson A, Dorsett R, Purdon S (2002) The use of propensity score matching in the evaluation of active labour market policies. Department for Work and Pensions working paper no 4. Department for Work and Pensions

  6. 6.

    Caliendo M, Kopeinig S (2008) Some practical guidance for the implementation of propensity score matching. J Econ Surv 22(1):31–72. doi:10.1111/j.1467-6419.2007.00527.x

    Article  Google Scholar 

  7. 7.

    Chen PY, Tsiatis AA (2001) Causal inference on the difference of the restricted mean lifetime between two groups. Biometrics 57:1030–1038

    Article  MATH  MathSciNet  Google Scholar 

  8. 8.

    Cox DR (1972) Regression models and life tables (with discussion). J R Stat Soc B 34:187–200

    MATH  Google Scholar 

  9. 9.

    Cox DR (1975) Partial likelihood. Biometrika 62:269–276

    Article  MATH  MathSciNet  Google Scholar 

  10. 10.

    Dehejia RH, Wahba S (2002) Propensity score-matching methods for nonexperimental causal studies. Rev Econ Stat 84:151–161

    Article  Google Scholar 

  11. 11.

    Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26

    Article  MATH  MathSciNet  Google Scholar 

  12. 12.

    Feuer EJ, Hankey BF, Gaynor JJ, Wesley MN, Baker SG, Meyer JS (1992) Graphical representation of survival curves associated with a binary non-reversible time dependent covariate. Stat Med 11:455–474

    Article  Google Scholar 

  13. 13.

    Fleming TR, Harrington DP (1991) Counting processes and survival analysis. Wiley, New York

    MATH  Google Scholar 

  14. 14.

    Fredriksson P, Johansson P (2008) Dynamic treatment assignment. J Bus Econ Stat 26:435–455

    Article  MathSciNet  Google Scholar 

  15. 15.

    Hansen BB (2008) The prognostic analogue of the propensity score. Biometrics 95:481–488

    Article  MATH  Google Scholar 

  16. 16.

    Heckman JJ, Smith J, Clements N (1997) Making the most out of programme evaluations and social experiments: accounting for heterogeneity in programme impacts. Rev Econ Stud 64:487–535

    Article  MATH  MathSciNet  Google Scholar 

  17. 17.

    Heckman J, Lalonde R, Smith J (1999) The economics and econometrics of active labor market programs. In: Handbook of labor economics. Elsevier, Amsterdam

    Google Scholar 

  18. 18.

    Hernán MA, Brumback B, Robins JM (2000) Marginal structural models to estimate the causal effect of Zidovudine on the survival of HIV-positive men. Epidemiology 11:561–570

    Article  Google Scholar 

  19. 19.

    Hernán MA, Brumback B, Robins JM (2001) Marginal structural models to estimate the joint causal effect of nonrandomized treatments. J Am Stat Assoc 96:440–448

    Article  MATH  Google Scholar 

  20. 20.

    Hernán MA, Cole SR, Margolick J, Cohen M, Robins JM (2005) Structural accelerated failure time models for survival analysis in studies with time-varying treatments. Pharmacoepidemiol Drug Saf 14:477–491

    Article  Google Scholar 

  21. 21.

    Hsu CH, Taylor JMG (2011) A robust weighted Kaplan–Meier approach for data with dependent censoring using linear combinations of prognostic covariates. Stat Med 29:2215–2223

    Article  MathSciNet  Google Scholar 

  22. 22.

    Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, New York

    Book  MATH  Google Scholar 

  23. 23.

    Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 282:457–481

    Article  MathSciNet  Google Scholar 

  24. 24.

    Li YP, Propert KJ, Rosenbaum PR (2001) Balanced risk set matching. J Am Stat Assoc 96:870–882. doi:10.1198/016214501753208573

    Article  MATH  MathSciNet  Google Scholar 

  25. 25.

    Lin DY, Sun W, Ying Z (1999) Nonparametric estimation of the gap time distributions for serial events with censored data. Biometrika 86:59–70. doi:10.1093/biomet/86.1.59

    Article  MATH  MathSciNet  Google Scholar 

  26. 26.

    Little RJA, Vartivarian S (2005) Does weighting for nonresponse increase the variance of survey means? Surv Methodol 31:161–168

    Google Scholar 

  27. 27.

    Lok J, Gill R, van der Vaart A, Robins J (2004) Estimating the causal effect of a time-varying treatment on time-to-event using structural nested failure time models. Stat Neerl 58:271–295

    Article  MATH  Google Scholar 

  28. 28.

    Lu B (2005) Propensity score matching with time-dependent covariates. Biometrics 61:721–728. doi:10.1111/j.1541-0420.2005.00356.x

    Article  MATH  MathSciNet  Google Scholar 

  29. 29.

    Petersen ML, Deeks SG, Martin JN, van der Laan MJ (2007) History-adjusted marginal structural models for estimating time-varying effect modification. Am J Epidemiol 166:185–193

    Google Scholar 

  30. 30.

    Petersen ML, Porter KE, Gruber S, Wang Y, van der Laan MJ (2012) Diagnosing and responding to violations in the positivity assumption. Stat Methods Med Res 21(1):31–54

    Article  MathSciNet  Google Scholar 

  31. 31.

    Prentice RL, Breslow NE (1978) Retrospective studies and failure time models. Biometrika 65:153–158. doi:10.1093/biomet/65.1.153

    Article  MATH  Google Scholar 

  32. 32.

    Robins JM (1986) A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Model 7:1393–1512

    Article  MATH  MathSciNet  Google Scholar 

  33. 33.

    Robins JM (1987) A graphical approach to the identification and estimation of causal parameters in mortality studies with sustained exposure periods. J Chronic Dis 40(2):139S–161S

    Article  Google Scholar 

  34. 34.

    Robins JM (1988) The control of confounding by intermediate variables. Stat Med 8:679–701

    Article  Google Scholar 

  35. 35.

    Robins JM, Finkelstein D (2000) Correcting for non-compliance and dependent censoring in an AIDS clinical trial with Inverse Probability of Censoring Weighted (IPCW) Log-rank tests. Biometrics 56:779–788

    Article  MATH  Google Scholar 

  36. 36.

    Robins JM, Hernán MA (2008) Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G (eds) Advances in longitudinal data analysis. Chapman & Hall, New York

    Google Scholar 

  37. 37.

    Robins JM, Rotnitzky A (1992) Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell N, Dietz K, Farewell V (eds) AIDS epidemiology—methodological issues. Birkhauser Boston, Cambridge

    Google Scholar 

  38. 38.

    Robins JM, Blevins D, Ritter G, Wulfsohn M (1992) G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of AIDS patients. Epidemiology 3:319–336

    Article  Google Scholar 

  39. 39.

    Robins JM, Hernán MA, Brumback B (2000) Marginal structural models and causal inference in epidemiology. Epidemiology 11:550–560

    Article  Google Scholar 

  40. 40.

    Rosenbaum P, Rubin D (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70:41–55

    Article  MATH  MathSciNet  Google Scholar 

  41. 41.

    Rubin DB (1974) Estimating casual effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701

    Article  Google Scholar 

  42. 42.

    Rubin DB (1978) Bayesian inference for causal effects: the role of randomization. Ann Stat 6:34–58

    Article  MATH  Google Scholar 

  43. 43.

    Rubin DB (1980) Randomization analysis of experimental-data—the Fisher randomziation test: comment. J Am Stat Assoc 75:591–593

    Google Scholar 

  44. 44.

    Schafer JL, Kang J (2008) Average causal effects from nonrandomized studies: a practical guide and simulated example. Psychol Methods 13(4):279–313

    Article  Google Scholar 

  45. 45.

    Schaubel DE, Cai J (2004) Regression methods for gap time hazard functions of sequentially ordered multivariate failure time data. Biometrika 91:291–303. doi:10.1093/biomet/91.2.291

    Article  MATH  MathSciNet  Google Scholar 

  46. 46.

    Schaubel DE, Wolfe RA, Port FK (2006) A sequential stratification method for estimating the effect of a time-dependent experimental treatment in observational studies. Biometrics 62:910–917

    Article  MathSciNet  Google Scholar 

  47. 47.

    Schaubel DE, Wolfe RA, Sima CS, Merion RM (2009) Estimating the effect of a time-dependent treatment in by levels of an internal time-dependent covariate. J Am Stat Assoc 104:49–59. doi:10.1198/jasa.2009.0003

    Article  MathSciNet  Google Scholar 

  48. 48.

    Stuart EA (2010) Matching methods for causal inference: a review and a look forward. Stat Sci 25(1):1–21

    Article  MathSciNet  Google Scholar 

  49. 49.

    Taubman SL, Robins JM, Mittleman MA, Hernán MA (2009) Intervening on risk factors for coronary heart disease: an application of the parametric g-formula. Int J Epidemiol 38(6):1599–1611

    Article  Google Scholar 

  50. 50.

    van der Vaart AW (2000) Asymptotic statistics. Cambridge University Press, Cambridge

    Google Scholar 

  51. 51.

    van Houwelingen HC (2007) Dynamic prediction by landmarking in event history analysis. Scand J Stat 34:70–85. doi:10.1111/j.1467-9469.2006.00529.x

    Article  MATH  Google Scholar 

  52. 52.

    van Houwelingen HC, Putter H (2007) Dynamic predicting by landmarking as an alternative for multi-state modeling: an application to acute lymphoid leukemia data. Lifetime Data Anal 14:447–463. doi:10.1007/s10985-008-9099-8

    Article  Google Scholar 

  53. 53.

    Wolfe RA, Ashby VB, Milford EL, Ojo AO, Ettenger RE, Agodoa LY, Held PJ, Port FK (1999) Comparison of mortality in all patients on dialysis, patients on dialysis awaiting transplantation, and recipients of a first cadaveric transplant. N Engl J Med 341:1725–1730

    Article  Google Scholar 

  54. 54.

    Young JG, Cain LE, Robins JM, O’Reilly EJ, Hernán MA (2011) Comparative effectiveness of dynamic treatment regimes: an application of the parametric g-formula. Stat Biosci 3(1):119–143. doi:10.1007/s12561-011-9040-7

    Article  Google Scholar 

Download references


This work was supported in part by National Institutes of Health grant 5R01-DK070869. The authors thank the Canadian Institute for Health Information for access to the Canadian Organ Replacement Register database.

The authors would like to thank Jeffrey Smith and Brenda Gillespie for their valuable input on the work.

Author information



Corresponding author

Correspondence to Yun Li.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Li, Y., Schaubel, D.E. & He, K. Matching Methods for Obtaining Survival Functions to Estimate the Effect of a Time-Dependent Treatment. Stat Biosci 6, 105–126 (2014).

Download citation


  • Causal inference
  • Matching
  • Observational study
  • Propensity score
  • Survival function
  • Time-dependent treatment