Lifetime Data Analysis

, Volume 13, Issue 3, pp 351–369 | Cite as

A marginal regression model for multivariate failure time data with a surviving fraction

  • Yingwei PengEmail author
  • Jeremy M. G. Taylor
  • Binbing Yu


A marginal regression approach for correlated censored survival data has become a widely used statistical method. Examples of this approach in survival analysis include from the early work by Wei et al. (J Am Stat Assoc 84:1065–1073, 1989) to more recent work by Spiekerman and Lin (J Am Stat Assoc 93:1164–1175, 1998). This approach is particularly useful if a covariate’s population average effect is of primary interest and the correlation structure is not of interest or cannot be appropriately specified due to lack of sufficient information. In this paper, we consider a semiparametric marginal proportional hazard mixture cure model for clustered survival data with a surviving or “cure” fraction. Unlike the clustered data in previous work, the latent binary cure statuses of patients in one cluster tend to be correlated in addition to the possible correlated failure times among the patients in the cluster who are not cured. The complexity of specifying appropriate correlation structures for the data becomes even worse if the potential correlation between cure statuses and the failure times in the cluster has to be considered, and thus a marginal regression approach is particularly attractive. We formulate a semiparametric marginal proportional hazards mixture cure model. Estimates are obtained using an EM algorithm and expressions for the variance–covariance are derived using sandwich estimators. Simulation studies are conducted to assess finite sample properties of the proposed model. The marginal model is applied to a multi-institutional study of local recurrences of tonsil cancer patients who received radiation therapy. It reveals new findings that are not available from previous analyses of this study that ignored the potential correlation between patients within the same institution.


Copula Cure Logit link Mixture model Proportional hazards Sandwich variance estimate Semiparametric Tonsil cancer 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Chandler RE and Bate S (2007). Inference for clustered data using the independence loglikelihood. Biometrika 94: 167–183 CrossRefzbMATHMathSciNetGoogle Scholar
  2. Chatterjee N and Shih J (2001). A bivariate cure-mixture approach for modeling familial association in disease. Biometrics 57: 779–786 CrossRefMathSciNetGoogle Scholar
  3. Chen M-H, Ibrahim JG and Sinha D (2002). Bayesian inference for multivariate survival data with a cure fraction. J Multivariate Anal 80: 101–126 zbMATHCrossRefMathSciNetGoogle Scholar
  4. Clayton DG (1978). A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65: 141–152 zbMATHCrossRefMathSciNetGoogle Scholar
  5. Cox DR (1972). Regression models and life-tables. J Roy Statis Soc Ser B 34: 187–220 zbMATHGoogle Scholar
  6. Fang H-B, Li G and Sun J (2005). Maximum likelihood estimation in a semiparametric logistic/proportional-hazards mixture model. Scand J Stat 32: 59–75 zbMATHCrossRefMathSciNetGoogle Scholar
  7. Kalbfleisch JD and Prentice RL (2002). The statistical analysis of failure time data, 2nd edn. Wiley, New York zbMATHGoogle Scholar
  8. Kuk AYC and Chen C (1992). A mixture model combining logistic regression with proportional hazards regression. Biometrika 79: 531–41 zbMATHCrossRefGoogle Scholar
  9. Lee AJ (1993). Generating random binary deviates having fixed marginal distributions and specified degree of association. Am Stat 47: 209–215 CrossRefGoogle Scholar
  10. Li C-S and Taylor JMG (2002). A semi-parametric accelerated failure time cure model. Statis Med 21: 3235–3247 CrossRefGoogle Scholar
  11. Liang K-Y and Zeger SL (1986). Longitudinal data analysis using generalized linear models. Biometrika 73: 13–22 zbMATHCrossRefMathSciNetGoogle Scholar
  12. Lin DY (1994). Cox regression analysis of multivariate failure time data: the marginal approach. Stat Medi 13: 2233–2247 CrossRefGoogle Scholar
  13. Lipsitz SR, Dear KBG and Zhao L (1994). Jackknife estimators of variance for parameter estimates from estimating equations with applications to clustered survival data. Biometrics 50: 842–846 zbMATHCrossRefGoogle Scholar
  14. Lipsitz SR and Parzen M (1996). A jackknife estimator of variance for Cox regression for correlated survival data. Biometrics 52: 291–298 zbMATHCrossRefGoogle Scholar
  15. McLachlan G and Peel D (2000). Finite mixture models. Wiley, New York zbMATHGoogle Scholar
  16. Oakes D (1999). Direct calculation of the information matrix via the EM algorithm. J Roy Stat Soc, Ser B, 61: 479–482 zbMATHCrossRefMathSciNetGoogle Scholar
  17. Park CG, Park T and Shin DW (1996). A simple method for generating correlated binary variates. Am Stat 50: 306–310 CrossRefMathSciNetGoogle Scholar
  18. Peng Y (2003). Fitting semiparametric cure models. Comput Stat Data Anal 41: 481–490 CrossRefGoogle Scholar
  19. Peng Y and Dear KBG (2000). A nonparametric mixture model for cure rate estimation. Biometrics 56: 237–243 zbMATHCrossRefGoogle Scholar
  20. Peng Y, Dear KBG and Denham JW (1998). A generalized F mixture model for cure rate estimation. Stat Med 17: 813–830 CrossRefGoogle Scholar
  21. Royall RM (1986). Model robust confidence intervals using maximum likelihood estimators. Int Stat Rev 54: 221–226 zbMATHCrossRefMathSciNetGoogle Scholar
  22. Spiekerman CF and Lin DY (1998). Marginal regression models for multivariate failure time data. J Am Stat Assoc 93: 1164–1175 zbMATHCrossRefMathSciNetGoogle Scholar
  23. Sy JP and Taylor JMG (2000). Estimation in a Cox proportional hazards cure model. Biometrics 56: 227–236 zbMATHCrossRefMathSciNetGoogle Scholar
  24. Tsodikov AD, Ibrahim JG and Yakovlev AY (2003). Estimating cure rates from survival data: an alternative to two-component mixture models. J Am Stat Assoc 98: 1063–1078 CrossRefMathSciNetGoogle Scholar
  25. Wei LJ, Lin DY and Weissfeld L (1989). Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc 84: 1065–1073 CrossRefMathSciNetGoogle Scholar
  26. Wienke A, Lichtenstein P and Yashin AI (2003). A bivariate frailty model with a cure fraction for modeling familial correlation in disease. Biometrics 59: 1178–1183 zbMATHCrossRefMathSciNetGoogle Scholar
  27. Withers HR, Peters LJ, Taylor JMG, Owen JB, Morrison WH, Schultheiss TE, Keane T, O’Sullivan B, Gupta N, Wang CC, Jones CU, Doppke KP, Myint S, Thompson M, Parsons JT, Mendenhall WM, Dische S, Aird EGA, Henk JM, Bidmean MAM, Svoboda V, Chon Y, Hanlon AL, Peters TL, Hanks GE and Dyk J (1995). Local control of carcinoma of the tonsil by radiation therapy: an analysis of patterns of fractionation in nine institutions. Int J Radiat Oncol, Biol, Phys 33: 549–562 CrossRefGoogle Scholar
  28. Yamaguchi K (1992). Accelerated failure-time regression models with a regression model of surviving fraction: an application to the analysis of ‘permanent employment’ in Japan. J Am Stat Assoc 87: 284–292 CrossRefGoogle Scholar
  29. Yau KKW and Ng ASK (2001). Long-term survivor mixture model with random effects: application to a multi-centre clinical trial of carcinoma. Stat Med 20: 1591–1607 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Yingwei Peng
    • 1
    Email author
  • Jeremy M. G. Taylor
    • 2
  • Binbing Yu
    • 3
  1. 1.Department of Community Health and EpidemiologyQueen’s UniversityKingstonCanada
  2. 2.Department of BiostatisticsUniversity of MichiganAnn ArborUSA
  3. 3.National Institute on AgingBethesdaUSA

Personalised recommendations