Statistics and Computing

, Volume 24, Issue 4, pp 559–568 | Cite as

Fast accelerated failure time modeling for case-cohort data

  • Sy Han Chiou
  • Sangwook Kang
  • Jun YanEmail author


Semiparametric accelerated failure time (AFT) models directly relate the expected failure times to covariates and are a useful alternative to models that work on the hazard function or the survival function. For case-cohort data, much less development has been done with AFT models. In addition to the missing covariates outside of the sub-cohort in controls, challenges from AFT model inferences with full cohort are retained. The regression parameter estimator is hard to compute because the most widely used rank-based estimating equations are not smooth. Further, its variance depends on the unspecified error distribution, and most methods rely on computationally intensive bootstrap to estimate it. We propose fast rank-based inference procedures for AFT models, applying recent methodological advances to the context of case-cohort data. Parameters are estimated with an induced smoothing approach that smooths the estimating functions and facilitates the numerical solution. Variance estimators are obtained through efficient resampling methods for nonsmooth estimating functions that avoids full blown bootstrap. Simulation studies suggest that the recommended procedure provides fast and valid inferences among several competing procedures. Application to a tumor study demonstrates the utility of the proposed method in routine data analysis.


Induced smoothing Multiplier bootstrap Resampling 


  1. Barlow, W.E.: Robust variance estimation for the case-cohort design. Biometrics 50, 1064–1072 (1994) CrossRefzbMATHGoogle Scholar
  2. Breslow, N.E., Lumley, T., Ballantyne, C.M., Chambless, L.E., Kulich, M.: Improved Horvitz–Thompson estimation of model parameters from two-phase stratified samples: applications in epidemiology. Stat. Biosci. 1, 32–49 (2009) CrossRefGoogle Scholar
  3. Brown, B.M., Wang, Y.-G.: Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 149–158 (2005) CrossRefzbMATHMathSciNetGoogle Scholar
  4. Brown, B.M., Wang, Y.-G.: Induced smoothing for rank regression with censored survival times. Stat. Med. 26, 828–836 (2007) CrossRefMathSciNetGoogle Scholar
  5. Chen, H.Y.: Fitting semiparametric transformation regression models to data from a modified case-cohort design. Biometrika 88, 255–268 (2001a) CrossRefzbMATHMathSciNetGoogle Scholar
  6. Chen, H.Y.: Weighted semiparametric likelihood method for fitting a proportional odds regression model to data from the case cohort design. J. Am. Stat. Assoc. 96, 1446–1458 (2001b) CrossRefzbMATHGoogle Scholar
  7. Chiou, S., Kang, S., Yan, J.: aftgee: Accelerated failure time model with generalized estimating equations. R package version 0.2-27 (2012) Google Scholar
  8. D’Angio, G.J., Breslow, N., Beckwith, J.B., Evans, A., Baum, E., Delorimier, A., Fernbach, D., Hrabovsky, E., Jones, B., Kelalis, P., Othersen, H.B., Tefft, M., Thomas, P.R.M.: Treatment of Wilms’ tumor. Results of the third national Wilms’ tumor study. Cancer 64, 349–360 (1989) CrossRefGoogle Scholar
  9. Green, D., Breslow, N., Beckwith, J., Finklestein, J., Grundy, P., Thomas, P., Kim, T., Shochat, S., Haase, G., Ritchey, M., Kelalis, P., D’Angio, G.: Comparison between single-dose and dvided-dose administration of dactinomycin and doxorubicin for patients with Wilms’ tumor: a report from the National Wilms’ Tumor Study Group. J. Clin. Oncol. 16, 237–245 (1998) Google Scholar
  10. Hasselman, B.: nleqslv: Solve systems of non linear equations. R package version 1.9.3 (2012).
  11. Huang, Y.: Calibration regression of censored lifetime medical cost. J. Am. Stat. Assoc. 97, 318–327 (2002) CrossRefzbMATHGoogle Scholar
  12. Jin, Z., Lin, D.Y., Wei, L.J., Ying, Z.: Rank-based inference for the accelerated failure time model. Biometrika 90, 341–353 (2003) CrossRefzbMATHMathSciNetGoogle Scholar
  13. Johnson, L.M., Strawderman, R.L.: Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96, 577–590 (2009) CrossRefzbMATHMathSciNetGoogle Scholar
  14. Kalbfleisch, J.D., Lawless, J.F.: Likelihood analysis of multistate models for disease incidence and mortality. Stat. Med. 7, 149–160 (1988) CrossRefGoogle Scholar
  15. Kang, S., Cai, J.: Marginal hazards model for case-cohort studies with multiple disease outcomes. Biometrika 96, 887–901 (2009) CrossRefzbMATHMathSciNetGoogle Scholar
  16. Kong, L., Cai, J.: Case-cohort analysis with accelerated failure time model. Biometrics 65, 135–142 (2009) CrossRefzbMATHMathSciNetGoogle Scholar
  17. Kong, L., Cai, J., Sen, P.K.: Weighted estimating equations for semiparametric transformation models with censroed data from a case-cohort design. Biometrika 91, 305–319 (2004) CrossRefzbMATHMathSciNetGoogle Scholar
  18. Kulich, M., Lin, D.: Additive hazards regression for case-cohort studies. Biometrika 87, 73–87 (2000) CrossRefzbMATHMathSciNetGoogle Scholar
  19. Kulich, M., Lin, D.: Improving the efficiency of relative-risk estimation in case-cohort studies. J. Am. Stat. Assoc. 99, 832–844 (2004) CrossRefzbMATHMathSciNetGoogle Scholar
  20. Lin, D.Y., Ying, Z.: Cox regression with incomplete covariate measurements. J. Am. Stat. Assoc. 88, 1341–1349 (1993) CrossRefzbMATHMathSciNetGoogle Scholar
  21. Lu, W., Tsiatis, A.A.: Semiparametric transformation models for the case-cohort study. Biometrika 93, 207–214 (2006) CrossRefzbMATHMathSciNetGoogle Scholar
  22. Nan, B., Yu, M., Kalbfleisch, J.D.: Censored linear regression for case-cohort studies. Biometrika 93, 747–762 (2006) CrossRefMathSciNetGoogle Scholar
  23. Nan, B., Kalbfleisch, J.D., Yu, M.: Asymptotic theory for the semiparametric accelerated failure time model with missing data. Ann. Stat. 37, 2351–2376 (2009) CrossRefzbMATHMathSciNetGoogle Scholar
  24. Prentice, R.L.: Linear rank tests with right censored data (Corr: V70 p304). Biometrika 65, 167–180 (1978) CrossRefzbMATHMathSciNetGoogle Scholar
  25. Prentice, R.L.: A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73, 1–11 (1986) CrossRefzbMATHMathSciNetGoogle Scholar
  26. Self, S.G., Prentice, R.L.: Asymptotic distribution theory and efficiency results for case-cohort studies. Ann. Stat. 16, 64–81 (1988) CrossRefzbMATHMathSciNetGoogle Scholar
  27. Sun, J., Sun, L., Flournoy, N.: Additive hazards model for competing risks analysis of the case-cohort design. Commun. Stat., Theory Methods 33, 351–366 (2004) CrossRefzbMATHMathSciNetGoogle Scholar
  28. Therneau, T.M., Li, H.: Computing the cox model for case cohort designs. Lifetime Data Anal. 5, 99–112 (1999) CrossRefzbMATHGoogle Scholar
  29. Tsiatis, A.A.: Estimating regression parameters using linear rank tests for censored data. Ann. Stat. 18, 354–372 (1990) CrossRefzbMATHMathSciNetGoogle Scholar
  30. Varadhan, R., Gilbert, P.: BB: an R package for solving a large system of nonlinear equations and for optimizing a high-dimensional nonlinear objective function. J. Stat. Softw. 32, 1–26 (2009). Google Scholar
  31. Wacholder, S., Gail, M.H., Pee, D., Brookmeyer, R.: Alternative variance and efficiency calculations for the case-cohort design. Biometrika 76, 117–123 (1989) CrossRefzbMATHMathSciNetGoogle Scholar
  32. Wang, Y.-G., Fu, L.: Rank regression for the accelerated failure time model with clustered and censored data. Comput. Stat. Data Anal. 55, 2334–2343 (2011) CrossRefMathSciNetGoogle Scholar
  33. Ying, Z.: A large sample study of rank estimation for censored regression data. Ann. Stat. 21, 76–99 (1993) CrossRefzbMATHGoogle Scholar
  34. Yu, M.: Buckley-James type estimator in censored data with covariates missing by design. Scand. J. Stat. 38, 252–267 (2011) CrossRefzbMATHMathSciNetGoogle Scholar
  35. Yu, Q., Wong, G.Y.C., Yu, M.: Buckley-James-type of estimators under the classical case cohort design. Ann. Inst. Stat. Math. 59, 675–695 (2007) CrossRefzbMATHMathSciNetGoogle Scholar
  36. Zeng, D., Lin, D.Y.: Efficient resampling methods for nonsmooth estimating functions. Biostatistics 9, 355–363 (2008) CrossRefzbMATHMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of ConnecticutStorrs-MansfieldUSA
  2. 2.Institute for Public Health ResearchUniversity of Connecticut Health CenterEast HartfordUSA
  3. 3.Center for Environmental Sciences & EngineeringUniversity of ConnecticutStorrs-MansfieldUSA

Personalised recommendations