Skip to main content
Log in

Fast accelerated failure time modeling for case-cohort data

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Semiparametric accelerated failure time (AFT) models directly relate the expected failure times to covariates and are a useful alternative to models that work on the hazard function or the survival function. For case-cohort data, much less development has been done with AFT models. In addition to the missing covariates outside of the sub-cohort in controls, challenges from AFT model inferences with full cohort are retained. The regression parameter estimator is hard to compute because the most widely used rank-based estimating equations are not smooth. Further, its variance depends on the unspecified error distribution, and most methods rely on computationally intensive bootstrap to estimate it. We propose fast rank-based inference procedures for AFT models, applying recent methodological advances to the context of case-cohort data. Parameters are estimated with an induced smoothing approach that smooths the estimating functions and facilitates the numerical solution. Variance estimators are obtained through efficient resampling methods for nonsmooth estimating functions that avoids full blown bootstrap. Simulation studies suggest that the recommended procedure provides fast and valid inferences among several competing procedures. Application to a tumor study demonstrates the utility of the proposed method in routine data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Barlow, W.E.: Robust variance estimation for the case-cohort design. Biometrics 50, 1064–1072 (1994)

    Article  MATH  Google Scholar 

  • Breslow, N.E., Lumley, T., Ballantyne, C.M., Chambless, L.E., Kulich, M.: Improved Horvitz–Thompson estimation of model parameters from two-phase stratified samples: applications in epidemiology. Stat. Biosci. 1, 32–49 (2009)

    Article  Google Scholar 

  • Brown, B.M., Wang, Y.-G.: Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 149–158 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Brown, B.M., Wang, Y.-G.: Induced smoothing for rank regression with censored survival times. Stat. Med. 26, 828–836 (2007)

    Article  MathSciNet  Google Scholar 

  • Chen, H.Y.: Fitting semiparametric transformation regression models to data from a modified case-cohort design. Biometrika 88, 255–268 (2001a)

    Article  MATH  MathSciNet  Google Scholar 

  • Chen, H.Y.: Weighted semiparametric likelihood method for fitting a proportional odds regression model to data from the case cohort design. J. Am. Stat. Assoc. 96, 1446–1458 (2001b)

    Article  MATH  Google Scholar 

  • Chiou, S., Kang, S., Yan, J.: aftgee: Accelerated failure time model with generalized estimating equations. R package version 0.2-27 (2012)

  • D’Angio, G.J., Breslow, N., Beckwith, J.B., Evans, A., Baum, E., Delorimier, A., Fernbach, D., Hrabovsky, E., Jones, B., Kelalis, P., Othersen, H.B., Tefft, M., Thomas, P.R.M.: Treatment of Wilms’ tumor. Results of the third national Wilms’ tumor study. Cancer 64, 349–360 (1989)

    Article  Google Scholar 

  • Green, D., Breslow, N., Beckwith, J., Finklestein, J., Grundy, P., Thomas, P., Kim, T., Shochat, S., Haase, G., Ritchey, M., Kelalis, P., D’Angio, G.: Comparison between single-dose and dvided-dose administration of dactinomycin and doxorubicin for patients with Wilms’ tumor: a report from the National Wilms’ Tumor Study Group. J. Clin. Oncol. 16, 237–245 (1998)

    Google Scholar 

  • Hasselman, B.: nleqslv: Solve systems of non linear equations. R package version 1.9.3 (2012). http://CRAN.R-project.org/package=nleqslv

  • Huang, Y.: Calibration regression of censored lifetime medical cost. J. Am. Stat. Assoc. 97, 318–327 (2002)

    Article  MATH  Google Scholar 

  • Jin, Z., Lin, D.Y., Wei, L.J., Ying, Z.: Rank-based inference for the accelerated failure time model. Biometrika 90, 341–353 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  • Johnson, L.M., Strawderman, R.L.: Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96, 577–590 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Kalbfleisch, J.D., Lawless, J.F.: Likelihood analysis of multistate models for disease incidence and mortality. Stat. Med. 7, 149–160 (1988)

    Article  Google Scholar 

  • Kang, S., Cai, J.: Marginal hazards model for case-cohort studies with multiple disease outcomes. Biometrika 96, 887–901 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Kong, L., Cai, J.: Case-cohort analysis with accelerated failure time model. Biometrics 65, 135–142 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Kong, L., Cai, J., Sen, P.K.: Weighted estimating equations for semiparametric transformation models with censroed data from a case-cohort design. Biometrika 91, 305–319 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Kulich, M., Lin, D.: Additive hazards regression for case-cohort studies. Biometrika 87, 73–87 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  • Kulich, M., Lin, D.: Improving the efficiency of relative-risk estimation in case-cohort studies. J. Am. Stat. Assoc. 99, 832–844 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Lin, D.Y., Ying, Z.: Cox regression with incomplete covariate measurements. J. Am. Stat. Assoc. 88, 1341–1349 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  • Lu, W., Tsiatis, A.A.: Semiparametric transformation models for the case-cohort study. Biometrika 93, 207–214 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Nan, B., Yu, M., Kalbfleisch, J.D.: Censored linear regression for case-cohort studies. Biometrika 93, 747–762 (2006)

    Article  MathSciNet  Google Scholar 

  • Nan, B., Kalbfleisch, J.D., Yu, M.: Asymptotic theory for the semiparametric accelerated failure time model with missing data. Ann. Stat. 37, 2351–2376 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Prentice, R.L.: Linear rank tests with right censored data (Corr: V70 p304). Biometrika 65, 167–180 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  • Prentice, R.L.: A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73, 1–11 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  • Self, S.G., Prentice, R.L.: Asymptotic distribution theory and efficiency results for case-cohort studies. Ann. Stat. 16, 64–81 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  • Sun, J., Sun, L., Flournoy, N.: Additive hazards model for competing risks analysis of the case-cohort design. Commun. Stat., Theory Methods 33, 351–366 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Therneau, T.M., Li, H.: Computing the cox model for case cohort designs. Lifetime Data Anal. 5, 99–112 (1999)

    Article  MATH  Google Scholar 

  • Tsiatis, A.A.: Estimating regression parameters using linear rank tests for censored data. Ann. Stat. 18, 354–372 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  • Varadhan, R., Gilbert, P.: BB: an R package for solving a large system of nonlinear equations and for optimizing a high-dimensional nonlinear objective function. J. Stat. Softw. 32, 1–26 (2009). http://www.jstatsoft.org/v32/i04/

    Google Scholar 

  • Wacholder, S., Gail, M.H., Pee, D., Brookmeyer, R.: Alternative variance and efficiency calculations for the case-cohort design. Biometrika 76, 117–123 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  • Wang, Y.-G., Fu, L.: Rank regression for the accelerated failure time model with clustered and censored data. Comput. Stat. Data Anal. 55, 2334–2343 (2011)

    Article  MathSciNet  Google Scholar 

  • Ying, Z.: A large sample study of rank estimation for censored regression data. Ann. Stat. 21, 76–99 (1993)

    Article  MATH  Google Scholar 

  • Yu, M.: Buckley-James type estimator in censored data with covariates missing by design. Scand. J. Stat. 38, 252–267 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  • Yu, Q., Wong, G.Y.C., Yu, M.: Buckley-James-type of estimators under the classical case cohort design. Ann. Inst. Stat. Math. 59, 675–695 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Zeng, D., Lin, D.Y.: Efficient resampling methods for nonsmooth estimating functions. Biostatistics 9, 355–363 (2008)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Yan.

Appendix: Analytical details

Appendix: Analytical details

We give the analytical form of S i (β)’s here. Define the general rank based weighted estimating function (Jin et al. 2003)

$$ U_n(\beta)=\sum_{i=1}^n \Delta_i \varphi_{n,i}(\beta) \biggl[ X_i- \frac{W^{(1)}_{n,i}(\beta)}{W^{(0)}_{n,i}(\beta)} \biggr], $$

where φ n,i (β) is an nonnegative weight function and

$$ W^{(k)}_{n,i}(\beta)=\frac{1}{n}\sum _{j=1}^n X_j^k I \bigl[e_j(\beta) \geq e_i(\beta)\bigr], \quad k = 0,1. $$

Equation (1) can be obtained by setting \(\varphi_{n,i}(\beta) = W^{(0)}_{n,i}(\beta)\). On the other hand, the general rank based weighted estimating function for case-cohort samples has the following form:

$$ U_n^c(\beta) = \sum_{i=1}^n \Delta_i\varphi_{n,i}(\beta) \biggl[X_i- \frac{\hat{W}^{(1)}_{n, i}(\beta)}{\hat{W}^{(0)}_{n, i}(\beta)} \biggr], $$

where

$$ \hat{W}^{(k)}_{n, i}(\beta)=\frac{1}{n} \sum _{j=1}^n h_j X_j^k I\bigl[e_j(\beta)\geq e_i(\beta)\bigr], \quad k = 0,1. $$

Similarly, Eq. (2) can be obtained by setting \(\varphi_{n,i}(\beta) = \hat{W}^{(0)}_{n,i}(\beta)\).

With these settings, an explicit form of S i (β 0) is

where

N i (β;t)=Δ i I(e i (β)≤t) and λ(u) is the common hazard function of ϵ i .

The unknown quantities in S i (β 0) include β 0, w (0), w (1) and λ(t). With the explicit form of S i (β 0), \(\hat{S}_{i}(\hat{\beta})\) is obtained by replacing these unknown quantities by their sample estimators.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chiou, S.H., Kang, S. & Yan, J. Fast accelerated failure time modeling for case-cohort data. Stat Comput 24, 559–568 (2014). https://doi.org/10.1007/s11222-013-9388-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-013-9388-2

Keywords

Navigation