Fast accelerated failure time modeling for case-cohort data

Chiou, Sy Han; Kang, Sangwook; Yan, Jun

doi:10.1007/s11222-013-9388-2

Fast accelerated failure time modeling for case-cohort data

Published: 03 April 2013

Volume 24, pages 559–568, (2014)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Sy Han Chiou¹,
Sangwook Kang¹ &
Jun Yan^1,2,3

936 Accesses
18 Citations
Explore all metrics

Abstract

Semiparametric accelerated failure time (AFT) models directly relate the expected failure times to covariates and are a useful alternative to models that work on the hazard function or the survival function. For case-cohort data, much less development has been done with AFT models. In addition to the missing covariates outside of the sub-cohort in controls, challenges from AFT model inferences with full cohort are retained. The regression parameter estimator is hard to compute because the most widely used rank-based estimating equations are not smooth. Further, its variance depends on the unspecified error distribution, and most methods rely on computationally intensive bootstrap to estimate it. We propose fast rank-based inference procedures for AFT models, applying recent methodological advances to the context of case-cohort data. Parameters are estimated with an induced smoothing approach that smooths the estimating functions and facilitates the numerical solution. Variance estimators are obtained through efficient resampling methods for nonsmooth estimating functions that avoids full blown bootstrap. Simulation studies suggest that the recommended procedure provides fast and valid inferences among several competing procedures. Application to a tumor study demonstrates the utility of the proposed method in routine data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust estimation in accelerated failure time models

Article 13 February 2018

Nonparametric inference in the accelerated failure time model using restricted means

Article 12 January 2022

Bayesian accelerated failure time models based on penalized mixtures of Gaussians: regularization and variable selection

Article 15 November 2014

References

Barlow, W.E.: Robust variance estimation for the case-cohort design. Biometrics 50, 1064–1072 (1994)
Article MATH Google Scholar
Breslow, N.E., Lumley, T., Ballantyne, C.M., Chambless, L.E., Kulich, M.: Improved Horvitz–Thompson estimation of model parameters from two-phase stratified samples: applications in epidemiology. Stat. Biosci. 1, 32–49 (2009)
Article Google Scholar
Brown, B.M., Wang, Y.-G.: Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 149–158 (2005)
Article MATH MathSciNet Google Scholar
Brown, B.M., Wang, Y.-G.: Induced smoothing for rank regression with censored survival times. Stat. Med. 26, 828–836 (2007)
Article MathSciNet Google Scholar
Chen, H.Y.: Fitting semiparametric transformation regression models to data from a modified case-cohort design. Biometrika 88, 255–268 (2001a)
Article MATH MathSciNet Google Scholar
Chen, H.Y.: Weighted semiparametric likelihood method for fitting a proportional odds regression model to data from the case cohort design. J. Am. Stat. Assoc. 96, 1446–1458 (2001b)
Article MATH Google Scholar
Chiou, S., Kang, S., Yan, J.: aftgee: Accelerated failure time model with generalized estimating equations. R package version 0.2-27 (2012)
D’Angio, G.J., Breslow, N., Beckwith, J.B., Evans, A., Baum, E., Delorimier, A., Fernbach, D., Hrabovsky, E., Jones, B., Kelalis, P., Othersen, H.B., Tefft, M., Thomas, P.R.M.: Treatment of Wilms’ tumor. Results of the third national Wilms’ tumor study. Cancer 64, 349–360 (1989)
Article Google Scholar
Green, D., Breslow, N., Beckwith, J., Finklestein, J., Grundy, P., Thomas, P., Kim, T., Shochat, S., Haase, G., Ritchey, M., Kelalis, P., D’Angio, G.: Comparison between single-dose and dvided-dose administration of dactinomycin and doxorubicin for patients with Wilms’ tumor: a report from the National Wilms’ Tumor Study Group. J. Clin. Oncol. 16, 237–245 (1998)
Google Scholar
Hasselman, B.: nleqslv: Solve systems of non linear equations. R package version 1.9.3 (2012). http://CRAN.R-project.org/package=nleqslv
Huang, Y.: Calibration regression of censored lifetime medical cost. J. Am. Stat. Assoc. 97, 318–327 (2002)
Article MATH Google Scholar
Jin, Z., Lin, D.Y., Wei, L.J., Ying, Z.: Rank-based inference for the accelerated failure time model. Biometrika 90, 341–353 (2003)
Article MATH MathSciNet Google Scholar
Johnson, L.M., Strawderman, R.L.: Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96, 577–590 (2009)
Article MATH MathSciNet Google Scholar
Kalbfleisch, J.D., Lawless, J.F.: Likelihood analysis of multistate models for disease incidence and mortality. Stat. Med. 7, 149–160 (1988)
Article Google Scholar
Kang, S., Cai, J.: Marginal hazards model for case-cohort studies with multiple disease outcomes. Biometrika 96, 887–901 (2009)
Article MATH MathSciNet Google Scholar
Kong, L., Cai, J.: Case-cohort analysis with accelerated failure time model. Biometrics 65, 135–142 (2009)
Article MATH MathSciNet Google Scholar
Kong, L., Cai, J., Sen, P.K.: Weighted estimating equations for semiparametric transformation models with censroed data from a case-cohort design. Biometrika 91, 305–319 (2004)
Article MATH MathSciNet Google Scholar
Kulich, M., Lin, D.: Additive hazards regression for case-cohort studies. Biometrika 87, 73–87 (2000)
Article MATH MathSciNet Google Scholar
Kulich, M., Lin, D.: Improving the efficiency of relative-risk estimation in case-cohort studies. J. Am. Stat. Assoc. 99, 832–844 (2004)
Article MATH MathSciNet Google Scholar
Lin, D.Y., Ying, Z.: Cox regression with incomplete covariate measurements. J. Am. Stat. Assoc. 88, 1341–1349 (1993)
Article MATH MathSciNet Google Scholar
Lu, W., Tsiatis, A.A.: Semiparametric transformation models for the case-cohort study. Biometrika 93, 207–214 (2006)
Article MATH MathSciNet Google Scholar
Nan, B., Yu, M., Kalbfleisch, J.D.: Censored linear regression for case-cohort studies. Biometrika 93, 747–762 (2006)
Article MathSciNet Google Scholar
Nan, B., Kalbfleisch, J.D., Yu, M.: Asymptotic theory for the semiparametric accelerated failure time model with missing data. Ann. Stat. 37, 2351–2376 (2009)
Article MATH MathSciNet Google Scholar
Prentice, R.L.: Linear rank tests with right censored data (Corr: V70 p304). Biometrika 65, 167–180 (1978)
Article MATH MathSciNet Google Scholar
Prentice, R.L.: A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73, 1–11 (1986)
Article MATH MathSciNet Google Scholar
Self, S.G., Prentice, R.L.: Asymptotic distribution theory and efficiency results for case-cohort studies. Ann. Stat. 16, 64–81 (1988)
Article MATH MathSciNet Google Scholar
Sun, J., Sun, L., Flournoy, N.: Additive hazards model for competing risks analysis of the case-cohort design. Commun. Stat., Theory Methods 33, 351–366 (2004)
Article MATH MathSciNet Google Scholar
Therneau, T.M., Li, H.: Computing the cox model for case cohort designs. Lifetime Data Anal. 5, 99–112 (1999)
Article MATH Google Scholar
Tsiatis, A.A.: Estimating regression parameters using linear rank tests for censored data. Ann. Stat. 18, 354–372 (1990)
Article MATH MathSciNet Google Scholar
Varadhan, R., Gilbert, P.: BB: an R package for solving a large system of nonlinear equations and for optimizing a high-dimensional nonlinear objective function. J. Stat. Softw. 32, 1–26 (2009). http://www.jstatsoft.org/v32/i04/
Google Scholar
Wacholder, S., Gail, M.H., Pee, D., Brookmeyer, R.: Alternative variance and efficiency calculations for the case-cohort design. Biometrika 76, 117–123 (1989)
Article MATH MathSciNet Google Scholar
Wang, Y.-G., Fu, L.: Rank regression for the accelerated failure time model with clustered and censored data. Comput. Stat. Data Anal. 55, 2334–2343 (2011)
Article MathSciNet Google Scholar
Ying, Z.: A large sample study of rank estimation for censored regression data. Ann. Stat. 21, 76–99 (1993)
Article MATH Google Scholar
Yu, M.: Buckley-James type estimator in censored data with covariates missing by design. Scand. J. Stat. 38, 252–267 (2011)
Article MATH MathSciNet Google Scholar
Yu, Q., Wong, G.Y.C., Yu, M.: Buckley-James-type of estimators under the classical case cohort design. Ann. Inst. Stat. Math. 59, 675–695 (2007)
Article MATH MathSciNet Google Scholar
Zeng, D., Lin, D.Y.: Efficient resampling methods for nonsmooth estimating functions. Biostatistics 9, 355–363 (2008)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, University of Connecticut, Storrs-Mansfield, CT, USA
Sy Han Chiou, Sangwook Kang & Jun Yan
Institute for Public Health Research, University of Connecticut Health Center, East Hartford, CT, USA
Jun Yan
Center for Environmental Sciences & Engineering, University of Connecticut, Storrs-Mansfield, CT, USA
Jun Yan

Authors

Sy Han Chiou
View author publications
You can also search for this author in PubMed Google Scholar
Sangwook Kang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Yan.

Appendix: Analytical details

We give the analytical form of S _i(β)’s here. Define the general rank based weighted estimating function (Jin et al. 2003)

$$ U_n(\beta)=\sum_{i=1}^n \Delta_i \varphi_{n,i}(\beta) \biggl[ X_i- \frac{W^{(1)}_{n,i}(\beta)}{W^{(0)}_{n,i}(\beta)} \biggr], $$

where φ _n,i(β) is an nonnegative weight function and

$$ W^{(k)}_{n,i}(\beta)=\frac{1}{n}\sum _{j=1}^n X_j^k I \bigl[e_j(\beta) \geq e_i(\beta)\bigr], \quad k = 0,1. $$

Equation (1) can be obtained by setting $\varphi_{n,i}(\beta) = W^{(0)}_{n,i}(\beta)$. On the other hand, the general rank based weighted estimating function for case-cohort samples has the following form:

$$ U_n^c(\beta) = \sum_{i=1}^n \Delta_i\varphi_{n,i}(\beta) \biggl[X_i- \frac{\hat{W}^{(1)}_{n, i}(\beta)}{\hat{W}^{(0)}_{n, i}(\beta)} \biggr], $$

where

$$ \hat{W}^{(k)}_{n, i}(\beta)=\frac{1}{n} \sum _{j=1}^n h_j X_j^k I\bigl[e_j(\beta)\geq e_i(\beta)\bigr], \quad k = 0,1. $$

Similarly, Eq. (2) can be obtained by setting $\varphi_{n,i}(\beta) = \hat{W}^{(0)}_{n,i}(\beta)$.

With these settings, an explicit form of S _i(β ₀) is

where

N _i(β;t)=Δ_i I(e _i(β)≤t) and λ(u) is the common hazard function of ϵ _i.

The unknown quantities in S _i(β ₀) include β ₀, w ⁽⁰⁾, w ⁽¹⁾ and λ(t). With the explicit form of S _i(β ₀), $\hat{S}_{i}(\hat{\beta})$ is obtained by replacing these unknown quantities by their sample estimators.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chiou, S.H., Kang, S. & Yan, J. Fast accelerated failure time modeling for case-cohort data. Stat Comput 24, 559–568 (2014). https://doi.org/10.1007/s11222-013-9388-2

Download citation

Received: 01 June 2012
Accepted: 20 February 2013
Published: 03 April 2013
Issue Date: July 2014
DOI: https://doi.org/10.1007/s11222-013-9388-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast accelerated failure time modeling for case-cohort data

Abstract

Access this article

Similar content being viewed by others

Robust estimation in accelerated failure time models

Nonparametric inference in the accelerated failure time model using restricted means

Bayesian accelerated failure time models based on penalized mixtures of Gaussians: regularization and variable selection

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Analytical details

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast accelerated failure time modeling for case-cohort data

Abstract

Access this article

Similar content being viewed by others

Robust estimation in accelerated failure time models

Nonparametric inference in the accelerated failure time model using restricted means

Bayesian accelerated failure time models based on penalized mixtures of Gaussians: regularization and variable selection

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Analytical details

Appendix: Analytical details

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation