Marginal semiparametric multivariate accelerated failure time model with generalized estimating equations


The semiparametric accelerated failure time (AFT) model is not as widely used as the Cox relative risk model due to computational difficulties. Recent developments in least squares estimation and induced smoothing estimating equations for censored data provide promising tools to make the AFT models more attractive in practice. For multivariate AFT models, we propose a generalized estimating equations (GEE) approach, extending the GEE to censored data. The consistency of the regression coefficient estimator is robust to misspecification of working covariance, and the efficiency is higher when the working covariance structure is closer to the truth. The marginal error distributions and regression coefficients are allowed to be unique for each margin or partially shared across margins as needed. The initial estimator is a rank-based estimator with Gehan’s weight, but obtained from an induced smoothing approach with computational ease. The resulting estimator is consistent and asymptotically normal, with variance estimated through a multiplier resampling method. In a large scale simulation study, our estimator was up to three times as efficient as the estimateor that ignores the within-cluster dependence, especially when the within-cluster dependence was strong. The methods were applied to the bivariate failure times data from a diabetic retinopathy study.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2


  1. Brown BM, Wang Y-G (2005) Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92(1):149–158

    MathSciNet  Article  MATH  Google Scholar 

  2. Brown BM, Wang Y-G (2007) Induced smoothing for rank regression with censored survival times. Stat Med 26(4):828–836

    MathSciNet  Article  Google Scholar 

  3. Buckley J, James I (1979) Linear regression with censored data. Biometrika 66:429–436

    Article  MATH  Google Scholar 

  4. Chiou SH, Kang S, Yan J (2013) Fast accelerated failure time modeling for case-cohort data. Stat Comput. doi:10.1007/s11222-013-9388-2

  5. Cox DR (1972) Regression models and life-tables (with discussion). J R Stat Soc 34:187–220

    MATH  Google Scholar 

  6. Diabetic Retinopathy Study Research Group (1976) Preliminary report on effects of photocoagulation therapy. Am J Ophthalmol 81(4):383–396

    Google Scholar 

  7. Gehan EA (1965) A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika 52:203–223

    MathSciNet  Article  MATH  Google Scholar 

  8. Halekoh U, Højsgaard S (2006) The R package geepack for generalized estimating equations. J Stat Softw 15(2):1–11

    Google Scholar 

  9. Hornsteiner U, Hamerle A (1996) A combined GEE/Buckley–James method for estimating an accelerated failure time model of multivariate failure times. Discussion paper 47, Ludwig-Maximilians-Universität München, Collaborative Research Center 386. Accessed 12 Feb 2014

  10. Huang Y (2002) Calibration regression of censored lifetime medical cost. J Am Stat Assoc 97(457):318–327

    Article  MATH  Google Scholar 

  11. Huster WJ, Brookmeyer R, Self SG (1989) Modelling paired survival data with covariates. Biometrics 45:145–156

    MathSciNet  Article  MATH  Google Scholar 

  12. Jin Z, Lin DY, Wei LJ, Ying Z (2003) Rank-based inference for the accelerated failure time model. Biometrika 90(2):341–353

    MathSciNet  Article  MATH  Google Scholar 

  13. Jin Z, Lin DY, Ying Z (2006a) On least-squares regression with censored data. Biometrika 93(1):147–161

    MathSciNet  Article  MATH  Google Scholar 

  14. Jin Z, Lin DY, Ying Z (2006b) Rank regression analysis of multivariate failure time data based on marginal linear models. Scand J Stat 33(1):1–23

    MathSciNet  Article  MATH  Google Scholar 

  15. Johnson LM, Strawderman RL (2009) Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96(3):577–590

    MathSciNet  Article  MATH  Google Scholar 

  16. Komárek A, Lesaffre E, Hilton JF (2005) Accelerated failure time model for arbitrarily censored data with smoothed error distribution. J Comput Graph Stat 14(3):726–745

    Article  Google Scholar 

  17. Lai TL, Ying Z (1991) Large sample theory of a modified Buckley–James estimator for regression analysis with censored data. Ann Stat 19:1370–1402

    MathSciNet  Article  MATH  Google Scholar 

  18. Lee EW, Wei LJ, Ying Z (1993) Linear regression analysis for highly stratified failure time data. J Am Stat Assoc 88:557–565

    MathSciNet  Article  MATH  Google Scholar 

  19. Li H, Yin G (2009) Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika 96(2):293–306

    MathSciNet  Article  MATH  Google Scholar 

  20. Liang K-Y, Self SG, Chang Y-C (1993) Modelling marginal hazards in multivariate failure time data. J R Stat Soc 55:441–453

    MathSciNet  MATH  Google Scholar 

  21. Liang K-Y, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22

    MathSciNet  Article  MATH  Google Scholar 

  22. Luo X, Huang C-Y (2011) Analysis of recurrent gap time data using the weighted risk set method and the modified within-cluster resampling method. Stat Med 30(4):301–311

    MathSciNet  Article  Google Scholar 

  23. Novák P (2013) Goodness-of-fit test for the accelerated failure time model based on martingale residuals. Kybernetika 49:40–59

    MathSciNet  MATH  Google Scholar 

  24. Prentice RL (1978) Linear rank tests with right censored data (Corr: V70 p304). Biometrika 65:167–180

    MathSciNet  Article  MATH  Google Scholar 

  25. Qu A, Lindsay BG, Li B (2000) Improving generalised estimating equations using quadratic inference functions. Biometrika 87(4):823–836

    MathSciNet  Article  MATH  Google Scholar 

  26. Ritov Y (1990) Estimation in a linear regression model with censored data. Ann Stat 18:303–328

    MathSciNet  Article  MATH  Google Scholar 

  27. Robins JM, Rotnitzky A (1992) Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell N, Dietz K, Farewell V (eds) AIDS epidemiology—methodological issues. Birkhäuser, Boston, pp 297–331

    Google Scholar 

  28. Spiekerman CF, Lin DY (1996) Checking the marginal Cox model for correlated failure time data. Biometrika 83:143–156

    MathSciNet  Article  MATH  Google Scholar 

  29. Strawderman RL (2005) The accelerated gap times model. Biometrika 92(3):647–666

    MathSciNet  Article  MATH  Google Scholar 

  30. Stute W (1993) Consistent estimation under random censorship when covariables are present. J Multivar Anal 45:89–103

    MathSciNet  Article  MATH  Google Scholar 

  31. Stute W (1996) Distributional convergence under random censorship when covariables are present. Scand J Stat 23(4):461–471

    MathSciNet  MATH  Google Scholar 

  32. Tsiatis AA (1990) Estimating regression parameters using linear rank tests for censored data. Ann Stat 18:354–372

    MathSciNet  Article  MATH  Google Scholar 

  33. Wang M-C, Chang S-H (1999) Nonparametric estimation of a recurrent survival function. J Am Stat Assoc 94:146–153

    MathSciNet  Article  MATH  Google Scholar 

  34. Wang Y-G, Fu L (2011) Rank regression for accelerated failure time model with clustered and censored data. Comput Stat Data Anal 55(7):2334–2343

    MathSciNet  Article  Google Scholar 

  35. Yan J, Fine J (2004) Estimating equations for association structures (Pkg: P859–880). Stat Med 23(6):859–874

    Article  Google Scholar 

  36. Ying Z (1993) A large sample study of rank estimation for censored regression data. Ann Stat 21:76–99

    Article  MATH  Google Scholar 

  37. Yu L (2011) Nonparametric quasi-likelihood for right censored data. Lifetime Data Anal 17:594–607

    MathSciNet  Article  MATH  Google Scholar 

  38. Yu L, Peace KE (2012) Spline nonparametric quasi-likelihood regression within the frame of the accelerated failure time model. Comput Stat Data Anal 56:2675–2687

    MathSciNet  Article  MATH  Google Scholar 

  39. Zeng D, Lin D (2007) Efficient estimation for the accelerated failure time model. J Am Stat Assoc 102(480):1387–1396

    MathSciNet  Article  MATH  Google Scholar 

  40. Zhou M (1992) \(M\)-estimation in censored linear models. Biometrika 79:837–841

    MathSciNet  Article  MATH  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Jun Yan.



Sketches of the Proofs

We impose the following regularity conditions:

  1. A1:

    \(\Vert X_i\Vert \le B\) for all \(i = 1, \ldots , n\) and some nonrandom constant \(B\), where \(\Vert \cdot \Vert \) is matrix norm.

  2. A2:

    The density function of \(F_{k, \beta }\) exists such that \(\int _{-\infty }^\infty t^2{\mathrm {d}}F_{k, \beta }(t) < \infty \), for \(k=1, \ldots , K\).

  3. A3:

    The distribution function \(F_{k, \beta }\) is twice differentiable with density \(f_{k, \beta }\) such that

    $$\begin{aligned} \int \limits _{-\infty }^\infty \left( \frac{f_{k, \beta }^\prime (t)}{f_{k, \beta }(t)}\right) ^2 {\mathrm {d}}F_{k, \beta }(t) < \infty \end{aligned}$$

    where \(1 \le k \le K\), and both \(f_{k, \beta }(t)\) and \(f^\prime _{k, \beta }(t)\) are bounded functions.

  4. A4:

    \(E[\exp (\theta \epsilon _{ik}^-)]+ \sup _{k\in \{1, \ldots , K\}} E[\exp (\theta C_{ik}^- )] < \infty \) for some \(\theta > 0\), where \(a^-=|a|I_{\{a\le 0\}}\).

  5. A5:

    \(\sup _{| b | < \infty ; -\infty < t < \infty }\sum _{i=1}^n\sum _{k=1}^K \Pr (t \le C_{ik} - X_{ik}^\top b \le t +h) = O(nh)\) as \(h \rightarrow 0\) and \(nh \rightarrow \infty \).

  6. A6:

    As \(n\rightarrow \infty \), \(\hat{\alpha }_n\) is bounded and is \(n^{1/2}\) consistent to \(\alpha _0\) given \(\beta \).

  7. A7:

    As \(n\rightarrow \infty \), initial estimator \(b_n\) is \(n^{1/2}\) consistent to \(\beta _0\) and \(\sqrt{n}( b_n - \beta _0)\) is asymptoticly normal with zero mean.

  8. A8:

    The slope matrices \(n^{-1} \partial U_n / \partial \beta \) and \(n^{-1} \partial U_n / \partial b\) evaluated at \((\beta _0, \beta _0, \alpha _0)\) converge to nondegenerate, finite limit \(A\) and \(B\), respectively.

  9. A9:

    The derivative \(\partial \Omega _i^{-1}(\alpha ) / \partial \alpha \) is finite for all \(i = 1, 2, \ldots n\).

Conditions A1–A5 are standard and ensure the existence of the solution of Eq. (2) (Lai and Ying 1991). It is natural to assume that the working covariance matrix \(\Omega \) in Eq. (4) is a symmetric positive definite matrix. Then there exist a \(K\times K\) nonsingular matrix, \(\Gamma \), such that \(\Omega (\alpha _0) = \Gamma ^{1/2} \Gamma ^{1/2}\). Let \(\mathbb X_i = \Gamma ^{-1/2} X_i\), \(\mathbb T_i = \Gamma ^{-1/2} Y_i\), \(\mathbb C_i = \Gamma ^{-1/2} C_i\), and \(\omega _i = \Gamma ^{-1/2} \epsilon _i\). Then Eq. (4) evaluated at \(\alpha = \alpha _0\) can be viewed as Eq. (2) with the transformed data \(\mathbb X_i\) and \(\mathbb Y_i = \min (\mathbb Y_i, \mathbb C_i)\), with error \(\omega _i\), \(i = 1, \ldots , n\). The existence of the solution to Eq. (4) can be verified by the same arguments as in Lai and Ying (1991), with assumptions similar to A1 to A5 on the transformed data. The consistency and asymptotic normality of the estimator given \(\alpha = \alpha _0\) follow from the same arguments as in Jin et al. (2006a).

The extra complexity here comes from the fact that Eq. (4) is solved at \(\alpha = \hat{\alpha }_n\), an estimator of \(\alpha _0\). Under condition A9, the \(i\)th term in the summation of \(\partial U_n / \partial \alpha \) evaluated at \((\beta _0, \beta _0, \alpha _0)\) is a linear function of \(\hat{Y}_i(\beta _0)-X_i^\top \beta _0\), \(i = 1, \ldots , n\), with expectation zero. By the law of large number, \(n^{-1}\partial U_n/\partial \alpha \) evaluated at \((\beta _0, \beta _0, \alpha _0)\) converges to zero in probability.

Proof of Theorem  1

At the solution \(\hat{\beta }_n^{(1)}\) given \(b_n\) and \(\hat{\alpha }_n\), we have \(n^{-1} U_n(\hat{\beta }_n^{(1)}, b_n, \hat{\alpha }_n) = 0\). Taylor expansion at \((\beta _0, \beta _0, \alpha _0)\) gives

$$\begin{aligned} 0&= \frac{1}{n}U_{n}(\beta _0, \beta _0, \alpha _0) + \frac{1}{n} \frac{\partial }{\partial \beta }\left[ U_n(\beta _0, \beta _0, \alpha _0) \right] (\hat{\beta }_n^{(1)}-\beta _0)\nonumber \\&+\, \frac{1}{n}\frac{\partial }{\partial b}\left[ U_n(\beta _0, \beta _0, \alpha _0) \right] (b_n-\beta _0) + \frac{1}{n} \frac{\partial }{\partial \alpha }\left[ U_n(\beta _0, \beta _0, \alpha _0) \right] (\hat{\alpha }_n-\alpha _0)\nonumber \\&+\,\, o_p(n^{-1/2}) \nonumber \\&= \frac{1}{n}U_n(\beta _0, \beta _0, \alpha _0) + A_n (\hat{\beta }_n^{(1)}-\beta _0) +B_n (b_n-\beta _0)+C_n(\hat{\alpha }_n-\alpha _0) + o_p(n^{-1/2}).\nonumber \\ \end{aligned}$$

With regularity conditions A1–A5, the first term converges in probability to zero by the law of large number. The convergence of \(b_n\) and \(\alpha _n\) in A6 and A7, combined with the limit condition in A8 and A9, then gives consistency of \(\hat{\beta }_n^{(1)}\) to \(\beta _0\). By induction, \(\hat{\beta }^{(m)}_n\) is consistent for \(\beta _0\) at every \(m\).

Proof of Theorem  2

Under regularity conditions \(\sqrt{n}(\hat{\beta }_n^{(1)}-\beta _0)\) can be expressed as

$$\begin{aligned}&\sqrt{n}(\hat{\beta }_n^{(1)}-\beta _0)\nonumber \\&\quad =\left[ A_n\right] ^{-1}\left[ \frac{1}{\sqrt{n}} U_n(\beta _0, \beta _0, \alpha _0)+B_n\sqrt{n}(b_n-\beta _0)+ C_n\sqrt{n}(\hat{\alpha }_n - \alpha _0)\right] + o_p(1).\nonumber \\ \end{aligned}$$

With condition A9, \(C_n\) converges to zero in probability, and, hence, with \(\sqrt{n}\) consistency of \(\hat{\alpha }_n\), \(C_n \sqrt{n} (\hat{\alpha }_n - \alpha _0) = o_p(1)\). Equation (11) is then asymptotically equivalent to

$$\begin{aligned} \left[ A_n\right] ^{-1}\left[ \frac{1}{\sqrt{n}} U_n(\beta _0, \beta _0, \alpha _0)+B_n\sqrt{n}(b_n-\beta _0)\right] . \end{aligned}$$

With the assumption that \(b_n-\beta _0\) is asymptoticly normal, there exist some nonrandom functions \(\eta _i\) with zero mean such that,

$$\begin{aligned} \sqrt{n}(b_n - \beta _0) = n^{-1/2}\sum _{i=1}^n\eta _i + o_p\left( \Vert b_n-\beta _0\Vert \right) . \end{aligned}$$

On the other hand, \(U_n(\beta _0, \beta _0, \alpha _0)\) is a sum of independent and identically distributed quantities with zero mean, denoted by \(\phi _i\)’s, \(i = 1, \ldots , n\). Equation (11) reduces to

$$\begin{aligned} \sqrt{n}(\hat{\beta }_n^{(1)}-\beta _0) = \left[ A_n\right] ^{-1}\left[ n^{-1/2}\sum _{i=1}^n \left( \phi _i+B_n\eta _i\right) \right] + o_p\left( \Vert b_n-\beta _0\Vert \right) . \end{aligned}$$

By multivariate central limit theorem for sums of independent random vectors, the asymptotic distribution for \(\hat{\beta }_n^{(1)}\) is zero mean multivariate normal as \(n\rightarrow \infty \). The limit covariance matrix \(\Sigma \) have the form \(A^{-1}\Phi A^{-1}\), where \(\Phi = \lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n \imath _i \imath _i^{\top }\) with \(\imath _i = \phi _i + B\eta _i\). Induction then implies that \(\hat{\beta }_n^{(m)}\) is multivariate normal for every \(m\).

Analytic details of \(W(t, x)\) in model checking

Using arguments similar to those in Jin et al. (2006a) and Novák (2013), \(W(t, x)\) can be shown to have the same asymptotic distribution as \(\hat{W}(t, x)\), where

$$\begin{aligned} \hat{W}(t, x)&= n^{-1/2}\sum _{i=1}^n\sum _{k=1}^K\int \limits _0^t \\&\quad \left( \omega _{ik}(x) - \frac{\sum _{j=1}^n\sum _{l=1}^K \omega _{jl}(x) I[e_{jl}(\hat{\beta }_n^{(m)})\ge u]}{\sum _{j=1}^n\sum _{l{=}1}^KI[e_{jl}(\hat{\beta }_n^{(m)}\ge u])}\right) {\mathrm {d}}\hat{M}_{ik}(u, \hat{\beta }_n^{(m)}) (Z_i {-}1)\\&\quad -n^{1/2}\left( \hat{f}_n(t, x) + \int \limits _0^t\hat{f}_Y(u, x){\mathrm {d}}\hat{\Lambda }(u, \hat{\beta }_n^{(m)})\right) ^\top (\hat{\beta }_n^{(m)}-\hat{\beta }_n^{(m)*})\\&\quad -n^{{-}1/2}\int \limits _0^t\sum _{i{=}1}^n\sum _{k{=}1}^K\omega _{ik}(x)I[e_{ik}(\hat{\beta }_n^{(m)}){\ge } u]{\mathrm {d}}\left( \hat{\Lambda }(u, \hat{\beta }_n^{(m)}) {-} \hat{\Lambda }(u, \hat{\beta }_n^{(m)*})\right) \!, \end{aligned}$$

where \(\hat{f}_N(t, x) = n^{-1}\sum _{i=1}^n\sum _{k=1}^K\Delta _{ik}\omega _{ij}(x) \hat{f}_0(t) X_{ik}\), \(\hat{f}_Y(t, x) = n^{-1}\sum _{i=1}^n\) \( \sum _{k=1}^K \omega _{ik}(x) \hat{g}_0(t) X_{ik}\), and \(f_0(t)\) and \(g_0(t)\) are the baseline densities of \(\epsilon _{ik}\) and \(e_{ik}(\beta _0)\), respectively, with kernel estimate \(\hat{f}_0(t)\) and \(\hat{g}_0(t)\) (e.g., Novák 2013), obtained with \(\beta _0\) replaced with \(\hat{\beta }_{n}^{(m)}\). Note that the multipliers \(Z_i\)’s used to obtain the bootstrap samples \(\hat{\beta }_{n}^{(m)*}\) are used again here.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Chiou, S.H., Kang, S., Kim, J. et al. Marginal semiparametric multivariate accelerated failure time model with generalized estimating equations. Lifetime Data Anal 20, 599–618 (2014).

Download citation


  • Buckley-James estimator
  • Efficiency
  • Induced smoothing
  • Least squares
  • Multivariate survival