Skip to main content

Advertisement

Log in

Accelerated failure time model for data from outcome-dependent sampling

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Outcome-dependent sampling designs such as the case–control or case–cohort design are widely used in epidemiological studies for their outstanding cost-effectiveness. In this article, we propose and develop a smoothed weighted Gehan estimating equation approach for inference in an accelerated failure time model under a general failure time outcome-dependent sampling scheme. The proposed estimating equation is continuously differentiable and can be solved by the standard numerical methods. In addition to developing asymptotic properties of the proposed estimator, we also propose and investigate a new optimal power-based subsamples allocation criteria in the proposed design by maximizing the power function of a significant test. Simulation results show that the proposed estimator is more efficient than other existing competing estimators and the optimal power-based subsamples allocation will provide an ODS design that yield improved power for the test of exposure effect. We illustrate the proposed method with a data set from the Norwegian Mother and Child Cohort Study to evaluate the relationship between exposure to perfluoroalkyl substances and women’s subfecundity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120

    MathSciNet  MATH  Google Scholar 

  • Breslow NE, Cain KC (1988) Logistic regression for two-stage case-control data. Biometrika 75:11–20

    MathSciNet  MATH  Google Scholar 

  • Breslow NE, Holubkov R (1997) Maximum likelihood estimation of logistic regression parameters under two-phase, outcome-dependent sampling. J R Stat Soc B 59:447–461

    MathSciNet  MATH  Google Scholar 

  • Brown BM, Wang YG (2007) Induced smoothing for rank regression with censored survival times. Stat Med 26:828–836

    MathSciNet  Google Scholar 

  • Cai J, Zeng D (2007) Power calculation for case-cohort studies with nonrare events. Biometrics 63:1288–1295

    MathSciNet  MATH  Google Scholar 

  • Chiou S, Kang S, Yan J (2014) Fast accelerated failure time modeling for case-cohort data. Stat Comput 24:559–568

    MathSciNet  MATH  Google Scholar 

  • Chen K (2001) Generalized case-cohort sampling. J R Stat Soc B 63:791–809

    MathSciNet  MATH  Google Scholar 

  • Ding J, Zhou H, Liu Y, Cai J, Longnecker M (2014) Estimating effect of environmental contaminants on women’s subfecundity for the MoBa study data with an outcome-dependent sampling scheme. Biostatistics 15:636–650

    Google Scholar 

  • Fleming TR, Harrington DP (1991) Counting processes and survival analysis. Wiley, New York

    MATH  Google Scholar 

  • Fygenson M, Ritov Y (1994) Monotone estimating equations for censored data. Ann Stat 22:732–746

    MathSciNet  MATH  Google Scholar 

  • Hájek J (1960) Limiting distributions in simple random sampling from a finite population. Publ Math Inst Hung Acad Sci 5:361–374

    MathSciNet  MATH  Google Scholar 

  • Jin Z, Lin DY, Wei LJ, Ying Z (2003) Rank-based inference for the accelerated failure time model. Biometrika 90:341–353

    MathSciNet  MATH  Google Scholar 

  • Johnson L, Strawderman R (2009) Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 93:577–590

    MathSciNet  MATH  Google Scholar 

  • Kang S, Cai J (2009) Marginal hazards model for case-cohort studies with multiple disease outcomes. Biometrika 96:887–901

    MathSciNet  MATH  Google Scholar 

  • Kang S, Cai J, Chambless L (2013) Marginal additive hazards model for case-cohort studies with multiple disease outcomes: an application to the Atherosclerosis Risk in Communities (ARIC) study. Biostatistics 14:28–41

    Google Scholar 

  • Kim S, Cai J, Lu W (2013) More efficient estimators for case-cohort studies. Biometrika 100:695–708

    MathSciNet  MATH  Google Scholar 

  • Kim J, Sit T, Ying Z (2016) Accelerated failure time model under general biased sampling scheme. Biostatistics 17:576–588

    MathSciNet  Google Scholar 

  • Kong L, Cai J (2009) Case-cohort analysis with accelerated failure time model. Biometrics 65:135–142

    MathSciNet  MATH  Google Scholar 

  • Kulich M, Lin DY (2000) Additive hazards regression with covariate measurement error. J Am Stat Assoc 95:238–248

    MathSciNet  MATH  Google Scholar 

  • Magnus P, Irgens L, Haug K, Nystad W, Skjærven R, Stoltenberg C, The MoBa Study Group (2006) Cohort profile: the Norwegian Mother and Child Cohort Study (MoBa). Int J Epidemiol 35:1146–1150

  • Nan B, Yu M, Kalbfleisch JD (2006) Censored linear regression for case-cohort studies. Biometrica 93:747–762

    MathSciNet  MATH  Google Scholar 

  • Novák P (2013) Goondess-of-fit test for accelerated failure time model based on martingale residuals. Kyberanetika 49:40–59

    MATH  Google Scholar 

  • Prentice RL (1978) Linear rank tests with right-censored data. Biometrika 65:167–179

    MathSciNet  MATH  Google Scholar 

  • Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73:1–11

    MathSciNet  MATH  Google Scholar 

  • Prentice RL, Pyke R (1979) Logistic disease incidence models and case-control studies. Biometrika 66:403–412

    MathSciNet  MATH  Google Scholar 

  • Pollard D (1990) Empirical processes: theory and applications. Institute of Mathematical Statistics, Hayward

    MATH  Google Scholar 

  • Schildcrout JS, Garbett SP, Heagerty PJ (2013) Outcome vector dependent sampling with longitudinal continuous response data: stratified sampling based on summary statistics. Biometrics 69:405–416

    MathSciNet  MATH  Google Scholar 

  • Schildcrout JS, Rathouz PJ, Zelnick LR, Garbett SP, Heagerty PJ (2015) Biased sampling designs tofimprove research efficiency: factors in uencing pulmonary function over time in children with asthma. Ann Appl Stat 9:731–753

    MathSciNet  MATH  Google Scholar 

  • Song R, Zhou H, Kosorok M (2009) A note on semiparametric efficient inference for two-stage outcome-dependent sampling with a continuous outcome. Biometrika 96:221–228

    MathSciNet  MATH  Google Scholar 

  • Tsiatis AA (1990) Estimating regression parameters using linear rank tests for censored data. Ann Stat 18:354–372

    MathSciNet  MATH  Google Scholar 

  • Tan Z, Qin G, Zhou H (2016) Estimation of a paritally linear additive model for data from an outcome-dependent sampling design with a continuous outcome. Biostiatistics 17:663–676

    Google Scholar 

  • van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York

    MATH  Google Scholar 

  • Wang X, Zhou H (2010) Design and inference for cancer biomarker study with an outcome and auxiliary-dependent subsampling. Biometrics 66:502–511

    MathSciNet  MATH  Google Scholar 

  • Weaver MA, Zhou H (2005) An estimated likelihood method for continuous outcome regression models with outcome-dependent sampling. J Am Stat Assoc 100:459–469

    MathSciNet  MATH  Google Scholar 

  • Weinberg C, Wacholder S (1993) Prospective analysis of case-control data under general multiplicativeintercept risk models. Biometrika 80:461–465

    MathSciNet  MATH  Google Scholar 

  • Whitworth KW, Haug LS, Barid DD, Becher G, Hoppin JA, Skjaerven R, Thomsen C, Eggesbo M, Travlos G, Wilson R, Longnecker MP (2012) Perfluorinated compounds and subfecundity in pregnant women. Epidemiology 23:257–263

    Google Scholar 

  • Ying Z (1993) A large sample study of rank estimation for censored regression data. Ann Stat 21:76–99

    MathSciNet  MATH  Google Scholar 

  • Yu J, Liu Y, Cai J, Sandler DP, Zhou H (2016) Design and inference with an outcome-dependent sampling scheme under the Cox proportional hazards model. J Stat Plan Inference 178:24–36

    MATH  Google Scholar 

  • Zeng D, Lin DR (2007) Efficient estimation for the accelerated failure time model. J Am Stat Assoc 102:1387–1396

    MathSciNet  MATH  Google Scholar 

  • Zhou H, Chen J, Rissnen T, Korrick S, Hu H, Salonen J, Longnecker MP (2007) Outcome-dependent sampling: an efficient sampling and inference procedure for studies with a continuous outcome. Epidemiology 18:461–468

    Google Scholar 

  • Zhou H, Qin G, Longnecker M (2011) A partial linear model in the outcome-dependent sampling setting to evaluate the effect of prenatal PCB exposure on cognitive function in children. Biometrics 67:876–885

    MathSciNet  MATH  Google Scholar 

  • Zhou H, Weaver M, Qin J, Longnecker M, Wang MC (2002) A semiparametric empirical likelihood method for data from an outcome-dependent sampling scheme with a continuous outcome. Biometrics 58:413–421

    MathSciNet  MATH  Google Scholar 

  • Zhou H, Xu W, Zeng D, Cai J (2014) Semiparametric inference for data with a continuous outcome from a two-phase probability-dependent sampling scheme. J R Stat Soc B 76:197–215

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is partly supported by the National Science Foundation of China Grants 11501578 and 11701571 (for Yu), and National Institutes of Health Grants P42ES031007 Super fund, P30ES010126, and P01 CA142538 (for Cai and Zhou).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianwen Cai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 349 KB)

Appendix

Appendix

In order to establish the asymptotic properties of the proposed estimator, we need the following two lemmas.

Lemma 1

Under Conditions \((C1){-}(C4)\) and (C6), \(m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)\) is asymptotically normal with zero-mean and covariance matrix \(\Sigma _{F}(\theta _0)+\Sigma _{O}(\theta _0)\).

Proof

Using martingales expression \(M_i(\theta _0;t), i=1,\ldots , m\), we have

$$\begin{aligned} m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)= & {} m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty } {\tilde{\psi }}(\theta _0;t)W_i[X_i-{\tilde{X}}(\theta _0;t)]dN_i(\theta _0;t)\nonumber \\= & {} m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }{\tilde{\psi }}(\theta _0;t)W_i {[}X_i-{\tilde{X}}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&+\,m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }{\tilde{\psi }}(\theta _0;t)W_i {[}X_i-{\tilde{X}}(\theta _0;t)]d\Lambda _i(\theta _0;t). \end{aligned}$$
(7.1)

Obviously, the second term of (7.1) is equal to zero. Therefore, the term \(m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)\) can be written as

$$\begin{aligned} m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)= & {} m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }{\tilde{\psi }}(\theta _0;t)W_i [X_i-{\tilde{X}}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\= & {} m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t)W_i [X_i-e_X(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&+\,m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) [e_X(\theta _0;t)-{\tilde{X}}(\theta _0;t)]d\left[ W_iM_i(\theta _0;t)\right] \nonumber \\&+\, o_p(1). \end{aligned}$$
(7.2)

Next, we will show the second term of (7.2) is asymptotically negligible, which means

$$\begin{aligned} \int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) {[}e_X(\theta _0;t)-{\tilde{X}}(\theta _0;t)]d\left[ \frac{1}{\sqrt{m}}\sum \nolimits _{i=1}^{m}W_iM_i(\theta _0;t) \right] =o_p(1). \end{aligned}$$
(7.3)

For each i, \(W_iM_i(\theta _0;t)\) is a zero-mean process, which can be expressed as a sum of two monotone processes on the interval \([-Con_M,Con_M]\). Due to Conditions (C1) and (C2) and the follow-up time of the studies being bounded, the integrable interval \((-\infty ,+\infty )\) in formula (7.1) should be an interval of \([-Con_M,Con_M]\), which is a compact set in real space \({\mathcal {R}}\) with \(Con_M\) being a positive constant and similar to the condition A of Tsiates (1990). Hence, the term \(m^{-1/2}\sum _{i=1}^{m} W_iM_i(\theta _0;t)\) converges weakly to a tight Gaussian process with continuous sample paths on \([-Con_M,Con_M]\) by Example 2.11.16 of van der Vaart and Wellner (1996). We assume \(X_i \ge 0\), otherwise, we decompose each \(X_i(\cdot )\) into its positive and negative parts. Because \({\tilde{X}}(\theta _0;t)\) is a product of two monotone processes, which converges uniformly in probability to \(e_X(\theta _0;t)\) on a compact set \([-Con_M,Con_M]\) in \({\mathcal {R}}\). Using Lemma A.1 of Kulich and Lin (2000), (7.3) holds. Hence,

$$\begin{aligned} m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)=m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t)W_i {[}X_i-e_X(\theta _0;t)]dM_i(\theta _0;t)+o_p(1).\nonumber \\ \end{aligned}$$
(7.4)

The first term of the right-side of (7.4) can be written as:

$$\begin{aligned}&m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t)(1-\delta _i)(\xi _i/(\rho _0\rho _V)-1) [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) \delta _i(1-\zeta _i)(\xi _i/(\rho _0\rho _V)-1) [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\sum \limits _{k=\{1,3\}}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) \delta _i(1-\xi _i)\zeta _{i,k}\left( \frac{\eta _{i,k}\pi _k(1-\rho _0\rho _V)}{\rho _k\rho _V}-1\right) \nonumber \\&\quad \times [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t). \end{aligned}$$
(7.5)

In order to simplify the expression, we define \(H_i(\theta _0)=\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t)\) and the formula (7.5) can be written as following

$$\begin{aligned}&m^{-1/2}\sum \limits _{i=1}^{m}H_i(\theta _0)+ m^{-1/2}\sum \limits _{i=1}^{m}(1-\delta _i)(\frac{\xi _i}{\rho _0\rho _V}-1)H_i(\theta _0)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\delta _i(1-\zeta _i)(\frac{\xi _i}{\rho _0\rho _V}-1)H_i(\theta _0)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\sum \limits _{k=\{1,3\}} \delta _i(1-\xi _i)\zeta _{i,k}(\frac{\eta _{i,k}\pi _k(1-\rho _0\rho _V)}{\rho _k\rho _V}-1)H_i(\theta _0). \end{aligned}$$
(7.6)

The five terms on the right-hand side of (7.6) have mean zero. Because \(E[\xi /(\rho _0\rho _V)]=1\), the covariance matrix between the first term and the second term is

$$\begin{aligned} E\left[ (1-\delta )(\frac{\xi }{\rho _0\rho _V}-1)H(\theta _0)^{\otimes 2}\right]= & {} E\left\{ E\left[ (1-\delta )(\frac{\xi }{\rho _0\rho _V}-1)H(\theta _0)^{\otimes 2} |X,\delta ,T\right] \right\} \\= & {} E\left\{ (1-\delta )H(\theta _0)^{\otimes 2}E\left[ (\frac{\xi }{\rho _0\rho _V}-1)|X,\delta ,T\right] \right\} \\= & {} E\left\{ (1-\delta )H(\theta _0)^{\otimes 2}E\left[ (\frac{\xi }{\rho _0\rho _V}-1)\right] \right\} \\= & {} 0. \end{aligned}$$

By similar arguments, we can obtain the five terms on the right hand side of (7.6) are uncorrelated with each other. Besides, each term is a sum of independent and identically distributed zero-mean random vectors. Using a slight extension of H\(\acute{a}\)jek’s (1960) central limit theorem, \(m^{-1/2}{\tilde{U}}_{m,G}(\theta _{0})\) can be shown to converge in distribution to a zero-mean normal vector with covariance matrix being \(\Sigma _{F}(\theta _0)+\Sigma _{O}(\theta _0)\), where \(\Sigma _F(\theta _0)=E[H_1(\theta _0)^{\otimes 2}]\), \(\Sigma _O(\theta _0)=\frac{1-\rho _0\rho _V}{\rho _0\rho _V}E[(1-\delta _1)H_1(\theta _0)^{\otimes 2}]+ \frac{1-\rho _0\rho _V}{\rho _0\rho _V}E[\delta _1(1-\zeta _1)H_1(\theta _0)^{\otimes 2}]+ \sum \limits _{k=\{1,3\}}\frac{(1-\rho _0\rho _V)(\pi _k(1-\rho _0\rho _V)-\rho _k\rho _V)}{\rho _k\rho _V} E[\delta _1\zeta _{1,k}H_1(\theta _0)^{\otimes 2}]\), with \(a^{\otimes 2}=aa^{'}\) for a vector a. Therefore, Lemma 1 holds. \(\square \)

Lemma 2

Under Conditions \((C1){-}(C4)\), the weighted Gehan estimating function and the smoothed weighted Gehan estimating function are asymptotically equivalent:

$$\begin{aligned} m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)=m^{-1/2}{\bar{U}}_{m,G}(\theta _0)+o_p(1). \end{aligned}$$

Proof

Due to the induced smoothness method, we have

$$\begin{aligned} m^{-1/2}\left( {\tilde{U}}_{m,G}(\theta _0)-{\bar{U}}_{m,G}(\theta _0)\right)= & {} \frac{1}{m^{3/2}}\sum \limits _{i=1}^{m}\sum \limits _{j=1}^{m}\delta _iW_iW_j(X_i-X_j)\\&\times \left[ I(e_j(\theta _0)-e_i(\theta _0)\ge 0)-\Phi (\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}})\right] . \end{aligned}$$

Due to the inequality \(I(e_j(\theta _0)-e_i(\theta _0)\ge 0)-\Phi (\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}})\le \Phi (-|\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}|)\), we can obtain

$$\begin{aligned} \Vert m^{-1/2}({\tilde{U}}_{m,G}(\theta _0)-{\bar{U}}_{m,G}(\theta _0))\Vert\le & {} \left\| \frac{1}{m^2}\sum \limits _{i=1}^{m}\sum \limits _{j=1}^{m}\frac{\delta _iW_iW_j (X_{i}-X_{j})\sqrt{m}r_{ij}}{|e_j(\theta _0) -e_i(\theta _0)|}\right\| \\&\times \left| \frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}\right| \Phi \left( -\left| \frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}\right| \right) . \end{aligned}$$

Because of \(\Phi (-x)\le (\sqrt{2\pi }x)^{-1}\exp \{-x^2/2\}\), we can obtain \(\lim \nolimits _{x\rightarrow +\infty }x\Phi (-x)=0\). Due to the fact \(r_{ij}=\sqrt{\frac{(X_j-X_i)^{'}(X_j-X_i)}{m}}\), the term \(|\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}|\Phi (-|\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}|)=|\frac{\sqrt{m}[e_j(\theta _0) -e_i(\theta _0)]}{\sqrt{(X_j-X_i)^{'}(X_j-X_i)}}| \Phi (-|\frac{\sqrt{m}[e_j(\theta _0) -e_i(\theta _0)]}{\sqrt{(X_j-X_i)^{'}(X_j-X_i)}}|)\) goes to zero as m goes to infinity. Therefore, Lemma 2 holds by applying the strong law of large number (Pollard 1990). \(\square \)

Proof of Theorem 1:

Due to the fact that \({\tilde{U}}_{m,G}(\theta )\) is the gradient of the convex objective function

$$\begin{aligned} L_m(\theta )=m^{-1}\sum \limits _{i=1}^m \sum \limits _{j=1}^m\delta _iW_iW_j(e_j(\theta )-e_i(\theta ))I(e_j(\theta )-e_i(\theta )\ge 0), \end{aligned}$$
(7.7)

a parameter estimator could be obtained by minimizing \(L_m(\theta )\) with respect to \(\theta \) and the resulting set of solutions is also convex. However, the lack of smoothness also presents computational challenges. We can use standard results for normal random variables and integration by parts to obtain

$$\begin{aligned} {\bar{L}}_m(\theta )= & {} m^{-1}\sum \limits _{i=1}^m \sum \limits _{j=1}^m\delta _iW_iW_j\left[ (e_j(\theta )-e_i(\theta ))\Phi \left( \frac{e_j (\theta )-e_i(\theta )}{r_{ij}}\right) \right. \nonumber \\&+r_{ij}\phi \left( \frac{e_j (\theta )-e_i(\theta )}{r_{ij}}\right) \left. \right] , \end{aligned}$$
(7.8)

where the function \(\phi (\cdot )\) is a standard normal density function. A straightforward calculation can show that \({\bar{U}}_{m,G}(\theta )=\partial {\bar{L}}_m(\theta )/\partial \theta \). The smoothed objective function \({\bar{L}}_m(\theta )\) is convex and continuously differentiable. Hence, the standard numerical methods can be used to obtain \({\widehat{\theta }}_m=\arg \min \nolimits _{\theta \in {\mathcal {B}}} {\bar{L}}_m(\theta )\). By Lemmas 1 and 2 of Johnson and Strawderman (2009), the respective minimizers \({\tilde{\theta }}_m\) and \({\widehat{\theta }}_m\) of \(L_m(\theta )\) and \({\bar{L}}_m(\theta )\) thus converge almost surely to \(\theta _0\) (Andersen and Gill 1982, Corollary II.2). By Taylor expansion of \({\bar{U}}_{m,G}(\theta )\) around \(\theta _0\), we have

$$\begin{aligned} {\bar{U}}_{m,G}(\theta )- {\bar{U}}_{m,G}(\theta _0)=\frac{\partial {{\bar{U}}_{m,G}}(\theta )}{\partial \theta }|_{\theta ^*} ( \theta -\theta _0), \end{aligned}$$

where \(\theta ^*\) is between \(\theta \) and \(\theta _0\). Inserting \({\widehat{\theta }}_{m}\) in the above equation, we can obtain

$$\begin{aligned} m^{-1/2}{{\bar{U}}_{m,G}}(\theta _0)= \left\{ -m^{-1}\frac{\partial {\bar{U}}_{m,G}(\theta ^*)}{\partial \theta } \right\} \sqrt{m} ( {\widehat{\theta }}_{m}-\theta _0) \end{aligned}$$

with \(\theta ^*\) being between \({\widehat{\theta }}_{m}\) and \(\theta _0\). The asymptotic normality of \({\widehat{\theta }}_{m}\) can be established based on Lemmas 1 and 2, Condition (C5), and the consistency of \({\widehat{\theta }}_{m}\). Hence, Theorem 1 holds. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, J., Zhou, H. & Cai, J. Accelerated failure time model for data from outcome-dependent sampling. Lifetime Data Anal 27, 15–37 (2021). https://doi.org/10.1007/s10985-020-09508-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-020-09508-y

Keywords

Mathematics Subject Classification

Navigation