Accelerated failure time model for data from outcome-dependent sampling

Yu, Jichang; Zhou, Haibo; Cai, Jianwen

doi:10.1007/s10985-020-09508-y

Accelerated failure time model for data from outcome-dependent sampling

Published: 12 October 2020

Volume 27, pages 15–37, (2021)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

454 Accesses
3 Citations
Explore all metrics

Abstract

Outcome-dependent sampling designs such as the case–control or case–cohort design are widely used in epidemiological studies for their outstanding cost-effectiveness. In this article, we propose and develop a smoothed weighted Gehan estimating equation approach for inference in an accelerated failure time model under a general failure time outcome-dependent sampling scheme. The proposed estimating equation is continuously differentiable and can be solved by the standard numerical methods. In addition to developing asymptotic properties of the proposed estimator, we also propose and investigate a new optimal power-based subsamples allocation criteria in the proposed design by maximizing the power function of a significant test. Simulation results show that the proposed estimator is more efficient than other existing competing estimators and the optimal power-based subsamples allocation will provide an ODS design that yield improved power for the test of exposure effect. We illustrate the proposed method with a data set from the Norwegian Mother and Child Cohort Study to evaluate the relationship between exposure to perfluoroalkyl substances and women’s subfecundity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recent progresses in outcome-dependent sampling with failure time data

Article 13 January 2016

Optimal generalized case-cohort analysis with accelerated failure time model

Article 18 November 2016

Two-phase outcome-dependent studies for failure times and testing for effects of expensive covariates

Article 29 November 2016

References

Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120
MathSciNet MATH Google Scholar
Breslow NE, Cain KC (1988) Logistic regression for two-stage case-control data. Biometrika 75:11–20
MathSciNet MATH Google Scholar
Breslow NE, Holubkov R (1997) Maximum likelihood estimation of logistic regression parameters under two-phase, outcome-dependent sampling. J R Stat Soc B 59:447–461
MathSciNet MATH Google Scholar
Brown BM, Wang YG (2007) Induced smoothing for rank regression with censored survival times. Stat Med 26:828–836
MathSciNet Google Scholar
Cai J, Zeng D (2007) Power calculation for case-cohort studies with nonrare events. Biometrics 63:1288–1295
MathSciNet MATH Google Scholar
Chiou S, Kang S, Yan J (2014) Fast accelerated failure time modeling for case-cohort data. Stat Comput 24:559–568
MathSciNet MATH Google Scholar
Chen K (2001) Generalized case-cohort sampling. J R Stat Soc B 63:791–809
MathSciNet MATH Google Scholar
Ding J, Zhou H, Liu Y, Cai J, Longnecker M (2014) Estimating effect of environmental contaminants on women’s subfecundity for the MoBa study data with an outcome-dependent sampling scheme. Biostatistics 15:636–650
Google Scholar
Fleming TR, Harrington DP (1991) Counting processes and survival analysis. Wiley, New York
MATH Google Scholar
Fygenson M, Ritov Y (1994) Monotone estimating equations for censored data. Ann Stat 22:732–746
MathSciNet MATH Google Scholar
Hájek J (1960) Limiting distributions in simple random sampling from a finite population. Publ Math Inst Hung Acad Sci 5:361–374
MathSciNet MATH Google Scholar
Jin Z, Lin DY, Wei LJ, Ying Z (2003) Rank-based inference for the accelerated failure time model. Biometrika 90:341–353
MathSciNet MATH Google Scholar
Johnson L, Strawderman R (2009) Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 93:577–590
MathSciNet MATH Google Scholar
Kang S, Cai J (2009) Marginal hazards model for case-cohort studies with multiple disease outcomes. Biometrika 96:887–901
MathSciNet MATH Google Scholar
Kang S, Cai J, Chambless L (2013) Marginal additive hazards model for case-cohort studies with multiple disease outcomes: an application to the Atherosclerosis Risk in Communities (ARIC) study. Biostatistics 14:28–41
Google Scholar
Kim S, Cai J, Lu W (2013) More efficient estimators for case-cohort studies. Biometrika 100:695–708
MathSciNet MATH Google Scholar
Kim J, Sit T, Ying Z (2016) Accelerated failure time model under general biased sampling scheme. Biostatistics 17:576–588
MathSciNet Google Scholar
Kong L, Cai J (2009) Case-cohort analysis with accelerated failure time model. Biometrics 65:135–142
MathSciNet MATH Google Scholar
Kulich M, Lin DY (2000) Additive hazards regression with covariate measurement error. J Am Stat Assoc 95:238–248
MathSciNet MATH Google Scholar
Magnus P, Irgens L, Haug K, Nystad W, Skjærven R, Stoltenberg C, The MoBa Study Group (2006) Cohort profile: the Norwegian Mother and Child Cohort Study (MoBa). Int J Epidemiol 35:1146–1150
Nan B, Yu M, Kalbfleisch JD (2006) Censored linear regression for case-cohort studies. Biometrica 93:747–762
MathSciNet MATH Google Scholar
Novák P (2013) Goondess-of-fit test for accelerated failure time model based on martingale residuals. Kyberanetika 49:40–59
MATH Google Scholar
Prentice RL (1978) Linear rank tests with right-censored data. Biometrika 65:167–179
MathSciNet MATH Google Scholar
Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73:1–11
MathSciNet MATH Google Scholar
Prentice RL, Pyke R (1979) Logistic disease incidence models and case-control studies. Biometrika 66:403–412
MathSciNet MATH Google Scholar
Pollard D (1990) Empirical processes: theory and applications. Institute of Mathematical Statistics, Hayward
MATH Google Scholar
Schildcrout JS, Garbett SP, Heagerty PJ (2013) Outcome vector dependent sampling with longitudinal continuous response data: stratified sampling based on summary statistics. Biometrics 69:405–416
MathSciNet MATH Google Scholar
Schildcrout JS, Rathouz PJ, Zelnick LR, Garbett SP, Heagerty PJ (2015) Biased sampling designs tofimprove research efficiency: factors in uencing pulmonary function over time in children with asthma. Ann Appl Stat 9:731–753
MathSciNet MATH Google Scholar
Song R, Zhou H, Kosorok M (2009) A note on semiparametric efficient inference for two-stage outcome-dependent sampling with a continuous outcome. Biometrika 96:221–228
MathSciNet MATH Google Scholar
Tsiatis AA (1990) Estimating regression parameters using linear rank tests for censored data. Ann Stat 18:354–372
MathSciNet MATH Google Scholar
Tan Z, Qin G, Zhou H (2016) Estimation of a paritally linear additive model for data from an outcome-dependent sampling design with a continuous outcome. Biostiatistics 17:663–676
Google Scholar
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
MATH Google Scholar
Wang X, Zhou H (2010) Design and inference for cancer biomarker study with an outcome and auxiliary-dependent subsampling. Biometrics 66:502–511
MathSciNet MATH Google Scholar
Weaver MA, Zhou H (2005) An estimated likelihood method for continuous outcome regression models with outcome-dependent sampling. J Am Stat Assoc 100:459–469
MathSciNet MATH Google Scholar
Weinberg C, Wacholder S (1993) Prospective analysis of case-control data under general multiplicativeintercept risk models. Biometrika 80:461–465
MathSciNet MATH Google Scholar
Whitworth KW, Haug LS, Barid DD, Becher G, Hoppin JA, Skjaerven R, Thomsen C, Eggesbo M, Travlos G, Wilson R, Longnecker MP (2012) Perfluorinated compounds and subfecundity in pregnant women. Epidemiology 23:257–263
Google Scholar
Ying Z (1993) A large sample study of rank estimation for censored regression data. Ann Stat 21:76–99
MathSciNet MATH Google Scholar
Yu J, Liu Y, Cai J, Sandler DP, Zhou H (2016) Design and inference with an outcome-dependent sampling scheme under the Cox proportional hazards model. J Stat Plan Inference 178:24–36
MATH Google Scholar
Zeng D, Lin DR (2007) Efficient estimation for the accelerated failure time model. J Am Stat Assoc 102:1387–1396
MathSciNet MATH Google Scholar
Zhou H, Chen J, Rissnen T, Korrick S, Hu H, Salonen J, Longnecker MP (2007) Outcome-dependent sampling: an efficient sampling and inference procedure for studies with a continuous outcome. Epidemiology 18:461–468
Google Scholar
Zhou H, Qin G, Longnecker M (2011) A partial linear model in the outcome-dependent sampling setting to evaluate the effect of prenatal PCB exposure on cognitive function in children. Biometrics 67:876–885
MathSciNet MATH Google Scholar
Zhou H, Weaver M, Qin J, Longnecker M, Wang MC (2002) A semiparametric empirical likelihood method for data from an outcome-dependent sampling scheme with a continuous outcome. Biometrics 58:413–421
MathSciNet MATH Google Scholar
Zhou H, Xu W, Zeng D, Cai J (2014) Semiparametric inference for data with a continuous outcome from a two-phase probability-dependent sampling scheme. J R Stat Soc B 76:197–215
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is partly supported by the National Science Foundation of China Grants 11501578 and 11701571 (for Yu), and National Institutes of Health Grants P42ES031007 Super fund, P30ES010126, and P01 CA142538 (for Cai and Zhou).

Author information

Authors and Affiliations

School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, 430073, Hubei, China
Jichang Yu
Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
Haibo Zhou & Jianwen Cai

Authors

Jichang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Haibo Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jianwen Cai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianwen Cai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 349 KB)

Appendix

In order to establish the asymptotic properties of the proposed estimator, we need the following two lemmas.

Lemma 1

Under Conditions $(C1){-}(C4)$ and (C6), $m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)$ is asymptotically normal with zero-mean and covariance matrix $\Sigma _{F}(\theta _0)+\Sigma _{O}(\theta _0)$.

Proof

Using martingales expression $M_i(\theta _0;t), i=1,\ldots , m$, we have

$$\begin{aligned} m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)= & {} m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty } {\tilde{\psi }}(\theta _0;t)W_i[X_i-{\tilde{X}}(\theta _0;t)]dN_i(\theta _0;t)\nonumber \\= & {} m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }{\tilde{\psi }}(\theta _0;t)W_i {[}X_i-{\tilde{X}}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&+\,m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }{\tilde{\psi }}(\theta _0;t)W_i {[}X_i-{\tilde{X}}(\theta _0;t)]d\Lambda _i(\theta _0;t). \end{aligned}$$

(7.1)

Obviously, the second term of (7.1) is equal to zero. Therefore, the term $m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)$ can be written as

$$\begin{aligned} m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)= & {} m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }{\tilde{\psi }}(\theta _0;t)W_i [X_i-{\tilde{X}}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\= & {} m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t)W_i [X_i-e_X(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&+\,m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) [e_X(\theta _0;t)-{\tilde{X}}(\theta _0;t)]d\left[ W_iM_i(\theta _0;t)\right] \nonumber \\&+\, o_p(1). \end{aligned}$$

(7.2)

Next, we will show the second term of (7.2) is asymptotically negligible, which means

$$\begin{aligned} \int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) {[}e_X(\theta _0;t)-{\tilde{X}}(\theta _0;t)]d\left[ \frac{1}{\sqrt{m}}\sum \nolimits _{i=1}^{m}W_iM_i(\theta _0;t) \right] =o_p(1). \end{aligned}$$

(7.3)

For each i, $W_iM_i(\theta _0;t)$ is a zero-mean process, which can be expressed as a sum of two monotone processes on the interval $[-Con_M,Con_M]$. Due to Conditions (C1) and (C2) and the follow-up time of the studies being bounded, the integrable interval $(-\infty ,+\infty )$ in formula (7.1) should be an interval of $[-Con_M,Con_M]$, which is a compact set in real space ${\mathcal {R}}$ with $Con_M$ being a positive constant and similar to the condition A of Tsiates (1990). Hence, the term $m^{-1/2}\sum _{i=1}^{m} W_iM_i(\theta _0;t)$ converges weakly to a tight Gaussian process with continuous sample paths on $[-Con_M,Con_M]$ by Example 2.11.16 of van der Vaart and Wellner (1996). We assume $X_i \ge 0$, otherwise, we decompose each $X_i(\cdot )$ into its positive and negative parts. Because ${\tilde{X}}(\theta _0;t)$ is a product of two monotone processes, which converges uniformly in probability to $e_X(\theta _0;t)$ on a compact set $[-Con_M,Con_M]$ in ${\mathcal {R}}$. Using Lemma A.1 of Kulich and Lin (2000), (7.3) holds. Hence,

$$\begin{aligned} m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)=m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t)W_i {[}X_i-e_X(\theta _0;t)]dM_i(\theta _0;t)+o_p(1).\nonumber \\ \end{aligned}$$

(7.4)

The first term of the right-side of (7.4) can be written as:

$$\begin{aligned}&m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t)(1-\delta _i)(\xi _i/(\rho _0\rho _V)-1) [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) \delta _i(1-\zeta _i)(\xi _i/(\rho _0\rho _V)-1) [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\sum \limits _{k=\{1,3\}}\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) \delta _i(1-\xi _i)\zeta _{i,k}\left( \frac{\eta _{i,k}\pi _k(1-\rho _0\rho _V)}{\rho _k\rho _V}-1\right) \nonumber \\&\quad \times [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t). \end{aligned}$$

(7.5)

In order to simplify the expression, we define $H_i(\theta _0)=\int _{-\infty }^{\infty }s^{(0)}(\theta _0;t) [X_i-e_{X}(\theta _0;t)]dM_i(\theta _0;t)$ and the formula (7.5) can be written as following

$$\begin{aligned}&m^{-1/2}\sum \limits _{i=1}^{m}H_i(\theta _0)+ m^{-1/2}\sum \limits _{i=1}^{m}(1-\delta _i)(\frac{\xi _i}{\rho _0\rho _V}-1)H_i(\theta _0)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\delta _i(1-\zeta _i)(\frac{\xi _i}{\rho _0\rho _V}-1)H_i(\theta _0)\nonumber \\&\quad +m^{-1/2}\sum \limits _{i=1}^{m}\sum \limits _{k=\{1,3\}} \delta _i(1-\xi _i)\zeta _{i,k}(\frac{\eta _{i,k}\pi _k(1-\rho _0\rho _V)}{\rho _k\rho _V}-1)H_i(\theta _0). \end{aligned}$$

(7.6)

The five terms on the right-hand side of (7.6) have mean zero. Because $E[\xi /(\rho _0\rho _V)]=1$, the covariance matrix between the first term and the second term is

$$\begin{aligned} E\left[ (1-\delta )(\frac{\xi }{\rho _0\rho _V}-1)H(\theta _0)^{\otimes 2}\right]= & {} E\left\{ E\left[ (1-\delta )(\frac{\xi }{\rho _0\rho _V}-1)H(\theta _0)^{\otimes 2} |X,\delta ,T\right] \right\} \\= & {} E\left\{ (1-\delta )H(\theta _0)^{\otimes 2}E\left[ (\frac{\xi }{\rho _0\rho _V}-1)|X,\delta ,T\right] \right\} \\= & {} E\left\{ (1-\delta )H(\theta _0)^{\otimes 2}E\left[ (\frac{\xi }{\rho _0\rho _V}-1)\right] \right\} \\= & {} 0. \end{aligned}$$

By similar arguments, we can obtain the five terms on the right hand side of (7.6) are uncorrelated with each other. Besides, each term is a sum of independent and identically distributed zero-mean random vectors. Using a slight extension of H$\acute{a}$jek’s (1960) central limit theorem, $m^{-1/2}{\tilde{U}}_{m,G}(\theta _{0})$ can be shown to converge in distribution to a zero-mean normal vector with covariance matrix being $\Sigma _{F}(\theta _0)+\Sigma _{O}(\theta _0)$, where $\Sigma _F(\theta _0)=E[H_1(\theta _0)^{\otimes 2}]$, $\Sigma _O(\theta _0)=\frac{1-\rho _0\rho _V}{\rho _0\rho _V}E[(1-\delta _1)H_1(\theta _0)^{\otimes 2}]+ \frac{1-\rho _0\rho _V}{\rho _0\rho _V}E[\delta _1(1-\zeta _1)H_1(\theta _0)^{\otimes 2}]+ \sum \limits _{k=\{1,3\}}\frac{(1-\rho _0\rho _V)(\pi _k(1-\rho _0\rho _V)-\rho _k\rho _V)}{\rho _k\rho _V} E[\delta _1\zeta _{1,k}H_1(\theta _0)^{\otimes 2}]$, with $a^{\otimes 2}=aa^{'}$ for a vector a. Therefore, Lemma 1 holds. $\square $

Lemma 2

Under Conditions $(C1){-}(C4)$, the weighted Gehan estimating function and the smoothed weighted Gehan estimating function are asymptotically equivalent:

$$\begin{aligned} m^{-1/2}{\tilde{U}}_{m,G}(\theta _0)=m^{-1/2}{\bar{U}}_{m,G}(\theta _0)+o_p(1). \end{aligned}$$

Proof

Due to the induced smoothness method, we have

$$\begin{aligned} m^{-1/2}\left( {\tilde{U}}_{m,G}(\theta _0)-{\bar{U}}_{m,G}(\theta _0)\right)= & {} \frac{1}{m^{3/2}}\sum \limits _{i=1}^{m}\sum \limits _{j=1}^{m}\delta _iW_iW_j(X_i-X_j)\\&\times \left[ I(e_j(\theta _0)-e_i(\theta _0)\ge 0)-\Phi (\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}})\right] . \end{aligned}$$

Due to the inequality $I(e_j(\theta _0)-e_i(\theta _0)\ge 0)-\Phi (\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}})\le \Phi (-|\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}|)$, we can obtain

$$\begin{aligned} \Vert m^{-1/2}({\tilde{U}}_{m,G}(\theta _0)-{\bar{U}}_{m,G}(\theta _0))\Vert\le & {} \left\| \frac{1}{m^2}\sum \limits _{i=1}^{m}\sum \limits _{j=1}^{m}\frac{\delta _iW_iW_j (X_{i}-X_{j})\sqrt{m}r_{ij}}{|e_j(\theta _0) -e_i(\theta _0)|}\right\| \\&\times \left| \frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}\right| \Phi \left( -\left| \frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}\right| \right) . \end{aligned}$$

Because of $\Phi (-x)\le (\sqrt{2\pi }x)^{-1}\exp \{-x^2/2\}$, we can obtain $\lim \nolimits _{x\rightarrow +\infty }x\Phi (-x)=0$. Due to the fact $r_{ij}=\sqrt{\frac{(X_j-X_i)^{'}(X_j-X_i)}{m}}$, the term $|\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}|\Phi (-|\frac{e_j(\theta _0) -e_i(\theta _0)}{r_{ij}}|)=|\frac{\sqrt{m}[e_j(\theta _0) -e_i(\theta _0)]}{\sqrt{(X_j-X_i)^{'}(X_j-X_i)}}| \Phi (-|\frac{\sqrt{m}[e_j(\theta _0) -e_i(\theta _0)]}{\sqrt{(X_j-X_i)^{'}(X_j-X_i)}}|)$ goes to zero as m goes to infinity. Therefore, Lemma 2 holds by applying the strong law of large number (Pollard 1990). $\square $

Proof of Theorem 1:

Due to the fact that ${\tilde{U}}_{m,G}(\theta )$ is the gradient of the convex objective function

$$\begin{aligned} L_m(\theta )=m^{-1}\sum \limits _{i=1}^m \sum \limits _{j=1}^m\delta _iW_iW_j(e_j(\theta )-e_i(\theta ))I(e_j(\theta )-e_i(\theta )\ge 0), \end{aligned}$$

(7.7)

a parameter estimator could be obtained by minimizing $L_m(\theta )$ with respect to $\theta $ and the resulting set of solutions is also convex. However, the lack of smoothness also presents computational challenges. We can use standard results for normal random variables and integration by parts to obtain

$$\begin{aligned} {\bar{L}}_m(\theta )= & {} m^{-1}\sum \limits _{i=1}^m \sum \limits _{j=1}^m\delta _iW_iW_j\left[ (e_j(\theta )-e_i(\theta ))\Phi \left( \frac{e_j (\theta )-e_i(\theta )}{r_{ij}}\right) \right. \nonumber \\&+r_{ij}\phi \left( \frac{e_j (\theta )-e_i(\theta )}{r_{ij}}\right) \left. \right] , \end{aligned}$$

(7.8)

where the function $\phi (\cdot )$ is a standard normal density function. A straightforward calculation can show that ${\bar{U}}_{m,G}(\theta )=\partial {\bar{L}}_m(\theta )/\partial \theta $. The smoothed objective function ${\bar{L}}_m(\theta )$ is convex and continuously differentiable. Hence, the standard numerical methods can be used to obtain ${\widehat{\theta }}_m=\arg \min \nolimits _{\theta \in {\mathcal {B}}} {\bar{L}}_m(\theta )$. By Lemmas 1 and 2 of Johnson and Strawderman (2009), the respective minimizers ${\tilde{\theta }}_m$ and ${\widehat{\theta }}_m$ of $L_m(\theta )$ and ${\bar{L}}_m(\theta )$ thus converge almost surely to $\theta _0$ (Andersen and Gill 1982, Corollary II.2). By Taylor expansion of ${\bar{U}}_{m,G}(\theta )$ around $\theta _0$, we have

$$\begin{aligned} {\bar{U}}_{m,G}(\theta )- {\bar{U}}_{m,G}(\theta _0)=\frac{\partial {{\bar{U}}_{m,G}}(\theta )}{\partial \theta }|_{\theta ^*} ( \theta -\theta _0), \end{aligned}$$

where $\theta ^*$ is between $\theta $ and $\theta _0$. Inserting ${\widehat{\theta }}_{m}$ in the above equation, we can obtain

$$\begin{aligned} m^{-1/2}{{\bar{U}}_{m,G}}(\theta _0)= \left\{ -m^{-1}\frac{\partial {\bar{U}}_{m,G}(\theta ^*)}{\partial \theta } \right\} \sqrt{m} ( {\widehat{\theta }}_{m}-\theta _0) \end{aligned}$$

with $\theta ^*$ being between ${\widehat{\theta }}_{m}$ and $\theta _0$. The asymptotic normality of ${\widehat{\theta }}_{m}$ can be established based on Lemmas 1 and 2, Condition (C5), and the consistency of ${\widehat{\theta }}_{m}$. Hence, Theorem 1 holds. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, J., Zhou, H. & Cai, J. Accelerated failure time model for data from outcome-dependent sampling. Lifetime Data Anal 27, 15–37 (2021). https://doi.org/10.1007/s10985-020-09508-y

Download citation

Received: 26 December 2019
Accepted: 29 September 2020
Published: 12 October 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s10985-020-09508-y

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated failure time model for data from outcome-dependent sampling

Abstract

Access this article

Similar content being viewed by others

Recent progresses in outcome-dependent sampling with failure time data

Optimal generalized case-cohort analysis with accelerated failure time model

Two-phase outcome-dependent studies for failure times and testing for effects of expensive covariates

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 349 KB)

Appendix

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem 1:

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Accelerated failure time model for data from outcome-dependent sampling

Abstract

Access this article

Similar content being viewed by others

Recent progresses in outcome-dependent sampling with failure time data

Optimal generalized case-cohort analysis with accelerated failure time model

Two-phase outcome-dependent studies for failure times and testing for effects of expensive covariates

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 349 KB)

Appendix

Appendix

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem 1:

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation