Abstract
In this paper, a copula-graphic estimator is proposed for censored survival data. It is assumed that there is some dependent censoring acting on the variable of interest that may come from an existing competing risk. Furthermore, the full process is independently censored by some administrative censoring time. The dependent censoring is modeled through an Archimedean copula function, which is supposed to be known. An asymptotic representation of the estimator as a sum of independent and identically distributed random variables is obtained, and, consequently, a central limit theorem is established. We investigate the finite sample performance of the estimator through simulations. A real data illustration is included.
Similar content being viewed by others
References
Fleming TR, Harrington DP (1991) Counting processes and survival analysis. Wiley, New York
Földes A, Rejtő L (1981) A LIL type result for the product limit estimator. Z Wahrscheinlichkeitstheor Verw Geb 56:75–86
Kalbfleisch JD, Prentice RL (1980) The statistical analysis of failure time data. Wiley, New York
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481
Lakhal L, Rivest LP, Abdous B (2008) Estimating survival and association in a semicompeting risks model. Biometrics 64:180–188
Lo SH, Singh K (1986) The product-limit estimator and the bootstrap: some asymptotic representations. Probab Theory Relat Fields 71:455–456
Major P, Rejtő L (1988) Strong embedding of the estimator of the distribution function under random censorship. Ann Stat 16:1113–1132
Nelsen RB (2006) An introduction to copulas. Springer, New York
Rivest LP, Wells MT (2001) A martingale approach to the copula-graphic estimator for the survival function under dependent censoring. J Multivar Anal 79:138–155
Said M, Ghazzali N, Rivest LP (2009) Score tests for independence in semiparametric competing risks models. Lifetime Data Anal 15:413–444
Sánchez-Sellero C, González-Manteiga W, Van Keilegom I (2005) Uniform representation of product-limit integrals with applications. Scand J Stat 32:563–581
Schäfer H (1986) Local convergence of empirical measures in the random censorship situation with application to density and rate estimators. Ann Stat 14:1240–1245
Stute W (1993) Consistent estimation under random censorship when covariables are present. J Multivar Anal 45:89–103
Stute W (1995) The central limit theorem under random censorship. Ann Stat 23:422–439
Stute W (1996) Distributional convergence under random censorship when covariables are present. Scand J Stat 23:461–471
Tsiatis A (1975) A nonidentifiability aspect of the problem of competing risks. Proc Natl Acad Sci 72:20–22
Van Keilegom I, Veraverbeke N (1997) Estimation and bootstrap with censored data in fixed design nonparametric regression. Ann Inst Stat Math 49:467–491
Zheng M, Klein JP (1995) Estimates of marginal survival for dependent competing risks based on an assumed copula. Biometrika 82:127–138
Acknowledgements
Work supported by the Grant MTM2008-03129 of the Spanish Ministry of Science and Innovation. The first author acknowledges support from the projects MTM2011-23204 of the Spanish Ministry of Science and Innovation (FEDER support included) and 10PXIB300068PR of the Xunta de Galicia. The second author also acknowledges the IAP Research Network P6/03 of the Belgian State (Belgian Science Policy). Noël Veraverbeke is extraordinary professor at the North-West University, Potchefstroom, South Africa.
Author information
Authors and Affiliations
Corresponding author
Appendix 1: Technical lemmas
Appendix 1: Technical lemmas
In this section, we give the technical lemmas needed in the proof to Theorem 1.
Lemma 1
Under the conditions in Theorem 1, we have
Proof
It is easy to see that, with probability 1,
The first term is O(n −1/2(loglogn)1/2) a.s., cf. Földes and Rejtő (1981). The same order bound is proved to hold for the second term in Lemma 4 below, and the proof is complete. □
Lemma 2
Under the conditions in Theorem 1, we have
Proof
With probability 1 we have
Then, the assertion of the lemma follows from a result of Földes and Rejtő (1981). □
Lemma 3
Under the conditions in Theorem 1, we have
Proof
Divide [0,T] into k n =O(n 1/2(logn)1/2) subintervals [t i ,t i+1] of length O(n −1/2(logn)1/2). Then, as in the proof of Lo and Singh (1986), we have
For I, we have by Taylor expansion and the fact that \(\sup_{0\leq t\leq T}\vert \overline{H}_{n}(t)-\overline{H}(t)\vert =O(n^{-1/2}(\log \log n)^{1/2})\) a.s. (Földes and Rejtő 1981):
Now further subdivide each interval [t i ,t i+1] into a n =O(n 1/4(logn)−1/4) subintervals of length O(n −3/4(logn)3/4). By using Bernstein’s inequality we can show that this term is bounded a.s. by
for some constant c>0. By the modulus of continuity result for the Kaplan–Meier estimator (see Schäfer 1986) we obtain that I=O(n −3/4(logn)3/4) a.s. The II term is treated similarly and leads to the same order. It requires the almost sure behavior of the modulus of continuity of the \(H_{n}^{1}\) estimator, and this follows from Lemma 5 below. In that Lemma take a n =n −1/2(logn)1/2. □
Lemmas 4 and 5 below are needed for the proofs of Lemmas 1 and 3, respectively. They have some independent interest since they provide the almost sure rate of convergence and the almost sure behavior of the modulus of continuity for the estimator of the cumulative incidence function of Z subject to δ=1 (\(H_{n}^{1}\)).
Lemma 4
For \(T<\min (T_{F},T_{G},T_{\widetilde{G}})\), we have
Proof
Define the following empirical estimators for the distribution function \(\widetilde{H}(t)=P(U\leq t)\) and for the subdistribution functions \(\widetilde{H}^{0}(t)=P(U\leq t,\rho =0)\) and \(\widetilde{H}^{11}(t)=P(U\leq t,\rho =1,\delta =1)\):
Then, H 1(t) can be expressed in terms of \(\widetilde{H}\), \(\widetilde{H}^{0}\), and \(\widetilde{H}^{11}\), and \(H_{n}^{1}(t)\) can be expressed in terms of the corresponding empiricals. Similar as in Stute (1995), we obtain
and
It follows that \(\sup_{0\leq t\leq T}\vert H_{n}^{1}(t)-H^{1}(t)\vert \) is smaller than
The second term in (4) is O(n −1/2(loglogn)1/2) a.s. For the first term in (4), we use (with obvious abbreviations) that
with θ between 0 and a−b. Note that exp(b) is uniformly bounded in [0,T]. Looking at (a−b), we have
The second term in (5) is O(n −1/2(loglogn)1/2) a.s. For the first term in (5), we use that, for x≥0,
It follows that the first term in (5) is bounded above by
This is O(n −1/2(loglogn)1/2) a.s. since \(\sup_{0\leq z\leq T}\vert \widetilde{H}_{n}(z)-\widetilde{H}(z)\vert \) has the same order and since \(\widetilde{H}(T)<1\). □
Lemma 5
Suppose that \(T<\min (T_{F},T_{G},T_{\widetilde{G}})\). Suppose that H(t)=P(Z≤t) and H 1(t)=P(Z≤t,δ=1) have bounded first derivatives in [0,T]. Let {a n } be a sequence of positive constants tending to zero with a n n(logn)−5>Δ>0 for all n sufficiently large. Then
Proof
We make the same partition of the interval [0,T] as in Lemma A.5 of Van Keilegom and Veraverbeke (1997). Exploiting the monotonicity of H 1(t) and \(H_{n}^{1}(t)\) and also the Lipschitz continuity of H 1(t), we obtain that it suffices to prove that
where {t ij }, i=1,…,m, j=−b n ,…,b n is a grid of points with \(m= [ \frac{T}{a_{n}} ] \) ([⋅] denoting the integer part) and \(b_{n}\sim a_{n}^{1/2}n^{1/2}(\log n)^{-1/2}\). At this point, we use the almost sure asymptotic representation for \(H_{n}^{1}(t)\) as it can be derived as a special case of the more general result of Sánchez-Sellero et al. (2005):
where
with
the function \(\widetilde{C}(t)\) being that in the remark of Sect. 2; and sup0≤t≤T |R n (t)|=O(n −1(logn)3) a.s. It follows that it suffices to show that
To achieve this, we use Bernstein’s inequality as in Van Keilegom and Veraverbeke (1997). The random variables \(\widetilde{\widetilde{\psi }}_{r}(t_{ik})-\widetilde{\widetilde{\psi }}_{r}(t_{ij})\) are bounded, and \(\operatorname {Var}(\widetilde{\widetilde{\psi }}_{r}(t_{ik})-\widetilde{\widetilde{\psi }}_{r}(t_{ij}))\) is bounded by a constant times a n . The latter fact is shown by checking six appropriate groups of terms in
For example, by direct calculation,
for some constant c>0 by the Lipschitz continuity of H. The other groups of terms are treated similarly. □
Rights and permissions
About this article
Cite this article
de Uña-Álvarez, J., Veraverbeke, N. Generalized copula-graphic estimator. TEST 22, 343–360 (2013). https://doi.org/10.1007/s11749-012-0314-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-012-0314-2