Generalized copula-graphic estimator

de Uña-Álvarez, Jacobo; Veraverbeke, Noël

doi:10.1007/s11749-012-0314-2

Generalized copula-graphic estimator

Original Paper
Published: 18 January 2013

Volume 22, pages 343–360, (2013)
Cite this article

TEST Aims and scope Submit manuscript

Jacobo de Uña-Álvarez¹ &
Noël Veraverbeke^2,3

449 Accesses
21 Citations
Explore all metrics

Abstract

In this paper, a copula-graphic estimator is proposed for censored survival data. It is assumed that there is some dependent censoring acting on the variable of interest that may come from an existing competing risk. Furthermore, the full process is independently censored by some administrative censoring time. The dependent censoring is modeled through an Archimedean copula function, which is supposed to be known. An asymptotic representation of the estimator as a sum of independent and identically distributed random variables is obtained, and, consequently, a central limit theorem is established. We investigate the finite sample performance of the estimator through simulations. A real data illustration is included.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extention of Relative-Risk Power Estimator Under Dependent Random Censored Data

Article 31 January 2024

Semi-parametric Random Censorship Models

Estimating the survival function based on the semi-Markov model for dependent censoring

Article 14 March 2015

References

Fleming TR, Harrington DP (1991) Counting processes and survival analysis. Wiley, New York
MATH Google Scholar
Földes A, Rejtő L (1981) A LIL type result for the product limit estimator. Z Wahrscheinlichkeitstheor Verw Geb 56:75–86
Article MATH Google Scholar
Kalbfleisch JD, Prentice RL (1980) The statistical analysis of failure time data. Wiley, New York
MATH Google Scholar
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481
Article MathSciNet MATH Google Scholar
Lakhal L, Rivest LP, Abdous B (2008) Estimating survival and association in a semicompeting risks model. Biometrics 64:180–188
Article MathSciNet MATH Google Scholar
Lo SH, Singh K (1986) The product-limit estimator and the bootstrap: some asymptotic representations. Probab Theory Relat Fields 71:455–456
Article MathSciNet MATH Google Scholar
Major P, Rejtő L (1988) Strong embedding of the estimator of the distribution function under random censorship. Ann Stat 16:1113–1132
Article MATH Google Scholar
Nelsen RB (2006) An introduction to copulas. Springer, New York
MATH Google Scholar
Rivest LP, Wells MT (2001) A martingale approach to the copula-graphic estimator for the survival function under dependent censoring. J Multivar Anal 79:138–155
Article MathSciNet MATH Google Scholar
Said M, Ghazzali N, Rivest LP (2009) Score tests for independence in semiparametric competing risks models. Lifetime Data Anal 15:413–444
Article MathSciNet Google Scholar
Sánchez-Sellero C, González-Manteiga W, Van Keilegom I (2005) Uniform representation of product-limit integrals with applications. Scand J Stat 32:563–581
Article Google Scholar
Schäfer H (1986) Local convergence of empirical measures in the random censorship situation with application to density and rate estimators. Ann Stat 14:1240–1245
Article MATH Google Scholar
Stute W (1993) Consistent estimation under random censorship when covariables are present. J Multivar Anal 45:89–103
Article MathSciNet MATH Google Scholar
Stute W (1995) The central limit theorem under random censorship. Ann Stat 23:422–439
Article MathSciNet MATH Google Scholar
Stute W (1996) Distributional convergence under random censorship when covariables are present. Scand J Stat 23:461–471
MathSciNet MATH Google Scholar
Tsiatis A (1975) A nonidentifiability aspect of the problem of competing risks. Proc Natl Acad Sci 72:20–22
Article MathSciNet MATH Google Scholar
Van Keilegom I, Veraverbeke N (1997) Estimation and bootstrap with censored data in fixed design nonparametric regression. Ann Inst Stat Math 49:467–491
Article MATH Google Scholar
Zheng M, Klein JP (1995) Estimates of marginal survival for dependent competing risks based on an assumed copula. Biometrika 82:127–138
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Work supported by the Grant MTM2008-03129 of the Spanish Ministry of Science and Innovation. The first author acknowledges support from the projects MTM2011-23204 of the Spanish Ministry of Science and Innovation (FEDER support included) and 10PXIB300068PR of the Xunta de Galicia. The second author also acknowledges the IAP Research Network P6/03 of the Belgian State (Belgian Science Policy). Noël Veraverbeke is extraordinary professor at the North-West University, Potchefstroom, South Africa.

Author information

Authors and Affiliations

Department of Statistics and Operations Research, Universidad of Vigo, 36310, Vigo, Spain
Jacobo de Uña-Álvarez
Center for Statistics, Universiteit Hasselt, Hasselt, Belgium
Noël Veraverbeke
Unit for BMI, North-West University, Potchefstroom, South Africa
Noël Veraverbeke

Authors

Jacobo de Uña-Álvarez
View author publications
You can also search for this author in PubMed Google Scholar
Noël Veraverbeke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jacobo de Uña-Álvarez.

Appendix 1: Technical lemmas

In this section, we give the technical lemmas needed in the proof to Theorem 1.

Lemma 1

Under the conditions in Theorem 1, we have

$$ \sup_{0\leq t\leq T}\bigl \vert R_{n1}(t)\bigr \vert =O \bigl(n^{-1}\log \log n\bigr)\quad \text{\textit{a.s.}} $$

Proof

It is easy to see that, with probability 1,

The first term is O(n ^−1/2(loglogn)^1/2) a.s., cf. Földes and Rejtő (1981). The same order bound is proved to hold for the second term in Lemma 4 below, and the proof is complete. □

Lemma 2

Under the conditions in Theorem 1, we have

$$ \sup_{0\leq t\leq T}\bigl \vert R_{n2}(t)\bigr \vert =O \bigl(n^{-1}\log \log n\bigr)\quad \text{\textit{a.s.}} $$

Proof

With probability 1 we have

Then, the assertion of the lemma follows from a result of Földes and Rejtő (1981). □

Lemma 3

Under the conditions in Theorem 1, we have

$$ \sup_{0\leq t\leq T}\bigl \vert R_{n3}(t)\bigr \vert =O \bigl(n^{-3/4}(\log n)^{3/4}\bigr)\quad \text{\textit{a.s.}} $$

Proof

Divide [0,T] into k _n=O(n ^1/2(logn)^1/2) subintervals [t _i,t _i+1] of length O(n ^−1/2(logn)^1/2). Then, as in the proof of Lo and Singh (1986), we have

For I, we have by Taylor expansion and the fact that $\sup_{0\leq t\leq T}\vert \overline{H}_{n}(t)-\overline{H}(t)\vert =O(n^{-1/2}(\log \log n)^{1/2})$ a.s. (Földes and Rejtő 1981):

Now further subdivide each interval [t _i,t _i+1] into a _n=O(n ^1/4(logn)^−1/4) subintervals of length O(n ^−3/4(logn)^3/4). By using Bernstein’s inequality we can show that this term is bounded a.s. by

$$ c\max_{1\leq i\leq k_{n}}\max_{0\leq j\leq a_{n}-1}\bigl \vert H_{n}(t_{i,j+1})-H(t_{i,j+1})-H_{n}(t_{i})+H(t_{i}) \bigr \vert +O\bigl(n^{-3/4}(\log n)^{3/4}\bigr) $$

for some constant c>0. By the modulus of continuity result for the Kaplan–Meier estimator (see Schäfer 1986) we obtain that I=O(n ^−3/4(logn)^3/4) a.s. The II term is treated similarly and leads to the same order. It requires the almost sure behavior of the modulus of continuity of the $H_{n}^{1}$ estimator, and this follows from Lemma 5 below. In that Lemma take a _n=n ^−1/2(logn)^1/2. □

Lemmas 4 and 5 below are needed for the proofs of Lemmas 1 and 3, respectively. They have some independent interest since they provide the almost sure rate of convergence and the almost sure behavior of the modulus of continuity for the estimator of the cumulative incidence function of Z subject to δ=1 ($H_{n}^{1}$).

Lemma 4

For $T<\min (T_{F},T_{G},T_{\widetilde{G}})$, we have

$$ \sup_{0\leq t\leq T}\bigl \vert H_{n}^{1}(t)-H^{1}(t) \bigr \vert =O\bigl(n^{-1/2}(\log \log n)^{1/2}\bigr)\quad \text{\textit{a.s.}} $$

Proof

Define the following empirical estimators for the distribution function $\widetilde{H}(t)=P(U\leq t)$ and for the subdistribution functions $\widetilde{H}^{0}(t)=P(U\leq t,\rho =0)$ and $\widetilde{H}^{11}(t)=P(U\leq t,\rho =1,\delta =1)$:

Then, H ¹(t) can be expressed in terms of $\widetilde{H}$, $\widetilde{H}^{0}$, and $\widetilde{H}^{11}$, and $H_{n}^{1}(t)$ can be expressed in terms of the corresponding empiricals. Similar as in Stute (1995), we obtain

$$ H_{n}^{1}(t)=\int_{0}^{t}\exp \biggl\{ n\int_{0}^{u}\log \biggl(1+ \frac{1}{n(1-\widetilde{H}_{n}(z))}\biggr)\,d\widetilde{H}_{n}^{0}(z) \biggr\} \,d \widetilde{H}_{n}^{11}(u) $$

and

$$ H^{1}(t)=\int_{0}^{t}\exp \biggl\{ \int _{0}^{u}\frac{d\widetilde{H}^{0}(z)}{1-\widetilde{H}(z)} \biggr\} \,d \widetilde{H}^{11}(u). $$

It follows that $\sup_{0\leq t\leq T}\vert H_{n}^{1}(t)-H^{1}(t)\vert $ is smaller than

(4)

The second term in (4) is O(n ^−1/2(loglogn)^1/2) a.s. For the first term in (4), we use (with obvious abbreviations) that

with θ between 0 and a−b. Note that exp(b) is uniformly bounded in [0,T]. Looking at (a−b), we have

(5)

The second term in (5) is O(n ^−1/2(loglogn)^1/2) a.s. For the first term in (5), we use that, for x≥0,

$$ x-\frac{1}{2}x^{2}\leq \log (1+x)\leq x. $$

It follows that the first term in (5) is bounded above by

$$ \sup_{0\leq z\leq T}\biggl \vert \frac{1}{1-\widetilde{H}_{n}(z)}-\frac{1}{1-\widetilde{H}(z)}\biggr \vert +\frac{1}{2n}\sup_{0\leq z\leq T}\frac{1}{(1-\widetilde{H}_{n}(z))^{2}}. $$

This is O(n ^−1/2(loglogn)^1/2) a.s. since $\sup_{0\leq z\leq T}\vert \widetilde{H}_{n}(z)-\widetilde{H}(z)\vert $ has the same order and since $\widetilde{H}(T)<1$. □

Lemma 5

Suppose that $T<\min (T_{F},T_{G},T_{\widetilde{G}})$. Suppose that H(t)=P(Z≤t) and H ¹(t)=P(Z≤t,δ=1) have bounded first derivatives in [0,T]. Let {a _n} be a sequence of positive constants tending to zero with a _n n(logn)⁻⁵>Δ>0 for all n sufficiently large. Then

$$ \sup_{0\leq t,s\leq T,\vert t-s\vert \leq a_{n}}\bigl \vert H_{n}^{1}(t)-H_{n}^{1}(s)-H^{1}(t)+H^{1}(s) \bigr \vert =O\bigl(a_{n}^{1/2}n^{-1/2}(\log n)^{1/2}\bigr)\quad \text{\textit{a.s.}} $$

Proof

We make the same partition of the interval [0,T] as in Lemma A.5 of Van Keilegom and Veraverbeke (1997). Exploiting the monotonicity of H ¹(t) and $H_{n}^{1}(t)$ and also the Lipschitz continuity of H ¹(t), we obtain that it suffices to prove that

where {t _ij}, i=1,…,m, j=−b _n,…,b _n is a grid of points with $m= [ \frac{T}{a_{n}} ] $ ([⋅] denoting the integer part) and $b_{n}\sim a_{n}^{1/2}n^{1/2}(\log n)^{-1/2}$. At this point, we use the almost sure asymptotic representation for $H_{n}^{1}(t)$ as it can be derived as a special case of the more general result of Sánchez-Sellero et al. (2005):

$$ H_{n}^{1}(t)-H^{1}(t)=\frac{1}{n}\sum _{i=1}^{n}\widetilde{\widetilde{\psi }}_{i}(t)+R_{n}(t), $$

where

with

the function $\widetilde{C}(t)$ being that in the remark of Sect. 2; and sup_0≤t≤T|R _n(t)|=O(n ⁻¹(logn)³) a.s. It follows that it suffices to show that

$$ \max_{1\leq i\leq m-1}\max_{-b_{n}<j,k<b_{n}}\Biggl \vert \frac{1}{n}\sum_{r=1}^{n}\bigl(\widetilde{\widetilde{ \psi }}_{r}(t_{ik})-\widetilde{\widetilde{\psi }}_{r}(t_{ij})\bigr)\Biggr \vert =O\bigl(a_{n}^{1/2}n^{-1/2}( \log n)^{1/2}\bigr). $$

To achieve this, we use Bernstein’s inequality as in Van Keilegom and Veraverbeke (1997). The random variables $\widetilde{\widetilde{\psi }}_{r}(t_{ik})-\widetilde{\widetilde{\psi }}_{r}(t_{ij})$ are bounded, and $\operatorname {Var}(\widetilde{\widetilde{\psi }}_{r}(t_{ik})-\widetilde{\widetilde{\psi }}_{r}(t_{ij}))$ is bounded by a constant times a _n. The latter fact is shown by checking six appropriate groups of terms in

For example, by direct calculation,

for some constant c>0 by the Lipschitz continuity of H. The other groups of terms are treated similarly. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Uña-Álvarez, J., Veraverbeke, N. Generalized copula-graphic estimator. TEST 22, 343–360 (2013). https://doi.org/10.1007/s11749-012-0314-2

Download citation

Received: 05 September 2012
Accepted: 21 December 2012
Published: 18 January 2013
Issue Date: June 2013
DOI: https://doi.org/10.1007/s11749-012-0314-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized copula-graphic estimator

Abstract

Access this article

Similar content being viewed by others