Abstract
This paper addresses the problem of estimating, from randomly censored data subject to competing risks, the extreme value index of the (sub)-distribution function associated to one particular cause, in a heavy-tail framework. Asymptotic normality of the proposed estimator is established. This estimator has the form of an Aalen-Johansen integral and is the first estimator proposed in this context. Estimation of extreme quantiles of the cumulative incidence function is then addressed as a consequence. A small simulation study exhibits the performances for finite samples.
Similar content being viewed by others
References
Aalen A, Johansen S (1978) An empirical transition matrix for nonhomogeneous Markov chains based on censored observations. Scand J Stat 5:141–150
Akritas MG (2000) The central limit theorem under censoring. Bernoulli 6(6):1109–1120
Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer Series in Statistics. Springer, New York
Beirlant J, Dierckx G, Fils-Villetard A, Guillou A (2007) Estimation of the extreme value index and extreme quantiles under random censoring. Extremes 10:151–174
Beyersmann J, Schumacher M (2008) A note on nonparametric quantile inference for risks and more complex multistate models. Biometrika 95(4):1006–1008
Bingham NH, Goldie CM, Teugels IL (1987) Regular variation. Cambridge University Press, Cambridge
Chow YS, Teicher H (1997) Probability theory. Independence, interchangeability, martingales. Springer, New York
Crowder M (2001) Classical competing risks. Chapman and Hall, London
Csorgo M, Szyszkowicz B, Wang Q (2008) Asymptotics of studentized U-type processes for change-point problems. Acta Math Hunga 121(4):333–357
de Haan L, Ferreira A (2006) Extreme value theory: an introduction. Springer, New York
Einmahl J, Fils-Villetard A, Guillou A (2008) Statistics of extremes under random censoring. Bernoulli 14:207–227
Fermanian J-D (2003) Nonparametric estimation of competing risks models with covariates. J Multivar Anal 85:156–191
Geffray S (2009) Strong approximations for dependent competing risks with independent censoring. Test 18:76–95
Gerds T, Beyersmann J, Starkopf L, Schumacher M (2017) The Kaplan–Meier integral in the presence of covariates: a review. Chapter 2 of from statistics to mathematical finance: Festschrift in honour of Winfried Stute, pp 25–41
Moeschberger ML, Klein JP (1995) Statistical methods for dependent competing risks. Lifetime Data Anal 1:195–204
Peng L, Fine JP (2007) Nonparametric quantile inference with competing-risks data. Biometrika 94:735–744
Smith R (1987) Estimating tails of probability distributions. Ann Stat 15(3):1174–1207
Stute W (1994) Strong and weak representations of cumulative hazard function and Kaplan–Meier estimators on increasing sets. J Stat Plan Inference 43:315–329
Stute W (1995) The central limit theorem under random censorship. Ann Stat 23(2):422–439
Suzukawa A (2002) Asymptotic properties of Aalen-Johansen integrals for competing risks data. J Jpn Stat Soc 32(1):77–93
Tsiatis A (1975) A nonidentifiability aspects of the problem of competing risks. Proc Natl Acad Sci USA 72:20–22
Worms J, Worms R (2014) New estimators of the extreme value index under random right censoring, for heavy-tailed distributions. Extremes 17(2):337–358
Worms J, Worms R (2016) A Lynden-Bell integral estimator for extremes of randomly truncated data. Stat Probab Lett 109:106–117
Zhou M (1991) Some properties of the Kaplan–Meier estimator for independent nonidentically distributed random variables. Ann Stat 19(4):2266–2274
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
This “Appendix” contains various results: some of them are used repeatedly in the proof of the main result (in particular Proposition 4, Lemmas 7, and 10, and to a lesser extent Lemmas 9 and 8), the other ones concern parts of the main proof which are postponed to the “Appendix” for better clarity of the main flow of the proof (Lemmas 11, 12 and 13). All the proofs can be found as a supplementary material file located in the first author’s webpage at http://lmv.math.cnrs.fr/annuaire/julien-worms/article/julien-worms-english.
Definition 1
An ultimately positive function f : \(\mathbb {R}^+ \rightarrow \mathbb {R}\) is regularly varying (at infinity) with index \(\alpha \in \mathbb {R}\), if
This is noted \(f \in RV_{\alpha }\). If \(\alpha =0\), f is said to be slowly varying.
Proposition 4
(See Haan and Ferreira 2006 Proposition B.1.9)
Suppose \(f \in RV_{\alpha }\). If \(x < 1\) and \(\epsilon >0\), then there exists \(t_0=t_0(\epsilon )\) such that for every \(t\ge t_0\),
and if \(x \ge 1\) ,
Lemma 7
Let \(x \in \mathbb {R}_+^* \) , \(\alpha \in \mathbb {R}_+\), \(\beta >-1\), and for a and b real numbers, f and g are two regular varying functions at infinity, with index, respectively, a and b. Then, as \(t \rightarrow + \infty \),
-
(i)
\( \displaystyle J_{\beta }(x) = \int _1^{+\infty } \log ^{\beta } (y) \ y^{-x-1} dy = \frac{\varGamma (\beta +1)}{x^{\beta +1}}\).
-
(ii)
\( \displaystyle I_{\alpha ,a,b} = \int _{1}^{+\infty } \log ^{\alpha } (y) \ \frac{f(yt)}{f(t)} \ \frac{dg(yt)}{g(t)} \rightarrow \frac{b \varGamma (\alpha +1)}{(-a-b)^{\alpha +1}}\), if \(a+b <0\)
-
(iii)
\( \displaystyle J_{a,b} = \int _0^{1} \frac{f(yt)}{f(t)} \ \frac{dg(yt)}{g(t)} \rightarrow \frac{b}{a+b}\), if \(a+b > 0\)
Lemma 8
For any \(\delta >0\), let \(C_{\delta }\) denote the function
Under condition (1), this function is regularly varying of order \(\delta /\gamma \) and \(C_{\delta }(t)\sim (\gamma /\gamma _C)/(\delta \overline{H}^{\delta }(t))\), as \(t\rightarrow +\,\infty \).
Remark 3
In the lemma above, \(C_1\) is the important function C introduced at the beginning of Sect. 5, and thus \(C(t)\sim (\gamma /\gamma _C)/\overline{H}(t) = (1-\gamma /\gamma _F)/\overline{H}(t)\), as \(t \rightarrow +\, \infty \). Hence, C is regularly varying at infinity with index \(1/\gamma \), a property which proves useful several times in the main proofs.
Lemma 9
Let \(\psi (\phi _n,u)= \int _u^{+\infty } \phi _n(s) d F^{(k)}(x)\), for \(u \ge 0\) and \(\phi _n(u)= \frac{1}{\overline{F}^{(k)}(t_n)} \log (u/t_n) \mathbb {I}_{u>t_n}\). Under condition (1), we have
where \(\epsilon _n(u)\) is a sequence tending to 0 uniformly in u, as \(n\rightarrow \infty \), and \(\delta \) a positive real number such that \(-\frac{1}{\gamma _k} + \delta <0\).
Lemma 10
Recalling that H is a distribution function with infinite right endpoint, we have:
-
(i)
\(\sup _{0\le x<Z^{(n)}} \overline{H}(x)/\overline{H}_n(x) = O_{\mathbb {P}}(1)\)
-
(ii)
for any \(a<1/2\),
$$\begin{aligned} \sqrt{n} \sup _{t \ge 0} \frac{ |\overline{H}_n(t)-\overline{H}(t)|}{(\overline{H}(t))^a} = O_{\mathbb {P}}(1) \quad \hbox {and}\quad \sqrt{n} \sup _{t \ge 0 } \frac{ |\overline{H}_n^{(0)}(t)-\overline{H}^{(0)}(t)|}{(\overline{H}^{(0)}(t))^a} = O_{\mathbb {P}}(1) . \end{aligned}$$
Lemma 11
Under conditions (1) and (2), suppose that \(\alpha \ge 0\) and \(d \ge 1\) are real numbers. If \(\gamma _k < \gamma _C\) and
then we have \( \sum _{i=1}^n X_{i,n}{\mathop {\longrightarrow }\limits ^{\mathbb {P}}} 0\), as n tends to infinity, if \(\alpha \) is 0 or sufficiently close to it.
Lemma 12
Suppose that \(V_1\) and \(W_2\) are independent improper random variables of respective subdistribution functions \(H^{(0)}\) and \(H^{(1,k)}\), and \(Z_3\) is independent of \(V_1\) and \(W_2\) and has distribution H. Consider h, \(\underline{h}\), \({\mathcal {H}}\) and \(\underline{\mathcal {H}}\) the functions defined in (30) and (39).
-
(i)
For any \(d\ge 1\), there exist some positive constants c and \(c'\) such that
$$\begin{aligned} \mathbb {E}\,(\, |{\mathcal {H}}^d(V_1,W_2)|\,)\le & {} c \, \mathbb {E}\,(\, h^d(V_1,W_2)\,) \quad \hbox {and}\quad \mathbb {E}\,(\, |\underline{\mathcal {H}}^d(Z_3,V_1,W_2)|\,) \\\le & {} c' \, \mathbb {E}\,(\, \underline{h}^d(Z_3,V_1,W_2) \,) . \end{aligned}$$ -
(ii)
For any \(d\in ]1,1+(1+2\gamma _k/\gamma _C)^{-1}[\), we have
$$\begin{aligned} \textstyle \mathbb {E}\,(\, h^d(V_1,W_2)\,) = O\left( (\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{2(1-d)} \right) . \end{aligned}$$In particular, if \(\gamma _k < \gamma _C\), then \(\mathbb {E}(h^{4/3}(V_1,W_2))\) is of the order of \((\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{-2/3}\) and \(\mathbb {E}(h^d(V_1,W_2)) \) is finite whenever d is (greater than but) sufficiently close to 4 / 3.
-
(iii)
For any \(d\in ]1,1+(1+3\gamma _k/\gamma _C)^{-1}[\), we have
$$\begin{aligned} \textstyle \mathbb {E}\,(\, \underline{h}^d(Z_3,V_1,W_2)\,) = O\left( (\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{3(1-d)} \right) . \end{aligned}$$In particular, if \(\gamma _k < \gamma _C\), then \(\mathbb {E}(\underline{h}^{6/5}(Z_3,V_1,W_2))\) is of the order of \((\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{-3/5}\) and \(\mathbb {E}(\underline{h}^d(Z_3,V_1,W_2)) \) is finite whenever d is (greater than but) sufficiently close to 6 / 5.
-
(iv)
For any \(d\in ]1/2,(2\gamma _C^{-1}+\gamma _F^{-1}+\gamma _k^{-1})/(3\gamma _C^{-1}+2\gamma _F^{-1})[\), we have \(\mathbb {E}\left( h^d(V_1,W_2)/\overline{H}^d(V_1)\right) =O\left( (\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{2-3d}\right) \). In particular, if \(\gamma _k<\gamma _C\) then taking \(\delta \) (greater than but) sufficiently close to 4 / 5 is permitted, otherwise it is 2 / 3 instead of 4 / 5.
-
(v)
The integral \(\theta _n=\iint h(v,w)dH^{(0)}(v)dH^{(1,k)}(w)\) is equivalent, as \(n\rightarrow \infty \), to \(\gamma _k(-\log \overline{G}(t_n))\).
Lemma 13
In this lemma, various notations defined in Sects. 5.2.2–5.2.4 are used.
-
(i)
The variables \({{\mathcal {H}}}^{**}_I\) for \(I\in \{(i,j) \, ; \, 1\le i<j\le n\}\) are centred and uncorrelated. This is also true for the variables \({\underline{\mathcal {H}}}^{**}_I\) for \(I\in \{(i,j,l) \, ; \, 1\le i<j<l\le n\}\).
-
(ii)
We have \(\mathbb {E}\left[ ({{\mathcal {H}}}^{**}(V_1,W_2))^2\right] \le 48 \mathbb {E}[{{\mathcal {H}}}_1^2\mathbb {I}_{|{{\mathcal {H}}}_1|\le M_n}]\).
-
(iii)
We have \(\mathbb {E}\left( \, |{{\mathcal {H}}}_1-{\mathcal {H}}^*(V_1,W_2)+{{\mathcal {H}}}^*_{1\bullet }(V_1)+{{\mathcal {H}}}^*_{{\bullet }1}(W_2)| \,\right) \le 4\mathbb {E}\left( |{{\mathcal {H}}}_1|\mathbb {I}_{|{{\mathcal {H}}}_1|>M_n}\right) \)
Rights and permissions
About this article
Cite this article
Worms, J., Worms, R. Extreme value statistics for censored data with heavy tails under competing risks . Metrika 81, 849–889 (2018). https://doi.org/10.1007/s00184-018-0662-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-018-0662-3