Skip to main content
Log in

Extreme value statistics for censored data with heavy tails under competing risks

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

This paper addresses the problem of estimating, from randomly censored data subject to competing risks, the extreme value index of the (sub)-distribution function associated to one particular cause, in a heavy-tail framework. Asymptotic normality of the proposed estimator is established. This estimator has the form of an Aalen-Johansen integral and is the first estimator proposed in this context. Estimation of extreme quantiles of the cumulative incidence function is then addressed as a consequence. A small simulation study exhibits the performances for finite samples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Aalen A, Johansen S (1978) An empirical transition matrix for nonhomogeneous Markov chains based on censored observations. Scand J Stat 5:141–150

    MATH  Google Scholar 

  • Akritas MG (2000) The central limit theorem under censoring. Bernoulli 6(6):1109–1120

    Article  MathSciNet  MATH  Google Scholar 

  • Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer Series in Statistics. Springer, New York

    Book  MATH  Google Scholar 

  • Beirlant J, Dierckx G, Fils-Villetard A, Guillou A (2007) Estimation of the extreme value index and extreme quantiles under random censoring. Extremes 10:151–174

    Article  MathSciNet  MATH  Google Scholar 

  • Beyersmann J, Schumacher M (2008) A note on nonparametric quantile inference for risks and more complex multistate models. Biometrika 95(4):1006–1008

    Article  MathSciNet  MATH  Google Scholar 

  • Bingham NH, Goldie CM, Teugels IL (1987) Regular variation. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Chow YS, Teicher H (1997) Probability theory. Independence, interchangeability, martingales. Springer, New York

    MATH  Google Scholar 

  • Crowder M (2001) Classical competing risks. Chapman and Hall, London

    Book  MATH  Google Scholar 

  • Csorgo M, Szyszkowicz B, Wang Q (2008) Asymptotics of studentized U-type processes for change-point problems. Acta Math Hunga 121(4):333–357

    Article  MATH  Google Scholar 

  • de Haan L, Ferreira A (2006) Extreme value theory: an introduction. Springer, New York

    Book  MATH  Google Scholar 

  • Einmahl J, Fils-Villetard A, Guillou A (2008) Statistics of extremes under random censoring. Bernoulli 14:207–227

    Article  MathSciNet  MATH  Google Scholar 

  • Fermanian J-D (2003) Nonparametric estimation of competing risks models with covariates. J Multivar Anal 85:156–191

    Article  MathSciNet  MATH  Google Scholar 

  • Geffray S (2009) Strong approximations for dependent competing risks with independent censoring. Test 18:76–95

    Article  MathSciNet  MATH  Google Scholar 

  • Gerds T, Beyersmann J, Starkopf L, Schumacher M (2017) The Kaplan–Meier integral in the presence of covariates: a review. Chapter 2 of from statistics to mathematical finance: Festschrift in honour of Winfried Stute, pp 25–41

  • Moeschberger ML, Klein JP (1995) Statistical methods for dependent competing risks. Lifetime Data Anal 1:195–204

    Article  MATH  Google Scholar 

  • Peng L, Fine JP (2007) Nonparametric quantile inference with competing-risks data. Biometrika 94:735–744

    Article  MathSciNet  MATH  Google Scholar 

  • Smith R (1987) Estimating tails of probability distributions. Ann Stat 15(3):1174–1207

    Article  MathSciNet  MATH  Google Scholar 

  • Stute W (1994) Strong and weak representations of cumulative hazard function and Kaplan–Meier estimators on increasing sets. J Stat Plan Inference 43:315–329

    Article  MathSciNet  MATH  Google Scholar 

  • Stute W (1995) The central limit theorem under random censorship. Ann Stat 23(2):422–439

    Article  MathSciNet  MATH  Google Scholar 

  • Suzukawa A (2002) Asymptotic properties of Aalen-Johansen integrals for competing risks data. J Jpn Stat Soc 32(1):77–93

    Article  MathSciNet  MATH  Google Scholar 

  • Tsiatis A (1975) A nonidentifiability aspects of the problem of competing risks. Proc Natl Acad Sci USA 72:20–22

    Article  MathSciNet  MATH  Google Scholar 

  • Worms J, Worms R (2014) New estimators of the extreme value index under random right censoring, for heavy-tailed distributions. Extremes 17(2):337–358

    Article  MathSciNet  MATH  Google Scholar 

  • Worms J, Worms R (2016) A Lynden-Bell integral estimator for extremes of randomly truncated data. Stat Probab Lett 109:106–117

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou M (1991) Some properties of the Kaplan–Meier estimator for independent nonidentically distributed random variables. Ann Stat 19(4):2266–2274

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julien Worms.

Appendix

Appendix

This “Appendix” contains various results: some of them are used repeatedly in the proof of the main result (in particular Proposition 4, Lemmas 7, and 10, and to a lesser extent Lemmas 9 and 8), the other ones concern parts of the main proof which are postponed to the “Appendix” for better clarity of the main flow of the proof (Lemmas 11,  12 and 13). All the proofs can be found as a supplementary material file located in the first author’s webpage at http://lmv.math.cnrs.fr/annuaire/julien-worms/article/julien-worms-english.

Definition 1

An ultimately positive function f : \(\mathbb {R}^+ \rightarrow \mathbb {R}\) is regularly varying (at infinity) with index \(\alpha \in \mathbb {R}\), if

$$\begin{aligned} \lim _{t \rightarrow + \infty } \frac{f(tx)}{f(t)} = x^{\alpha } \quad (\forall x >0). \end{aligned}$$

This is noted \(f \in RV_{\alpha }\). If \(\alpha =0\), f is said to be slowly varying.

Proposition 4

(See Haan and Ferreira 2006 Proposition B.1.9)

Suppose \(f \in RV_{\alpha }\). If \(x < 1\) and \(\epsilon >0\), then there exists \(t_0=t_0(\epsilon )\) such that for every \(t\ge t_0\),

$$\begin{aligned} (1-\epsilon ) x^{\alpha +\epsilon }< \frac{f(tx)}{f(t)} < (1+\epsilon ) x^{\alpha -\epsilon } \end{aligned}$$

and if \(x \ge 1\) ,

$$\begin{aligned} (1-\epsilon ) x^{\alpha -\epsilon }< \frac{f(tx)}{f(t)} < (1+\epsilon ) x^{\alpha +\epsilon } . \end{aligned}$$
(43)

Lemma 7

Let \(x \in \mathbb {R}_+^* \) , \(\alpha \in \mathbb {R}_+\), \(\beta >-1\), and for a and b real numbers, f and g are two regular varying functions at infinity, with index, respectively, a and b. Then, as \(t \rightarrow + \infty \),

  1. (i)

    \( \displaystyle J_{\beta }(x) = \int _1^{+\infty } \log ^{\beta } (y) \ y^{-x-1} dy = \frac{\varGamma (\beta +1)}{x^{\beta +1}}\).

  2. (ii)

    \( \displaystyle I_{\alpha ,a,b} = \int _{1}^{+\infty } \log ^{\alpha } (y) \ \frac{f(yt)}{f(t)} \ \frac{dg(yt)}{g(t)} \rightarrow \frac{b \varGamma (\alpha +1)}{(-a-b)^{\alpha +1}}\), if \(a+b <0\)

  3. (iii)

    \( \displaystyle J_{a,b} = \int _0^{1} \frac{f(yt)}{f(t)} \ \frac{dg(yt)}{g(t)} \rightarrow \frac{b}{a+b}\), if \(a+b > 0\)

Lemma 8

For any \(\delta >0\), let \(C_{\delta }\) denote the function

$$\begin{aligned} C_{\delta }(t)=\int _0^t \frac{dG(v)}{\overline{G}(v)\overline{H}^{\delta }(v)}. \end{aligned}$$

Under condition (1), this function is regularly varying of order \(\delta /\gamma \) and \(C_{\delta }(t)\sim (\gamma /\gamma _C)/(\delta \overline{H}^{\delta }(t))\), as \(t\rightarrow +\,\infty \).

Remark 3

In the lemma above, \(C_1\) is the important function C introduced at the beginning of Sect. 5, and thus \(C(t)\sim (\gamma /\gamma _C)/\overline{H}(t) = (1-\gamma /\gamma _F)/\overline{H}(t)\), as \(t \rightarrow +\, \infty \). Hence, C is regularly varying at infinity with index \(1/\gamma \), a property which proves useful several times in the main proofs.

Lemma 9

Let \(\psi (\phi _n,u)= \int _u^{+\infty } \phi _n(s) d F^{(k)}(x)\), for \(u \ge 0\) and \(\phi _n(u)= \frac{1}{\overline{F}^{(k)}(t_n)} \log (u/t_n) \mathbb {I}_{u>t_n}\). Under condition (1), we have

$$\begin{aligned} \psi (\phi _n,u)= & {} \gamma _{n,k}, \hbox { if } u \le t_n \\= & {} \log \left( \frac{u}{t_n} \right) \frac{\overline{F}^{(k)}(u)}{\overline{F}^{(k)}(t_n)} + \gamma _k \left( \frac{u}{t_n} \right) ^{-1/ \gamma _k} + \epsilon _n(u) \left( \frac{u}{t_n} \right) ^{-1/ \gamma _k + \delta }\quad \hbox {if } u > t_n, \end{aligned}$$

where \(\epsilon _n(u)\) is a sequence tending to 0 uniformly in u, as \(n\rightarrow \infty \), and \(\delta \) a positive real number such that \(-\frac{1}{\gamma _k} + \delta <0\).

Lemma 10

Recalling that H is a distribution function with infinite right endpoint, we have:

  1. (i)

    \(\sup _{0\le x<Z^{(n)}} \overline{H}(x)/\overline{H}_n(x) = O_{\mathbb {P}}(1)\)

  2. (ii)

    for any \(a<1/2\),

    $$\begin{aligned} \sqrt{n} \sup _{t \ge 0} \frac{ |\overline{H}_n(t)-\overline{H}(t)|}{(\overline{H}(t))^a} = O_{\mathbb {P}}(1) \quad \hbox {and}\quad \sqrt{n} \sup _{t \ge 0 } \frac{ |\overline{H}_n^{(0)}(t)-\overline{H}^{(0)}(t)|}{(\overline{H}^{(0)}(t))^a} = O_{\mathbb {P}}(1) . \end{aligned}$$

Lemma 11

Under conditions (1) and (2), suppose that \(\alpha \ge 0\) and \(d \ge 1\) are real numbers. If \(\gamma _k < \gamma _C\) and

$$\begin{aligned} X_{i,n}= \frac{ \sqrt{v_n} }{ n^{1+d}} \frac{ \phi _n(Z_i) }{ \overline{G}(Z_i) (\overline{H}^{(0)}(Z_i))^{d+\alpha } } \mathbb {I}_{\xi _i=k}, \end{aligned}$$

then we have \( \sum _{i=1}^n X_{i,n}{\mathop {\longrightarrow }\limits ^{\mathbb {P}}} 0\), as n tends to infinity, if \(\alpha \) is 0 or sufficiently close to it.

Lemma 12

Suppose that \(V_1\) and \(W_2\) are independent improper random variables of respective subdistribution functions \(H^{(0)}\) and \(H^{(1,k)}\), and \(Z_3\) is independent of \(V_1\) and \(W_2\) and has distribution H. Consider h, \(\underline{h}\), \({\mathcal {H}}\) and \(\underline{\mathcal {H}}\) the functions defined in (30) and (39).

  1. (i)

    For any \(d\ge 1\), there exist some positive constants c and \(c'\) such that

    $$\begin{aligned} \mathbb {E}\,(\, |{\mathcal {H}}^d(V_1,W_2)|\,)\le & {} c \, \mathbb {E}\,(\, h^d(V_1,W_2)\,) \quad \hbox {and}\quad \mathbb {E}\,(\, |\underline{\mathcal {H}}^d(Z_3,V_1,W_2)|\,) \\\le & {} c' \, \mathbb {E}\,(\, \underline{h}^d(Z_3,V_1,W_2) \,) . \end{aligned}$$
  2. (ii)

    For any \(d\in ]1,1+(1+2\gamma _k/\gamma _C)^{-1}[\), we have

    $$\begin{aligned} \textstyle \mathbb {E}\,(\, h^d(V_1,W_2)\,) = O\left( (\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{2(1-d)} \right) . \end{aligned}$$

    In particular, if \(\gamma _k < \gamma _C\), then \(\mathbb {E}(h^{4/3}(V_1,W_2))\) is of the order of \((\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{-2/3}\) and \(\mathbb {E}(h^d(V_1,W_2)) \) is finite whenever d is (greater than but) sufficiently close to 4 / 3.

  3. (iii)

    For any \(d\in ]1,1+(1+3\gamma _k/\gamma _C)^{-1}[\), we have

    $$\begin{aligned} \textstyle \mathbb {E}\,(\, \underline{h}^d(Z_3,V_1,W_2)\,) = O\left( (\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{3(1-d)} \right) . \end{aligned}$$

    In particular, if \(\gamma _k < \gamma _C\), then \(\mathbb {E}(\underline{h}^{6/5}(Z_3,V_1,W_2))\) is of the order of \((\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{-3/5}\) and \(\mathbb {E}(\underline{h}^d(Z_3,V_1,W_2)) \) is finite whenever d is (greater than but) sufficiently close to 6 / 5.

  4. (iv)

    For any \(d\in ]1/2,(2\gamma _C^{-1}+\gamma _F^{-1}+\gamma _k^{-1})/(3\gamma _C^{-1}+2\gamma _F^{-1})[\), we have \(\mathbb {E}\left( h^d(V_1,W_2)/\overline{H}^d(V_1)\right) =O\left( (\overline{F}^{(k)}(t_n)\overline{G}(t_n))^{2-3d}\right) \). In particular, if \(\gamma _k<\gamma _C\) then taking \(\delta \) (greater than but) sufficiently close to 4 / 5 is permitted, otherwise it is 2 / 3 instead of 4 / 5.

  5. (v)

    The integral \(\theta _n=\iint h(v,w)dH^{(0)}(v)dH^{(1,k)}(w)\) is equivalent, as \(n\rightarrow \infty \), to \(\gamma _k(-\log \overline{G}(t_n))\).

Lemma 13

In this lemma, various notations defined in Sects. 5.2.25.2.4 are used.

  1. (i)

    The variables \({{\mathcal {H}}}^{**}_I\) for \(I\in \{(i,j) \, ; \, 1\le i<j\le n\}\) are centred and uncorrelated. This is also true for the variables \({\underline{\mathcal {H}}}^{**}_I\) for \(I\in \{(i,j,l) \, ; \, 1\le i<j<l\le n\}\).

  2. (ii)

    We have \(\mathbb {E}\left[ ({{\mathcal {H}}}^{**}(V_1,W_2))^2\right] \le 48 \mathbb {E}[{{\mathcal {H}}}_1^2\mathbb {I}_{|{{\mathcal {H}}}_1|\le M_n}]\).

  3. (iii)

    We have \(\mathbb {E}\left( \, |{{\mathcal {H}}}_1-{\mathcal {H}}^*(V_1,W_2)+{{\mathcal {H}}}^*_{1\bullet }(V_1)+{{\mathcal {H}}}^*_{{\bullet }1}(W_2)| \,\right) \le 4\mathbb {E}\left( |{{\mathcal {H}}}_1|\mathbb {I}_{|{{\mathcal {H}}}_1|>M_n}\right) \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Worms, J., Worms, R. Extreme value statistics for censored data with heavy tails under competing risks . Metrika 81, 849–889 (2018). https://doi.org/10.1007/s00184-018-0662-3

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-018-0662-3

Keywords

Mathematics Subject Classification

Navigation