Nonparametric estimation in the illness-death model using prevalent data

Vakulenko-Lagun, Bella; Mandel, Micha; Goldberg, Yair

doi:10.1007/s10985-016-9373-0

Nonparametric estimation in the illness-death model using prevalent data

Published: 28 June 2016

Volume 23, pages 25–56, (2017)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Bella Vakulenko-Lagun¹,
Micha Mandel¹ &
Yair Goldberg²

462 Accesses
2 Citations
Explore all metrics

Abstract

We study nonparametric estimation of the illness-death model using left-truncated and right-censored data. The general aim is to estimate the multivariate distribution of a progressive multi-state process. Maximum likelihood estimation under censoring suffers from problems of uniqueness and consistency, so instead we review and extend methods that are based on inverse probability weighting. For univariate left-truncated and right-censored data, nonparametric maximum likelihood estimation can be considerably improved when exploiting knowledge on the truncation distribution. We aim to examine the gain in using such knowledge for inverse probability weighting estimators in the illness-death framework. Additionally, we compare the weights that use truncation variables with the weights that integrate them out, showing, by simulation, that the latter performs more stably and efficiently. We apply the methods to intensive care units data collected in a cross-sectional design, and discuss how the estimators can be easily modified to more general multi-state models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonparametric change point estimation for survival distributions with a partially constant hazard rate

Article 05 April 2018

The wild bootstrap for multivariate Nelson–Aalen estimators

Article Open access 06 March 2018

A Joint Modeling Approach for Longitudinal Data with Informative Observation Times and a Terminal Event

Article 05 September 2018

References

Andersen P, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer-Verlag, New York
Book MATH Google Scholar
Asgharian M, M’Lan C, Wolfson D (2002) Length-biased sampling with right censoring: an unconditional approach. J Am Stat Assoc 97:201–209
Article MathSciNet MATH Google Scholar
Chang S, Tzeng S (2006) Nonparametric estimation of sojourn time distributions for truncated serial event data—a weight-adjusted approach. Lifetime Data Anal 12:53–67
Article MathSciNet MATH Google Scholar
Datta S, Satten G (2001) Validity of the Aalen-Johansen estimators of stage occupation probabilities and Nelson-Aalen estimators of integrated transition hazards for non-Markov models. Stat Probab Lett 55:403–411
Article MathSciNet MATH Google Scholar
Gill R (1992) Multivariate survival analysis. Theory Probab Appl 37(1):18–31
Article MathSciNet MATH Google Scholar
Gill R, van der Laan M, Wellner J (1995) Inefficient esttimators of the bivariate survival function for three models. Annales de l’Institut Henri Poincaré - Probabilités et Statistiques 31(3):545–597
MATH Google Scholar
Hougaard P (2000) Analysis of multivariate survival data. Springer, New York
Book MATH Google Scholar
Huang Y, Wang M-C (1995) Estimating the occurrence rate for prevalent survival data in competing risks model. J Am Stat Assoc 90(432):1406–1415
Article MathSciNet MATH Google Scholar
Kalbfleisch J, Prentice R (2002) The statistical analysis of failure time data. Wiley, Hoboken
Book MATH Google Scholar
Keiding N (1991) Age-specific incidence and prevalence: a statistical perspective. J R Stat Soc Ser A 154(3):371–412
Article MathSciNet MATH Google Scholar
Kosorok M (2008) Introduction to empirical processes and semiparametric inference. Springer, New York
Book MATH Google Scholar
Lin D, Sun W, Ying Z (1999) Nonparametric estimation of the gap time distributions for serial events with censored data. Biometrika 86(1):59–70
Article MathSciNet MATH Google Scholar
Mandel M (2010) The competing risks illness-death model under cross-sectional sampling. Biostatistics 11(2):290–303
Article Google Scholar
Mandel M, Betensky R (2007) Testing goodness of fit of a uniform truncation model. Biometrics 63(2):405–412
Article MathSciNet MATH Google Scholar
Mnatzaganian G, Galai N, Sprung CD, Zitser-Gurevich Y, Mandel M, Ben-Hur D, Gurman G, Klein M, Lev A, Levi L et al (2005) Increased risk of bloodstream and urinary infections in intensive care unit (ICU) patients compared with patients fitting ICU admission criteria treated in regular wards. J Hosp Infect 59:331–342
Article Google Scholar
Neuhaus G (1971) On weak convergence of stochastic processes with multidimensional time parameter. Ann Math Stat 42(4):1285–1295
Article MathSciNet MATH Google Scholar
Prentice R, Moodie Z, Wu J (2004) Nonparametric estimation of the bivariate survivor function. In Lin D, Heagerty P (eds) Proceedings of the second Seattle symposium in Biostatistics. Lecture notes in statistics, vol. 179. Springer, New York
Putter H, Fiocco M, Geskus RB (2007) Tutorial in biostatistics: competing risks and multi-state models. Stat Med 26(11):2389–2430
Article MathSciNet Google Scholar
Qin J, Shen Y (2010) Statistical methods for analyzing right-censored length-biased data under Cox model. Biometrics 66:382–392
Article MathSciNet MATH Google Scholar
Robins J, Rotnitzky A et al (1992) Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell N, Dietz K, Farewell V (eds) AIDS epidemiology—methodological issues. Springer, Boston
Google Scholar
Rubin D (1981) The Bayesian bootstrap. Ann Stat 9:130–134
Article MathSciNet Google Scholar
Tsai W-Y (1990) Testing the assumption of independence of truncation time and failure time. Biometrika 77(1):169–177
Article MathSciNet MATH Google Scholar
Vakulenko-Lagun B, Mandel M (2016) Comparing estimation approaches for the illness-death model under left truncation and right censoring. Stat Med 35:1533–1548
Article MathSciNet Google Scholar
van der Laan M (1996) Nonparametric estimation of the bivariate survival function with truncated data. J Multivar Anal 58(1):107–131
Article MathSciNet MATH Google Scholar
Wang M-C (1989) A semiparametric model for randomly truncated data. J Am Stat Assoc 84:742–748
Article MathSciNet MATH Google Scholar
Wang M-C (1991) Nonparametric estimation from cross-sectional survival data. J Am Stat Assoc 86:130–143
Article MathSciNet MATH Google Scholar
Wang M-C (1999) Gap time bias in incident and prevalent cohorts. Stat Sin 9:999–1010
MATH Google Scholar
Wang M-C, Jewell N, Tsai W-Y (1986) Asymptotic properties of the product limit estimate under random truncation. Ann Stat 14(4):1597–1605
Article MathSciNet MATH Google Scholar
Wang W, Wells M (1998) Nonparametric estimation of successive duration times under dependent censoring. Biometrika 85(3):561–572
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

We thank the two reviewers for their valuable comments and suggestions. The work was supported by The Israel Science Foundation (Grant No. 519/14) and by NSF grant DMS-1407732.

Author information

Authors and Affiliations

Department of Statistics, The Hebrew University of Jerusalem, Jerusalem, Israel
Bella Vakulenko-Lagun & Micha Mandel
Department of Statistics, University of Haifa, Haifa, Israel
Yair Goldberg

Authors

Bella Vakulenko-Lagun
View author publications
You can also search for this author in PubMed Google Scholar
Micha Mandel
View author publications
You can also search for this author in PubMed Google Scholar
Yair Goldberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bella Vakulenko-Lagun.

Appendix

1.1 Convergence to a Gaussian process

Definition 1

Let $X_1,\ldots ,X_n$ be random variables. For every function f, define $Pf\equiv E[f(X)]$. Define $\mathbb {P}_n$ to be the empirical measure, such that $\mathbb {P}_n f\equiv n^{-1} \sum _{i=1}^{n}f(X_i)$. Define $\mathbb {P}_n^{(b)}$ to be the bootstrap empirical measure with weights $(M_{n1},M_{n2},\ldots ,M_{nn})$ which are i.i.d., positive random variables with expectation 1 and finite variance, and independent of $X_1,\ldots ,X_n$, such that $\mathbb {P}_n^{(b)} f\equiv n^{-1} \sum _{i=1}^{n} (M_{ni}/\bar{M}_n)f(X_i)$ where $\bar{M}_n=n^{-1}\sum _{i=1}^n M_{ni}$. The convergence type for bootstrap $\sqrt{n}(\mathbb {P}_n^{(b)}-\mathbb {P}_n)\underset{\text {M}}{\overset{\text {P}}{\rightsquigarrow }}\mathbb {G}$ is defined in Kosorok (2008, pp. 19–20). Finally, for a space $\mathcal {X}$, define $\ell ^{\infty }(\mathcal {X})$ to be the space of all uniformly bounded real functions on $\mathcal {X}$.

Let

$$\begin{aligned} \widehat{N}(t,u,l)\equiv \frac{1}{n}\sum _{i=1}^{n}F^*_i I(\widetilde{T}^*_i\le t,\widetilde{U}^*_i\le u,L_i^* \le l)\,. \end{aligned}$$

Let N be the expectation of $\widehat{N}$, in other words, $N(t,u,l)\equiv P(T\le t, U\le u,L\le l, C>T+U-L \mid L\le T+U)$. We have $N(t,u,l)=G^*_{\widetilde{T}^*,\widetilde{U}^*,L^*}(t,u,l)$ as defined in Eq. (1). In particular,

$$\begin{aligned} N(t_0,u_0,l_0)=\beta ^{-1}\int _0^{t_0}\int _0^{u_0}\int _0^{l_0} S_C(t+u-l)I(l\le t+u)F_L(dl)G_{T,U}(dt,du). \end{aligned}$$

(10)

Lemma 1

The process

$$\begin{aligned} n^{\frac{1}{2}}\left( \begin{array}{c} \widehat{N}(t,u,l) -N(t,u,l) \\ \widehat{S}_C(s)- S_C(s)\\ \widehat{S}_{T+U}(v)-S_{T+U}(v)\\ \widehat{F}_{L}(w) -F_{L}(w) \end{array}\right) \rightsquigarrow \left( \begin{array}{c} \mathbb {G}_1(t,u,l)\\ \mathbb {G}_2(s) \\ \mathbb {G}_3(v)\\ \mathbb {G}_4(w)\end{array}\right) \end{aligned}$$

where $(\mathbb {G}_1,\ldots ,\mathbb {G}_4)^T \in \ell ^\infty ([0,\tau ]^3)\times (\ell ^\infty [0,\tau ])^3$ is a tight zero-mean Gaussian process with covariance structure that appears in the proof. Moreover, its corresponding bootstrap process

$$\begin{aligned} n^{\frac{1}{2}}\left( \begin{array}{c} \widehat{N}^{(b)}(t,u,l) -\widehat{N}(t,u,l) \\ \widehat{S}_C^{(b)}(s)- \widehat{S}_C(s)\\ \widehat{S}_{T+U}^{(b)}(v)-\widehat{S}_{T+U}(v)\\ \widehat{F}_{L}^{(b)}(w) -\widehat{F}_{L}(w)\end{array}\right) \underset{\text {M}}{\overset{\text {P}}{\rightsquigarrow }}\left( \begin{array}{c} \mathbb {G}_1(u,t,l)\\ \mathbb {G}_2(s) \\ \mathbb {G}_3(v)\\ \mathbb {G}_4(w)\end{array}\right) \,. \end{aligned}$$

(11)

Proof

Write $\widehat{N}(t,u,l) \equiv \mathbb {P}_n f_{t,u,l}(F^*,\widetilde{T}^*,\widetilde{U}^*,L^*)$, where $f_{t,u,l}(F^*,\widetilde{T}^*,\widetilde{U}^*,L^*)=F^* I(\widetilde{T}^*\le t,\widetilde{U}^*\le u,L^*\le l)$. Note that the class

$$\begin{aligned} \mathcal F_1\equiv \left\{ f_{t,u,l}(F^*,\widetilde{T}^*,\widetilde{U}^*, L^*): (t,u,l)\in [0,\tau ]^3,l\le t+u\in [0,\tau ] \right\} \end{aligned}$$

is a P-Donsker class. Hence, $\sqrt{n}(\widehat{N}-N)\rightsquigarrow \mathbb {G}_1$, where $\mathbb {G}_1$ is a Brownian bridge on $\ell ^{\infty }([0,\tau ]^3)$ with covariance

$$\begin{aligned}&\mathrm {Cov}\left( \mathbb {G}_1(t_1,u_1,l_1),\mathbb {G}_1(t_2,u_2,l_2)\right) \nonumber \\&\quad =G_{\widetilde{T}^*,\widetilde{U}^*,L^*}\big (\min (t_1,t_2),\min (u_1,u_2),\min (l_1,l_2)\big )\nonumber \\&\quad \quad -\,G_{\widetilde{T}^*,\widetilde{U}^*,L^*}(t_1,u_1,l_1)G_{\widetilde{T}^*,\widetilde{U}^*,L^*}(t_2,u_2,l_2)\,. \end{aligned}$$

Let $\Lambda _C(s)$ be the cumulative hazard of C, and recall that C is randomly censored by $T^*+U^*-L^*$. Let $\pi (s)=P(\widetilde{T}^*+\widetilde{U}^*\ge s)$. Write

$$\begin{aligned} \nu (\widetilde{T}^*,\widetilde{U}^*,F^*,L^*,s)&\equiv -S_C(s)\left[ \frac{(1-F^*)I(\widetilde{T}^*+\widetilde{U}^*-L^*\le s)}{\pi (\widetilde{T}^*+\widetilde{U}^*-L^*)}\right. \nonumber \\&\quad \left. -\int _0^s\frac{I(\widetilde{T}^*+\widetilde{U}^*-L^*\ge u)}{\pi (u)}d\Lambda _C(u)\right] \,. \end{aligned}$$

By Kosorok (2008, Chap. 4.3), $\sqrt{n}(\widehat{S}_C-S_C)=\sqrt{n}(\mathbb {P}_n-P)\nu +o_p(1)$. Note that the random process $\nu $, as a process in $s\in [0,\tau ]$, is P-Donsker by Corollary 9.32 combined with Lemma 4.1 of Kosorok (2008). Hence, $\sqrt{n}(\widehat{S}_C-S_C)\rightsquigarrow \mathbb {G}_2$ where $\mathbb {G}_2$ is a Brownian bridge on $\ell ^{\infty }([0,\tau ])$ with covariance

$$\begin{aligned} \mathrm {Cov}\left( \mathbb {G}_2(s_1),\mathbb {G}_2(s_2)\right) =E[\nu (s_1)\nu (s_2)]\,. \end{aligned}$$

For $\mathbb {G}_3$ and $\mathbb {G}_4$ we use results from Wang (1991). Let

$$\begin{aligned} K(s)&\equiv P(\widetilde{T}^*+\widetilde{U}^*\le s, F^*=1)\\ R(s)&\equiv P(L^*\le s\le \widetilde{T}^*+\widetilde{U}^*)\\ F_{L^*}(s)&\equiv P(L^*\le s)\\ \xi (\widetilde{T}^*,\widetilde{U}^*,L^*,F^*,s)&\equiv -S_{T+U}(s)\left[ \frac{I(\widetilde{T}^*+\widetilde{U}^*\le s)F^*}{R(s)}\right. \\&\qquad +\left. \int _0^s\frac{I(\widetilde{T}^*+\widetilde{U}^*\le u)F^*}{R(u)^2}dR(u)\right. \nonumber \\&\quad \quad \left. -\int _0^s\frac{I(L^*\le u \le \widetilde{T}^*+\widetilde{U}^*)}{R(u)^2}dK(u) \right] \nonumber \\ \psi (\widetilde{T}^*,\widetilde{U}^*,L^*,F^*,s)&\equiv \int \frac{1}{S_{T+U}(u)^2}\xi (\widetilde{T}^*,\widetilde{U}^*,L^*,F^*,u)\nonumber \\&\quad \times \big (F_L(s)-I(u\le s\big )dF_{L^{*}}(s) \\ \zeta (L^*,s)&\equiv \frac{ I(L^*\le s)-F_L(s)}{S_{T+U}(L^*)}\\ \vartheta (\widetilde{T}^*,\widetilde{U}^*,L^*,F^*,s)&\equiv \beta \left( \psi (\widetilde{T}^*,\widetilde{U}^*,L^*,F^*,s)+\zeta (L^*,s)\right) \end{aligned}$$

where $\beta \equiv P(T+U\ge L)$. By Wang (1991, Sect. 4), $\sqrt{n}(\widehat{S}_{T+U}-S_{T+U})=\sqrt{n}(\mathbb {P}_n-P)\xi +o_p(1)$ and $\sqrt{n}(\widehat{F}_L-F_L)=\sqrt{n}(\mathbb {P}_n-P)\vartheta +o_p(1) $. Note that the random processes $\xi $ and $\psi $, as processes in $s\in [0,\tau ]$, are P-Donsker by Lemma 4.1 of Kosorok (2008). The process $\zeta $ is also P-Donsker by the boundedness of $S_{T+U}$ on $[0,\tau ]$ and therefore also their sum. Hence, $\sqrt{n}(\widehat{S}_{T+U}-S_{T+U})\rightsquigarrow \mathbb {G}_3$ where $\mathbb {G}_3$ is a Brownian bridge on $\ell ^{\infty }([0,\tau ])$ with covariance given in Wang (1991, Lemma 4.1). Similarly, $\sqrt{n}(\widehat{F}_L-F_L)\rightsquigarrow \mathbb {G}_4$ where $\mathbb {G}_4$ is a Brownian bridge on $\ell ^{\infty }([0,\tau ])$ with covariance given in Wang (1991, Theorem 4.1).

Since all four Gaussian processes are tight, by Lemmas 7.12 and 7.14 of Kosorok (2008), the joint process $(\mathbb {G}_1,\ldots ,\mathbb {G}_4)^T$ is also tight in $\ell ^{\infty }([0,\tau ]^3)\times (\ell ^{\infty }[0,\tau ])^3$. By the Cramer-Wald device it is also zero-mean Gaussian with covariance

$$\begin{aligned}&\mathrm {Cov}\left( \mathbb {G}_1(t,u,l),\mathbb {G}_2(s)\right) =E[f_{t,u,l}\nu (s)],\quad \mathrm {Cov}\left( \mathbb {G}_1(t,u,l),\mathbb {G}_3(s)\right) =E[f_{t,u,l}\xi (s)]\,,\\&\mathrm {Cov}\left( \mathbb {G}_1(t,u,l),\mathbb {G}_4(s)\right) =E[f_{t,u,l}\vartheta (s)],\quad \mathrm {Cov}\left( \mathbb {G}_2(s),\mathbb {G}_3(t)\right) =E[\nu (s)\xi (t)]\,,\\&\mathrm {Cov}\left( \mathbb {G}_2(s),\mathbb {G}_4(t)\right) =E[\nu (s)\vartheta (t)],\quad \mathrm {Cov}\left( \mathbb {G}_3(s),\mathbb {G}_4(t)\right) =E[\xi (s)\vartheta (t)]\,, \end{aligned}$$

where the expectation is taken with respect to the random variables $\widetilde{T}^*,\widetilde{U}^*,L^*,F^*$. Since all the functions’ classes are Donsker, by Theorem 2.6 of Kosorok (2008), the bootstrap version in (11) also holds.

1.2 Hadamard differentiability

Definition 2

Let $\mathbb {D}$ and $\mathbb {E}$ be normed spaces. Then $\phi :\mathbb {D}\mapsto \mathbb {E}$ is Hadamard differentiable at $A\in \mathbb {D}$ if there exists a linear and continuous function $\phi _A':\mathbb {D}\mapsto \mathbb {E}$ such that

$$\begin{aligned} \frac{\phi (A +h_n a_n)-\phi (A)}{h_n}- \phi _A'(a)\rightarrow 0\,, \end{aligned}$$

for all converging sequences $h_n\rightarrow 0$ and $a_n\rightarrow a$ with $h_n\in \mathbb {R}$, $a_n\in \mathbb {D}$ and $A+h_na_n\in \mathbb {D}$ (Kosorok 2008, Sect. 2.2.4).

Definition 3

The space $D[0,\tau ]$ is the space of all cadlag functions (right continuous functions with left limits) from $[0,\tau ]$ to $\mathbb {R}$ equipped with the sup-norm. Denote by $BV_M[0,\tau ]$ the space of all functions with bounded variation, that is, all the functions $A\in D[0,\tau ]$ such that $\int _0^\tau |dA(t)|\equiv |A(0)|+\int _{(0,\tau )}|dA(t)| <M$ (see Kosorok 2008 Sect. 12.2.2). Finally, the space $D([0,\tau ]^p)$ is the space of all cadlag p-variate functions equipped with the sup-norm (see Neuhaus 1971, for details).

Lemma 2

Let $D_1\equiv \{f\in D[0,\tau ]\, :\, \inf _t|f(t)|>0\} $, $D_2\equiv BV_M[0,\tau ]\times BV_M[0,\tau ]$, $D_3\equiv D[0,\tau ]^2\times D[0,\tau ]^2$. Then

(i)
The function
$$\begin{aligned}&H_1: D[0,\tau ]\times D[0,\tau ]\mapsto D([0,\tau ]^3)\quad ;\\&\quad H_1(\, (A,B)\,)(t,u,l)=A(t+u)B(t+u-l) \end{aligned}$$
is Hadamard differentiable with derivative
$$\begin{aligned} H_{1,(A,B)}'(a,b)(t,u,l)= & {} A(t+u)b(t+u-l)+a(t+u)B(t+u-l)\\= & {} H_1(A,b)(t,u,l)+H_1(a,B)(t,u,l)\,. \end{aligned}$$
(ii)
The function
$$\begin{aligned} H_2: D_1\mapsto D[0,\tau ] \quad ; \quad H_2(A)(t)=\frac{1}{A(t)} \end{aligned}$$
is Hadamard differentiable with derivative $H_{2,(A)}'(a)(t)=-a(t)/A(t)^2$.
(iii)
Let $C(t_0,u_0)=\{(t,u,l)\in [0,\tau ]^3\,:\, l\le t+u\le \tau ,t\le t_0,u\le u_0\}$. The function
$$\begin{aligned} H_3: D_3\mapsto D[0,\tau ]^2\quad ;\quad H_3(\, (A,B)\, )(t_0,u_0)=\int _{C(t_0,u_0)} A(t,u,l)dB(t,u,l) \end{aligned}$$
is Hadamard differentiable with derivative
$$\begin{aligned} H_{3,(A,B)}'(a,b)(t_0,u_0)= & {} \int _{C(t_0,u_0)} A(t,u,l)db(t,u,l)\\&+\int _{C(t_0,u_0)} a(t,u,l)dB(t,u,l)\,. \end{aligned}$$
(iv)
The function
$$\begin{aligned} H_4: D_2\mapsto D([0,\tau ]^3)\quad ;\quad H_4(\, (A,B)\, )(t,u,l)=\int _{(0,t+u]} A(t+u-s)dB(s) \end{aligned}$$
is Hadamard differentiable with derivative
$$\begin{aligned} H_{4,(A,B)}'(a,b)(t,u,l)&=\int _{(0,t+u])} A(t+u-s)db(s)+\int _{(0,t+u]} a(t+u-s)dB(s)\\&= H_4(A,b)(t,u,l)+H_4(a,B)(t,u,l)\,. \end{aligned}$$

Proof

For the proofs of i and iv, let $h_n\rightarrow 0$ and $(a_n,b_n)\rightarrow (a,b)$ in the appropriate space.

(i)
Write
$$\begin{aligned}&\frac{H_1(A+h_na_n,B+h_nb_n)(t,u,l)-H_{1}(t,u,l)}{h_n}-H_1'{(A,B)} (a,b)(t,u,l)\\&\quad =A(t+u)b_n(t+u-l)+a_n(t+u)B(t+u-l)+h_na_n(t+u)b_n(t+u-l)\\&\quad \quad -\left\{ A(t+u)b(t+u-l)+a(t+u)B(t+u-l)\right\} \rightarrow 0\,. \end{aligned}$$
(ii)
The proof appears in kosorok (2008, Sect. 2.2.4).
(iii)
The proof appears in Gill et al. (1995, Sect. 2, Illustration 1).
(iv)
Write
$$\begin{aligned}&\frac{H_4( A+h_na_n,B+h_nb_n)(t,u,l)-H_4(A,B)(t,u,l)}{h_n}-H_{4,(A,B)}'(a,b)(t,u,l)\\&=h_n^{-1}\left( \int _0^{t+u} (A+h_na_n)(t+u-s)d(B+h_nb_n)(s)\right. \\&\quad \quad \left. -\int _0^{t+u} A(t+u-s)dB(s)\right) \\&\quad -\left( \int _0^{t+u} A(t+u-s)db(s)+\int _0^{t+u} a(t+u-s)dB(s)\right) \\&=\int _0^{t+u}A(t+u-s)d(b_n-b)(s)+\int _0^{t+u} (a_n-a)(t+u-s)dB(s)\\&\quad +h_n\int _0^{t+u}a_n(t+u-s)db_n(s)\,. \end{aligned}$$
The first term in the above equation goes to zero using the same arguments as in proof 12.3 of Kosorok (2008, p. 242). The second term goes to zero since $a_n\rightarrow a$ and B is bounded in total variation. The third term goes to zero since $h_n\rightarrow 0$ and $a_n$ and $b_n$ are bounded, which completes the proof.

1.3 Computation of the asymptotic variance

Proof of Theorem 1

Let $C(t_0,u_0)=\{(t,u,l)\in [0,\tau ]^3\,:\, l\le t+u\le \tau ,t\le t_0, u\le u_0\}$. Note that

$$\begin{aligned} G_{T,U}(t_0,u_0)=&\int _0^{t_0}\int _0^{u_0}{I(t+u\le \tau )G_{T,U}(dt,du)}\\ =&\int _0^{t_0}\int _0^{u_0}F_L(t+u) \frac{I(t+u\le \tau )G_{T,U}(dt,du)}{F_L(t+u)} \\ =&\int _0^{t_0}\int _0^{u_0}\int _0^{t+u} \frac{S_C(t+u-l)I(t+u\le \tau )F_L(dl)G_{T,U}(dt,du)}{F_L(t+u)S_C(t+u-l)} \\ =&\,\beta \int _{C(t_0,u_0)} \frac{dN(t,u,l)}{F_L(t+u)S_C(t+u-l)}\,. \end{aligned}$$

Therefore we can write $G_{T,U}=\beta \phi (N,S_C,F_L)$, where (compare to (3))

$$\begin{aligned} \phi (N,S_C,F_L)(t_0,u_0)\equiv \int _{C(u_0,t_0)} \frac{dN(t,u,l)}{F_L(t+u)S_C(t+u-l)}\,. \end{aligned}$$

The function $\phi (N,S_C,F_L)$ can be decomposed as a sequences of the following mappings:

$$\begin{aligned} (N,S_C,F_L)\mapsto (N,H_1(F_L,S_C))&\mapsto \left( N,H_2(H_1(F_L,S_C))\right) \\&\mapsto H_3(H_2(H_1(F_L,S_C)),N) \,, \end{aligned}$$

where $H_1$, $H_2$, and $H_3$ are defined in Lemma 2. By (3), using the same mapping for $t_0\le \tau $, $u_0\le \tau $,

$$\begin{aligned} \widehat{G}^{NPE-1}_{T,U}(t_0,u_0)=\beta _n\phi (\widehat{N},\widehat{S}_C,\widehat{F}_L)=\frac{\beta _n}{n}\sum _{i=1}^n \frac{F^*_iI(\widetilde{T}^*_i\le t_0,\widetilde{U}^*_i\le u_0,L_i^*\le \tau )}{\widehat{F}_L(\widetilde{T}^*_i+\widetilde{U}^*_i)\widehat{S}_C(\widetilde{T}^*_i+\widetilde{U}^*_i-L^*_i)}\,, \end{aligned}$$

where

$$\begin{aligned} \beta _n&\equiv \Big \{\frac{1}{n}\sum _{j=1}^n\widehat{S}_{T+U}^{-1}(L^*_j)\Big \}^{-1}\,. \end{aligned}$$

(12)

The derivative of the map $\phi $ at $(N,S_C,F_L)$, $\phi _{(N,S_C,F_L)}'(a,b,c)$ for $(a,b,c)$ in $D[0,\tau ]^3\times D[0,\tau ]\times D[0,\tau ] $ can be obtained using the chain rule for Hadamard differentiable functions (Kosorok 2008, Lemma 6.19).

$$\begin{aligned}&(a,b,c)\mapsto (a,H_1(F_L,b)+H_1(c, S_C))\mapsto \left( a,-\frac{H_1(F_L,b)+H_1(c,S_C)}{H_1(F_L,S_C)^2}\right) \\&\quad \mapsto \int _{C(t_0,u_0)}\frac{da(t,u,l)}{H_1(F_L,S_C)(t,u,l)}\\&\quad \quad - \int _{C(t_0,u_0)}\left( \frac{H_1(F_L,b)+H_1(c,S_C)}{H_1(F_L,S_C)^2}\right) (u,t,l)dN(t,u,l)\\&\quad =\int _{C(t_0,u_0)}\frac{da(t,u,l)}{F_L(t+u)S_C(t+u-l)}\\&\quad \quad - \int _{C(t_0,u_0)}\left( \frac{F_L(t+u)b(t+u-l)+c(t+u)S_C(t+u-l)}{\left( F_L(t+u)S_C(t+u-l)\right) ^2}\right) dN(t,u,l)\,. \end{aligned}$$

By the functional delta method (Kosorok 2008, Theorem 2.8), together with Slutsky’s Theorem (Kosorok 2008, Theorem 7.15) applied to the multiplication by $\beta _n\rightarrow \beta $,

$$\begin{aligned} \sqrt{n}\left( \widehat{G}^{NPE-1}_{T,U}(t,u)-G_{T,U}(t,u)\right) \rightsquigarrow \beta \phi _{(N,S_C,F_L)}'(\mathbb {G}_1,\mathbb {G}_2,\mathbb {G}_3)\,. \end{aligned}$$

By the functional delta method for bootstrap processes (Kosorok 2008, Theorem 12.1), we also have

$$\begin{aligned} \sqrt{n}\left( \widehat{G}^{NPE-1(b)}_{T,U}(t,u)-\widehat{G}^{NPE-1}_{T,U}(t,u)\right) \underset{\text {M}}{\overset{\text {P}}{\rightsquigarrow }}\beta \phi _{(N,S_C,F_L)}'(\mathbb {G}_1,\mathbb {G}_2,\mathbb {G}_3)\,. \end{aligned}$$

This completes the proof for the first estimator. For the second estimator, note the relation $F_{L^*}(dl)=F_L(dl)S_{T+U}(l)/\int _0^\infty F_L(dy)S_{T+U}(y)$ that follows from the model $L^* \sim L\mid L\le T+U$, where L and $T+U$ are independent. Therefore, the inverse formula is $F_L(dl)=F_{L^*}(dl)S^{-1}_{T+U}(l)/\int _0^\infty F_{L^*}(dy)S^{-1}_{T+U}(y)$. Let $\beta =\int _0^\infty F_L(dy)S_{T+U}(y) = P(L\le T+U)$ as before and $\gamma =\int _0^\infty F_{L^*}(dy)S^{-1}_{T+U}(y)$. Let $C_1(t_0,u_0)=\{(t,u):t+u\le \tau , t\le t_0,u\le u_0\}$.

Note that

$$\begin{aligned} G_{T,U}(t_0,u_0)=&\int _{C_1(t_0,u_0)} G_{T,U}(dt,du) \\ =&\int _{C_1(t_0,u_0)} \frac{G_{T,U}(dt,du) \int _{l=0}^{t+u}S_C(t+u-l)\gamma ^{-1} F_{L^*}(dl)S^{-1}_{T+U}(l) }{\int _{s=0}^{t+u}S^{-1}_{T+U}(s)S_C(t+u-s)\gamma ^{-1}F_{L^*}(ds)}\\ =&\int _{C_1(t_0,u_0)}\int _{0\le l\le t+u} \frac{S_C(t+u-l)F_L(dl)G_{T,U}(dt,du)}{\int _{s=0}^{t+u}S_C(t+u-s)F_{L}(ds)}\\ =&\,\beta \int _{C(t_0,u_0)} \frac{dN(t,u,l)}{\int _{s=0}^{t+u}S_C(t+u-s)F_{L}(ds)}\,, \end{aligned}$$

where the first equation follows from the definition of the density $G_{T,U}(dt,tu)$; the second equation follows by multiplying and dividing by $\gamma ^{-1} \int _0^{t+u}S_C(t+u-l)F_{L^*}(dl)S_{T+U}^{-1}$; the third equation follows from the inverse formula that relates $F_{L^*}$ and $F_L$ above; and the last equation follows from the definition of N in (10).

Write $G_{T,U}(t_0,u_0)=\beta \psi (N,S_C,F_{L})(t_0,u_0)$, where (compare to (6))

$$\begin{aligned} \psi (N,S_C,F_L)(t_0,u_0)\equiv \int _{C(t_0,u_0)} \frac{dN(t,u,l)}{\int _{s=0}^{t+u}S_C(t+u-s)dF_{L}(s)}\,. \end{aligned}$$

The function $\psi (N,S_C,F_L)$ can be decomposed as a sequence of the following mappings

$$\begin{aligned} (N,S_C,F_{L})&\mapsto (N,H_4(S_C,F_{L}))\mapsto \left( N,H_2(H_4(S_C,F_L))\right) \\&\mapsto H_3(H_2(H_4(S_C,F_L)),N)\,, \end{aligned}$$

where $H_2$, $H_3$, and $H_4$ are defined in Lemma 2. By (6), using the same mappings

$$\begin{aligned} \widehat{G}^{NPE-2}_{T,U}(t_0,u_0)=\beta _n \psi (\widehat{N},\widehat{S}_C,\widehat{F}_{L})=\frac{\beta _n}{n}\sum _{i=1}^n \frac{F^*_iI(\widetilde{T}^*_i\le t_0,\widetilde{U}^*_i\le u_0)}{\int _{s=0}^{\widetilde{T}_i^*+\widetilde{U}_i^*}\widehat{S}_C(\widetilde{T}^*_i+\widetilde{U}^*_i-s)d\widehat{F}_{L}(s)} \end{aligned}$$

where $\beta _n$ is defined in (12). The derivative of the map $\psi $ at $(N,S_C,F_{L})$, $\psi _{(N,S_C,F_{L})}'(a,b,c)$ for $(a,b,c)$ in $D([0,\tau ]^3)\times (D[0,\tau ])^2 $ can be obtained using the chain rule for Hadamard differentiable functions (Kosorok 2008, Lemma 6.19).

$$\begin{aligned}&(a,b,c)\mapsto \left( a,H_4(S_C,c)+H_4(b,F_{L})\right) \mapsto \left( a,-\frac{H_4(S_C,c)+H_4(b,F_{L})}{\left( H_4(S_C,F_{L})\right) ^2} \right) \\&\quad \mapsto \int _{C(t_0,u_0)}\frac{da(t,u,l)}{H_4(S_C,F_L)(t,u,l)}-\\ {}&\quad \quad \int _{C(t_0,u_0)}\left( \frac{H_4(S_C,c)+H_4(b,F_{L})}{\left( H_4(S_C,F_{L})\right) ^2}\right) (t,u,l)dN(t,u,l) \\&\quad =\int _{C(t_0,u_0)}\frac{da(t,u,l)}{\int _0^{t+u}S_C(t+u-s)dF_L(s)} \\&\quad \quad - \int _{C(t_0,u_0)}\left( \frac{\int _0^{t+u}S_C(t+u-s)dc(s)+\int _0^{t+u}b(t+u-s)dF_{L}(s)}{\left( \int _0^{t+u}S_C(t+u-l)dF_{L}(s)\right) ^2}\right) dN(t,u,l) \end{aligned}$$

By the functional delta method (Kosorok 2008, Theorem 2.8), together with Slutsky’s Theorem for the convergence of $\beta _n$ to $\beta $ (Kosorok 2008, Theorem 7.14)

$$\begin{aligned} \sqrt{n}\left( \widehat{G}^{NPE-2}_{T,U}(t,u)-G_{T,U}(t,u)\right) \rightsquigarrow \beta \psi _{(N,S_C,F_{L})}'(\mathbb {G}_1,\mathbb {G}_2,\mathbb {G}_4)\,. \end{aligned}$$

By the functional delta method for bootstrap processes (Kosorok 2008, Theorem 12.1) we also have

$$\begin{aligned} \sqrt{n}\left( \widehat{G}^{NPE-2(b)}_{T,U}(t,u)-\widehat{G}^{NPE-2}_{T,U}(t,u)\right) \underset{\text {M}}{\overset{\text {P}}{\rightsquigarrow }}\beta \psi _{(N,S_C,F_{L})}'(\mathbb {G}_1,\mathbb {G}_2,\mathbb {G}_4)\,. \end{aligned}$$

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vakulenko-Lagun, B., Mandel, M. & Goldberg, Y. Nonparametric estimation in the illness-death model using prevalent data. Lifetime Data Anal 23, 25–56 (2017). https://doi.org/10.1007/s10985-016-9373-0

Download citation

Received: 29 July 2015
Accepted: 17 June 2016
Published: 28 June 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10985-016-9373-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonparametric estimation in the illness-death model using prevalent data

Abstract

Access this article

Similar content being viewed by others

Nonparametric change point estimation for survival distributions with a partially constant hazard rate

The wild bootstrap for multivariate Nelson–Aalen estimators

A Joint Modeling Approach for Longitudinal Data with Informative Observation Times and a Terminal Event

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Convergence to a Gaussian process

Definition 1

Lemma 1

Proof

1.2 Hadamard differentiability

Definition 2

Definition 3

Lemma 2

Proof

1.3 Computation of the asymptotic variance

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Nonparametric estimation in the illness-death model using prevalent data

Abstract

Access this article

Similar content being viewed by others

Nonparametric change point estimation for survival distributions with a partially constant hazard rate

The wild bootstrap for multivariate Nelson–Aalen estimators

A Joint Modeling Approach for Longitudinal Data with Informative Observation Times and a Terminal Event

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Convergence to a Gaussian process

Definition 1

Lemma 1

Proof

1.2 Hadamard differentiability

Definition 2

Definition 3

Lemma 2

Proof

1.3 Computation of the asymptotic variance

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation