Statistical inference based on the nonparametric maximum likelihood estimator under double-truncation

Abstract

Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Austin D, Simon DK, Betensky RA (2013) Computationally simple estimation and improved efficiency for special cases of double truncation. Lifetime Data Anal. doi:10.1007/s10985-013-9287-z

  2. Chen YH (2010) Semiparametric marginal regression analysis for dependent competing risks under an assumed copula. J R Stat Soc B 72:235–251

    Article  Google Scholar 

  3. Commenges D (2002) Inference for multi-state models from interval-censored data. Stat Methods Med Res 11:167–182

    MATH  Article  Google Scholar 

  4. Efron B, Petrosian V (1999) Nonparametric method for doubly truncated data. J Am Stat Assoc 94:824–834

    MATH  MathSciNet  Article  Google Scholar 

  5. Emura T, Wang W (2012) Nonparametric maximum likelihood estimation for dependent truncation data based on copulas. J Multivar Anal 110:171–188

    MATH  MathSciNet  Article  Google Scholar 

  6. Emura T, Konno Y (2012) Multivariate normal distribution approaches for dependently truncated data. Stat Papers 53:133–149

    MATH  MathSciNet  Article  Google Scholar 

  7. Klein JP, Moeschberger ML (2003) Survival analysis: techniques for censored and truncated data. Springer, New York

    Google Scholar 

  8. Moreira C, Uña-Álvarez J (2010) Bootstrapping the NPMLE for doubly truncated data. J Nonparametr Stat 22:567–583

    MATH  MathSciNet  Article  Google Scholar 

  9. Moreira C, Uña-Álvarez J (2012) Kernel density estimation with doubly-truncated data. Electron J Stat 6:501–521

    MATH  MathSciNet  Article  Google Scholar 

  10. Moreira C, Keilegom IV (2013) Bandwidth selection for kernel density estimation with doubly truncated data. Comput Stat Data Anal 61:107–123

    Article  Google Scholar 

  11. Moreira C, Uña-Álvarez J, Meira-Machado L (2014) Nonparametric regression with doubly truncated data. Comput Stat Data Anal. doi:10.1016/j.csda.2014.03.017

  12. Murphy SA (1995) Asymptotic theory for the frailty model. Ann Stat 23:182–198

    MATH  Article  Google Scholar 

  13. Nair VN (1984) Confidence bands for survival functions with censored data: a comparative study. Technometrics 26:265–275

    Article  Google Scholar 

  14. Shen PS (2010) Nonparametric analysis of doubly truncated data. Ann Inst Stat Math 62:835–853

    Article  Google Scholar 

  15. Shen PS (2011) Testing quasi-independence for doubly truncated data. J Nonparametr Stat 23:1–9

    MATH  MathSciNet  Article  Google Scholar 

  16. Shen PS (2012) Empirical likelihood ratio with doubly truncated data. J Appl Stat 38:2345–2353

    Article  Google Scholar 

  17. Stovring H, Wang MC (2007) A new approach of nonparametric estimation of incidence and lifetime risk based on birth rates and incidence events. BMC Med Res Methodol 7:53

    Article  Google Scholar 

  18. van der Vaart AW, Wellner JA (1996) Weak convergence and empirical process. Springer-Varlag, New York

    Google Scholar 

  19. Zeng D, Lin DY (2006) Efficient estimation of semiparametric transformation models for counting processes. Biometrika 93:627–640

    MATH  MathSciNet  Article  Google Scholar 

  20. Zhu H, Wang MC (2012) Analyzing bivariate survival data with interval sampling and application to cancer epidemiology. Biometrika 99:345–361

    MATH  MathSciNet  Article  Google Scholar 

Download references

Acknowledgments

We would like to thank the editor, the associate editor and the two reviewers for their helpful comments and corrections that greatly improved the manuscript. This work was financially supported by the National Science Council of Taiwan (NSC101-2118-M008-002-MY2) to T. Emura, and a Grant-in-Aid for a Research Fellow of the Japan Society for the Promotion of Science to H. Michimae (No. 23570036). The work of Y. Konno was partially supported by Grant-in-Aid for Scientific Research(C) (No. 25330043 and 21500283).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Takeshi Emura.

Appendices

Appendix A: Bootstrap and jackknife algorithms

Simple bootstrap algorithm (Moreira and Uña-Álvarez 2010):

  • Step 1: For each \(b=1,\;\ldots ,\;B\), draw bootstrap resamples \(\{\;(U_{jb}^*,\;T_{jb}^*,\;V_{jb}^*):j=1,\;\ldots ,\;n\;\}\) from \(\{\;(U_j ,\;T_j ,\;V_j ):j=1,\;\ldots ,\;n\;\}\), and then compute the NPMLE \(\hat{{F}}_b^*(t)\) from them.

  • Step 2: Compute the bootstrap variance estimator

    $$\begin{aligned} \hat{{V}}_{\mathrm{Boot}} \{\hat{{F}}(t)\}=\frac{1}{B-1}\sum _{b=1}^B {\{\hat{{F}}_b^*(t)-\bar{{F}}^{*}(t)\}^{2}} , \end{aligned}$$

    where \(\bar{{F}}^{*}(t)=\frac{1}{B} \sum _{b=1}^B {\hat{{F}}_b^*(t)}\), and take the \((\alpha /2)\times 100\)% and \((1-\alpha /2)\times 100\)% points of \(\{\;\hat{{F}}_b^*(t):\;b=1,\;\ldots ,\;B\;\}\) for the\((1-\alpha )\times 100\)% confidence interval.

Jackknife algorithm:

  • Step 1: For each \(i=1,\;\ldots ,\;n\), delete the \(i \) th sample from \(\{\;(U_j ,\;T_j ,\;V_j ):j=1,\;\ldots ,\;n\;\}\), and then compute the NPMLE \(\hat{{F}}_{(-i)}(t)\) from the remaining \(n-\)1 samples.

  • Step 2: Compute the jackknife variance estimator

    $$\begin{aligned} \hat{{V}}_{\mathrm{Jack}} \{\hat{{F}}(t)\}=\frac{n-1}{n}\sum _{i=1}^n {\{\hat{{F}}_{(-i)} (t)-\bar{{F}}_{(\cdot )} (t)\}^{2}} , \end{aligned}$$

    where \(\bar{{F}}_{(\cdot )} (t)=\frac{1}{n}\sum _{i=1}^n {\hat{{F}}_{(-i)} (t)} \), and the log-transformed \((1-\alpha )\times 100\)% confidence interval

    $$\begin{aligned} (\;\hat{{F}}(t)\exp [\;-z_{\alpha /2} \hat{{V}}_{\mathrm{Jack}}^{\mathrm{1/2}} \{\hat{{F}}(t)\}/\hat{{F}}(t)\;],\;\;\hat{{F}}(t)\exp [\;z_{\alpha /2} \hat{{V}}_{\mathrm{Jack}}^{\mathrm{1/2}} \{\hat{{F}}(t)\}/\hat{{F}}(t)\;]\;). \end{aligned}$$

Appendix B: Asymptotic theory

Appendix B1. Weak convergence of \(\sqrt{n}(\;\hat{{F}}(t)-F(t)\;)\)

Although not stated explicitly, we assume that the identifiability conditions (Shen 2010, p. 836) are satisfied. Consider the log-likelihood function

$$\begin{aligned} \ell _n (F)/n=\sum _{i=1}^n {(\log f_j -\log F_j )} /n. \end{aligned}$$

For any \(h\in Q\), where \(Q\) is the set of all uniformly bounded functions, let \(H(t)=\int _0^t {h(s)dF(s)} \) and \(\hat{{H}}(t)=\int _0^t {h(s)d\hat{{F}}(s)} \) where \(h\) satisfies the constraint \(\hat{{H}}(\infty )=1\). Suppose that \(\hat{{F}}\) is the maximizer of \(\ell _n (F)\). Then for any \(h\in Q\) and \(\varepsilon \ge 0\), we have \(\ell _n (\hat{{F}}+\varepsilon \hat{{H}})\le \ell _n (\hat{{F}})\). Hence, the score function \(\partial \ell _n (F+\varepsilon H)/\partial \varepsilon |_{\varepsilon =0} \) is equal to

$$\begin{aligned} \Psi _n (F)[h]\equiv \frac{1}{n}\sum _{i=1}^n {\left[ {h(T_i )-\frac{\int {\mathbf{I}(U_i \le s\le V_i )h(s)dF(s} )}{\int {\mathbf{I}(U_i \le s\le V_i )dF(s} )}} \right] } , \end{aligned}$$

for any \(h\in Q\). The expectation is defined as

$$\begin{aligned} \Psi (F)[h]\equiv E\left[ {h(T^{*})-\frac{\int {\mathbf{I}(U^{*}\le s\le V^{*})h(s)dF(s)} }{\int {\mathbf{I}(U^{*}\le s\le V^{*})dF(s)} }} \right] . \end{aligned}$$

Consider \(\Psi _n (F)[h]\) as a random function defined on \(Q\). Accordingly, consider a random map \(\Theta \rightarrow l^{\infty }(Q)\), defined by \(F\mapsto \Psi _n (F)[\cdot ]\). Then, the equation \(\Psi _n (F)[\cdot ]=0\) is considered the estimating function that takes its value on \(l^{\infty }(Q)\). It follows that the NPMLE is the Z-estimator that satisfies \(\Psi _n (\hat{{F}})[\cdot ]=0\) (van der Vaart and Wellner 1996, p. 309). In the following, we assume that certain regularity conditions for the asymptotic theory for the Z-estimator hold, which include the asymptotic approximation condition, the Fréchet differentiability of the map, and the invertibility of the derivative map.

Then, one can write

$$\begin{aligned} 0=n^{1/2}\Psi _n (\hat{{F}})[h]=n^{1/2}\Psi _n (F)[h]+n^{1/2}\dot{\Psi }_F (\hat{{F}}-F)[h]+o_P (1), \end{aligned}$$
(5)

where \(\dot{\Psi }_F (\hat{{F}}-F)[h]\) is the derivative of \(\Psi _n (F)[h]\) at \(F\) with direction \(\hat{{F}}-F\). It follows from the form of \(\Psi (F)[\cdot ]\) that

$$\begin{aligned} \dot{\Psi }_F (\hat{{F}}-F)[h]=\frac{d}{dt}\Psi \{\;\hat{{F}}+t(\hat{{F}}-F)\;\}[h]|_{t=0} =-\int {\sigma _F (x)[h]d(\hat{{F}}-F)(x)} . \end{aligned}$$
(6)

It follows from Eqs. (5) and (6) that the NPMLE satisfies the asymptotic linear expression

$$\begin{aligned}&\sqrt{n}\int {\sigma _F (x)[h]d(\hat{{F}}-F)(x)} \nonumber \\&\quad =\frac{1}{\sqrt{n}}\sum _{i=1}^n {\left[ {h(T_i )-\frac{\int {I(U_i \le s\le V_i )h(s)dF(s} )}{\int {I(U_i \le s\le V_i )dF(s} )}} \right] } +o_P (1), \end{aligned}$$
(7)

where the right-side converges weakly to a mean zero Gaussian process with the covariance structure

$$\begin{aligned}&E\left[ {h(T^{*})-\frac{\int {\mathbf{I}(U^{*}\le s\le V^{*})h(s)dF(s} )}{\int {\mathbf{I}(U^{*}\le s\le V^{*})dF(s} )}} \right] \left[ {{h}'(T^*)-\frac{\int {\mathbf{I}(U^{*}\le s\le V^{*}){h}'(s)dF(s} )}{\int {\mathbf{I}(U^{*}\le s\le V^{*})dF(s} )}} \right] \\&\quad =\int {\sigma _F (x)[h]{h}'(x)dF(x)} , \end{aligned}$$

for bounded functions \(h\) and \({h}'\). The desired weak convergence of \(\sqrt{n}(\;\hat{{F}}(t)-F(t)\;)\) is obtained by setting \(h=\sigma _F^{-1} (w_t )\) in Eq. (7).

Appendix B2: Proof of \(\sum _{j=1}^n {w_s (T_j )\hat{{\sigma }}_F^{-1} (w_t )(T_j )\hat{{f}}_j } =\mathbf{W}_s^\mathrm{T} \left\{ {\frac{i_n (\mathbf{f})}{n}} \right\} ^{-1}\mathbf{W}_t \)

It follows that

$$\begin{aligned} \hat{{\sigma }}_F (T_j )[h]=\frac{1}{n}\sum _{i=1}^n {J_{ij} \left\{ {\frac{h_j }{\hat{{F}}_i }-\frac{1}{\hat{{F}}_i^2 }\sum _{k=1}^n {J_{ik} h_k \hat{{f}}_k } } \right\} }=\frac{1}{n}\left[ {\frac{h_j \hat{{f}}_j}{\hat{{f}}_j^2 }-\sum _{i=1}^n {\sum _{k=1}^n {\frac{J_{ij} J_{ik} }{\hat{{F}}_i^2 }h_k \hat{{f}}_k } } } \right] .\nonumber \\ \end{aligned}$$
(8)

Note that

$$\begin{aligned} J^{\mathrm{T}}\hbox {diag}\left( {\frac{1}{\mathbf{F}^{2}}} \right) J=\left[ {{\begin{array}{c@{\quad }c@{\quad }c} {\sum _{i=1}^n {\frac{J_{i1} J_{i1} }{F_i^2 }} }&{} \cdots &{} {\sum _{i=1}^n {\frac{J_{i1} J_{in} }{F_i^2 }} } \\ \vdots &{} \ddots &{} \vdots \\ {\sum _{i=1}^n {\frac{J_{in} J_{i1} }{F_i^2 }} }&{} \cdots &{} {\sum _{i=1}^n {\frac{J_{in} J_{in} }{F_i^2 }} } \\ \end{array} }} \right] . \end{aligned}$$

Hence, Eq. (8) with \(h={h}'\) and \(\sigma _F (x)[{h}']=w_t (x)=\mathbf{I}(x\le t)\) yield

$$\begin{aligned} \left[ {{\begin{array}{c} {w_t (T_1 )} \\ \vdots \\ {w_t (T_n )} \\ \end{array} }} \right]&= \frac{1}{n}\left[ {\left. {\left\{ {\hbox {diag}\left( {\frac{1}{{\hat{\mathbf{f}}}^{2}}} \right) -J^{\mathrm{T}}\hbox {diag}\left( {\frac{1}{{\hat{\mathbf{F}}}^{2}}} \right) J} \right\} } \right| _{\hat{{f}}_n =1-\mathbf{1}_{n-1}^\mathrm{T} {\hat{\mathbf{f}}}} } \right] \left[ {{\begin{array}{c} {h_1 \hat{{f}}_1 } \\ \vdots \\ {h_n \hat{{f}}_n } \\ \end{array} }} \right] \\&= \frac{1}{n}\left[ {\left. {\left\{ {\hbox {diag}\left( {\frac{1}{{\hat{\mathbf{f}}}^{2}}} \right) -J^{\mathrm{T}}\hbox {diag}\left( {\frac{1}{{\hat{\mathbf{F}}}^{2}}} \right) J} \right\} } \right| _{\hat{{f}}_n =1-\mathbf{1}_{n-1}^\mathrm{T} {\hat{\mathbf{f}}}} } \right] D^{\mathrm{T}}\left[ {{\begin{array}{c} {h_1 \hat{{f}}_1 } \\ \vdots \\ {h_{n-1} \hat{{f}}_{n-1} } \\ \end{array} }} \right] , \end{aligned}$$

where the last equation uses the constraint \(\sum _{j=1}^n {h_j \hat{{f}}_j } =0\). Multiplying \(D\) for both sides, and taking the inverse of the information matrix,

$$\begin{aligned} \left[ {{\begin{array}{c} {\hat{{\sigma }}_F^{-1} (w_t )(T_1 )\hat{{f}}_1 } \\ \vdots \\ {\hat{{\sigma }}_F^{-1} (w_t )(T_{n-1} )\hat{{f}}_{n-1} } \\ \end{array} }} \right] =\left\{ {\frac{i_n ({\hat{\mathbf{f}}})}{n}} \right\} ^{-1}\left[ {{\begin{array}{c} {w_t (T_1 )-w_t (T_n )} \\ \vdots \\ {w_t (T_1 )-w_t (T_n )} \\ \end{array} }} \right] . \end{aligned}$$

It follows that

$$\begin{aligned}&\sum _{j=1}^n {w_s (T_j )\hat{{\sigma }}_F^{-1} (w_t )(T_j )\hat{{f}}_j } =\sum _{j=1}^{n-1} {\{\;w_s (T_j )-w_s (T_n )\;\}\hat{{\sigma }}_F^{-1} (w_t )(T_j )\hat{{f}}_j } \\&\quad =\left[ {{\begin{array}{lll} {w_s (T_1 )-w_s (T_n )}&{} \cdots &{} {w_s (T_{n-1} )-w_s (T_n )\;} \\ \end{array} }} \right] \left\{ {\frac{i_n ({\hat{\mathbf{f}}})}{n}} \right\} ^{-1}\left[ {{\begin{array}{c} {w_t (T_1 )-w_t (T_n )} \\ \vdots \\ {w_t (T_1 )-w_t (T_n )} \\ \end{array} }} \right] \\&\quad =\mathbf{W}_s^\mathrm{T} \left\{ {\frac{i_n ({\hat{\mathbf{f}}})}{n}} \right\} ^{-1}\mathbf{W}_t . \end{aligned}$$

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Emura, T., Konno, Y. & Michimae, H. Statistical inference based on the nonparametric maximum likelihood estimator under double-truncation. Lifetime Data Anal 21, 397–418 (2015). https://doi.org/10.1007/s10985-014-9297-5

Download citation

Keywords

  • Asymptotic variance
  • Bootstrap
  • Confidence band
  • Goodness-of-fit test
  • Survival analysis