Abstract
A popular stochastic measure of the distance between the density of the lifetimes and its estimator is the integrated square error (ISE) and Hellinger distance (HD). In this paper, we focus on the right-censored model when the censoring indicators are missing at random. Based on two density estimators defined by Wang et al.(J Multivar Anal 100:835–850, 2009), and another new kernel estimator of the density, we established the asymptotic normality of the ISE and HD for the proposed estimators. In addition, the uniformly strongly consistency of the new kernel estimator of the density is discussed. Also, a simulation study is conducted to compare finite-sample performance of the proposed estimators.
Similar content being viewed by others
References
Beran RJ (1977) Minimum Hellinger distance estimates for parametric models. Ann Stat 5:445–463
Chatrabgoun O, Parham G, Chinipardaz R (2017) A Legendre multiwavelets approach to copula density estimation. Stat Papers 58(3):673–690
Dinse GE (1982) Nonparametric estimation for partially-complete time and type of failure data. Biometrics 38:417–431
Hall P (1984) Central limit theorem for integrated square error of multivariate nonparametric density estimators. J Multivar Anal 14:1–16
Hosseinioun N, Doosti H, Nirumand HA (2012) Nonparametric estimation of the derivatives of a density by the method of wavelet for mixing sequences. Stat Papers 53(1):195–203
Jomhoori S, Fakoor V, Azarnoosh H (2012) Central limit theorem for ISE of kernel density estimators in censored dependent model. Commun Stat Theory Methods 41:1334–1349
Koul H, Susarla V, van Ryzin J (1981) Regression analysis with randomly right censored data. Ann Stat 9:1276–1288
Liang HY, Liu AA (2013) Kernel estimation of conditional density with truncated, censored and dependent data. J Multivar Anal 120:40–58
Liang HY, Baek JI (2016) Asymptotic normality of conditional density estimation with left-truncated and dependent data. Stat Papers 57(1):1–20
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York
Liu W, Lu X (2011) Empirical likelihood for density-weighted average derivatives. Stat Papers 52(2):391–412
Luo S, Zhang CY (2016) Nonparametric M-type regression estimation under missing response data. Stat Papers 57(3):641–664
McKeague IW, Subramanian S (1998) Product-limit estimators and cox regression with missing censoring information. Scand J Stat 25(4):589–601
Nguyen TT, Tsoy Y (2017) A kernel PLS based classification method with missing data handling. Stat Papers 58(1):211–225
Robins JM, Rotnitzky A (1992) Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell N, Dietz K, Farewell V (eds) AIDS epidemiology-methodological issues. Birkhauser, Boston, pp 237–331
Satten GA, Datta S (2001) The Kaplan-Meier estimator as an inverse-probability-of-censoring weighted average. Am Stat 55:207–210
Sun LQ, Zheng ZG (1999) The asymptotics of the integrated square error for the kernel hazard rate estimators with left truncated and right censored data. Syst Sci Math Sci 12(3):251–262
Subramanian S (2004) Asymptotically efficient estimation of a survival function in the missing censoring indicator model. J Nonparametr Stat 16(5):797–817
Subramanian S (2006) Survival analysis for the missing censoring indicator model using kernel density estimation techniques. Stat Methodol 3(2):125–136
Tenreiro C (2001) On the asymptotic behavior of the integrated square error of kernel density estimators with data-dependent bandwidth. Stat Probab Lett 53:283–292
Laan MJVD, McKeague IW (1998) Efficient estimation from right-censored data when failure indicators are missing at random. Ann Stat 26:164–182
Wang Q, Ng KW (2008) Asymptotically efficient product-limit estimators with censoring indicators missing at random. Stat Sin 18:749–768
Wang Q, Liu W, Liu C (2009) Probability density estimation for survival data with censoring indicators missing at random. J Multivar Anal 100:835–850
Wied D, Weißbach R (2012) Consistency of the kernel density estimator: a survey. Stat Papers 53(1):1–21
Xu HX, Fan GL, Liang HY (2017) Hypothesis test on response mean with inequality constraints under data missing when covariables are present. Stat Papers 58(1):53–75
Yang H, Liu H (2016) Penalized weighted composite quantile estimators with missing covariates. Stat Papers 57(1):69–88
Zamini R, Fakoor V, Sarmad M (2015) On estimation of a density function in multiplicative censoring. Stat Papers 56(3):661–676
Zhang B (1998) A note on the integrated square errors of kernel density estimators under random censorship. Stoch Process Their Appl 75:225–234
Zhao L, Wu C (2001) Central limit theorem for integrated square error of kernel estimators of spherical density. Sci China Ser A Math 44(4):474–483
Zou YY, Liang HY (2014) Global \(L_2\) error of wavelet density estimator with truncated and strong mixing observations. Int J Wavelets Multiresolut Inf Process 12(3):1450033, 1-12
Zou YY, Liang HY, Zhang JJ (2015) Nonlinear wavelet density estimation with data missing at random when covariates are present. Metrika 78(8):967–995
Zou YY, Liang HY (2017a) Convergence rate of wavelet density estimator with data missing randomly when covariables are present. Commun Stat Theory Methods 46(2):1007–1023
Zou YY, Liang HY (2017b) Wavelet estimation of density for censored data with censoring indicator missing at random. Statistics 51(6):1214–1237
Acknowledgements
The first author was supported by the Major Research Plan of the National Social Science Foundation of China (18ZD05). The second author was supported by the National Natural Science Foundation of China (11671299) and the National Social Science Foundation of China (17BTJ032).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
In this section, we give some preliminary Lemmas, which have been used in Sect. 4.
Lemma 5.1
(Peter Hall (1984), Theorem1, p. 3) Let \(X_1,\ldots ,X_n\) be i.i.d. random variables. Assume \(\widetilde{H}_n(x,y)\) is symmetric, \(E[\widetilde{H}_n(X_1,X_2)|X_1]=0\) almost surely and \(E\widetilde{H}_n^2(X_1,X_2)<\infty \) for each n. Define \(\widetilde{G}_n(x,y)=E[\widetilde{H}_n(X_1,x)\widetilde{H}_n(X_1,y)]\). If
then \(U_n=\sum _{1\le i<j \le n}\widetilde{H}_n(X_i,X_j)\) is asymptotically normally distributed with zero mean and variance given by \(\frac{1}{2}n^2E\widetilde{H}_n^2(X_1,X_2)\).
Lemma 5.2
Let \(V_n(X_i)=\frac{1}{nb_{n}}\sum _{j=1}^n\xi _j\Omega (\frac{X_i-X_j}{b_{n}})\) and \(V(X_i)=\pi (X_i)h(X_i)\). If (A2), (A4), (A5) and (A7) hold, then
-
(a)
\(\max _{1\le i\le n}|V_n(X_i)-V(X_i)|I(X_i\le \tau )=O(\gamma _{n})~a.s.\)
-
(b)
\(\max _{1\le i\le n}|\widehat{m}_n(X_i)-m(X_i)|I(X_i\le \tau )=O(\gamma _{n})~a.s.\)
-
(c)
\(\max _{1\le i\le n}|\widehat{G}_n(X_i-)-G(X_i-)|I(X_i\le \tau )=O(\gamma _{n})~a.s.\)
-
(d)
\(\max _{1\le i\le n}|\widehat{\pi }_n(X_i)-\pi (X_i)|I(X_i\le \tau )=O(\eta _{n})~a.s.\)
Proof
Following the proof of Lemma A.3 in Zou and Liang (2017b), one can prove this Lemma. \(\square \)
Lemma 5.3
Suppose that (A2), (A4), (A5) and (A7) hold, then
where \(\max _{1\le i\le n}|R_{n}(X_i)I(X_i\le \tau )|=o_p(n^{-1/2})\) and
Proof
Following the proof line of Theorem 3.2 in Wang and Ng (2008), one can verify Lemma 5.5. \(\square \)
Lemma 5.4
If (A3), (A5) and (A7) hold, then we have \( \widehat{f}_R(t):=\bar{f}_R(t)+\widetilde{f}_R(t)+r_{1n}(t)+r_{2n}(t), \) where \(P_1(t)=[\pi (t)(1-G(t))]^{-1}\) and
Proof
From the definition of \(\widehat{f}_R(t)\) in Sect. 2, one can rewrite
From the definition of \(\widehat{m}_n(\cdot )\) in Sect. 2, we have
and denote the first term as \(D_{2n}(t)\), then we have
Collecting the results above, this lemma is proved. \(\square \)
Lemma 5.5
Under Lemma 5.5, if (A0) and (A2)–(A8) hold, then \(\int r_{1n}^2(t)w(t)dt=o_p(n^{-1/2}h_n^2).\)
Proof
From the definition of \(r_{1n}(t)\) in Lemma 5.4, let \(r_{1n}(t):=\sum _{l=1}^5r_{1nl}(t)\). Due to the independence of \(\{X_j,\delta _j,\xi _j\}\) for \(j=1,\ldots ,n\), then, under MAR assumption, from (A5) to (A7) we have
Similarly, we have \(\int E[r_{1n2}(t)]^2w(t)dt=O(n^{-1}b_n^{2r}h_n^{-(2r+1)})\).
In order to evaluate \(r_{1n3}(t)\), let \(U_j=\xi _j(\delta _j-m(X_j))\) and
it is easy to verify \(E[U_j^2 Y_{ij}^2]=O(h_nb_n)\), then
For \(r_{1n4}(t)\), we write
Using the Taylor expansion, it follows that \(\int E[r_{1n4}^{(1)}(t)]^2w(t)dt=O(b_n^{2r})\).
Let \(A_i=\frac{1}{V(X_i)[1-G(X_i)]}K(\frac{t-X_i}{h_{n}})\) and
It can be verified that \(E[A_i^2B_{ij}^2]=O(h_nb_n^3)\) and \(E[A_{i_1}A_{i_2}B_{i_1j}B_{i_2j}]=O(h_n^2b_n^4)\). Then
For \(r_{1n5}(t)\), we observe
Recalling the definition of \(V_n(\cdot )\) in Lemma 5.2, it follows that
If \(i=j\), it is easy to verify that \(r_{1n5}^{(11)}(t)=0\), \(\int E[r_{1n5}^{(12)}(t)]^2w(t)dt=O(\gamma _n^2n^{-3}b_n^{-2} h_n^{-1})\) and \(\int E[r_{1n5}^{(13)}(t)]^2w(t)dt=O(n^{-5}b_n^{-4}h_n^{-1}).\)
If \(i\ne j\), we have \(\int E[r_{1n5}^{(11)}(t)]^2w(t)dt=O(\gamma _n^2b_n^{2r}h_n^{-1})\). Let \(S_i=\frac{ V(X_i)-\frac{1}{nb_n}\sum _{k\ne j}^n\xi _k\Omega (\frac{X_i-X_k}{b_n})}{V^2(X_i)[1-G(X_i)]}K\big (\frac{t-X_i}{h_{n}}\big )\),
It can be verified that \(E[S_i^2Y_{ij}^2]=O(\gamma _n^2h_nb_n)\) and \(E[S_{i_1}S_{i_2}Y_{i_1j}Y_{i_2j}]=O(\gamma _n^2h_nb_n^2)\). Then
Similarly, \(\int E[r_{1n5}^{(13)}(t)]^2w(t)dt=O(n^{-2}b_n^{-2}).\) From Lemma 5.2 (a), one can get \(\int E[r_{1n5}^{(2)}(t)]^2w(t)dt=O(\gamma _n^4).\) Collecting the results above and using (A8), it yields that
Lemma 5.6
Under (A3), (A5) and (A7), we have \( \widehat{f}_I(t):=\bar{f}_I(t)+\widetilde{f}_I(t)+s_{1n}(t)+s_{2n}(t), \) where
Proof
Following the proof line as for Lemma 5.4, one can verify Lemma 5.6. \(\square \)
Lemma 5.7
Following simple calculation, we have \(\widehat{f}_W(t):=\bar{f}_W(t)+\widetilde{f}_W(t)+v_{1n}(t)+v_{2n}(t),\) where
Proof
Following simple computation, one can get Lemma 5.7. Here, we omit the proof line. \(\square \)
Rights and permissions
About this article
Cite this article
Zou, YY., Liang, HY. CLT for integrated square error of density estimators with censoring indicators missing at random. Stat Papers 61, 2685–2714 (2020). https://doi.org/10.1007/s00362-018-01065-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-018-01065-9