Partitioning estimation of local variance based on nearest neighbors under censoring


In a nonparametric and heteroscedastic setting, our primary interest is in the local variance estimation when the response variable is subject to right censoring. For the proposed partitioning local variance estimators, based on the first and second nearest neighbors, some transformations on the observed censoring times are involved, using their estimated survival functions. Proofs of consistency and rate of convergence for the presented estimators are given. Moreover, local variance estimation is demonstrated on the basis of real survival data.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. Brown LD, Levine M (2007) Variance estimation in nonparametric regression via the difference sequence method. Ann Stat 35:2219–2232

    MathSciNet  Article  MATH  Google Scholar 

  2. Buckley J, James I (1979) Linear regression with censored data. Biometrika 66:429–436

    Article  MATH  Google Scholar 

  3. Cai T, Levine M, Wang L (2009) Variance function estimation in multivariate nonparametric regression with fixed design. J Multivar Anal 100:126–136

    MathSciNet  Article  MATH  Google Scholar 

  4. Cox DR (1972) Regression models and life-tables. J R Stat Soc B 34:187–220

    MathSciNet  MATH  Google Scholar 

  5. Evans D (2005) Estimating the variance of multiplicative noise. In: 18th international conference on noise and fluctuations. ICNF, in AIP conference proceedings, vol 780, pp 99–102

  6. Evans D, Jones A (2008) Non-parametric estimation of residual moments and covariance. Proc R Soc Lond Ser A Math Phys Eng Sci 464:2831–2846

    MathSciNet  Article  MATH  Google Scholar 

  7. Fan J, Gijbels I (1994) Censored regression: local linear approximations and their applications. J Am Stat Assoc 89(426):560–570

    MathSciNet  Article  MATH  Google Scholar 

  8. Ferrario PG (2013) Local variance estimation for uncensored and censored observations. Springer Vieweg Verlag, Wiesbaden

    Google Scholar 

  9. Ferrario PG, Walk H (2012) Nonparametric partitioning estimation of residual and local variance based on first and second nearest neighbors. J Nonparametric Stat 24:1019–1039

    MathSciNet  Article  MATH  Google Scholar 

  10. Györfi L, Kohler M, Krzyżak A, Walk H (2002) A distribution-free theory of nonparametric regression. Springer Series in Statistics, Springer, New York

    Google Scholar 

  11. Hall P, Carroll PJ (1989) Variance function estimation in regression: the effect of estimating the mean. J R Stat Soc Ser B 51:3–14

    MathSciNet  MATH  Google Scholar 

  12. Härdle W, Tsybakov A (1997) Local polynomial estimators of the volatility function in nonparametric autoregression. J Econom 81:223–242

    MathSciNet  Article  MATH  Google Scholar 

  13. Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481

    MathSciNet  Article  MATH  Google Scholar 

  14. Kohler M (2006) Nonparametric regression with additional measurement errors in the dependent variable. J Stat Plan Inference 136:3339–3361

    MathSciNet  Article  MATH  Google Scholar 

  15. Kohler M, Krzyżak A, Walk H (2006) Rates of convergence for partitioning and nearest neighbor regression estimates with unbounded data. J Multivar Anal 97:311–323

    MathSciNet  Article  MATH  Google Scholar 

  16. Koul H, Susarla V, Van Ryzin J (1981) Regression analysis with random rightly-censored data. Ann Stat 9:1276–1288

    Article  MATH  Google Scholar 

  17. Lemdani M, Saïd EU (2015) Nonparametric robust regression estimation for censored data. Stat Pap 1:1–21

    MATH  Google Scholar 

  18. Li J, Zheng M (2009) Robust estimation of multivariate regression model. Stat Papers 50:81–100

    MathSciNet  Article  MATH  Google Scholar 

  19. Liitiäinen E, Corona F, Lendasse A (2007) Non-parametric residual variance estimation in supervised learning. In: IWANN’07 Proceedings of the 9th international work-conference on artificial neural networks. Lecture Notes in Computer Science: Computational and Ambient Intelligence, vol 4507, pp 63–71

  20. Liitiäinen E, Corona F, Lendasse A (2008) On nonparametric residual variance estimation. Neural Process Lett 28:155–167

    Article  MATH  Google Scholar 

  21. Liitiäinen E, Corona F, Lendasse A (2010) Residual variance estimation using a nearest neighbor statistic. J Multivar Anal 101:811–823

    MathSciNet  Article  MATH  Google Scholar 

  22. Loprinzi CL, Laurie JA, Wieand HS, Krook JE, Novotny PJ, Kugler JW, Bartel J, Law M, Bateman M, Klatt NE et al, North Central Cancer Treatment Group (1994) Prospective evaluation of prognostic variables from patient-completed questionnaires. J Clin Oncol 12(3):601–607

    Article  Google Scholar 

  23. Luo S, Zhang C-Y (2015) Nonparametric M-type regression estimation under missing response data. Stat Papers 1–23

  24. Mathe K (2006) Regressionanalyse mit zensierten Daten. PhD Thesis. Institute of Stochastics and Applications, Universität Stuttgart

  25. Müller HG, Stadtmüller U (1987) Estimation of heteroscedasticity in regression analysis. Ann Stat 15:610–625

    MathSciNet  Article  MATH  Google Scholar 

  26. Müller HG, Stadtmüller U (1993) On variance function estimation with quadratic forms. J Stat Plan Inference 35:213–231

    MathSciNet  Article  MATH  Google Scholar 

  27. Munk A, Bissantz N, Wagner T, Freitag G (2005) On difference based variance estimation in nonparametric regression when the covariate is high dimensional. J R Stat Soc Ser B 67:19–41

    MathSciNet  Article  MATH  Google Scholar 

  28. Neumann M (1994) Fully data-driven nonparametric variance estimators. Statistics 25:189–212

    MathSciNet  Article  MATH  Google Scholar 

  29. Ruppert D, Wand M, Holst U, Hössjer O (1997) Local polynomial variance-function estimation. Technometrics 39:262–273

    MathSciNet  Article  MATH  Google Scholar 

  30. Spokoiny V (2002) Variance estimation for high-dimensional regression models. J Multivar Anal 82:111–133

    MathSciNet  Article  MATH  Google Scholar 

  31. Steele JM (1986) An Efron-Stein inequality for nonsymmetric statistics. Ann Stat 14:753–758

    MathSciNet  Article  MATH  Google Scholar 

  32. Strobel M (2008) Estimation of minimum mean squared error with variable metric from censored observations. PhD Thesis. Institute of Stochastics and Applications, Universität Stuttgart

  33. Wang L, Brown LD, Cai T, Levine M (2008) Effect of mean on variance function estimation in nonparametric regression. Ann Stat 36:646–664

    MathSciNet  Article  MATH  Google Scholar 

Download references


The author gratefully acknowledges many helpful suggestions of two anonymous referees. Moreover, the author wishes to express her gratitude to Prof. em. Dr. Harro Walk and Dr. Maik Döring for essential help and support.

Author information



Corresponding author

Correspondence to Paola Gloria Ferrario.



In this appendix we relegated some technical lemmas.

The first three lemmas concern the conditional expectations in the context of first and second nearest neighbors.

Lemma 1

It holds that

$$\begin{aligned} {\mathbf E}\left\{ \frac{\delta _i T_i}{G(T_i)}\frac{\delta _i^{'} T_i^{'}}{G(T_i^{'})}\Bigg |X_i\right\} ={\mathbf E}\left\{ Y_i Y_i^{'}|X_i\right\} \quad (i=1,\dots ,n). \end{aligned}$$


Let \(N_i'\) denote the index of the first nearest neighbor of \(X_i.\) Consider that

$$\begin{aligned}&{\mathbf E}\left\{ \frac{\delta _i T_i}{G(T_i)}\frac{\delta _i^{'} T_i^{'}}{G(T_i^{'})}\Bigg |X_i\right\} ={\mathbf E}\left\{ \sum _{l\in \{1,\dots ,n\}\setminus \{i\}}\frac{\delta _i T_i}{G(T_i)}\frac{\delta _l T_l}{G(T_l)}1_{\{N_i'=l\}}\Bigg |X_i\right\} \\= & {} \sum _{l\in \{1,\dots ,n\}\setminus \{i\}}{\mathbf E}\left\{ \frac{\delta _i T_i}{G(T_i)}\bigg |X_i\right\} {\mathbf E}\left\{ \frac{\delta _l T_l}{G(T_l)}\bigg |X_l\right\} 1_{\{N_i'=l\}}\\&\text { (by the independence assumption)}\\= & {} {\mathbf E}\{Y_i|X_i\}\sum _{l\in \{1,\dots ,n\}\setminus \{i\}}{\mathbf E}\{Y_l|X_l\}1_{\{N_i'=l\}}\\&\text {(the latter by (3), twice applied)}\\= & {} \sum _{l\in \{1,\dots ,n\}\setminus \{i\}}{\mathbf E}\left\{ Y_i Y_l1_{\{N_i'=l\}}|X_i\right\} ={\mathbf E}\left\{ Y_i Y_i^{'}|X_i\right\} . \end{aligned}$$

\(\square \)

According to the above lemma, one obtains the following two lemmas

Lemma 2

It holds that

$$\begin{aligned} {\mathbf E}\left\{ \frac{\delta _i T_i}{G(T_i)}\frac{\delta ^{''} T_i^{''}}{G(T_i^{''})}\Bigg |X_i\right\} ={\mathbf E}\left\{ Y_i Y_i^{''}|X_i\right\} \quad (i=1,\dots ,n). \end{aligned}$$

Lemma 3

It holds that

$$\begin{aligned} {\mathbf E}\left\{ \frac{\delta _i^{'} T_i^{'}}{G(T_i^{'})}\frac{\delta _i^{''} T_i^{''}}{G(T_i^{''})}\Bigg |X_i\right\} ={\mathbf E}\left\{ Y_i^{'} Y_i^{''}|X_i\right\} \quad (i=1,\dots ,n). \end{aligned}$$

The proofs of both lemmas are analogous to that of Lemma 1 and are therefore omitted.

Definition 1

The Kaplan–Meier estimator in Kaplan and Meier (1958) of the survival functions \(F={\mathbf P}\{Y>t\}\) and \(G={\mathbf P}\{C>t\}\) is given by:


respectively, where \((T_{(1)},\delta _{(1)}),\dots ,(T_{(n)},\delta _{(n)})\) are the n pairs of observed \((T_i,\delta _i)\) set in increasing order.

One has \(T_K\ge T_{(n)}.\)

Lemma 4

Under the conditions of Theorem 1, \(~ \widehat{\widehat{\sigma }}_n^2(x)\) is consistent, i.e.,

$$\begin{aligned} \int |\widehat{\widehat{\sigma }}_n^2(x)-\sigma ^2(x)|\mu (dx)\mathop {\rightarrow }\limits ^{P} 0. \end{aligned}$$

If additionally X is assumed to be bounded, one has

$$\begin{aligned} \int |\widehat{\widehat{\sigma }}_n^2(x)-{\mathbf E}\widehat{\widehat{\sigma }}_n^2(x)|\mu (dx)= O\left( \sqrt{\frac{l_n}{n}}\right) . \end{aligned}$$

proof of Lemma 4

Let \(c,~c_1\) be suitable constants. The variance of the estimator can be bounded by

$$\begin{aligned} {\mathbf {Var}}\left\{ \widehat{\widehat{\sigma }}_n^2(x)\right\} \le c\frac{1}{n\mu (A_n(x))}. \end{aligned}$$

In fact, it holds that

$$\begin{aligned} {\mathbf {Var}}\left\{ \widehat{\widehat{\sigma }}_n^2(x)\right\}\le & {} \frac{4}{n^2\mu (A_n(x))^2}\left[ {\mathbf {Var}}\left\{ {\sum _{i=1}^n}\frac{\delta _i T_i^21_{A_n(x)}(X_i)}{G(T_i)}\right\} \right. \\&+\,{\mathbf {Var}}\left\{ {\sum _{i=1}^n}\frac{\delta _i T_i}{G(T_i)}\frac{\delta _i^{'} T_i^{'}1_{A_n(x)}(X_i)}{G(T_i^{'})}\right\} \\&+\,{\mathbf {Var}}\left\{ {\sum _{i=1}^n}\frac{\delta _i T_i}{G(T_i)}\frac{\delta _i^{''} T_i^{''}1_{A_n(x)}(X_i)}{G(T_i^{''})}\right\} \\&\left. +\,{\mathbf {Var}}\left\{ {\sum _{i=1}^n}\frac{\delta _i^{'}T_i^{'}}{G(T_i^{'})}\frac{\delta _i^{''}T_i^{''}1_{A_n(x)}(X_i)}{G(T_i^{''})}\right\} \right] . \end{aligned}$$

Each of the four variances on the right-hand side is bounded by \(c_1n\mu (A_n(x)).\) We show this only for the second variance because the other three variances can be treated in the same way. The Efron-Stein inequality in Steele’s version Steele (1986) will be applied, following the argument in the proof of Equation (12) in Ferrario and Walk (2012).

Let \(n\ge 2\) be fixed. Replacement of \((X_j,Y_j,C_j)\) by \((X^*_j,Y^*_j,C^*_j)\) for fixed \(j\in \{1,\dots ,n\}\) (where \((X_1,Y_1,C_1),\dots ,(X_n,Y_n,C_n),(X^*_1,Y^*_1,C^*_1),\dots ,(X^*_n,Y^*_n,C^*_n)\) are i.i.d.) leads, for fixed x,  from

$$\begin{aligned} U_n:={\sum _{i=1}^n}\frac{\delta _i T_i}{G(T_i)}\cdot \frac{\delta _i^{'}T_i^{'}}{G(T_i^{'})}1_{A_n(x)}(X_i) \end{aligned}$$

and \(N_i'\) to \(U^*_{n,j},~N_{i,j}^{'*} ~(i \in 1,\dots ,n ),\) respectively. By the Efron-Stein inequality

$$\begin{aligned} {\mathbf {Var}}\left\{ U_n\right\} \le \frac{1}{2}\sum _{j=1}^n {\mathbf E}\left\{ |U_n-U_{n,j}^*|^2\right\} . \end{aligned}$$

With \(T_j^*:=\min \left\{ Y_j^*,C_j^*\right\} ,\) \(\delta ^*_j=1_{\{Y_j^*\le C^*_j\}},\) and \(\tilde{Y}_j^*=\frac{\delta _j^*T_j^*}{G(T_j^*)},\) noticing (10), we obtain

$$\begin{aligned} |U_n-U_{n,j}^*|\le B_{n,j}+C_{n,j}+D_{n,j}+E_{n,j}, \end{aligned}$$


$$\begin{aligned} B_{n,j}= & {} \sum _{\begin{array}{c} l\in \{1,\dots ,n\}\setminus \{j\} \end{array}} |{\widetilde{Y}}_j|1_{A_n(x)}(X_j) |{\widetilde{Y}}_l| 1_{\left\{ N'_j=l\right\} }\\&\le \frac{L^2}{G(L)^2}1_{A_n(x)}(X_j)\sum _{\begin{array}{c} l\in \{1,\dots ,n\}\setminus \{j\} \end{array}} 1_{\left\{ N'_j=l\right\} } =\frac{L^2}{G(L)^2}1_{A_n(x)}(X_j),\\ C_{n,j}= & {} \sum _{\begin{array}{c} l\in \{1,\dots ,n\}\setminus \{j\} \end{array}} |{\widetilde{Y}}_j^*|1_{A_n(x)}(X_j^*) |{\widetilde{Y}}_l| 1_{\left\{ N^{'*}_{j,j}=l\right\} }\\&\le \frac{L^2}{G(L)^2}1_{A_n(x)}(X_j^*)\sum _{\begin{array}{c} l\in \{1,\dots ,n\}\setminus \{j\} \end{array}} 1_{\left\{ N^{'*}_{j,j}=l\right\} } =\frac{L^2}{G(L)^2}1_{A_n(x)}(X_j^{*}),\\ D_{n,j}= & {} \sum _{\begin{array}{c} i\in \{1,\dots ,n\}\setminus \{j\} \end{array}} |{\widetilde{Y}}_i|1_{A_n(x)}(X_i) |{\widetilde{Y}}_j| 1_{\left\{ N'_i=j\right\} }\\&\le \frac{L^2}{G(L)^2}1_{A_n(x)}(X_i)\sum _{\begin{array}{c} i\in \{1,\dots ,n\}\setminus \{j\} \end{array}} 1_{\left\{ N'_i=j\right\} }, \\ E_{n,j}= & {} \sum _{\begin{array}{c} i\in \{1,\dots ,n\}\setminus \{j\} \end{array}} |{\widetilde{Y}}_i|1_{A_n(x)}(X_i) |{\widetilde{Y}}_j^*| 1_{\left\{ N^{'*}_{i,j}=j\right\} }\\&\le \frac{L^2}{G(L)^2}1_{A_n(x)}(X_i)\sum _{\begin{array}{c} i\in \{1,\dots ,n\}\setminus \{j\} \end{array}} 1_{\left\{ N^{'*}_{i,j}=j\right\} } \\ \end{aligned}$$

Then, \(|U_n-U_{n,j}|^2\le 4\left( B^2_{n,j}+C^2_{n,j}+D^2_{n,j}+E^2_{n,j}\right) .\) By the Cauchy-Schwarz inequality applied to the sums in the bounds of \(D_{n,j}\) and \(E_{n,j},\) and noticing that

$$\begin{aligned} {\sum _{i=1}^n}1_{\left\{ N^{'}_{i}=j\right\} }\le \gamma _d \quad (j=1,\dots ,n) \end{aligned}$$

(the latter by Györfi et al. (2002), Corollary 6.1, already used above), we obtain

$$\begin{aligned}&{\mathbf E}\left\{ |U_n-U_{n,j}|^2\right\} \le 4 \frac{L^4}{G(L)^4}\left[ 2 \mu (A_n(x))+2\gamma _d{\mathbf E}\left\{ \sum _{i\in \{1,\dots ,n\}}1_{A_n(x)}(X_i)1_{\left\{ N^{'}_{i}=j\right\} }\right\} \right] . \end{aligned}$$

Then, because of \(\sum _{j\in \{1,\dots ,n\}}1_{\left\{ N^{'}_{i}=j\right\} }=1,\)

$$\begin{aligned} \sum _{j=1}^n{\mathbf E}\left\{ |U_n-U_{n,j}|^2\right\} \le 8\frac{L^4}{G(L)^4}(1+\gamma _d)n\mu (A_n(x)) \end{aligned}$$

and thus the above bound for the second variance and correspondingly (24).

From (24) we get

$$\begin{aligned} {\mathbf E}|\widehat{\widehat{\sigma }}_n^2(x)-{\mathbf E}\widehat{\widehat{\sigma }}_n^2(x)|\le \sqrt{{\mathbf {Var}}(\widehat{\widehat{\sigma }}_n^2(x))}\le \sqrt{c}_2\frac{1}{\sqrt{n\mu (A_n(x))}}. \end{aligned}$$

By the triangle inequality

$$\begin{aligned} {\mathbf E}|\widehat{\widehat{\sigma }}_n^2(x)-\sigma ^2(x)|\le {\mathbf E}|\widehat{\widehat{\sigma }}_n^2(x)-{\mathbf E}\widehat{\widehat{\sigma }}_n^2(x)|+|{\mathbf E}\widehat{\widehat{\sigma }}_n^2(x)-\sigma ^2(x)|, \end{aligned}$$

one has

$$\begin{aligned} K_n:= & {} \int |{\mathbf E}\widehat{\widehat{\sigma }}_n^2(x)-\sigma ^2(x)| \mu (dx)\nonumber \\\le & {} \int \left| \frac{\int \sigma ^2(z)1_{A_n(x)}(z)\mu (dz)}{\mu (A_n(x))}-\sigma ^2(x)\right| \mu (dx)\nonumber \\&+ \int \int \frac{{\mathbf E}\left\{ |(m(X_1)-m(X_1'))(m(X_1)-m(X_1''))| \big | X_1=z\right\} 1_{A_n(x)}(z)}{\mu (A_n(x))}\mu (dz)\mu (dx)\nonumber \\= & {} J_n+L_n. \end{aligned}$$

It holds that \(J_n\rightarrow 0\) by Györfi et al. (2002), pp. 461, 462 (there with m instead of \(\sigma ^2\)).

In order to show \(L_n\rightarrow 0,\) for an arbitrary \(\epsilon >0,\) one chooses a continuous function \(\widetilde{m}\) with compact support such that \(\int |m(x)-\widetilde{m}(x)|\mu (dx)<\epsilon .\) Then, noticing that \(\mu (A_n(x))=\mu (A_n(z))\) for \(z\in A_n(x),\) one has

$$\begin{aligned} L_n= & {} {\mathbf E}\left\{ |(m(X_1)-m(X_1'))(m(X_1)-m(X_1''))|\right\} \nonumber \\\le & {} 2L {\mathbf E}\left\{ |(m(X_1)-m(X_1'))|\right\} \nonumber \\&\text {(by} \mathbf (A1) )\nonumber \\\le & {} 2L {\mathbf E}\left\{ |(\widetilde{m}(X_1)-\widetilde{m}(X_1'))|\right\} \\&+2L {\mathbf E}\left\{ |(m(X_1)-\widetilde{m}(X_1))|\right\} +2L {\mathbf E}\left\{ |(m(X_1')-\widetilde{m}(X_1'))|\right\} \\\le & {} o(1)+2L (1+\gamma _d){\mathbf E}\left\{ |(m(X_1)-\widetilde{m}(X_1))|\right\} \end{aligned}$$

because of continuity and boundedness of \(\widetilde{m}\) together with \(X_1'\rightarrow X_1\) \((n\rightarrow \infty )\) a.s. guaranteed by Lemma 6.1 and because of Lemma 6.3 in Györfi et al. (2002).

But \(L_n\) has as upper bound \(o(1)+2L(1+\gamma _d)\epsilon .\) Therefore \(L_n\rightarrow 0\) and thus \(K_n\rightarrow 0.\)

We note

$$\begin{aligned} {\mathbf E}\left\{ \left| \widehat{\widehat{\sigma }}_n^2(x)-{\mathbf E}\widehat{\widehat{\sigma }}_n^2(x)\right| \right\} \le 2{\mathbf E}\widehat{\widehat{\sigma }}_n^2(x)\le c_2, \end{aligned}$$

the latter because of (A2). For \(\epsilon >0\) choose a sphere S centered at 0 such that \(\mu (S^c)\le \epsilon .\) Then

$$\begin{aligned} I_n:=\int {\mathbf E}\left\{ \left| \widehat{\widehat{\sigma }}_n^2(x)-{\mathbf E}\widehat{\widehat{\sigma }}_n^2(x)\right| \right\} \mu (dx)\le c_2 \epsilon +\sqrt{c}\frac{1}{\sqrt{n}} \int _S\frac{1}{\sqrt{\mu (A_n(x))}}\mu (dx).\nonumber \\ \end{aligned}$$

Set \({\mathbf R}_n:=\{j: A_{n,j}\cap S\ne \emptyset \}\) and \(l_n:=\# {\mathbf R}_n.\) Now

$$\begin{aligned} \sqrt{c}\frac{1}{\sqrt{n}} \int _S\frac{1}{\sqrt{\mu (A_n(x))}}\mu (dx)\le \sqrt{c}\frac{1}{\sqrt{n}} \sqrt{\int _S\frac{1}{\mu (A_n(x))}\mu (dx)}=O\left( \sqrt{\frac{l_n}{n}}\right) .\nonumber \\ \end{aligned}$$

Then, by (6), one obtains \(I_n\rightarrow 0.\) This, together with \(K_n\rightarrow 0\) yields the first part of the assertion.

In the case that X is bounded, one chooses a sphere S in \({\mathbf R}^d\) centered at 0 which contains the support of \(P_X=\mu .\) Then (27) with \(\epsilon =0\) and (28) yield the second part of the assertion. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ferrario, P.G. Partitioning estimation of local variance based on nearest neighbors under censoring. Stat Papers 59, 423–447 (2018).

Download citation


  • Local variance
  • Censoring
  • Partitioning estimation
  • Nearest neighbors
  • Weak consistency
  • Rate of convergence
  • Survival data analysis