Right-censored nonparametric regression with measurement error

Aydın, Dursun; Yılmaz, Ersin; Chamidah, Nur; Lestari, Budi; Budiantara, I. Nyoman

doi:10.1007/s00184-024-00953-5

Right-censored nonparametric regression with measurement error

Published: 05 March 2024

(2024)
Cite this article

Metrika Aims and scope Submit manuscript

Dursun Aydın¹,
Ersin Yılmaz ORCID: orcid.org/0000-0002-9871-4700¹,
Nur Chamidah²^na1,
Budi Lestari³^na1 &
…
I. Nyoman Budiantara⁴^na1

69 Accesses
Explore all metrics

Abstract

This study focuses on estimating a nonparametric regression model with right-censored data when the covariate is subject to measurement error. To achieve this goal, it is necessary to solve the problems of censorship and measurement error ignored by many researchers. Note that the presence of measurement errors causes biased and inconsistent parameter estimates. Moreover, non-parametric regression techniques cannot be applied directly to right-censored observations. In this context, we consider an updated response variable using the Buckley–James method (BJM), which is essentially based on the Kaplan–Meier estimator, to solve the censorship problem. Then the measurement error problem is handled using the kernel deconvolution method, which is a specialized tool to solve this problem. Accordingly, three denconvoluted estimators based on BJM are introduced using kernel smoothing, local polynomial smoothing, and B-spline techniques that incorporate both the updated response variable and kernel deconvolution.The performances of these estimators are compared in a detailed simulation study. In addition, a real-world data example is presented using the Covid-19 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Linear Mixed-Effects Model Using Penalized Spline Based on Data Transformation Methods

Estimation for semiparametric varying coefficient models with different smoothing variables under random right censoring

Article 26 December 2017

Penalized spline estimation in varying coefficient models with censored data

Article 20 December 2017

References

Afshin A, Jorge MB (2020) COVID-19 data set resulted from a study on the quality of Novel Corona-virus official datasets. Mendeley Data, V1 https://doi.org/10.17632/nw5m4hs3jr.1
Aydin D, Yilmaz E (2018) Modified spline regression based on randomly right-censored data: a comparative study. Commun Stat Simul Comput 47(9):2587–2611
Article MathSciNet Google Scholar
Buckley J, James I (1979) Linear regression with censored data. Biometrika 66(3):429–436
Article Google Scholar
Buja A, Hastie T, Tibshirani R (1989) Linear smoothers and additive models. Ann Stat 17(2):453–510
MathSciNet Google Scholar
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective. Chapman and Hall/CRC, New York
Book Google Scholar
Craven P, Wahba G (1979) Smoothing noisy data with spline functions. Numer Math 31(4):377–403
Article Google Scholar
Delaigle A, Meister A (2007) Nonparametric regression estimation in the heteroscedastic errors-in-variables problem. J Am Stat Assoc 102(480):1416–1426
Article MathSciNet CAS Google Scholar
Delecroix M, Lopez O, Patilea V (2008) Nonlinear censored regression using synthetic data. Scand J Stat 35(2):248–265
Article MathSciNet Google Scholar
De Boor C (1978) A practical guide to splines, vol 27. Springer, New York, p 325
Book Google Scholar
Fan J (1991) On the optimal rates of convergence for nonparametric deconvolution problems. Ann Stat 19(3):1257–1272
Article MathSciNet Google Scholar
Belomestny D, Goldenshluger A (2021) Density deconvolution under general assumptions on the distribution of measurement errors. Ann Stat 49(2):615–649
Article MathSciNet Google Scholar
Fan J, Gijbels I (1994) Censored regression: local linear approximations and their applications. J Am Stat Assoc 89(426):560–570
Article MathSciNet Google Scholar
Fan J, Truong YK (1993) Nonparametric regression with errors in variables. Ann Stat 21(4):1900–1925
Article MathSciNet Google Scholar
Fan J, Gijbels I, Hu TC, Huang LS (1996) A study of variable bandwidth selection for local polynomial regression. Stat Sin 6(1):113–127
MathSciNet Google Scholar
Ghouch AE, Keilegom IV (2008) Non-parametric regression with dependent censored data. Scand J Stat 35(2):228–247
Article MathSciNet Google Scholar
Glasson S (2007) Censored regression techniques for credit scoring. Doctoral dissertation, RMIT University
Han K, Park BU (2018) Smooth backfitting for errors-in-variables additive models. Ann Stat 46:2216–2250
Article MathSciNet Google Scholar
Hazelton ML, Turlach BA (2009) Nonparametric density deconvolution by weighted kernel estimators. Stat Comput 19(3):217–228
Article MathSciNet Google Scholar
James IR, Smith PJ (1984) Consistency results for linear regression with censored data. Ann Stat 12(2):590–600
Article MathSciNet Google Scholar
Khardani S, Lemdani M, Said EO (2012) On the strong uniform consistency of the mode estimator for censored time series. Metrika 75(2):229–241
Article MathSciNet Google Scholar
Koul H, Susarla V, Van Ryzin J (1981) Regression analysis with randomly right-censored data. Ann Stat 9(6):1276–1288
Article MathSciNet Google Scholar
Lee YK, Mammen E, Park BU (2010) Backfitting and smooth backfitting for additive quantile models. Ann Stat 38:2857–2883
Article MathSciNet Google Scholar
Liang H, Wang N (2005) Partially linear single-index measurement error models. Stat Sin 15(1):99–116
MathSciNet Google Scholar
Li T, Vuong Q (1998) Nonparametric estimation of the measurement error model using multiple indicators. J Multivar Anal 65(2):139–165
Article MathSciNet Google Scholar
Meier P (2011) Estimation of a distribution function from incomplete observations. J Appl Probab 12(S1):67–87
Article ADS MathSciNet Google Scholar
Miller RG (1976) Least squares regression with censored data. Biometrika 63(3):449–464
Article MathSciNet Google Scholar
Moffatt JL, Scarf P (2016) Sequential regression measurement error models with application. Stat Model 16(6):454–476
Article MathSciNet Google Scholar
Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9(1):141–142
Article Google Scholar
Osman M, Ghosh SK (2012) Nonparametric regression models for right-censored data using Bernstein polynomials. Comput Stat Data Anal 56(3):559–573
MathSciNet Google Scholar
Orbe J, Ferreira E, Núñez-Antón V (2003) Censored partial regression. Biostatistics 4(1):109–121
Article PubMed Google Scholar
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression, vol 12. Cambridge University Press, Cambridge
Book Google Scholar
Stefanski LA, Carroll RJ (1990) Deconvolving kernel density estimators. Statistics 21(2):169–184
Article MathSciNet Google Scholar
Stute W (1999) Nonlinear censored regression. Stat Sin 9(4):140–159
MathSciNet Google Scholar
Tekwe CD, Carter RL, Cullings HM (2016) Generalized multiple indicators, multiple causes measurement error models. Stat Model 16(2):140–159
Article MathSciNet Google Scholar
Wang XF, Wang B (2011) Deconvolution estimation in measurement error models: the R package decon. J Stat Softw 39(10):i10
Article PubMed PubMed Central Google Scholar
Watson GS (1964) Smooth regression analysis. Sankhya Indian J Stat Ser A 26(4):359–372
MathSciNet Google Scholar
Aydin D, Ahmed SE, Yilmaz E (2021) Right-censored time series modeling by modified semi-parametric A-spline estimator. Entropy 23(12):1586
Article ADS MathSciNet PubMed PubMed Central Google Scholar
Zhang S, Karunamuni RJ (2009) Deconvolution boundary kernel method in nonparametric density estimation. J Stat Plan Inference 139(7):2269–2283
Article MathSciNet Google Scholar

Download references

Acknowledgements

We express our sincere gratitude to the Editor and anonymous reviewers whose meticulous and insightful feedback significantly contributed to the refinement and enhancement of this paper.

Author information

Nur Chamidah, Budi Lestari and I. Nyoman Budiantara contributed equally to this work.

Authors and Affiliations

Department of Statistics, Faculty of Science, Mugla Sitki Kocman University, Mugla, Turkey
Dursun Aydın & Ersin Yılmaz
Department of Mathematics, Airlangga University, Surabaya, Indonesia
Nur Chamidah
Department of Mathematics, The University of Jember, Jember, Indonesia
Budi Lestari
Department of Statistics, Sepuluh Nopember Institute of Technology, Surabaya, Indonesia
I. Nyoman Budiantara

Authors

Dursun Aydın
View author publications
You can also search for this author in PubMed Google Scholar
Ersin Yılmaz
View author publications
You can also search for this author in PubMed Google Scholar
Nur Chamidah
View author publications
You can also search for this author in PubMed Google Scholar
Budi Lestari
View author publications
You can also search for this author in PubMed Google Scholar
I. Nyoman Budiantara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ersin Yılmaz.

Ethics declarations

Conflicts of interest

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A1. Proof of Lemma 4.1

Differentiating the ${\text {AMISE}}\left\{ {\hat{m}}_{h}(x)\right\} $ with respect to h and setting the derivative equal to zero yields

$$\begin{aligned} \frac{\partial }{\partial h} A {\text {MISE}}\left\{ {\hat{m}}_{h}(x)\right\} =\frac{-n[R(K) V(x)]}{(n h)^{2}}+\frac{4 h^{3} V(K)^{2} R\left( m^{\prime \prime }\right) 4}{16}=0 \end{aligned}$$

(A2.1)

Setting Eq. (A2.1) equal to zero we obtain the following equation,

$$\begin{aligned} n h^{5} V(K)^{2} R\left( m^{\prime \prime }\right) =R(K) V(x) \end{aligned}$$

By taking simple algebraic operations it is seen that the optimal value of parameter h is

$$\begin{aligned} h_{o p t}=\left[ \frac{R(K) V(x)}{V(K)^{2} R\left( m^{\prime \prime }\right) n}\right] ^{1 / 5}, \end{aligned}$$

as claimed.

Appendix A2. Proof of Lemma 4.2

To prove Lemma 4.2, one needs to strong assumptions and conditions with regard to nonparametric smoothing, measurement errors, censorship and, the Buckley–James procedure. We provided the restrictions as follows:

Conditions for the Buckley–James (BJ) Method

By following the study of James and Smith (1984) and Meier (1975), let $\Xi =sup\{\xi :F(\xi )<1\}<\infty $ and suppose that $N(\xi )$ is the mean of the number of data points (censored or not) for which $\varepsilon ^*=Y^*-m(W)>\xi $. Finally, $\varepsilon _C=C-m(W)$. Accordingly, to reach the variance equation given in Lemma 4.2 the following assumptions need to be ensured:

BJ1. $N(\xi ) \rightarrow \infty $ when $n \rightarrow \infty $ for all $\xi <\Xi $.

BJ2. $\sum _{i=1}^{n}(W_i-{\bar{W}})^2\rightarrow \infty $ when $n \rightarrow \infty $.

BJ3. $lim sup_{n\rightarrow \infty }\{\frac{\sum _{i=1}^{n} F(\varepsilon _{C_i}) \mid W_i-{\bar{W}}\mid }{\sum _{i=1}^{n} (W_i-{\bar{W}})^2}\}<\infty $.

BJ4. To ensure the BJ3, following condition is needed: $\liminf _{n\rightarrow \infty } (1/n_c)\sum _i^n(W_i-{\bar{W}})^2>0$ where $n_c$ is no. the censored observations.

In addition to these conditions, assumptions (A4) and (B1–B5) should be ensured regarding the $\hat{{\textbf{m}}}_h(W)\rightarrow \hat{{\textbf{m}}}_h(X)$ based on the corresponding smoothing matrix. Note that assumptions (A1–A3) are also needed to take into account the censorship. Under the given restrictions, the following proof can be written for $MSSE({\hat{\textbf{m}}})$:

$$\begin{aligned} {\text {MSSE}}\left( {\hat{m}}_{h}\right)&=\sum \limits _{i=1}^{n} E\left[ \left\{ {\hat{m}}_{h}\left( W_{i}\right) -m\left( W_{i}\right) \right\} ^{2}\right] =\sum _{i=1}^{m} {\text {MSE}}\left[ {\hat{m}}_{h}\left( W_{i}\right) \right] \\&=\sum \limits _{i=1}^{n}\left\{ E\left( {\hat{m}}_{h}\left( W_{i}\right) \right) -m\left( W_{i}\right) \right\} ^ {2}+{\text {Var}}\left[ {\hat{m}}_{h}\left( W_{i}\right) \right] \\&=\sum \limits _{i=1}^{n} \{E\left( {\textbf{S}}_{h} {\textbf{Y}}_{i}^{*}\right) -m(W_{i})\}^{2}+Var({\textbf{S}}_{h} {\textbf{Y}}_{i}^{*}) \\&=\sum \limits _{i=1}^{n}\{E\left( {\textbf{S}}_{h} {\textbf{Y}}_{i}^{*}\right) -m(W_{i})\}^{2}+{\text {tr}}\left[ {\text {Cov}}\left( {\textbf{S}}_{h} {\textbf{Y}}_{i}^{*}\right) \right] \\&=\Vert (\hat{{\varvec{m}}_{h}}-{\varvec{m}})\Vert ^{2}+{\text {tr}}\left[ {\text {Cov}}\left( {\textbf{S}}_{h} {\textbf{Y}}^{*}\right) \right] \\&=\left\| \left( {\textbf{S}}_{h}-{\textbf{I}}\right) {\varvec{m}}\right\| ^{2}+{\text {tr}}\left[ {\textbf{S}}_{h} {\text {Cov}}({\textbf{Y}}^{*}) {\textbf{S}}_{h}^{\prime }\right] \end{aligned}$$

Due to censorship and transformed response variable ${\textbf{Y}}^*$, the variance ${\text {Cov}}({\textbf{Y}}^{*})$ is written as given in (4.20) $\sigma _{*}^{2} {\textbf{I}}=\sigma ^2_\varepsilon -Var(Y \mid Y>C){\bar{F}}(C)$ produces

$$\begin{aligned} {\text {MSSE}}\left( \widehat{{\varvec{m}}}_{h}\right) =\left\| \left( {\textbf{S}}_{h}-{\textbf{I}}\right) {\varvec{m}}\right\| ^{2}+\sigma _{*}^{2} {\text {tr}}\left( {\textbf{S}}_{h} {\textbf{S}}_{h}^{\prime }\right) \end{aligned}$$

(A3.1)

As claimed, Eq. (4.19) has been proven.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Aydın, D., Yılmaz, E., Chamidah, N. et al. Right-censored nonparametric regression with measurement error. Metrika (2024). https://doi.org/10.1007/s00184-024-00953-5

Download citation

Received: 19 October 2022
Accepted: 25 January 2024
Published: 05 March 2024
DOI: https://doi.org/10.1007/s00184-024-00953-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Right-censored nonparametric regression with measurement error

Abstract

Access this article

Similar content being viewed by others

Linear Mixed-Effects Model Using Penalized Spline Based on Data Transformation Methods

Estimation for semiparametric varying coefficient models with different smoothing variables under random right censoring

Penalized spline estimation in varying coefficient models with censored data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendices

Appendix A1. Proof of Lemma 4.1

Appendix A2. Proof of Lemma 4.2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Right-censored nonparametric regression with measurement error

Abstract

Access this article

Similar content being viewed by others

Linear Mixed-Effects Model Using Penalized Spline Based on Data Transformation Methods

Estimation for semiparametric varying coefficient models with different smoothing variables under random right censoring

Penalized spline estimation in varying coefficient models with censored data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendices

Appendix A1. Proof of Lemma 4.1

Appendix A2. Proof of Lemma 4.2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation