Kernel regression estimation for incomplete data with applications

Mojirsheibani, Majid; Reese, Timothy

doi:10.1007/s00362-015-0693-z

Kernel regression estimation for incomplete data with applications

Regular Article
Published: 02 July 2015

Volume 58, pages 185–209, (2017)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Majid Mojirsheibani¹ &
Timothy Reese¹

475 Accesses
5 Citations
Explore all metrics

Abstract

Methods are proposed to construct kernel estimators of a regression function in the presence of incomplete data. Furthermore, exponential upper bounds are derived on the performance of the $L_p$ norms of the proposed estimators, which can then be used to establish various strong convergence results. The presence of incomplete data points are handled by a Horvitz–Thompson-type inverse weighting approach, where the unknown selection probabilities are estimated by both kernel regression and least-squares methods. As an immediate application of these results, the problem of nonparametric classification with partially observed data will be studied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the $$L_p$$ norms of kernel regression estimators for incomplete data with applications to classification

Article 05 April 2016

Kernel classification with missing data and the choice of smoothing parameters

Article 02 February 2017

On regression and classification with possibly missing response variables in the data

Article 10 September 2023

References

Bernstein S (1946) The theory of probabilities. Gastehizdat Publishing House, Moscow
Google Scholar
Chen J, Fan J, Li K, Zhou H (2006) Local quasi-likelihood estimation with data missing at random. Statistica Sinica 16:1071–1100
MathSciNet MATH Google Scholar
Cheng PE, Chu CK (1996) Kernel estimation of distribution functions and quantiles with missing data. Statistica Sinica 6:63–78
MathSciNet MATH Google Scholar
Chu CK, Cheng PE (1995) Nonparametric regression estimation with missing data. J Stat Plan Inference 48:85–99
Article MathSciNet MATH Google Scholar
Devroye L (1981) On the almost everywhere convergence of nonparametric regression function estimates. Ann Stat 9:1310–1319
Article MathSciNet MATH Google Scholar
Devroye L, Krzy$\grave{{\rm z}}$ak A (1989) An equivalence theorem for $L_1$ convergence of kernel regression estimate. J Stat Plan Inference 23:71–82
Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer-Verlag, New York
Book MATH Google Scholar
Devroye L, Györfi L, Lugosi G (1985) Nonparametric density estimation: the L1 view. Wiley, New York
MATH Google Scholar
Devroye L, Wagner T (1980) On the $L_1$ convergence of kernel estimators of regression functions with applications in discrimination. Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete 51:15–25
Article MathSciNet MATH Google Scholar
Eframovich S (2011) Nonparametric regression with response missing at random. J Stat Plan Inference 141:3744–3752
Article MathSciNet Google Scholar
González S, Rueda M, Arcos A (2008) An improved estimator to analyse missing data. Stat Pap 49:791–796
Article MathSciNet MATH Google Scholar
Györfi L, Kohler M, Krzy$\grave{{\rm z}}$ak A, Walk H (2002) A distribution-free theory of nonparametric regression. Springer-Verlag, New York
Hirano KI, Ridder G (2003) Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71:1161–1189
Article MathSciNet MATH Google Scholar
Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58:13–30
Article MathSciNet MATH Google Scholar
Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685
Article MathSciNet MATH Google Scholar
Hu XJ, Zhang B (2012) Pseudolikelihood ratio test with biased observations. Stat Pap 53:387–400
Article MathSciNet MATH Google Scholar
Karimi O, Mohammadzadeh M (2012) Bayesian spatial regression models with closed skew normal correlated errors and missing observations. Stat Pap 53:205–218
Article MathSciNet MATH Google Scholar
Kohler M, Krzy$\grave{{\rm z}}$ak A, Walk H (2003) Strong consistency of automatic kernel regression estimates. Ann Inst Stat Math 55: 287–308
Kolmogorov AN, Tikhomirov VM (1959) $\epsilon $-entropy and $\epsilon $-capacity of sets in function spaces. Uspekhi Mat Nauk 14:3–86
MathSciNet MATH Google Scholar
Krzy$\grave{{\rm z}}$ak A, Pawlak M (1984) Distribution-free consistency of a nonparametric kernel regression estimate and classification. IEEE Trans Inform Theory 30:78–81
Liang H, Wang S, Carroll R (2007) Partially linear models with missing response variables and error-prone covariates. Biometrika 94:185–198
Article MathSciNet MATH Google Scholar
Little RJA, Rubin DB (2002) Statistical analysis with missing data. Wiley, New York
Book MATH Google Scholar
McCullagh P, Nelder J (1983) Generalized linear models. Chapman & Hall, London
Book MATH Google Scholar
Mojirsheibani M (2007) Nonparametric curve estimation with missing data: a general empirical process approach. J Stat Plan Inference 137:2733–2758
Article MathSciNet MATH Google Scholar
Müller U (2009) Estimating linear functionals in nonlinear regression with responses missing at random. Ann Stat 37:2245–2277
Article MathSciNet MATH Google Scholar
Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9:141–142
Article MATH Google Scholar
Robins J, Rotnitzky A (1995) Semiparametric efficiency in multivariate regression models with missing data. J Am Stat Assoc 90:122–129
Article MathSciNet MATH Google Scholar
Robins J, Rotnitzky A, Zhao L (1995) Ping analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc 90:106–121
Article MATH Google Scholar
Schisterman E, Rotnitzky A (2001) Estimation of the mean of a K-sample U-statistic with missing outcomes and auxiliaries. Biometrika 88:713–725
Article MathSciNet MATH Google Scholar
Spiegelman C, Sacks J (1980) Consistent window estimation in nonparametric regression. Ann Stat 8:240–246
Article MathSciNet MATH Google Scholar
Takai K, Kano Y (2013) Asymptotic inference with incomplete data. Commun Stat Theor Meth 42:3174–3190
Article MathSciNet MATH Google Scholar
Tsiatis A (2006) Semiparametric theory and missing data. Springer, New York
MATH Google Scholar
Toutenburg H, Shalabh (2003) Estimation of regression models with equi-correlated responses when some observations on the response variable are missing. Stat Pap 44:217–232
Article MathSciNet MATH Google Scholar
van der Vaart A, Wellner J (1996) Weak convergence and empirical processes. Springer-Verlag, New York
Book MATH Google Scholar
Walk H (2002) On cross-validation in kernel and partitioning regression estimation. Stat Probab Lett 59:113–123
Article MathSciNet MATH Google Scholar
Walk, H. (2002b), Almost sure convergence properties of Nadaraya-Watson regression estimates. Modeling uncertainty, 201–223, Inter Ser Oper Res Manag Sci, 46, Kluwer Academic Publishers, Boston
Wang D, Chen S (2009) Empirical likelihood for estimating equations with missing values. Ann Stat 37:490–517
Article MathSciNet MATH Google Scholar
Wang Q, Linton O, Härdle W (2004) Semiparametric regression analysis with missing response at random. J Am Stat Assoc 99:334–345
Article MathSciNet MATH Google Scholar
Wang Q, Qin Y (2010) Empirical likelihood confidence bands for distribution functions with missing responses. J Stat Plan Inference 140:2778–2789
Article MathSciNet MATH Google Scholar
Wang L, Rotnitzky A, Lin X (2010) Nonparametric regression with missing outcomes using weighted kernel estimating equations. J Am Stat Assoc 105:1135–1146
Article MathSciNet MATH Google Scholar
Watson GS (1964) Smooth regression analysis. Sankhya: Indian J Stat, Ser A 26:359–372
MathSciNet MATH Google Scholar

Download references

Acknowledgments

This work is supported in part by the National Science Foundation Grant DMS-1407400 of Majid Mojirsheibani.

Author information

Authors and Affiliations

Department of Mathematics, California State University, Northridge, CA, 91330, USA
Majid Mojirsheibani & Timothy Reese

Authors

Majid Mojirsheibani
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Reese
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Majid Mojirsheibani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mojirsheibani, M., Reese, T. Kernel regression estimation for incomplete data with applications. Stat Papers 58, 185–209 (2017). https://doi.org/10.1007/s00362-015-0693-z

Download citation

Received: 11 December 2014
Revised: 24 May 2015
Published: 02 July 2015
Issue Date: March 2017
DOI: https://doi.org/10.1007/s00362-015-0693-z

Keywords

Mathematics Subject Classification

62G08

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Kernel regression estimation for incomplete data with applications

Abstract

Access this article

Similar content being viewed by others

On the $$L_p$$ norms of kernel regression estimators for incomplete data with applications to classification

Kernel classification with missing data and the choice of smoothing parameters

On regression and classification with possibly missing response variables in the data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Kernel regression estimation for incomplete data with applications

Abstract

Access this article

Similar content being viewed by others

On the $$L_p$$ norms of kernel regression estimators for incomplete data with applications to classification

Kernel classification with missing data and the choice of smoothing parameters

On regression and classification with possibly missing response variables in the data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation