Advertisement

Statistics and Computing

, Volume 19, Issue 3, pp 217–228 | Cite as

Nonparametric density deconvolution by weighted kernel estimators

  • Martin L. HazeltonEmail author
  • Berwin A. Turlach
Article

Abstract

Nonparametric density estimation in the presence of measurement error is considered. The usual kernel deconvolution estimator seeks to account for the contamination in the data by employing a modified kernel. In this paper a new approach based on a weighted kernel density estimator is proposed. Theoretical motivation is provided by the existence of a weight vector that perfectly counteracts the bias in density estimation without generating an excessive increase in variance. In practice a data driven method of weight selection is required. Our strategy is to minimize the discrepancy between a standard kernel estimate from the contaminated data on the one hand, and the convolution of the weighted deconvolution estimate with the measurement error density on the other hand. We consider a direct implementation of this approach, in which the weights are optimized subject to sum and non-negativity constraints, and a regularized version in which the objective function includes a ridge-type penalty. Numerical tests suggest that the weighted kernel estimation can lead to tangible improvements in performance over the usual kernel deconvolution estimator. Furthermore, weighted kernel estimates are free from the problem of negative estimation in the tails that can occur when using modified kernels. The weighted kernel approach generalizes to the case of multivariate deconvolution density estimation in a very straightforward manner.

Keywords

Density estimation Errors in variables Integrated square error Measurement error Weights 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carroll, R., Hall, P.: Optimal rates of convergence for deconvolving a density. J. Am. Stat. Assoc. 83, 1184–1186 (1988) zbMATHCrossRefMathSciNetGoogle Scholar
  2. Carroll, R.J., Ruppert, D., Stefanski, L.A., Crainiceanu, C.: Measurement Error in Nonlinear Models, 2 edn. Chapman & Hall/CRC, Boca Raton (2006) zbMATHGoogle Scholar
  3. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000) Google Scholar
  4. Dai, Y.H., Fletcher, R.: New algorithms for singly linearly constrained quadratic programs subject to lower and upper bounds. Math. Progr. 106(3), 403–421 (2006) zbMATHCrossRefMathSciNetGoogle Scholar
  5. Delaigle, A., Gijbels, I.: Bootstrap bandwidth selection in kernel density estimation from a contaminated sample. Ann. Inst. Stat. Math. 56(1), 19–47 (2004a) zbMATHCrossRefMathSciNetGoogle Scholar
  6. Delaigle, A., Gijbels, I.: Practical bandwidth selection in deconvolution kernel density estimation. Comput. Stat. Data Anal. 45(2), 249–267 (2004b) zbMATHCrossRefMathSciNetGoogle Scholar
  7. Devroye, L.: Consistent deconvolution in density estimation. Can. J. Stat. 17, 235–239 (1989) zbMATHCrossRefMathSciNetGoogle Scholar
  8. Duong, T., Hazelton, M.L.: Plug-in bandwidth matrices for bivariate kernel density estimation. J. Nonparam. Stat. 15(1), 17–30 (2003) zbMATHCrossRefMathSciNetGoogle Scholar
  9. Efromovich, S., Density estimation for the case of supersmooth measurement error. J. Am. Stat. Assoc. 92, 526–535 (1997) zbMATHCrossRefMathSciNetGoogle Scholar
  10. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (with discussion). Ann. Stat. 32(2), 407–499 (2004) zbMATHCrossRefMathSciNetGoogle Scholar
  11. Eggermont, P., LaRiccia, V.: Nonlinearly smoothed EM density estimation with automated smoothing parameter selection for nonparametric deconvolution problems. J. Am. Stat. Assoc. 92, 1451–1458 (1997) zbMATHCrossRefMathSciNetGoogle Scholar
  12. Fan, J.: Global behavior of deconvolution kernel estimates. Stat. Sinica 1(2), 541–551 (1991a) zbMATHGoogle Scholar
  13. Fan, J.: On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Stat. 19(3), 1257–1272 (1991b) zbMATHCrossRefGoogle Scholar
  14. Fan, J.: Deconvolution with supersmooth distributions. Can. J. Stat. 20(2), 155–169 (1992) zbMATHCrossRefGoogle Scholar
  15. Fletcher, R.: Practical Methods of Optimization, 2 edn. Wiley, New York (1987) zbMATHGoogle Scholar
  16. Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, New York (1981) zbMATHGoogle Scholar
  17. Goldfarb, D., Idnani, A.: A numerically stable dual method for solving strictly convex quadratic programs. Math. Progr. 27, 1–33 (1983) zbMATHCrossRefMathSciNetGoogle Scholar
  18. Hall, P., Qiu, P.: Discrete-transform approach to deconvolution problems. Biometrika 92, 135–148 (2005) zbMATHCrossRefMathSciNetGoogle Scholar
  19. Hall, P., Turlach, B.: Reducing bias in curve estimation by use of weights. Comput. Stat. Data Anal. 30, 67–86 (1999) zbMATHCrossRefMathSciNetGoogle Scholar
  20. Hesse, C.H.: Data-driven deconvolution. J. Nonparam. Stat. 10(4), 343–373 (1999) zbMATHCrossRefMathSciNetGoogle Scholar
  21. Jones, M., Linton, O., Nielson, J.: A simple bias reduction method for density estimation. Biometrika 82, 327–338 (1995) zbMATHCrossRefMathSciNetGoogle Scholar
  22. Kannel, W., Neaton, J., Wentworth, D., Thomas, H., Stamler, J., Hulley, S., Kjelsberg, M.: Overall and coronary heart disease mortality rates in relation to major risk factors in 325,348 men screened for MRFIT. Am. Heart J. 112, 825–836 (1986) CrossRefGoogle Scholar
  23. Karatzoglou, A., Meyer, D., Hornik, K.: Support vector machines in R. J. Stat. Softw. 15(9), 1–28 (2006). http://www.jstatsoft.org/v15/i09/ Google Scholar
  24. Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab—an S4 package for kernel methods in R. J. Stat. Soft. 11(9), 1–20 (2004). http://www.jstatsoft.org/v11/i09/ Google Scholar
  25. Koo, J.-Y., Park, B.U.: B-spline deconvolution based on the EM algorithm. J. Stat. Comput. Simul. 54(4), 275–288 (1996) zbMATHCrossRefMathSciNetGoogle Scholar
  26. Liu, M., Taylor, R.: Simulations and computations of nonparametric estimates for the deconvolution problem. J. Stat. Comput. Simul. 35, 145–167 (1990) zbMATHCrossRefGoogle Scholar
  27. Masry, E.: Multivariate probability density deconvolution for stationary random processes. IEEE Trans. Inf. Theory 37, 1105–1115 (1991) zbMATHCrossRefMathSciNetGoogle Scholar
  28. Masry, E.: Deconvolving multivariate kernel density estimates from contaminated associated observations. IEEE Trans. Inf. Theory 49, 2941–2952 (2003) CrossRefMathSciNetGoogle Scholar
  29. Mendelsohn, J., Rice, R.: Deconvolution of microfluorometric histograms with B splines. J. Am. Stat. Assoc. 77, 748–753 (1982) CrossRefGoogle Scholar
  30. Moguerza, J.M., Muñoz, A.: Support vector machines with applications (with discussion). Stat. Sci. 21(3), 322–362 (2006) CrossRefGoogle Scholar
  31. Osborne, M.R., Presnell, B., Turlach, B.A.: A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20(3), 389–403 (2000) zbMATHCrossRefMathSciNetGoogle Scholar
  32. Pensky, M.: Density deconvolution based on wavelets with bounded supports. Stat. Probab. Lett. 56(3), 261–269 (2002) zbMATHCrossRefMathSciNetGoogle Scholar
  33. Pensky, M., Vidakovic, B.: Adaptive wavelet estimator for nonparametric density deconvolution. Ann. Stat. 27(6), 2033–2053 (1999) zbMATHCrossRefMathSciNetGoogle Scholar
  34. R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2007). http://www.R-project.org Google Scholar
  35. Rosset, S., Zhu, J.: Piecewise linear regularized solution paths. Ann. Stat. 35(3), 1012–1030 (2007) zbMATHCrossRefMathSciNetGoogle Scholar
  36. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002) Google Scholar
  37. Serafini, T., Zanghirati, G., Zanni, L.: Gradient projection methods for quadratic programs and applications in training support vector machines. Optim. Methods Softw. 20(23), 353–378 (2005) zbMATHCrossRefMathSciNetGoogle Scholar
  38. Sheather, S.J., Jones, M.C.: A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc. Ser. B 53, 683–690 (1991) zbMATHMathSciNetGoogle Scholar
  39. Simonoff, J.: Smoothing Methods in Statistics. Springer, New York (1996) zbMATHGoogle Scholar
  40. Sircombe, K., Hazelton, M.: Comparison of detrital zircon age distributions by kernel functional estimation. Sediment. Geol. 171(1–4), 91–111 (2004) CrossRefGoogle Scholar
  41. Stefanski, L.: Rates of convergence of some estimators in a class of deconvolution problem. Stat. Probab. Lett. 9, 229–235 (1990) zbMATHCrossRefMathSciNetGoogle Scholar
  42. Stefanski, L., Carroll, R.J.: Deconvoluting kernel density estimators. Statistics 21(2), 169–184 (1990) zbMATHCrossRefMathSciNetGoogle Scholar
  43. Turlach, B.A., Weingessel, A.: quadprog: Functions to solve Quadratic Programming Problems. S original by B.A. Turlach, R port by A. Weingessel; R package version 1.4-11 (2007) Google Scholar
  44. van de Wiel, M.A., Kim, K.I.: Estimating the false discovery rate using nonparametric deconvolution. Biometrics 63, 806–815 (2007) zbMATHCrossRefMathSciNetGoogle Scholar
  45. Van Es, B., Uh, H.-W.: Asymptotic normality of kernel-type deconvolution estimators. Scand. J. Stat. 32, 467–483 (2005) zbMATHCrossRefGoogle Scholar
  46. Walter, G.G.: Density estimation in the presence of noise. Stat. Probab. Lett. 41(3), 237–246 (1999). Special issue in memory of V. Susarla zbMATHCrossRefGoogle Scholar
  47. Wand, M.: Finite sample performance of deconvolving density estimators. Stat. Probab. Lett. 37, 131–139 (1998) zbMATHCrossRefMathSciNetGoogle Scholar
  48. Wand, M., Jones, M.: Kernel Smoothing. Chapman & Hall, London (1995) zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Institute of Fundamental SciencesMassey UniversityPalmerston NorthNew Zealand
  2. 2.Department of Statistics and Applied ProbabilityNational University of SingaporeSingaporeSingapore
  3. 3.School of Mathematics and Statistics (MO19)The University of Western AustraliaCrawleyAustralia

Personalised recommendations