Empirical Economics

, Volume 13, Issue 3–4, pp 187–208 | Cite as

Automatic smoothing parameter selection: A survey

  • J. S. Marron
Article

Abstract

This is a survey of recent developments in smoothing parameter selection for curve estimation. The first goal of this paper is to provide an introduction to the methods available, with discussion at both a practical and also a nontechnical level, including comparison of methods. The second goal is to provide access to the literature, especially on smoothing parameter selection, but also on curve estimation in general. The two main settings considered here are nonparametric regression and probability density estimation, although the points made apply to other settings as well. These points also apply to many different estimators, although the focus is on kernel estimators, because they are the most easily understood and motivated, and have been at the heart of the development in the field.

Keywords

Probability Density Economic Theory Density Estimation Parameter Selection Nonparametric Regression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bean SJ Tsokos CP (1980) Developments in nonparametric density estimation. International Statistical Review 48:267–287Google Scholar
  2. Bhattacharya PK, Mack KP (1987) Weak convergence ofk-NN density and regression estimators with varyingk and applications. Annals of Statistics 15:976–994Google Scholar
  3. Bierens HJ (1987) Kernel estimation of regression function. In: Bewley TF (ed) Advances in econometrics. Canbridge University Press, New York; pp 99–144Google Scholar
  4. Bowman A (1984) An alternative method of cross-validation for the smoothing of density estimates. Biometrika 65:521–528Google Scholar
  5. Burman P (1985) A data dependent approach to density estimation. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 69:609–628Google Scholar
  6. Burman P (1988) Estimation of the optimal transformations usingv-fold cross-validation and repeated learning testing methods. Unpublished manuscriptGoogle Scholar
  7. Chow YS, Geman S, Wu LD (1983) Consistent cross-validated density estimation. Annals of Statistics 11:25–38Google Scholar
  8. Clark RM (1975) A calibration curve for radio carbon dates. Antiquity 49:251–266Google Scholar
  9. Collomb G (1981) Estimation non parametrique de la regression: revue. International Statistical Review 49:75–93Google Scholar
  10. Collomb G (1985) Nonparametric regressio: an up-to-date bibliography. Statistics 16:309–324Google Scholar
  11. Craven P, Wahba G (1979) Smoothing noisy data with spline functions. Numerische Mathematik 31:377–403Google Scholar
  12. Devroye L, Györfi L (1984) Nonparametric density estimation: TheL 1 view. Wiley, New YorkGoogle Scholar
  13. Devroye L (1987) A course in density estimation. Birkhauser, BostonGoogle Scholar
  14. Dodge Y (1986) Some difficulties involving nonparametric estimation of a density function. Journal of Official Statistics 2:193–202Google Scholar
  15. Duin RPW (1976) On the choice of smoothing parameters of Parzen estimators of probability density functions. IEEE Transactions on Computers C-25:1175–1179Google Scholar
  16. Eagleson GK, Buckley MJ (1987) Estimating the variance in nonparametric regression. Unpublished manuscriptGoogle Scholar
  17. Eubank R (1988) Spline smoothing and nonparametric regression. Wiley, New YorkGoogle Scholar
  18. Ferguson TS (1967) Mathematical statistics, a decision theoretic approach. Academic Press, New YorkGoogle Scholar
  19. Fryer MJ (1977) A review of some non-parametric methods of density estimation. Journal of the Instiute of Mathematics and its Applications 20:335–354Google Scholar
  20. Gasser T, Sroka L, Jennen C (1986) Residuals variance and residual pattern in nonlinear regression. Biometrika 73:625–633Google Scholar
  21. Habbema JDF, Hermans J, van den Broek K (1984) A stepwise discrimination analysis program using density estimation. Compstat 1974: Proceedings in Computational Statistics. Physica Verlag, Vienna, pp 101–110Google Scholar
  22. Härdle W (1988) Applied nonparametric regressionGoogle Scholar
  23. Härdle W, Marron JS (1985a) Optimal bandwidth selection in nonparametric regression function estimation. Annals of Statistics 12:1465–1481Google Scholar
  24. Härdle W, Marron JS (1985b) Asymptotic nonequivalence of some bandwidth selectors in nonparametric regression. Biometrika 72:481–484Google Scholar
  25. Härdle W, Hall P, Marron JS (1988) How far are automatically chosen regression smoothers from their optimum? Journal of the American Statistical Association 83:86–101, with discussionGoogle Scholar
  26. Härdle W, Marron JS, Wand MP (1988) Bandwidth choice for density derivatives. Unpublished manuscriptGoogle Scholar
  27. Hall P (1982) Cross-validation in density estimation. Biometrika 69:383–390Google Scholar
  28. Hall P (1983) Large sample optimality of least square cross-validation in density estimation, Annals of Statistics 11:1156–1174Google Scholar
  29. Hall P (1985) Asymptotic theory of minimum integrated square error for multivariate density estimation. Proceedings of the Sixth International Symposium on Multivariate Analysis at Pittsburgh, 25–29Google Scholar
  30. Hall P (1987a) On the estimation of probability densities using compactly supported kernels. Journal of Multivariate Analysis 23:131–158Google Scholar
  31. Hall P (1987b) On Kullback-Leibler loss and density estimation. Annals of Statistics 15:1491–1519Google Scholar
  32. Hall P, Marron JS (1987a) Extent to which least-squares cross-validation minimises integrated square error in nonparametric density estimation. Probability Theory and Related Fields 74:567–581Google Scholar
  33. Hall P, Marron JS (1987b) On the amount of noise inherent in bandwidth selection for a kernel density estimator. Annals of Statistics 15:163–181Google Scholar
  34. Hall P, Marron JS (1987c) Estimation of integrated squared density derivatives. Statistics and Probability Letters 6:109–115Google Scholar
  35. Hall P, Wand M (1988) On the minimization of absolute distance in kernel density estimation. To appear in Statistics and Probability LettersGoogle Scholar
  36. Hall P, Wand M (1989) MinimizingL1 distance in nonparametric density estimation. To appear in Journal of Multivariate AnalysisGoogle Scholar
  37. Kappenman RF (1987) A nonparametric data based univariate function estimate. Computational Statistics and Data Analysis 5:1–7Google Scholar
  38. Kendall MS (1976) Time Series. Griffin, LondonGoogle Scholar
  39. Krieger AM, Pickands J (1981) Weak convergence and efficient density estimation at a point. Annals of Statistics 9:1066–1078Google Scholar
  40. Li KC, Hwang J (1984) The data smoothing aspects of Stein estimates. Annals of Statistics 12: 887–897Google Scholar
  41. Li KC (1985) From Stein's unbiased risk estimates to the method of generalized cross-validation. Annals of Statistics 13:1352–1377Google Scholar
  42. Li KC (1987) Asymptotic optimality forC p, CL, cross-validation and generalized cross-validation: discrete index set. Annals of Statistics 15:958–975Google Scholar
  43. Mallows CL (1973) Some comments onC p, Technometrics 15:661–675Google Scholar
  44. Marron JS (1985) An asymptotically efficient solution to the bandwidth problem of kernel density estimation. Annals of Statistics 13:1011–1023Google Scholar
  45. Marron JS (1986) Will the art of smoothing ever become a science? Marron JS (ed) Function estimates. American Mathematical Society Series: Contemporary Mathematics 9:169–178Google Scholar
  46. Marron JS (1987a) A comparison of cross-validation techniques in density estimation. Annals of Statistics 15:152–162Google Scholar
  47. Marron JS (1987b) What does optimal bandwidth selection mean for nonparametric regression estimation. Dodge Y (ed) Statistical data analysis based on theL 1 norm and related methods. North Holland, AmsterdamGoogle Scholar
  48. Marron JS (1987c) Partitioned cross-validation. North Carolina Institute of Statistics, Mimeo Series #1721Google Scholar
  49. Marron JS (1988) Improvement of a data based bandwidth selector. Unpublished manuscriptGoogle Scholar
  50. Marron JS, Padgett WJ (1987) Asymptotically optimal bandwidth selection for kernel density estimators from randomly right-censored samples. Annals of Statistics 15:1520–1535Google Scholar
  51. Mielniczuk J, Vieu P (1988) Asymptotic suboptimality of one method of cross-validatory bandwidth choice in density estimation. Unpublished manuscriptGoogle Scholar
  52. Müller HG (1985) Empirical bandwidth choice for nonparametric kernel regression by means of pilot estimators. Statistics and Decisions, Supplement Issue No. 2:193–206Google Scholar
  53. Müller HG, Stadtmüller U (1987) Variable bandwidth kernel estimators of regression curves. Annals of Statistics 15:182–201Google Scholar
  54. Müller HG, Stadtmüller U, Schmitt T (1987) Bandwidth choice and confidence intervals for derivatives of noisy data. Biometrika 74:743–749Google Scholar
  55. Nadaraya EA (1964) On estimating regression. Theory of Probability and its Application 9:141–142Google Scholar
  56. Nolan D, Pollard D (1987) U-processes: rates of convergence. Annals of Statistics 15:780–799Google Scholar
  57. Parzen E (1962) On estimation of a probability density function and mode. Annals of Mathematical Statistics 33:1065–1076Google Scholar
  58. Prakasa Rao BLS (1983) Nonparametric functional estimation. Academic Press, New YorkGoogle Scholar
  59. Rice J (1984) Bandwidth choice for nonparametric regression. Annals of Statistics 12:1215–1230Google Scholar
  60. Rice J (1986) Bandwidth choice for differentiation. Journal of Multivariate Analysis 19:251–264Google Scholar
  61. Rosenblatt M (1956) Remarks on some non-parametric estimates of a density function. Annals of Mathematical Statistics 27:832–837Google Scholar
  62. Rosenblatt M (1971) Curve estimates. Annals of Mathematical Statistics 42:1815–1842Google Scholar
  63. Rudemo M (1982a) Empirical choice of histograms and kernel density estimators. Scandanavian Journal of Statistics 9:65–78Google Scholar
  64. Rudemo M (1982b) Consistent choice of linear smoothing methods. Report 82-1, Department of Mathematics, Royal Danish Agricultural and Veterinary University, CopenhagenGoogle Scholar
  65. Schuster EA, Gregory CG (1981) On the nonconsistency of maximum likelihood nonparametric density estimators. Eddy WF (ed) Computer Science and Statistics: Proceedings of the 13th Symposium on the Interface. Springer, New York, pp 295–298Google Scholar
  66. Scott DW (1985) Handouts for ASA short course in density estimation. Rice University Technical Report 776-331-86-2Google Scholar
  67. Scott DW (1986) Choosing smoothing parameters for density estimators. In: Allen DM (ed) Computer Science and Statistics: The Interface, pp 225–229Google Scholar
  68. Scott DW (1988) Discussion of Hardie W, Hall P, Marron JS, How far are automatically chosen regression smoothers from their optimum? To appear Journal of the American Statistical AssociationGoogle Scholar
  69. Scott DW, Factor LE (1981) Monte Carlo study of three data-based nonparametric probability density estimators. Journal of the American Statistical Association 76:9–15Google Scholar
  70. Scott DW, Tapia RA, Thompson JR (1977) Kernel density estimation revisited. Nonlinear Analysis, Theory, Methods and Applications 1:339–372Google Scholar
  71. Scott DW, Terrell GR (1987) Biased and unbiased cross-validation in density estimation. Journal of the American Statistical Association 82:1131–1146Google Scholar
  72. Sheather SJ (1983) A data-based algorithm for choosing the window width when estimating the density at a point. Computational Statistics and Data Analysis 1:229–238Google Scholar
  73. Sheather SJ (1986) An improved data-based algorithm for choosing the window width when estimating the density at a point Computational Statistics and Data Analysis 4:61–65Google Scholar
  74. Silverman BW (1985) Some aspects of the spline smoothing approach to nonparametric regression curve fitting (with discussion). Journal of the Royal Statistical Society, Series B 46:1–52Google Scholar
  75. Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, New YorkGoogle Scholar
  76. Stone CJ (1984) An asymptotically optimal window selection rule for kernel density estimates. Annals of Statistics 12:1285–1297Google Scholar
  77. Stone CJ (1985) An asymptotically optimal histogram selection rule. Proceedings of the Berkeley Symposium in Honor of Jerzy Neyman and Jack KeiferGoogle Scholar
  78. Stone M (1974) Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, Series B 36:111–147Google Scholar
  79. Tapia RA, Thompson JR (1978) Nonparametric probability density estimation. The Johns Hopkins University Press, BaltimoreGoogle Scholar
  80. Tarter ME, Kronmal RA (1976) An introduction to the implementation and theory of nonparametric density estimation. The American Statistician 30:105–112Google Scholar
  81. Terrell GR, Scott DW (1985) Oversmoothed density estimates. Journal of the American Statistical Association 80:209–214Google Scholar
  82. Tsybakov AB (1987) On the choice of the bandwidth in kernel nonparametric regression. Theory of Probability and Its Applications 32:142–148Google Scholar
  83. Ullah A (1987) Nonparametric estimation of econometric functional. Research Report 18, University of Western Ontario, to appear in Canadian Journal of EconomicsGoogle Scholar
  84. Wahba G, Wold S (1975) A completely automatic fench curve: fitting spline functions by cross-validation. Communications in Statistics 4:1–17Google Scholar
  85. Watson GS (1964) Smooth regression analysis. Sankhya, series A 26:359–372Google Scholar
  86. Wegman EJ (1972) Nonparametric probability density estimation: I. a summary of the available methods. Technometrics 14:533–546Google Scholar
  87. Wertz W (1978) Statistical density estimation: a survey. Angewandte Statistique und Okonometrie 13, Vandenhoeck und RuprechtGoogle Scholar
  88. Wertz W, Schneider B (1979) Statistical density estimation: a bibliography. International Statistical Review 49:75–93Google Scholar
  89. Woodroofe M (1970) On choosing a delta sequence. Annals of Mathematical Statistics 41:1665–1671Google Scholar

Copyright information

© Physica-Verlag 1988

Authors and Affiliations

  • J. S. Marron
    • 1
  1. 1.Department of StatisticsUniversity of North CarolinaUSA

Personalised recommendations