On robust cross-validation for nonparametric smoothing
- 310 Downloads
An essential problem in nonparametric smoothing of noisy data is a proper choice of the bandwidth or window width, which depends on a smoothing parameter \(k\). One way to choose \(k\) based on the data is leave-one-out-cross-validation. The selection of the cross-validation criterion is similarly important as the choice of the smoother. Especially, when outliers are present, robust cross-validation criteria are needed. So far little is known about the behaviour of robust cross-validated smoothers in the presence of discontinuities in the regression function. We combine different smoothing procedures based on local constant fits with each of several cross-validation criteria. These combinations are compared in a simulation study under a broad variety of data situations with outliers and abrupt jumps. There is not a single overall best cross-validation criterion, but we find Boente-cross-validation to perform well in case of large percentages of outliers and the Tukey-criterion in case of data situations with jumps, even if the data are contaminated with outliers.
KeywordsNonparametric regression Jump-preserving smoothers Outliers Robust bandwidth selection Structural breaks
This work has been supported in part by the Collaborative Research Center “Statistical modeling of nonlinear dynamic processes” (SFB 823) of the German Research Foundation (DFG). The helpful and stimulating comments of the referees and the associate editor is also acknowledged.
- Donoho DL, Huber PJ (1983) The notion of breakdown point. In: Bickel PJ, Doksum K, Hodges JL (eds) A Festschrift for Erich Lehmann. Wadsworth, Belmont, CA, pp 157–184Google Scholar
- Haerdle W (1984) How to determine the bandwidth of nonlinear smoothers in practice? In: Franke J, Haerdle W, Martin D (eds) Lecture notes in statistics, vol 26. Springer, Heidelberg, DE, pp 163–184Google Scholar
- Haerdle W (2002) Applied nonparametric regression. Cambridge University Press, EdinburghGoogle Scholar
- Maechler M (1989) Parametric smoothing quality in nonparametric regression: shape control by penalizing inflection points. Phd thesis, no 8920, ETH Zuerich, Statistik, ETH-Zentrum, CH-8092 Zurich, SwitzerlandGoogle Scholar
- R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0. http://www.R-project.org
- Schmidt G, Mattern R, Schueler F (1981) Biomechanical investigation to determine physical and traumatological differentiation criteria for the maximum load capacity of head and vertebral column with and without protective helmet under effects of impact. EEC research program on biomechanics of impacts, final report phase III, Project 65, Institut fuer Rechtsmedizin, Universitaet Heidelberg, West GermanyGoogle Scholar
- Yang Y, Zheng Z (1992) Asymptotic properties for cross-validated nearest neighbor median estimators in nonparametric regression: the \(L_1\)-view. In: Jiang Z, Yan S, Cheng P, Wu R (eds) Probability and statistics. World Scientific, SG, pp 242–257Google Scholar