Abstract.
This paper concerns the cubic smoothing spline approach to nonparametric regression. After first deriving sharp asymptotic formulas for the eigenvalues of the smoothing matrix, the paper uses these formulas to investigate the efficiency of different selection criteria for choosing the smoothing parameter. Special attention is paid to the generalized maximum likelihood (GML), C p and extended exponential (EE) criteria and their marginal Bayesian interpretation. It is shown that (a) when the Bayesian model that motivates GML is true, using C p to estimate the smoothing parameter would result in a loss of efficiency with a factor of 10/3, proving and strengthening a conjecture proposed in Stein (1990); (b) when the data indeed come from the C p density, using GML would result in a loss of efficiency of ∞ ; (c) the loss of efficiency of the EE criterion is at most 1.543 when the data are sampled from its consistent density family. The paper not only studies equally spaced observations (the setting of Stein, 1990), but also investigates general sampling scheme of the design points, and shows that the efficiency results remain the same in both cases.
References
Akaike, H.: A new look at statistical model identification. IEEE Trans. Auto. Cont. AU-19, 716–722 (1974)
Billingsley, P.: Probability and Measure, 3rd ed. Wiley, New York, 1995
Bowman, A., Azzalini, A.: Applied smoothing techniques for data analysis: the kernel approach with S-Plus illustrations. Oxford University Press, New York, 1997
Craven, P., Wahba, G.: Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31, 377–403 (1979)
Culpin, D.: Calculation of cubic smoothing splines for equally spaced data. Numer. Math. 48, 627–638 (1986)
Demmler, A., Reinsch, C.: Oscillation matrices with spline smoothing. Numer. Math. 24, 375–382 (1975)
Efron, B.: Selection criteria for scatterplot smoothers. Ann. Statist. 29, 470–504 (2001)
Eubank, R.: Spline Smoothing and Nonparametric Regression. Marcel Dekker, New York, 1988
Eubank, R.: Nonparametric Regression and Spline Smoothing, 2nd ed, Marcel Dekker, New York, 1999
Fan, J.: Prospects of nonparametric modeling. J. Amer. Statist. Assoc. 95, 1296–1300 (2000)
Fan, J., Gijbels, I.: Local Polynomial Modelling and Its Applications. Chapman and Hall, London, 1996
Feller, W.: An Introduction to Probability Theory and Its Applications, Vol. II. Wiley, New York, 1971
Gradshteyn, I., Ryzhik, I.: Table of Integrals, Series, and Products. Academic Press, Boston, 1994
Green, P., Silverman, B.: Nonparametric Regression and Generalized Linear Models. Chapman and Hall, London, 1994
Hall, P.: Biometrika century: nonparametrics. Biometrika 88, 143–165 (2001)
Hall, P., Johnstone, I.: Empirical functionals and efficient smoothing parameter selection (with discussion). J. Roy. Statist. Soc. B 54, 475–530 (1992)
Härdle, W.: Applied Nonparametric Regression. Cambridge University Press, Cambridge, 1990
Härdle, W., Hall, P., Marron, S.: How far are the optimally chosen smoothing parameters from their optimum? (with discussion.) J. Amer. Statist. Assoc. 83, 86–101 (1988)
Hastie, T., Tibshirani, R.: Generalized Additive Models. Chapman and Hall, London, 1990
Kimeldorf, G., Wahba, G.: A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Ann. Math. Statist. 41, 495–502 (1970)
Kneip, A.: Ordered linear smoothers. Ann. Statist. 22, 835–866 (1994)
Kou, S.C.: Extended exponential criterion: a new selection procedure for scatterplot smoothers. Ph. D. thesis, Stanford University, 2001
Kou, S.C., Efron, B.: Smoothers and the C p , GML and EE criteria: A geometric approach. J. Amer. Statist. Assoc. 97, 766–782 (2002)
Li, K.-C.: Asymptotic optimality of C L and generalized cross-validation in ridge regression with application to spline smoothing. Ann. Statist. 14, 1101–1112 (1986)
Li, K.-C.: Asymptotic optimality for C p , C L , cross-validation and generalized cross-validation: discrete index set. Ann. Statist. 15, 958–975 (1987)
Mallows, C.: Some comments on C p . Technometrics 15, 661–675 (1973)
Nussbaum, M: Spline smoothing in regression models and asymptotic efficiency in L 2. Ann. Statist. 13, 984–997 (1985)
Rosenblatt, M.: Stochastic Curve Estimation. NSF-CBMS Regional Conference Series in Probability and Statistics, Volume 3. IMS, Hayward, 1991
Reinsch, C.: Smoothing by spline functions. Numer. Math. 10, 177–183 (1967)
Schoenberg, I.: Spline functions and the problem of graduation. Proc. Nat. Acad. Sci. USA. 52, 947–950 (1964a)
Schoenberg, I.: On interpolation by spline functions and its minimum properties. Internat. Ser. Numer. Anal. 5, 109–129 (1964b)
Silverman, B.: A fast and efficient cross-validation method for smoothing parameter choice in spline regression. J. Amer. Statist. Assoc. 79, 584–589 (1984)
Silverman, B.: Some aspects of the spline smoothing approach to nonparametric regression curve fitting (with discussion). J. Roy. Statist. Soc. B 47, 1–52 (1985)
Simonoff, J.: Smoothing Methods in Statistics. Springer-Verlag, New York, 1996
Speckman, P.: Efficient nonparametric regression with cross-validated smoothing splines. Unpublished manuscript, 1983
Speckman, P.: Spline smoothing and optimal rates of convergence in nonparametric regression models. Ann. Statist. 13, 970–983 (1985)
Speckman, P., Sun, D.: Asymptotic properties of smoothing parameter selection in spline regression. Preprint, 2001
Stein, M.: A comparison of generalized cross validation and modified maximum likelihood for estimating the parameters of a stochastic process. Ann. Statist. 18, 1139–1157 (1990)
Stein, M.: Spline smoothing with an estimated order parameter. Ann. Statist. 21, 1522–1544 (1993)
Utreras, F.: Cross-validation techniques for smoothing spline functions in one or two dimensions. In: Smoothing Techniques for Curve Estimation, (T. Gasser, M. Rosenblatt, ed.), Springer-Verlag, Heidelberg, 1979, pp. 196–232
Utreras, F.: Sur le choix du parametre d'ajustement dans le lissage par fonctions spline. Numer. Math. 34, 15–28 (1980)
Utreras, F.: Optimal smoothing of noisy data using spline functions. SIAM J. Sci. and Statist. Comput. 2, 349–362 (1981)
Utreras, F.: Boundary effects on convergence rates for Tikhonov regularization. J. Approx. Theor. 54, 235–249 (1988)
Wahba, G.: Smoothing noisy data by spline functions. Numer. Math. 24, 383–393 (1975)
Wahba, G.: Optimal smoothing of density estimates. In: Classification and Clustering (J. Van Ryzin, ed.), Academic Press, New York, 1977a, pp. 423–458
Wahba, G.: A survey of some smoothing problems and the method of generalized cross-validation for solving them. In: Applications of Statistics (P. R. Krishnaiah, ed.), North Holland, Amsterdam. 1977b, pp. 507–523
Wahba, G.: A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problem. Ann. Statist. 13, 1378–1402 (1985)
Wahba, G.: Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics, 59. SIAM, Philadelphia, 1990
Wecker, W., Ansley, C.: The signal extraction approach to nonlinear regression and spline smoothing. J. Amer. Statist. Assoc. 78, 81–89 (1983)
Whittaker, E.: On a new method of graduation. Proc. Edinburgh Math. Soc. 41, 63–75 (1923)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported in part by NSF grant DMS-0204674 and Harvard University Clark-Cooke Fund.
Mathematics Subject Classification (2000): Primary: 62G08; Secondary: 62G20
Rights and permissions
About this article
Cite this article
Kou, S. On the efficiency of selection criteria in spline regression. Probab. Theory Relat. Fields 127, 153–176 (2003). https://doi.org/10.1007/s00440-003-0277-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-003-0277-z
Key words or phrases:
- Smoothing splines
- Extended exponential criterion
- Cp
- Generalized maximum likelihood
- Eigenvalue
- Robustness
- Sampling scheme