Annals of the Institute of Statistical Mathematics

, Volume 67, Issue 1, pp 93–127

# Sparse and efficient estimation for partial spline models with increasing dimension

• Guang Cheng
• Hao Helen Zhang
• Zuofeng Shang
Article

## Abstract

We consider model selection and estimation for partial spline models and propose a new regularization method in the context of smoothing splines. The regularization method has a simple yet elegant form, consisting of roughness penalty on the nonparametric component and shrinkage penalty on the parametric components, which can achieve function smoothing and sparse estimation simultaneously. We establish the convergence rate and oracle properties of the estimator under weak regularity conditions. Remarkably, the estimated parametric components are sparse and efficient, and the nonparametric component can be estimated with the optimal rate. The procedure also has attractive computational properties. Using the representer theory of smoothing splines, we reformulate the objective function as a LASSO-type problem, enabling us to use the LARS algorithm to compute the solution path. We then extend the procedure to situations when the number of predictors increases with the sample size and investigate its asymptotic properties in that context. Finite-sample performance is illustrated by simulations.

## Keywords

Smoothing splines Semiparametric models RKHS   High dimensionality Solution path Oracle property  Shrinkage methods

## References

1. Abramowitz, M., Stegun, I. (1964). Handbook of mathematical functions with formulas, graphs, and mathematical tables. New York: Dover.Google Scholar
2. Breiman, L. (1995). Better subset selection using the nonnegative garrote. Technometrics, 37, 373–384.Google Scholar
3. Bickel, P.J., Ritov, Y., Tsybakov, A.B. (2009). Simultaneous analysis of Lasso and Dantzig selector. Annals of Statistics, 37, 1705–1732.Google Scholar
4. Craven, P., Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerishe Mathematik, 31, 377–403.Google Scholar
5. Denby, L. (1984). Smooth regression functions. Ph.D. Thesis. Department of Statistics. University of Michigan.Google Scholar
6. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32, 407–451.Google Scholar
7. Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of American Statistical Association, 96, 1348–1360.Google Scholar
8. Fan, J., Li, R. (2004). New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. Journal of American Statistical Association, 99, 710–723.Google Scholar
9. Fan, J., Lv, J. (2008). Sure independence screening for ultra-high dimensional feature space. Journal of Royal Statistical Society B (with discussion) , 70, 849–911.Google Scholar
10. Fan, J., Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Annals of Statistics, 32, 928–961.Google Scholar
11. Green, P.J., Silverman, B.W. (1994). Nonparametric regression and generalized linear models. London: Chapman and Hall.Google Scholar
12. Gu, C. (2002). Smoothing spline ANOVA models. New York: Springer.Google Scholar
13. Heckman, N. (1986). Spline smoothing in a partly linear models. Journal of Royal Statistical Society Series B, 48, 244–248.Google Scholar
14. Huang, J., Horowitz, J., Ma, S. (2008a). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Annals of Statistics, 36, 587–613.Google Scholar
15. Huang, J., Ma, S., Zhang, C.H. (2008b). Adaptive LASSO for sparse high dimensional regression. Statistica Sinica, 18, 1603–1618.Google Scholar
16. Kimeldorf, G., Wahba, G. (1971). Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 33, 82–95.Google Scholar
17. Mammen, E., van de Geer, S. (1997). Penalized quasi-likelihood estimation in partially linear models. Annals of Statistics, 25, 1014–1035.Google Scholar
18. Ni, X., Zhang, H.H., Zhang, D. (2009). Automatic model selection for partially linear models. Journal of Multivariate Analysis, 100, 2100–2111.Google Scholar
19. Portnoy, S. (1984). Asymptotic behavior of M-estimator of $$p$$ regression parameters when $$p^2/n$$ is large. I. Consistency. Annals of Statistics, 12, 1298–1309.Google Scholar
20. Rice, J. (1986). Convergence rates for partially spline model. Statistics and Probability Letters, 4, 203–208.Google Scholar
21. Ruppert, D., Wand, M.P., Carroll, R.J. (2003). Semiparametric regression. Cambridge: Cambridge University Press.Google Scholar
22. Shang, Z., Cheng, G. (2013). Local and global asymptotic inference in smoothing spline models. Annals of Statistics, (to appear).Google Scholar
23. Shao, J. (2003). Mathematical statistics (2nd ed.). New York: Springer.Google Scholar
24. Shiau, J., Wahba, G. (1988). Rates of convergence for some estimates of a semi-parametric model. Communications in Statistics Simulation and Computation, 17, 111–113.Google Scholar
25. Speckman, P. (1988). Kernel smoothing in partially linear models. Journal of Royal Statistical Society-B, 50, 413–436.Google Scholar
26. Stamey, T., Kabalin, J., McNeal, J., Johnstone, I., Freida, F., Redwine, E., et al. (1989). Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate II radical prostatectomy treated patients. Journal of Urology, 16, 1076–1083.Google Scholar
27. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B, 58, 267–288.Google Scholar
28. van der Vaart, A. W., Wellner, J.A. (1996). Weak convergence and empirical processes: with applications to statistics. New York: Springer.Google Scholar
29. Wahba, G. (1984) Partial spline models for the semiparametric estimation functions of several variables. In H.A. David, H. T. David (Eds.), Statistics: An appraisal, proceedings of the 50th anniversary conference. Ames: Iowa State University Press.Google Scholar
30. Wahba, G. (1990), Spline models for observational data. SIAM. CBMS-NSF, Regional Conference Series in Applied Mathematics, Vol. 59. Philadelphia.Google Scholar
31. Wang, H., Li, R., Tsai, C.L. (2007a). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika, 94, 553–568.Google Scholar
32. Wang, H., Li, G., Jiang, G. (2007b). Robust regression shrinkage and consistent variable selection via the LAD-LASSO. Journal of Business & Economics Statistics, 20, 347–355.Google Scholar
33. Wang, H., Li, B., Leng, C. (2009). Shrinkage tuning parameter selection with a diverging number of parameters. Journal of Royal Statistical Society, Series B, 71, 671–683.Google Scholar
34. Yatchew, A. (1997). An elementary estimator of the partial linear model. Economics Letters, 57, 135–143.Google Scholar
35. Zhang, H.H., Lu, W. (2007). Adaptive-LASSO for Cox’s proportional hazards model. Biometrika, 94, 691–703.Google Scholar
36. Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of American Statistical Association, 101, 1418–1429.Google Scholar
37. Zou, H., Zhang, H.H. (2009). On the adaptive elastic-net with a diverging number of parameters. Annals of Statistics, 37, 1733–1751.Google Scholar