Abstract
One of the popular method for fitting a regression function is regularization: minimizing an objective function which enforces a roughness penalty in addition to coherence with the data. This is the case when formulating penalized likelihood regression for exponential families. Most of the smoothing methods employ quadratic penalties, leading to linear estimates, and are in general incapable of recovering discontinuities or other important attributes in the regression function. In contrast, non-linear estimates are generally more accurate. In this paper, we focus on non-parametric penalized likelihood regression methods using splines and a variety of non-quadratic penalties, pointing out common basic principles. We present an asymptotic analysis of convergence rates that justifies the approach. We report on a simulation study including comparisons between our method and some existing ones. We illustrate our approach with an application to Poisson non-parametric regression modeling of frequency counts of reported acquired immune deficiency syndrome (AIDS) cases in the UK.
Similar content being viewed by others
References
Alliney S., Ruzinsky S.A. (1994) An algorithm for the minimization of mixed l 1 and l 2 norms with application to Bayesian estimation. IEEE Transactions on Signal Processing 42: 618–627
Antoniadis A., Fan J. (2001) Regularization of wavelets approximations (with discussion). Journal of the American Statistical Association 96: 939–967
Belge M., Kilmer M.E., Miller E.L. (2002) Efficient determination of multiple regularization parameters in a generalized L-curve approach. Inverse Problems 18: 1161–1183
Besag J.E. (1989) Digital image processing: Towards Bayesian image analysis. Journal of Applied Statistics 16: 395–407
Biller C. (2000) Adaptive Bayesian regression splines in semiparametric generalized linear models. Journal of Computational and Graphical Statistics 9: 122–140
Black M., Rangarajan A. (1996) On the unification of line processes, outlier rejection, and robust statistics with applications to early vision. International Journal of Computer Vision 19: 57–91
Ciarlet P.G. (1989) Introduction to numerical linear algebra and optimisation. Cambridge Texts in Applied Mathematics. Cambridge University Press, New York
Cox D.D., O’Sullivan F. (1990) Asymptotic analysis of penalized likelihood and related estimators. The Annals of Statistics 18: 1676–1695
de Boor C. (1978) A practical guide to splines. Springer, New York
Delanay A.H., Bressler Y. (1998) Globally convergent edge-presering regularized reconstruction: An application to limited-angle tomography. IEEE Transactions on Image Processing 7: 204–221
Demoment G. (1989) Image-reconstruction and restoration—overview of common estimation structures and problems. IEEE Transactions on Acoustics Speech and Signal Processing 37: 2024–2036
Denison D.G.T., Mallick B.K., Smith A.F.M. (1998) Automatic Bayesian curve fitting. Journal of the Royal Statistical Society, Series B 60: 333–350
Dierckx P. (1993) Curve and surface fitting with splines. Oxford Monographs on Numerical Analysis. Oxford University Press, New York
DiMatteo I., Genovese C.R., Kass R.E. (2001) Bayesian curve-fitting with free-knot splines. Biometrika 88: 1055–1071
Donoho D., Johnstone I., Hoch J., Stern A. (1992) Maximum entropy and the nearly black object. Journal of the Royal Statistical Society, Series B 52: 41–81
Donoho D.L., Johnstone I.M. (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81: 425–455
Donoho D.L., Johnstone I.M. (1995) Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association 90: 1200–1224
Durand S., Nikolova M. (2006a) Stability of the minimizers of least squares with a non-convex regularization. Part I: Local behavior. Applied Mathematics and Optimization 53: 185–208
Durand S., Nikolova M. (2006b) Stability of the minimizers of least squares with a non-convex regularization. Part II: Global behavior. Applied Mathematics and Optimization 53: 259–277
Eilers P.C., Marx B.D. (1996) Flexible smoothing with B-splines and penalties (with discussion). Statistical Science 11: 89–121
Fahrmeir L., Kaufman H. (1985) Consistency of the maximum likelihood estimator in generalized linear models. The Annals of Statistics 13: 342–368
Fahrmeir L., Tutz G. (1994) Multivariate statistical modelling based on generalized linear models. Springer Series in Statistics. Springer, New York
Fan J. (1997) Comments on “Wavelets in statistics: A Review”, by A. Antoniadis. Journal of the Italian Statistical Society 6: 131–138
Fan J., Li R. (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96: 1348–1360
Fan J., Peng H. (2004) Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics 32: 928–961
Frank I.E., Friedman J.H. (1993) A statistical view of some chemometric regression tools (with discussion). Technometrics 35: 109–148
Friedman J.H. (1991) Multivariate adaptive regression splines. The Annals of Statistics 19: 1–67
Friedman J.H., Silverman B.W. (1989) Flexible parsimonious smoothing and additive modeling. Technometrics 31: 3–21
Fu W.J. (1998) Penalized regressions: The bridge versus the Lasso. Journal of Computational and Graphical Statistics 7: 397–416
Geman S., Geman D. (1984) Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligenced 6: 721–741
Geman, S., McClure, D. E. (1987). Statistical methods for tomographic image reconstruction. In Proceedings of the 46-th session of the ISI, Bulletin of the ISI, 52, 22–26.
Geman D., Reynolds G. (1992) Constrained restoration and the recovery of discontinuities. IEEE Transactions on Pattern Analysis and Machine Intelligence 14: 367–383
Geman D., Yang C. (1995) Nonlinear image recovery with half-quadratic regularization. IEEE Transactions on Image Processing 4: 932–946
Good I.J., Gaskins R.A. (1971) Nonparametric roughness penalities for probability densities. Biometrika 58: 255–277
Green, P. J., Yandell, B. (1985). Semi-parametric generalized linear models. In R. Gilchrist, B. J. Francis, J. Whittaker (Eds.), Generalized linear models, Vol. 32. Lecture notes in statistics (pp. 44–55). Berlin: Springer.
Gu C., Kim Y.J. (2002) Penalized likelihood regression: General formulation and efficient approximation. The Canadian Journal of Statistics 30: 619–628
Gu C., Qiu C.F. (1994) Penalized likelihood regression—a simple asymptotic analysis. Statistica Sinica 4: 297–304
Gullikson, M., Wedin, P.-A. (1998). Analyzing the nonlinear L-curve. Technical Report, Sweden: Department of Computer Science, University of Umeøa.
Hastie T.J., Tibshirani R.J. (1990) Generalized additive models. Monographs on Statistics and Applied Probability. New York, Chapman and Hall
Hubert, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 221–233.
Imoto S., Konishi S. (2003) Selection of smoothing parameters in B-spline nonparametric regression models using information criteria. The Annals of the Institute of Statistical Mathematics 55: 671–687
Jandhyala V.K., MacNeill I.B. (1997) Iterated partial sum sequences of regression residuals and tests for changepoints with continuity constraints. Journal of the Royal Statistical Society, Series B 59: 147–156
Klinger A. (2001) Inference in high-dimensional generalized linear models based on soft-thresholding. Journal of the Royal Statistical Society, Series B 63: 377–392
Knight K., Fu W. (2000) Asymptotics for Lasso-type estimators. The Annals of Statistics 28: 1356–1378
Lindstrom M.J. (1999) Penalized estimation of free-knot splines. Journal of Computational and Graphical Statistics 8: 333–352
Mammen E., Van de Geer S. (1997) Locally adaptive regression splines. The Annals of Statistics 25: 387–413
McCullagh P., Nelder J.A. (1989) Generalized linear models, 2nd edn. Chapman & Hall, London
Nikolova M. (2000) Local strong homogeneity of a regularized estimator. SIAM Journal of Applied Mathematics 61: 633–658
Nikolova M. (2004) Weakly constrained minimization. Application to the estimation of images and signals involving constant regions. Journal of Mathematical Imaging and Vision 21: 155–175
Nikolova M. (2005) Analysis of the recovery of edges in images and signals by minimizing nonconvex regularized least-squares. SIAM Journal on Multiscale Modeling and Simulation 4: 960–991
Nikolova M., Ng M.K. (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM Journal on Scientific Computing 27: 937–966
O’Sullivan F. (1986) A statistical perspective on ill-posed inverse problems (with discussion). Statistical Science 1: 505–527
O’Sullivan F., Yandell B.S., Raynor W.J. (1986) Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81: 96–103
Park M.Y., Hastie T. (2007) L 1-regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society, Series B 69: 659–677
Poggio T. (1985) Early vision: From computational structure to algorithms and parallel hardware. Computer Vision, Graphics, and Image Processing 31: 139–155
Rockafellar R.T. (1970) Convex analysis. Princeton University Press, New Jersy
Rockafellar T., Wets R.J.-B. (1997) Variational analysis. Mathematics. Springer, Berlin
Ruppert, D., Carroll, R. (1997). Penalized regression splines. Working paper, School of Operations Research and Industrial Engineering, Cornell University. Available at http://www.orie.cornell.edu/~davidr/papers.
Ruppert D., Carroll R.J. (2000) Spatially-adaptive penalties for spline fitting. Australian and New Zealand Journal of Statistics 42: 205–223
Stasinopoulos D.M., Rigby R.A. (1992) Detecting break points in generalized linear models. Computational Statistics & Data Analysis 13: 461–471
Stone C.J., Hansen M.H., Kooperberg C., Truong Y.K. (1997) Polynomial splines and their tensor products in extended linear modeling. The Annals of Statistics 25: 1371–1470
Tibshirani R.J. (1996) Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B 58: 267–288
Tikhonov A.N. (1963) Solution of incorrectly formulated problems and the regularization method. Soviet Math Dokl 4: 1035–1038 (English translation)
Tishler A., Zang I. (1981) A new maximum likelihood algorithm for piecewise regression. Journal of the American Statistical Association 76: 980–987
Titterington D.M. (1985) Common structure of smoothing techniques in statistics. International Statistical Review 53: 141–170
Whittaker E. (1923) On a new method of graduation. Proceedings of Edinburgh Mathematical Society 41: 63–75
Yu Y., Ruppert D. (2002) Penalized spline estimation for partially linear single-index models. Journal of the American Statistical Association 97: 1042–1054
Yuan M., Lin Y. (2007) On the non-negative garrotte estimator. Journal of the Royal Statistical Society, Series B 69: 143–161
Zhao P., Yu B. (2006) On model selection consistency of Lasso. Journal of Machine Learning Research 7: 2541–2563
Zhou S., Shen X. (2001) Spatially adaptive regression splines and accurate knot selection schemes. Journal of the American Statistical Association 96: 247–259
Zou H. (2006) The adaptive Lasso and its oracle properties. Journal of the American Statistical Association 101: 1418–1429
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Antoniadis, A., Gijbels, I. & Nikolova, M. Penalized likelihood regression for generalized linear models with non-quadratic penalties. Ann Inst Stat Math 63, 585–615 (2011). https://doi.org/10.1007/s10463-009-0242-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-009-0242-4