Penalized likelihood regression for generalized linear models with non-quadratic penalties

Antoniadis, Anestis; Gijbels, Irène; Nikolova, Mila

doi:10.1007/s10463-009-0242-4

Penalized likelihood regression for generalized linear models with non-quadratic penalties

Published: 09 June 2009

Volume 63, pages 585–615, (2011)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Anestis Antoniadis¹,
Irène Gijbels² &
Mila Nikolova³

623 Accesses
47 Citations
Explore all metrics

Abstract

One of the popular method for fitting a regression function is regularization: minimizing an objective function which enforces a roughness penalty in addition to coherence with the data. This is the case when formulating penalized likelihood regression for exponential families. Most of the smoothing methods employ quadratic penalties, leading to linear estimates, and are in general incapable of recovering discontinuities or other important attributes in the regression function. In contrast, non-linear estimates are generally more accurate. In this paper, we focus on non-parametric penalized likelihood regression methods using splines and a variety of non-quadratic penalties, pointing out common basic principles. We present an asymptotic analysis of convergence rates that justifies the approach. We report on a simulation study including comparisons between our method and some existing ones. We illustrate our approach with an application to Poisson non-parametric regression modeling of frequency counts of reported acquired immune deficiency syndrome (AIDS) cases in the UK.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Smoothing, Shrinkage and Variable Selection in Hazard Regression

Semiparametric regression analysis of doubly censored failure time data from cohort studies

Article 21 May 2019

A comparison of optimization solvers for log binomial regression including conic programming

Article Open access 22 February 2021

References

Alliney S., Ruzinsky S.A. (1994) An algorithm for the minimization of mixed l ₁ and l ₂ norms with application to Bayesian estimation. IEEE Transactions on Signal Processing 42: 618–627
Article Google Scholar
Antoniadis A., Fan J. (2001) Regularization of wavelets approximations (with discussion). Journal of the American Statistical Association 96: 939–967
Article MATH MathSciNet Google Scholar
Belge M., Kilmer M.E., Miller E.L. (2002) Efficient determination of multiple regularization parameters in a generalized L-curve approach. Inverse Problems 18: 1161–1183
Article MATH MathSciNet Google Scholar
Besag J.E. (1989) Digital image processing: Towards Bayesian image analysis. Journal of Applied Statistics 16: 395–407
Article Google Scholar
Biller C. (2000) Adaptive Bayesian regression splines in semiparametric generalized linear models. Journal of Computational and Graphical Statistics 9: 122–140
Article MathSciNet Google Scholar
Black M., Rangarajan A. (1996) On the unification of line processes, outlier rejection, and robust statistics with applications to early vision. International Journal of Computer Vision 19: 57–91
Article Google Scholar
Ciarlet P.G. (1989) Introduction to numerical linear algebra and optimisation. Cambridge Texts in Applied Mathematics. Cambridge University Press, New York
Google Scholar
Cox D.D., O’Sullivan F. (1990) Asymptotic analysis of penalized likelihood and related estimators. The Annals of Statistics 18: 1676–1695
Article MATH MathSciNet Google Scholar
de Boor C. (1978) A practical guide to splines. Springer, New York
MATH Google Scholar
Delanay A.H., Bressler Y. (1998) Globally convergent edge-presering regularized reconstruction: An application to limited-angle tomography. IEEE Transactions on Image Processing 7: 204–221
Article Google Scholar
Demoment G. (1989) Image-reconstruction and restoration—overview of common estimation structures and problems. IEEE Transactions on Acoustics Speech and Signal Processing 37: 2024–2036
Article Google Scholar
Denison D.G.T., Mallick B.K., Smith A.F.M. (1998) Automatic Bayesian curve fitting. Journal of the Royal Statistical Society, Series B 60: 333–350
Article MATH MathSciNet Google Scholar
Dierckx P. (1993) Curve and surface fitting with splines. Oxford Monographs on Numerical Analysis. Oxford University Press, New York
Google Scholar
DiMatteo I., Genovese C.R., Kass R.E. (2001) Bayesian curve-fitting with free-knot splines. Biometrika 88: 1055–1071
Article MATH MathSciNet Google Scholar
Donoho D., Johnstone I., Hoch J., Stern A. (1992) Maximum entropy and the nearly black object. Journal of the Royal Statistical Society, Series B 52: 41–81
MathSciNet Google Scholar
Donoho D.L., Johnstone I.M. (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81: 425–455
Article MATH MathSciNet Google Scholar
Donoho D.L., Johnstone I.M. (1995) Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association 90: 1200–1224
Article MATH MathSciNet Google Scholar
Durand S., Nikolova M. (2006a) Stability of the minimizers of least squares with a non-convex regularization. Part I: Local behavior. Applied Mathematics and Optimization 53: 185–208
Article MATH MathSciNet Google Scholar
Durand S., Nikolova M. (2006b) Stability of the minimizers of least squares with a non-convex regularization. Part II: Global behavior. Applied Mathematics and Optimization 53: 259–277
Article MATH MathSciNet Google Scholar
Eilers P.C., Marx B.D. (1996) Flexible smoothing with B-splines and penalties (with discussion). Statistical Science 11: 89–121
Article MATH MathSciNet Google Scholar
Fahrmeir L., Kaufman H. (1985) Consistency of the maximum likelihood estimator in generalized linear models. The Annals of Statistics 13: 342–368
Article MATH MathSciNet Google Scholar
Fahrmeir L., Tutz G. (1994) Multivariate statistical modelling based on generalized linear models. Springer Series in Statistics. Springer, New York
Google Scholar
Fan J. (1997) Comments on “Wavelets in statistics: A Review”, by A. Antoniadis. Journal of the Italian Statistical Society 6: 131–138
Article Google Scholar
Fan J., Li R. (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96: 1348–1360
Article MATH MathSciNet Google Scholar
Fan J., Peng H. (2004) Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics 32: 928–961
Article MATH MathSciNet Google Scholar
Frank I.E., Friedman J.H. (1993) A statistical view of some chemometric regression tools (with discussion). Technometrics 35: 109–148
Article MATH Google Scholar
Friedman J.H. (1991) Multivariate adaptive regression splines. The Annals of Statistics 19: 1–67
Article MATH MathSciNet Google Scholar
Friedman J.H., Silverman B.W. (1989) Flexible parsimonious smoothing and additive modeling. Technometrics 31: 3–21
Article MATH MathSciNet Google Scholar
Fu W.J. (1998) Penalized regressions: The bridge versus the Lasso. Journal of Computational and Graphical Statistics 7: 397–416
Article MathSciNet Google Scholar
Geman S., Geman D. (1984) Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligenced 6: 721–741
Article MATH Google Scholar
Geman, S., McClure, D. E. (1987). Statistical methods for tomographic image reconstruction. In Proceedings of the 46-th session of the ISI, Bulletin of the ISI, 52, 22–26.
Geman D., Reynolds G. (1992) Constrained restoration and the recovery of discontinuities. IEEE Transactions on Pattern Analysis and Machine Intelligence 14: 367–383
Article Google Scholar
Geman D., Yang C. (1995) Nonlinear image recovery with half-quadratic regularization. IEEE Transactions on Image Processing 4: 932–946
Article Google Scholar
Good I.J., Gaskins R.A. (1971) Nonparametric roughness penalities for probability densities. Biometrika 58: 255–277
Article MATH MathSciNet Google Scholar
Green, P. J., Yandell, B. (1985). Semi-parametric generalized linear models. In R. Gilchrist, B. J. Francis, J. Whittaker (Eds.), Generalized linear models, Vol. 32. Lecture notes in statistics (pp. 44–55). Berlin: Springer.
Gu C., Kim Y.J. (2002) Penalized likelihood regression: General formulation and efficient approximation. The Canadian Journal of Statistics 30: 619–628
Article MATH MathSciNet Google Scholar
Gu C., Qiu C.F. (1994) Penalized likelihood regression—a simple asymptotic analysis. Statistica Sinica 4: 297–304
MATH MathSciNet Google Scholar
Gullikson, M., Wedin, P.-A. (1998). Analyzing the nonlinear L-curve. Technical Report, Sweden: Department of Computer Science, University of Umeøa.
Hastie T.J., Tibshirani R.J. (1990) Generalized additive models. Monographs on Statistics and Applied Probability. New York, Chapman and Hall
Google Scholar
Hubert, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 221–233.
Imoto S., Konishi S. (2003) Selection of smoothing parameters in B-spline nonparametric regression models using information criteria. The Annals of the Institute of Statistical Mathematics 55: 671–687
Article MATH MathSciNet Google Scholar
Jandhyala V.K., MacNeill I.B. (1997) Iterated partial sum sequences of regression residuals and tests for changepoints with continuity constraints. Journal of the Royal Statistical Society, Series B 59: 147–156
Article MATH MathSciNet Google Scholar
Klinger A. (2001) Inference in high-dimensional generalized linear models based on soft-thresholding. Journal of the Royal Statistical Society, Series B 63: 377–392
Article MATH MathSciNet Google Scholar
Knight K., Fu W. (2000) Asymptotics for Lasso-type estimators. The Annals of Statistics 28: 1356–1378
Article MATH MathSciNet Google Scholar
Lindstrom M.J. (1999) Penalized estimation of free-knot splines. Journal of Computational and Graphical Statistics 8: 333–352
Article MathSciNet Google Scholar
Mammen E., Van de Geer S. (1997) Locally adaptive regression splines. The Annals of Statistics 25: 387–413
Article MATH MathSciNet Google Scholar
McCullagh P., Nelder J.A. (1989) Generalized linear models, 2nd edn. Chapman & Hall, London
MATH Google Scholar
Nikolova M. (2000) Local strong homogeneity of a regularized estimator. SIAM Journal of Applied Mathematics 61: 633–658
Article MATH MathSciNet Google Scholar
Nikolova M. (2004) Weakly constrained minimization. Application to the estimation of images and signals involving constant regions. Journal of Mathematical Imaging and Vision 21: 155–175
Article MathSciNet Google Scholar
Nikolova M. (2005) Analysis of the recovery of edges in images and signals by minimizing nonconvex regularized least-squares. SIAM Journal on Multiscale Modeling and Simulation 4: 960–991
Article MATH MathSciNet Google Scholar
Nikolova M., Ng M.K. (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM Journal on Scientific Computing 27: 937–966
Article MATH MathSciNet Google Scholar
O’Sullivan F. (1986) A statistical perspective on ill-posed inverse problems (with discussion). Statistical Science 1: 505–527
MathSciNet Google Scholar
O’Sullivan F., Yandell B.S., Raynor W.J. (1986) Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81: 96–103
Article MathSciNet Google Scholar
Park M.Y., Hastie T. (2007) L ₁-regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society, Series B 69: 659–677
Article MathSciNet Google Scholar
Poggio T. (1985) Early vision: From computational structure to algorithms and parallel hardware. Computer Vision, Graphics, and Image Processing 31: 139–155
Article Google Scholar
Rockafellar R.T. (1970) Convex analysis. Princeton University Press, New Jersy
MATH Google Scholar
Rockafellar T., Wets R.J.-B. (1997) Variational analysis. Mathematics. Springer, Berlin
Google Scholar
Ruppert, D., Carroll, R. (1997). Penalized regression splines. Working paper, School of Operations Research and Industrial Engineering, Cornell University. Available at http://www.orie.cornell.edu/~davidr/papers.
Ruppert D., Carroll R.J. (2000) Spatially-adaptive penalties for spline fitting. Australian and New Zealand Journal of Statistics 42: 205–223
Article Google Scholar
Stasinopoulos D.M., Rigby R.A. (1992) Detecting break points in generalized linear models. Computational Statistics & Data Analysis 13: 461–471
Article MATH Google Scholar
Stone C.J., Hansen M.H., Kooperberg C., Truong Y.K. (1997) Polynomial splines and their tensor products in extended linear modeling. The Annals of Statistics 25: 1371–1470
Article MATH MathSciNet Google Scholar
Tibshirani R.J. (1996) Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B 58: 267–288
MATH MathSciNet Google Scholar
Tikhonov A.N. (1963) Solution of incorrectly formulated problems and the regularization method. Soviet Math Dokl 4: 1035–1038 (English translation)
Google Scholar
Tishler A., Zang I. (1981) A new maximum likelihood algorithm for piecewise regression. Journal of the American Statistical Association 76: 980–987
Article MATH MathSciNet Google Scholar
Titterington D.M. (1985) Common structure of smoothing techniques in statistics. International Statistical Review 53: 141–170
Article MATH MathSciNet Google Scholar
Whittaker E. (1923) On a new method of graduation. Proceedings of Edinburgh Mathematical Society 41: 63–75
Google Scholar
Yu Y., Ruppert D. (2002) Penalized spline estimation for partially linear single-index models. Journal of the American Statistical Association 97: 1042–1054
Article MATH MathSciNet Google Scholar
Yuan M., Lin Y. (2007) On the non-negative garrotte estimator. Journal of the Royal Statistical Society, Series B 69: 143–161
Article MATH MathSciNet Google Scholar
Zhao P., Yu B. (2006) On model selection consistency of Lasso. Journal of Machine Learning Research 7: 2541–2563
MathSciNet Google Scholar
Zhou S., Shen X. (2001) Spatially adaptive regression splines and accurate knot selection schemes. Journal of the American Statistical Association 96: 247–259
Article MATH MathSciNet Google Scholar
Zou H. (2006) The adaptive Lasso and its oracle properties. Journal of the American Statistical Association 101: 1418–1429
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire Jean Kuntzmann, Department de Statistique, Université Joseph Fourier, Tour IRMA, B.P. 53, 38041, Grenoble Cedex 9, France
Anestis Antoniadis
Department of Mathematics, Leuven Statistics Research Centre (LStat), Katholieke Universiteit Leuven, Celestijnenlaan 200B, Box 2400, 3001, Leuven, Belgium
Irène Gijbels
Centre de Mathématiques et de Leurs Applications, CNRS-ENS de Cachan, PRES UniverSud, 61 av. du Président Wilson, 94235, Cachan Cedex, France
Mila Nikolova

Authors

Anestis Antoniadis
View author publications
You can also search for this author in PubMed Google Scholar
Irène Gijbels
View author publications
You can also search for this author in PubMed Google Scholar
Mila Nikolova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Irène Gijbels.

About this article

Cite this article

Antoniadis, A., Gijbels, I. & Nikolova, M. Penalized likelihood regression for generalized linear models with non-quadratic penalties. Ann Inst Stat Math 63, 585–615 (2011). https://doi.org/10.1007/s10463-009-0242-4

Download citation

Received: 27 March 2008
Revised: 08 January 2009
Published: 09 June 2009
Issue Date: June 2011
DOI: https://doi.org/10.1007/s10463-009-0242-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Penalized likelihood regression for generalized linear models with non-quadratic penalties

Abstract

Access this article

Similar content being viewed by others

Bayesian Smoothing, Shrinkage and Variable Selection in Hazard Regression

Semiparametric regression analysis of doubly censored failure time data from cohort studies

A comparison of optimization solvers for log binomial regression including conic programming

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Keywords

Navigation

Penalized likelihood regression for generalized linear models with non-quadratic penalties

Abstract

Access this article

Similar content being viewed by others

Bayesian Smoothing, Shrinkage and Variable Selection in Hazard Regression

Semiparametric regression analysis of doubly censored failure time data from cohort studies

A comparison of optimization solvers for log binomial regression including conic programming

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Share this article

Keywords

Search

Navigation