Skip to main content

Advertisement

Log in

Penalized likelihood regression for generalized linear models with non-quadratic penalties

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

One of the popular method for fitting a regression function is regularization: minimizing an objective function which enforces a roughness penalty in addition to coherence with the data. This is the case when formulating penalized likelihood regression for exponential families. Most of the smoothing methods employ quadratic penalties, leading to linear estimates, and are in general incapable of recovering discontinuities or other important attributes in the regression function. In contrast, non-linear estimates are generally more accurate. In this paper, we focus on non-parametric penalized likelihood regression methods using splines and a variety of non-quadratic penalties, pointing out common basic principles. We present an asymptotic analysis of convergence rates that justifies the approach. We report on a simulation study including comparisons between our method and some existing ones. We illustrate our approach with an application to Poisson non-parametric regression modeling of frequency counts of reported acquired immune deficiency syndrome (AIDS) cases in the UK.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alliney S., Ruzinsky S.A. (1994) An algorithm for the minimization of mixed l 1 and l 2 norms with application to Bayesian estimation. IEEE Transactions on Signal Processing 42: 618–627

    Article  Google Scholar 

  • Antoniadis A., Fan J. (2001) Regularization of wavelets approximations (with discussion). Journal of the American Statistical Association 96: 939–967

    Article  MATH  MathSciNet  Google Scholar 

  • Belge M., Kilmer M.E., Miller E.L. (2002) Efficient determination of multiple regularization parameters in a generalized L-curve approach. Inverse Problems 18: 1161–1183

    Article  MATH  MathSciNet  Google Scholar 

  • Besag J.E. (1989) Digital image processing: Towards Bayesian image analysis. Journal of Applied Statistics 16: 395–407

    Article  Google Scholar 

  • Biller C. (2000) Adaptive Bayesian regression splines in semiparametric generalized linear models. Journal of Computational and Graphical Statistics 9: 122–140

    Article  MathSciNet  Google Scholar 

  • Black M., Rangarajan A. (1996) On the unification of line processes, outlier rejection, and robust statistics with applications to early vision. International Journal of Computer Vision 19: 57–91

    Article  Google Scholar 

  • Ciarlet P.G. (1989) Introduction to numerical linear algebra and optimisation. Cambridge Texts in Applied Mathematics. Cambridge University Press, New York

    Google Scholar 

  • Cox D.D., O’Sullivan F. (1990) Asymptotic analysis of penalized likelihood and related estimators. The Annals of Statistics 18: 1676–1695

    Article  MATH  MathSciNet  Google Scholar 

  • de Boor C. (1978) A practical guide to splines. Springer, New York

    MATH  Google Scholar 

  • Delanay A.H., Bressler Y. (1998) Globally convergent edge-presering regularized reconstruction: An application to limited-angle tomography. IEEE Transactions on Image Processing 7: 204–221

    Article  Google Scholar 

  • Demoment G. (1989) Image-reconstruction and restoration—overview of common estimation structures and problems. IEEE Transactions on Acoustics Speech and Signal Processing 37: 2024–2036

    Article  Google Scholar 

  • Denison D.G.T., Mallick B.K., Smith A.F.M. (1998) Automatic Bayesian curve fitting. Journal of the Royal Statistical Society, Series B 60: 333–350

    Article  MATH  MathSciNet  Google Scholar 

  • Dierckx P. (1993) Curve and surface fitting with splines. Oxford Monographs on Numerical Analysis. Oxford University Press, New York

    Google Scholar 

  • DiMatteo I., Genovese C.R., Kass R.E. (2001) Bayesian curve-fitting with free-knot splines. Biometrika 88: 1055–1071

    Article  MATH  MathSciNet  Google Scholar 

  • Donoho D., Johnstone I., Hoch J., Stern A. (1992) Maximum entropy and the nearly black object. Journal of the Royal Statistical Society, Series B 52: 41–81

    MathSciNet  Google Scholar 

  • Donoho D.L., Johnstone I.M. (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81: 425–455

    Article  MATH  MathSciNet  Google Scholar 

  • Donoho D.L., Johnstone I.M. (1995) Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association 90: 1200–1224

    Article  MATH  MathSciNet  Google Scholar 

  • Durand S., Nikolova M. (2006a) Stability of the minimizers of least squares with a non-convex regularization. Part I: Local behavior. Applied Mathematics and Optimization 53: 185–208

    Article  MATH  MathSciNet  Google Scholar 

  • Durand S., Nikolova M. (2006b) Stability of the minimizers of least squares with a non-convex regularization. Part II: Global behavior. Applied Mathematics and Optimization 53: 259–277

    Article  MATH  MathSciNet  Google Scholar 

  • Eilers P.C., Marx B.D. (1996) Flexible smoothing with B-splines and penalties (with discussion). Statistical Science 11: 89–121

    Article  MATH  MathSciNet  Google Scholar 

  • Fahrmeir L., Kaufman H. (1985) Consistency of the maximum likelihood estimator in generalized linear models. The Annals of Statistics 13: 342–368

    Article  MATH  MathSciNet  Google Scholar 

  • Fahrmeir L., Tutz G. (1994) Multivariate statistical modelling based on generalized linear models. Springer Series in Statistics. Springer, New York

    Google Scholar 

  • Fan J. (1997) Comments on “Wavelets in statistics: A Review”, by A. Antoniadis. Journal of the Italian Statistical Society 6: 131–138

    Article  Google Scholar 

  • Fan J., Li R. (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96: 1348–1360

    Article  MATH  MathSciNet  Google Scholar 

  • Fan J., Peng H. (2004) Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics 32: 928–961

    Article  MATH  MathSciNet  Google Scholar 

  • Frank I.E., Friedman J.H. (1993) A statistical view of some chemometric regression tools (with discussion). Technometrics 35: 109–148

    Article  MATH  Google Scholar 

  • Friedman J.H. (1991) Multivariate adaptive regression splines. The Annals of Statistics 19: 1–67

    Article  MATH  MathSciNet  Google Scholar 

  • Friedman J.H., Silverman B.W. (1989) Flexible parsimonious smoothing and additive modeling. Technometrics 31: 3–21

    Article  MATH  MathSciNet  Google Scholar 

  • Fu W.J. (1998) Penalized regressions: The bridge versus the Lasso. Journal of Computational and Graphical Statistics 7: 397–416

    Article  MathSciNet  Google Scholar 

  • Geman S., Geman D. (1984) Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligenced 6: 721–741

    Article  MATH  Google Scholar 

  • Geman, S., McClure, D. E. (1987). Statistical methods for tomographic image reconstruction. In Proceedings of the 46-th session of the ISI, Bulletin of the ISI, 52, 22–26.

  • Geman D., Reynolds G. (1992) Constrained restoration and the recovery of discontinuities. IEEE Transactions on Pattern Analysis and Machine Intelligence 14: 367–383

    Article  Google Scholar 

  • Geman D., Yang C. (1995) Nonlinear image recovery with half-quadratic regularization. IEEE Transactions on Image Processing 4: 932–946

    Article  Google Scholar 

  • Good I.J., Gaskins R.A. (1971) Nonparametric roughness penalities for probability densities. Biometrika 58: 255–277

    Article  MATH  MathSciNet  Google Scholar 

  • Green, P. J., Yandell, B. (1985). Semi-parametric generalized linear models. In R. Gilchrist, B. J. Francis, J. Whittaker (Eds.), Generalized linear models, Vol. 32. Lecture notes in statistics (pp. 44–55). Berlin: Springer.

  • Gu C., Kim Y.J. (2002) Penalized likelihood regression: General formulation and efficient approximation. The Canadian Journal of Statistics 30: 619–628

    Article  MATH  MathSciNet  Google Scholar 

  • Gu C., Qiu C.F. (1994) Penalized likelihood regression—a simple asymptotic analysis. Statistica Sinica 4: 297–304

    MATH  MathSciNet  Google Scholar 

  • Gullikson, M., Wedin, P.-A. (1998). Analyzing the nonlinear L-curve. Technical Report, Sweden: Department of Computer Science, University of Umeøa.

  • Hastie T.J., Tibshirani R.J. (1990) Generalized additive models. Monographs on Statistics and Applied Probability. New York, Chapman and Hall

    Google Scholar 

  • Hubert, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 221–233.

  • Imoto S., Konishi S. (2003) Selection of smoothing parameters in B-spline nonparametric regression models using information criteria. The Annals of the Institute of Statistical Mathematics 55: 671–687

    Article  MATH  MathSciNet  Google Scholar 

  • Jandhyala V.K., MacNeill I.B. (1997) Iterated partial sum sequences of regression residuals and tests for changepoints with continuity constraints. Journal of the Royal Statistical Society, Series B 59: 147–156

    Article  MATH  MathSciNet  Google Scholar 

  • Klinger A. (2001) Inference in high-dimensional generalized linear models based on soft-thresholding. Journal of the Royal Statistical Society, Series B 63: 377–392

    Article  MATH  MathSciNet  Google Scholar 

  • Knight K., Fu W. (2000) Asymptotics for Lasso-type estimators. The Annals of Statistics 28: 1356–1378

    Article  MATH  MathSciNet  Google Scholar 

  • Lindstrom M.J. (1999) Penalized estimation of free-knot splines. Journal of Computational and Graphical Statistics 8: 333–352

    Article  MathSciNet  Google Scholar 

  • Mammen E., Van de Geer S. (1997) Locally adaptive regression splines. The Annals of Statistics 25: 387–413

    Article  MATH  MathSciNet  Google Scholar 

  • McCullagh P., Nelder J.A. (1989) Generalized linear models, 2nd edn. Chapman & Hall, London

    MATH  Google Scholar 

  • Nikolova M. (2000) Local strong homogeneity of a regularized estimator. SIAM Journal of Applied Mathematics 61: 633–658

    Article  MATH  MathSciNet  Google Scholar 

  • Nikolova M. (2004) Weakly constrained minimization. Application to the estimation of images and signals involving constant regions. Journal of Mathematical Imaging and Vision 21: 155–175

    Article  MathSciNet  Google Scholar 

  • Nikolova M. (2005) Analysis of the recovery of edges in images and signals by minimizing nonconvex regularized least-squares. SIAM Journal on Multiscale Modeling and Simulation 4: 960–991

    Article  MATH  MathSciNet  Google Scholar 

  • Nikolova M., Ng M.K. (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM Journal on Scientific Computing 27: 937–966

    Article  MATH  MathSciNet  Google Scholar 

  • O’Sullivan F. (1986) A statistical perspective on ill-posed inverse problems (with discussion). Statistical Science 1: 505–527

    MathSciNet  Google Scholar 

  • O’Sullivan F., Yandell B.S., Raynor W.J. (1986) Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81: 96–103

    Article  MathSciNet  Google Scholar 

  • Park M.Y., Hastie T. (2007) L 1-regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society, Series B 69: 659–677

    Article  MathSciNet  Google Scholar 

  • Poggio T. (1985) Early vision: From computational structure to algorithms and parallel hardware. Computer Vision, Graphics, and Image Processing 31: 139–155

    Article  Google Scholar 

  • Rockafellar R.T. (1970) Convex analysis. Princeton University Press, New Jersy

    MATH  Google Scholar 

  • Rockafellar T., Wets R.J.-B. (1997) Variational analysis. Mathematics. Springer, Berlin

    Google Scholar 

  • Ruppert, D., Carroll, R. (1997). Penalized regression splines. Working paper, School of Operations Research and Industrial Engineering, Cornell University. Available at http://www.orie.cornell.edu/~davidr/papers.

  • Ruppert D., Carroll R.J. (2000) Spatially-adaptive penalties for spline fitting. Australian and New Zealand Journal of Statistics 42: 205–223

    Article  Google Scholar 

  • Stasinopoulos D.M., Rigby R.A. (1992) Detecting break points in generalized linear models. Computational Statistics & Data Analysis 13: 461–471

    Article  MATH  Google Scholar 

  • Stone C.J., Hansen M.H., Kooperberg C., Truong Y.K. (1997) Polynomial splines and their tensor products in extended linear modeling. The Annals of Statistics 25: 1371–1470

    Article  MATH  MathSciNet  Google Scholar 

  • Tibshirani R.J. (1996) Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B 58: 267–288

    MATH  MathSciNet  Google Scholar 

  • Tikhonov A.N. (1963) Solution of incorrectly formulated problems and the regularization method. Soviet Math Dokl 4: 1035–1038 (English translation)

    Google Scholar 

  • Tishler A., Zang I. (1981) A new maximum likelihood algorithm for piecewise regression. Journal of the American Statistical Association 76: 980–987

    Article  MATH  MathSciNet  Google Scholar 

  • Titterington D.M. (1985) Common structure of smoothing techniques in statistics. International Statistical Review 53: 141–170

    Article  MATH  MathSciNet  Google Scholar 

  • Whittaker E. (1923) On a new method of graduation. Proceedings of Edinburgh Mathematical Society 41: 63–75

    Google Scholar 

  • Yu Y., Ruppert D. (2002) Penalized spline estimation for partially linear single-index models. Journal of the American Statistical Association 97: 1042–1054

    Article  MATH  MathSciNet  Google Scholar 

  • Yuan M., Lin Y. (2007) On the non-negative garrotte estimator. Journal of the Royal Statistical Society, Series B 69: 143–161

    Article  MATH  MathSciNet  Google Scholar 

  • Zhao P., Yu B. (2006) On model selection consistency of Lasso. Journal of Machine Learning Research 7: 2541–2563

    MathSciNet  Google Scholar 

  • Zhou S., Shen X. (2001) Spatially adaptive regression splines and accurate knot selection schemes. Journal of the American Statistical Association 96: 247–259

    Article  MATH  MathSciNet  Google Scholar 

  • Zou H. (2006) The adaptive Lasso and its oracle properties. Journal of the American Statistical Association 101: 1418–1429

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irène Gijbels.

About this article

Cite this article

Antoniadis, A., Gijbels, I. & Nikolova, M. Penalized likelihood regression for generalized linear models with non-quadratic penalties. Ann Inst Stat Math 63, 585–615 (2011). https://doi.org/10.1007/s10463-009-0242-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-009-0242-4

Keywords

Navigation