Advertisement

Computational Statistics

, Volume 25, Issue 2, pp 341–361 | Cite as

Penalized regression with individual deviance effects

  • Aris PerperoglouEmail author
  • Paul H. C. Eilers
Open Access
Original Paper

Abstract

The present work addresses the problem of model estimation and computations for discrete data when some covariates are modeled smoothly using splines. We propose to introduce and explicitly estimate individual deviance effects (one for each observation), constrained by a ridge penalty. This turns out to be an effective way to absorb model excess variation and detect systematic patterns. Large but very sparse systems of penalized likelihood equations have to be solved. We present fast and compact algorithms for fitting, estimation and computation of the effective dimension. Applications to counts, binomial, and survival data illustrate practical use of this model.

Keywords

Generalized linear models Smoothing Effective dimension Penalized regression 

Notes

Acknowledgments

The authors would like to thank the two referees for their helpful suggestions and comments on the original manuscript.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References

  1. Agresti A (1996) An introduction to categorical data analysis. Wiley, New YorkzbMATHGoogle Scholar
  2. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC- 19: 716–723CrossRefMathSciNetGoogle Scholar
  3. Bissell AF (1972) A negative binomial model with varying elements sizes. Biometrika 59: 435–441zbMATHCrossRefMathSciNetGoogle Scholar
  4. Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88(421): 9–25zbMATHCrossRefGoogle Scholar
  5. Collet D (2003) Modeling binary data. Chapman and Hall/CRC, LondonGoogle Scholar
  6. Crowder M (1978) Beta-binomial ANOVA for proportions. Appl Stat 27: 34–37CrossRefGoogle Scholar
  7. Efron B (1986) Double exponential families and their use in generalized linear regression. J Am Stat Assoc 81: 709–721zbMATHCrossRefMathSciNetGoogle Scholar
  8. Efron B (1988) Logistic regression, survival analysis, and the Kaplan Meier curve. J Am Stat Assoc 83(402): 414–425zbMATHCrossRefMathSciNetGoogle Scholar
  9. Eilers P, Gampe J, Marx B, Rau R (2008) Modulation models for seasonal time series and incidence tables. Stat Med 27: 3430–3441CrossRefMathSciNetGoogle Scholar
  10. Eilers PHC, Borgdorff MW (2004) Modeling and correction of digit preference in tuberculin surveys. Int J Tuberc Lung Dis 8(2): 232–239Google Scholar
  11. Eilers PHC, Marx BD (1996) Flexible smoothing with b-splines and penalties. Stat Sci 11(2): 89–121zbMATHCrossRefMathSciNetGoogle Scholar
  12. Goldstein H, Spiegelhalter DJ (1996) League tables and their limitations: statistical issues in comparisons of institutional performance. J R Stat Soc A 156: 385–409CrossRefGoogle Scholar
  13. Green P, Silverman B (1993) Nonparametric regression and generalized linear models: a roughness penalty approach. Chapman and Hall, LondonGoogle Scholar
  14. Hurvich CM, Tsai CL (1989) Regression and time series model selection in small samples. Biometrica 76: 297–307zbMATHCrossRefMathSciNetGoogle Scholar
  15. Hurvich CM, Simonof JS, Tsai CL (1998) Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. J R Stat Soc B 60: 271–293zbMATHCrossRefGoogle Scholar
  16. Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall, LondonzbMATHGoogle Scholar
  17. Hinde J, Demetrio CGB (1998) Overdispersion: models and estimation. Comput Stat Data Anal 27: 151–170zbMATHCrossRefGoogle Scholar
  18. Hinde JP (1982) Compound Poisson regression models. In: Gilchrist R (eds) GLIM82. Springer, New York, pp 109–121Google Scholar
  19. Lee Y, Nelder JA (1996) Hierarhical generalized linear models. J R Stat Soc B 58(4): 619–678zbMATHMathSciNetGoogle Scholar
  20. Lee Y, Nelder JA (2001) Hierarhical generalized linear models: a synthesis of generalized linear models, random effects models and structured dispersions. Biometrika 88(4): 987–1006zbMATHCrossRefMathSciNetGoogle Scholar
  21. Cessie S, van Houwelingen HC (1995) Testing the fit of a regression model via score tests in random effects models. Biometrics 51: 600–614zbMATHCrossRefGoogle Scholar
  22. Lin X, Zhang D (1999) Inference in generalized additive mixed models by using smoothing splines. J R Stat Soc B 61(2): 381–400zbMATHCrossRefMathSciNetGoogle Scholar
  23. McCullagh P, Nelder JA (1989) Generalized linear models. Chapman and Hall, LondonzbMATHGoogle Scholar
  24. Morgan BJT (1992) Analysis of quantal response data. Chapman and Hall, LondonGoogle Scholar
  25. Müller H-G, Wang J-L, Capra WB (1997) From lifetables to hazard rates: the transformation approach. Biometrika 84(4): 881–892zbMATHCrossRefMathSciNetGoogle Scholar
  26. Pawitan Y (2001) In all likelihood: statistical modellind and inference using likelihood. Oxford Science Publications, OxfordGoogle Scholar
  27. Pinheiro J, Bates D (2000) Mixed effects models in S and S-plus. Springer, New YorkzbMATHGoogle Scholar
  28. Schall R (1991) Estimation in generalized linear models with random effects. Biometrika 78(4): 719–727zbMATHCrossRefGoogle Scholar
  29. Spiegelhalter DJ (1999) Surgical audit: statistical lessons from nightingale and codman. J R Stat Soc A 162: 45–58CrossRefGoogle Scholar
  30. Thomas N, Longford NT, Rolph JE (1994) Empirical bayes methods for estimating hospital-specific mortality-rates. Stat Med 13: 889–903CrossRefGoogle Scholar
  31. Thurston SW, Wand MP, Wiencke JK (2000) Negative binomial additive models. Biometrics 56: 139–144zbMATHCrossRefGoogle Scholar
  32. van Houwelingen HC, Brand R, Louis TA (2004) Empirical bayes methods for monitoring health care quality. Technical report, Department of Medical Statistics, LUMCGoogle Scholar
  33. Williams DA (1982) Extra binomial variation in logistic linear models. Appl Stat 31: 144–148zbMATHCrossRefGoogle Scholar
  34. Wood S (2008) Fast stable direct fitting and smoothness selection for generalized additive models. J R Stat Soc B 70: 495–518zbMATHCrossRefGoogle Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  1. 1.Department of Statistics and Actuarial Financial MathematicsUniversity of the AegeanSamosGreece
  2. 2.Department of BiostatisticsErasmus Medical CenterRotterdamThe Netherlands

Personalised recommendations