Abstract
The present work addresses the problem of model estimation and computations for discrete data when some covariates are modeled smoothly using splines. We propose to introduce and explicitly estimate individual deviance effects (one for each observation), constrained by a ridge penalty. This turns out to be an effective way to absorb model excess variation and detect systematic patterns. Large but very sparse systems of penalized likelihood equations have to be solved. We present fast and compact algorithms for fitting, estimation and computation of the effective dimension. Applications to counts, binomial, and survival data illustrate practical use of this model.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Agresti A (1996) An introduction to categorical data analysis. Wiley, New York
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC- 19: 716–723
Bissell AF (1972) A negative binomial model with varying elements sizes. Biometrika 59: 435–441
Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88(421): 9–25
Collet D (2003) Modeling binary data. Chapman and Hall/CRC, London
Crowder M (1978) Beta-binomial ANOVA for proportions. Appl Stat 27: 34–37
Efron B (1986) Double exponential families and their use in generalized linear regression. J Am Stat Assoc 81: 709–721
Efron B (1988) Logistic regression, survival analysis, and the Kaplan Meier curve. J Am Stat Assoc 83(402): 414–425
Eilers P, Gampe J, Marx B, Rau R (2008) Modulation models for seasonal time series and incidence tables. Stat Med 27: 3430–3441
Eilers PHC, Borgdorff MW (2004) Modeling and correction of digit preference in tuberculin surveys. Int J Tuberc Lung Dis 8(2): 232–239
Eilers PHC, Marx BD (1996) Flexible smoothing with b-splines and penalties. Stat Sci 11(2): 89–121
Goldstein H, Spiegelhalter DJ (1996) League tables and their limitations: statistical issues in comparisons of institutional performance. J R Stat Soc A 156: 385–409
Green P, Silverman B (1993) Nonparametric regression and generalized linear models: a roughness penalty approach. Chapman and Hall, London
Hurvich CM, Tsai CL (1989) Regression and time series model selection in small samples. Biometrica 76: 297–307
Hurvich CM, Simonof JS, Tsai CL (1998) Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. J R Stat Soc B 60: 271–293
Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall, London
Hinde J, Demetrio CGB (1998) Overdispersion: models and estimation. Comput Stat Data Anal 27: 151–170
Hinde JP (1982) Compound Poisson regression models. In: Gilchrist R (eds) GLIM82. Springer, New York, pp 109–121
Lee Y, Nelder JA (1996) Hierarhical generalized linear models. J R Stat Soc B 58(4): 619–678
Lee Y, Nelder JA (2001) Hierarhical generalized linear models: a synthesis of generalized linear models, random effects models and structured dispersions. Biometrika 88(4): 987–1006
Cessie S, van Houwelingen HC (1995) Testing the fit of a regression model via score tests in random effects models. Biometrics 51: 600–614
Lin X, Zhang D (1999) Inference in generalized additive mixed models by using smoothing splines. J R Stat Soc B 61(2): 381–400
McCullagh P, Nelder JA (1989) Generalized linear models. Chapman and Hall, London
Morgan BJT (1992) Analysis of quantal response data. Chapman and Hall, London
Müller H-G, Wang J-L, Capra WB (1997) From lifetables to hazard rates: the transformation approach. Biometrika 84(4): 881–892
Pawitan Y (2001) In all likelihood: statistical modellind and inference using likelihood. Oxford Science Publications, Oxford
Pinheiro J, Bates D (2000) Mixed effects models in S and S-plus. Springer, New York
Schall R (1991) Estimation in generalized linear models with random effects. Biometrika 78(4): 719–727
Spiegelhalter DJ (1999) Surgical audit: statistical lessons from nightingale and codman. J R Stat Soc A 162: 45–58
Thomas N, Longford NT, Rolph JE (1994) Empirical bayes methods for estimating hospital-specific mortality-rates. Stat Med 13: 889–903
Thurston SW, Wand MP, Wiencke JK (2000) Negative binomial additive models. Biometrics 56: 139–144
van Houwelingen HC, Brand R, Louis TA (2004) Empirical bayes methods for monitoring health care quality. Technical report, Department of Medical Statistics, LUMC
Williams DA (1982) Extra binomial variation in logistic linear models. Appl Stat 31: 144–148
Wood S (2008) Fast stable direct fitting and smoothness selection for generalized additive models. J R Stat Soc B 70: 495–518
Acknowledgments
The authors would like to thank the two referees for their helpful suggestions and comments on the original manuscript.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Perperoglou, A., Eilers, P.H.C. Penalized regression with individual deviance effects. Comput Stat 25, 341–361 (2010). https://doi.org/10.1007/s00180-009-0180-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-009-0180-x