Abstract
We study joint nonparametric estimators of the mean and the dispersion functions in extended double exponential family models. The starting point is the exponential family and the generalized linear models setting. The extended models allow for both overdispersion and underdispersion, or even a combination of both. We simultaneously estimate the dispersion function and the mean function by using P-splines with a difference type of penalty to avoid overfitting. Special attention is given to the smoothing parameter selection as well as to implementation issues. The performance of the method is investigated via simulations. A comparison with other available methods is made. We provide applications to several sets of data, including continuous data, counts and proportions.
Similar content being viewed by others
References
Aerts M, Claeskens G (1997) Local polynomial estimation in multiparameter likelihood models. J Am Stat Assoc 92:1536–1545
Alonso J, Muñoz A, Antó JM (1996) Using length of stay and inactive days in the hospital to assess appropriateness of utilization in Barcelona, Spain. J Epidemiol Community Health 50:196–201
de Boor C (2001) A practical guide to splines. Springer, Berlin. Revised edn
Carroll RJ, Ruppert D (1988) Transformation and weighting in regression. Chapman & Hall, New York
Claeskens G, Krivobokova T, Opsomer JD (2009) Asymptotic properties of penalized spline estimators. Biometrika 96:529–544
Davidian M, Carroll RJ (1987) Variance function estimation. J Am Stat Assoc 82:1079–1091
Davidian M, Carroll RJ (1988) A note on extended quasi-likelihood. J R Stat Soc, Ser B 50:74–82
Dey DK, Galfand AE, Peng F (1997) Overdispersed generalized linear models. J Stat Plan Inference 64:93–107
Eilers PHC, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–121
Efron B (1986) Double exponential families and their use in generalized linear regression. J Am Stat Assoc 81:809–721
Fan J, Yao Q (1998) Efficient estimation of conditional variance functions in stochastic regression. Biometrika 85:645–660
Galfand AE, Dalal SR (1990) A note on overdispersed exponential families. Biometrika 77:55–64
Gijbels I, Verhasselt A (2010a) P-splines regression smoothing and difference type of penalty. Stat Comput. doi:10.1007/s11222-009-9140-0
Gijbels I, Verhasselt A (2010b) Regularisation and P-splines in generalised linear models. J Nonparametr Stat. doi:10.1080/10485250903365900
Hall P, Carroll RJ (1989) Variance function estimation in regression: the effect of estimating the mean. J R Stat Soc, Ser B 51:3–14
Hall P, Kay J, Titterington D (1990) Asymptotically optimal difference-based estimation of variance in nonparametric regression. Biometrika 77:521–528
Hinde J (1982) Compound Poisson regression mode: in GLIM 82. In: Gilchrist R (ed) Proceedings of the international conference on generalized linear models. Springer, New York, pp 109–121
Hinde J, Demétrio CGB (1998) Overdispersion: models and estimation. Comput Stat Data Anal 27:151–170
Kauermann G, Krivobokava T, Fahrmeir L (2009) Some asymptotic results on generalized penalized spline smoothing. J R Stat Soc, Ser B 71:487–503
Lee Y, Nelder JA (1996) Hierarchical generalized linear models (with discussion). J R Stat Soc, Ser B 58:619–678
Lee Y, Nelder JA (2000) The relationship between double-exponential families and extended quasi-likelihood families, with application to modelling Geissler’s human sex ratio data. Appl Stat 49:413–419
Lee Y, Nelder JA (2001) Hierarchical generalized linear models: a synthesis of generalized linear models, random-effect models and structured dispersions. Biometrika 88:987–1006
Li Y, Ruppert D (2008) On the asymptotics of penalized splines. Biometrika 95:415–436
Marx BD, Eilers PHC (1998) Direct generalized additive modeling with penalized likelihood. Comput Stat Data Anal 28:193–209
McCullagh P, Nelder JA (1989) Generalized linear model. Chapman & Hall, London
Moore DF, Tsiatis A (1991) Robust estimation of the variance in moment methods for extra-binomial and extra-Poisson variation. Biometrics 47:383–401
Nelder JA, Lee Y (1992) Likelihood, quasi-likelihood and pseudo-likelihood: some comparisons. J R Stat Soc, Ser B 54:273–284
Nelder JA, Pregibon D (1987) An extended quasi-likelihood function. Biometrika 74:221–232
Nott D (2006) Semiparametric estimation of mean and variance functions for non-Gaussian data. Comput Stat 21:603–620
Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. Appl Stat 54:507–554
Ruppert D, Wand M, Carroll RJ (2003) Semiparametric regression. Cambridge University Press, Cambridge
Ruppert D, Wand MP, Holst U, Hössjer O (1997) Local polynomial variance-function estimation. Technometrics 39:262–273
Stasinopoulos DM, Rigby RA (2007). Generalized additive models for location scale and shape (GAMLSS) in R. J Stat Softw 23(7)
Wahba G (1990) Spline models for observational data. SIAM, Philadelphia
Wang L, Brown LD, Cai TT, Levine M (2008) Effect of mean on variance function estimation in nonparametric regression. Ann Stat 36:646–664
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gijbels, I., Prosdocimi, I. & Claeskens, G. Nonparametric estimation of mean and dispersion functions in extended generalized linear models. TEST 19, 580–608 (2010). https://doi.org/10.1007/s11749-010-0187-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-010-0187-1
Keywords
- Double exponential family
- Extended quasi-likelihood
- Nonparametric regression
- Overdispersion
- P-splines
- Variance estimation
- Underdispersion