Skip to main content
Log in

Modelling count data with overdispersion and spatial effects

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agarwal DK, Gelfand AE, Citron-Pousty S (2002) Zero-inflated models with application to spatial count data. Environ Ecol Stat 9:341–355

    Article  MathSciNet  Google Scholar 

  • Angers JF, Biswas A (2003) A Bayesian analysis of zero-inflated generalized Poisson model. Comput Stat Data Anal 42:37–46

    Article  MathSciNet  MATH  Google Scholar 

  • Banerjee S, Carlin B, Gelfand A (2004) Hierarchical modeling and analysis for spatial data. Chapman & Hall/CRC, New York

    MATH  Google Scholar 

  • Besag J, Kooperberg C (1995) On conditional and intrinsic autoregressions. Biometrika 82:733–746

    MATH  MathSciNet  Google Scholar 

  • Brier G (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78 (1):1–3

    Article  Google Scholar 

  • Consul P (1989) Generalized Poisson distributions. Properties and Applications. Marcel Dekker, New York

    MATH  Google Scholar 

  • Consul P, Jain G (1973) A generalization of the Poisson distribution. Technometrics 15:791–799

    Article  MATH  MathSciNet  Google Scholar 

  • Czado C, Prokopenko S (2004) Modeling transport mode decisions using hierarchical binary spatial regression models with cluster effects. Discussion paper 406, SFB 386 Statistische Analyse diskreter Strukturen Http://www.stat.uni-muenchen.de/sfb386/

  • Famoye F, Singh K (2003a) On inflated generalized Poisson regression models. Adv Appl Stat 3(2):145–158

    MATH  MathSciNet  Google Scholar 

  • Famoye F, Singh K (2003b) Zero inflated generalized Poisson regression model (submitted)

  • Gelman A, Carlin J, Stern H, Rubin D (2004) Bayesian data analysis, 2nd edn. Chapman & Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Gilks W, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41(2):337–348

    Article  MATH  Google Scholar 

  • Gilks W, Richardson S, Spiegelhalter D (1996) Markov Chain Monte Carlo in Practice. Chapman & Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Gneiting T, Raftery AE (2004) Strictly proper scoring rules, prediction and estimation. Technical Report no. 463, Department of Statistics, University of Washington

  • Gschlößl S (2006) Hierarchical Bayesian spatial regression models with applications to non-life insurance. PhD thesis, Munich University of Technology

  • Han C, Carlin B (2001) Markov chain Monte Carlo methods for computing Bayes factors: a comparative review. J Am Stat Assoc 96:1122–1132

    Article  Google Scholar 

  • Hoeting J, Madigan D, Raftery A, Volinsky C (1999) Bayesian model averaging: a tutorial. Stat Sci 14(4):382–417

    Article  MATH  MathSciNet  Google Scholar 

  • Jin X, Carlin B, Banerjee S (2005) Generalized hierarchical multivariate CAR models for areal data. Biometrics 61:950–961

    Article  MATH  MathSciNet  Google Scholar 

  • Joe H, Zhu R (2005) Generalized Poisson distribution: the property of mixture of Poisson and comparison with Negative Binomial distribution. Biometric J 47:219–229

    Article  MathSciNet  Google Scholar 

  • Kass R, Raftery A (1995) Bayes factors and model uncertainty. J Am Stat Assoc 90:773–795

    Article  MATH  Google Scholar 

  • Lambert D (1992) Zero-inflated Poisson regression with and application to defects in manufacturing. Technometrics 34(1):1–14

    Article  MATH  Google Scholar 

  • van der Linde A (2005) DIC in variable selection. Statistica Neerlandica 59(1):45–56

    Article  MATH  MathSciNet  Google Scholar 

  • Pettitt A, Weir I, Hart A (2002) A conditional autoregressive Gaussian process for irregularly spaced multivariate data with application to modelling large sets of binary data. Stat Comput 12(4):353–367

    Article  MathSciNet  Google Scholar 

  • Rodrigues J (2003) Bayesian analysis of zero-inflated distributions. Commun Stat 32(2):281–289

    Article  MATH  Google Scholar 

  • Spiegelhalter D, Best N, Carlin B, van der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64(4):583–640

    Article  MATH  Google Scholar 

  • Sun D, Tsutakawa RK, Kim H, He Z (2000) Bayesian analysis of mortality rates with disease maps. Stat Med 19:2015–2035

    Article  Google Scholar 

  • Winkelmann R (2003) Econometric analysis of count data, 4th edn. Springer, Berlin Heidelberg, Germany

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Claudia Czado.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gschlößl, S., Czado, C. Modelling count data with overdispersion and spatial effects. Statistical Papers 49, 531–552 (2008). https://doi.org/10.1007/s00362-006-0031-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-006-0031-6

Keywords

Navigation