Abstract
In recent years, the availability of spatial count data has massively increased. Due to the ubiquity of over- or under-dispersion in count data, we propose a Bayesian hierarchical modeling approach based on the renewal theory that relates nonexponential waiting times between events and the distribution of the counts, relaxing the assumption of equi-dispersion at the cost of an additional parameter. Particularly, we extend the methodology for analyzing spatial count data based on the gamma distribution assumption for waiting times. The model can be formulated as a latent Gaussian model, and therefore, we can carry out fast computation using the integrated nested Laplace approximation method. The analysis of a groundwater quality dataset and a simulation study show a significant improvement over both Poisson and negative binomial models.Supplementary materials accompanying this paper appear on-line.
Similar content being viewed by others
References
Bakka H, Rue H, Fuglstad G-A, Riebler A, Bolin D, Illian J, Krainski E, Simpson D, Lindgren F (2018) Spatial modeling with r-inla: a review. Wiley Interdiscipl Rev Comput Stat 10(6):e1443
Bear J (1979) Hydraulics of groundwater: courier corporation. Dover Publications, New York
Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88(421):9–25
Cameron A, Trivedi P (2013) Regression analysis of count data. Cambridge University Press, Cambridge
Cotruvo J, Fawell J, Giddings M, Jackson P, Magara Y, Festo Ngowi A, and Ohanian E (2011) Background document for development of who guidelines for drinking-water quality. Technical report
Cox DR (1962) Renewal theory. Methuen, London
Cressie N (1993) Statistics for spatial data. Wiley, London
Dawid AP (1984) Present position and potential developments: some personal views statistical theory the prequential approach. J R Stat Soc Ser A (General) 147(2):278–290
Fahrmeir L, Kneib T, and Lang S (2004). Penalized structured additive regression for space-time data: a Bayesian perspective. Stat Sinica 731–761
Fuglstad G-A, Simpson D, Lindgren F, Rue H (2019) Constructing priors that penalize the complexity of Gaussian random fields. J Am Stat Assoc 114(525):445–452
Gonzales-Barron U, Butler F (2011) Characterisation of within-batch and between-batch variability in microbial counts in foods using Poisson-gamma and Poisson-Lognormal regression models. Food Control 22:1268–1278
Hornberger GM, Wiberg PL, Raffensperger JP, and D’Odorico P (2014) Elements of physical hydrology. JHU Press
Krainski E, Gómez-Rubio V, Bakka H, Lenzi A, Castro-Camilo D, Simpson D, Lindgren F, Rue H (2018) Advanced spatial modeling with stochastic partial differential equations using R and INLA. Chapman and Hall/CRC, London
Lewis B (2003) Small dams. CRC Press/Balkema, London
Lindgren F (2012) Continuous domain spatial models in r-inla. ISBA Bull 19(4):14–20
Lindgren F, Rue H, Lindström J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J R Stat Soc Ser B (Stat Methodol) 73(4):423–498
McShane B, Adrian M, Bradlow ET, Fader PS (2008) Count models based on Weibull interarrival times. J Bus Econ Stat 26(3):369–378
Nelder JA, Wedderburn RW (1972) Generalized linear models. J R Stat Soc Ser A (General) 135(3):370–384
Pearson K, Henrici OMFE (1894) III. Contributions to the mathematical theory of evolution. Philos Trans R Soc Londn (A) 185:71–110
Pettit LI (1990) The conditional predictive ordinate for the normal distribution. J Roy Stat Soc: Ser B (Methodol) 52(1):175–184
Rapant S, Cvečková V, Fajčíková K, Sedláková D, Stehlíková B (2017) Impact of calcium and magnesium in groundwater and drinking water on the health of inhabitants of the Slovak republic. Int J Environ Res Public Health 14(3):278
Ridout MS, Besbeas P (2004) An empirical model for underdispersed count data. Stat Model 4(1):77–89
Rue H, Held L (2005) Gaussian Markov random fields: theory and applications. Chapman & Hall/CRC Press, London
Rue H, Martino S, Chopin N (2009) Approximate Bayesian inference for latent gaussian models by using integrated nested Laplace approximations. J Roy Stat Soc B 71:319–392
Rue H, Riebler A, Sørbye SH, Illian JB, Simpson DP, Lindgren FK (2017) Bayesian computing with inla: A review. Ann Rev Stat Appl 4(1):395–421
Sellers KF and Shmueli G (2010) A flexible regression model for count data. Ann Appl Stat 943–961
Simpson D, Rue H, Riebler A, Martins TG, and Sørbye SH (2017) Penalising model component complexity: a principled, practical approach to constructing priors. Stat Sci 32(1)
Sørbye SH, Illian JB, Simpson DP, Burslem D, Rue H (2019) Careful prior specification avoids incautious inference for log-Gaussian cox point processes. J Roy Stat Soc: Ser C (Appl Stat) 68(3):543–564
Spiegelhalter DJ, Best N, Carlin BP, Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J Roy Stat Soc B 64:1–34
Stein ML (1999) Interpolation of spatial data: some theory for kriging. Springer, Cham
Vesali Naseh MR, Noori R, Berndtsson R, Adamowski J, Sadatipour E (2018) Groundwater pollution sources apportionment in the Ghaen Plain, Iran. Int J Environ Res Public Health 15(1):172
Watanabe S (2012) A widely applicable Bayesian information criterion. J Mach Learn Res 14
Winkelmann R (1995) Duration dependence and dispersion in count-data models. J Bus Econ Stat 13(4):467–474
Winkelmann R (2013) Econometric analysis of count data. Springer, Berlin
Zeviani WM, Ribeiro PJ Jr, Bonat WH, Shimakura SE, Muniz JA (2014) The gamma-count distribution in the analysis of experimental underdispersed data. J Appl Stat 41(12):2616–2626
Zhu R, Joe H (2009) Modelling heavy-tailed count data using a generalised Poisson-inverse Gaussian family. Stat Probab Lett 79:1695–1703
Acknowledgements
We would like to thank the Associate Editor and a referee for their helpful comments and suggestions that improved the paper. Moreover, we appreciate Professor Håvard Rue for adding the GC model as a new family argument with the name “gammacount” to the R-INLA package.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest statement
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Matérn covariance function
An isotropic Matérn spatial covariance function is given by
where \(\Vert h\Vert \) denotes the Euclidean distance between any two locations \(s,s^\prime \in \Re ^d\), \(h=s-s^\prime \), \(\Gamma (\cdot )\) is the gamma function, and \(K_{\nu }(\cdot )\) is the modified Bessel function of the second kind of order \(\nu \). For the Matérn covariance function, \(\sigma ^2\) is the marginal variance, and \(\nu \) measures the degree of smoothness which is usually fixed. In the INLA-SPDE methodology, for \(d=2\), the smoothness is fixed at \(\nu =1\). Further, \(\kappa > 0\) is the scaling parameter with an empirical range \(r=\sqrt{8\nu }/\kappa \) where the spatial correlation is close to 0.1 for all \(\nu \) (Lindgren et al. 2011). For a fixed smoothing parameter \(\nu \), the larger the value of r, the stronger the spatial correlation.
Hazard function of gamma-distributed waiting times
For a gamma distribution, the hazard function is given by
It could be proved that \(h_{\tau }(t)\) is monotonically increasing for \(\alpha >1\), decreasing for \(\alpha <1\), and constant for \(\alpha =1\). Figure 6 shows the curve of \(h_{\tau }(t)\) for different values of parameters.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nadifar, M., Baghishani, H. & Fallah, A. A Flexible Generalized Poisson Likelihood for Spatial Counts Constructed by Renewal Theory, Motivated by Groundwater Quality Assessment. JABES 28, 726–748 (2023). https://doi.org/10.1007/s13253-023-00550-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13253-023-00550-5
Keywords
- Gamma-count distribution
- Integrated nested Laplace approximation
- Spatial count data
- Stochastic partial differential equations