Advertisement

Zero-Inflated Spatial Models: Application and Interpretation

  • L. M. AinsworthEmail author
  • C. B. Dean
  • R. Joy
Conference paper
Part of the Lecture Notes in Statistics book series (LNS, volume 218)

Abstract

Many environmental applications, such as species abundance studies, rainfall monitoring or tornado count reports, yield data with a preponderance of zero counts. Although standard statistical distributions may not fit these data, a large body of literature has been dedicated to methods for modeling zero-inflated data. One type of regression model for zero-inflated data is categorized as a mixture model. Mixture models postulate two types of zeros, represented using a latent variable, and model their probabilities separately. The latent classification of zeros may be of particular interest as it can provide important clues to physical characteristics associated with, for example, habitat suitability or resistance to disease or pest infestations. Different zero-inflated models can be developed depending on the biological and physical characteristics of the application at hand. Here, several zero-inflated spatial models are applied to a case study of spruce weevil (Pissodesstrobi) infestations in a Sitka spruce tree plantation. The data illustrate the unique features distinguished by various models and show the importance of using expert knowledge to inform model structures that in turn provide insight into underlying biological processes driving the probability of belonging to the zero, resistant, component. For instance, one model focuses on individually resistant trees located among infested trees. Another focuses on clusters of resistant trees which are likely located in unsuitable habitats. We apply six models: a standard generalized linear model (GLM); an overdispersion model; a random effects zero-inflated model; a conditional autoregressive random effects model (CAR); a multivariate CAR (MCAR) model; and a model developed using discrete random effects to accommodate spatial outliers. We discuss the distinct features identified by the zero-inflated spatial models and make recommendations regarding their application in general.

Keywords

Autocovariate Discrete mixture model Hierarchical Bayesian model Mixed binomial model Spatial autocorrelation Spruce weevil 

Notes

Acknowledgements

We would like to thank the Natural Sciences and Engineering Research Council of Canada for research funding. We thank Erin Lundy and Alisha Albert-Green for their assistance with the literature review for this paper. We would also like to thank all those who have provided valuable feedback on this work: Giovani da Silva for his review and helpful comments, the ISS-2015 Symposium audience for their thoughtful questions, and the ISS reviewer and Proceedings editor for their useful comments and suggestions.

References

  1. Agarwal, D.K., Gelfand, A.E., Citron-Pousty, S.: Zero-inflated models with application to spatial count data. Environ. Ecol. Stat. 9, 341–355 (2002)MathSciNetCrossRefGoogle Scholar
  2. Ainsworth, L.M., Dean, C.B.: Zero-inflated spatial models: web supplement. http://www.stat.sfu.ca/~dean/students/ainsworth.html#nav (2007)
  3. Ainsworth, L.M., Dean, C.B.: Detection of local and global outliers in mapping studies. Environmetrics 19, 21–37 (2008)MathSciNetCrossRefGoogle Scholar
  4. Alfo, M., Maruotti, A.: Two-part regression models for longitudinal zero-inflated count data. Can. J. Stat. 38 (2), 197–216 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  5. Besag, J., York, J., Mollié, A.: Bayesian image restoration with two applications in spatial statistics. Ann. Inst. Stat. Math. 43 (1), 1–59 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  6. Breslow, N.E., Clayton, D.G.: Approximate inference in generalized linear mixed models. J. Am. Stat. Assoc. 88, 9–25 (1993)zbMATHGoogle Scholar
  7. Chen, J., Knalili, A.: Order selection in finite mixture models. with a nonsmooth penalty, J. Am. Stat. Assoc. 103, 1674–1683 (2008)Google Scholar
  8. Consul, P.C., Jain, G.C.: A generalization of the Poisson distribution. Technometrics 15 (4), 791–799 (1973)MathSciNetCrossRefzbMATHGoogle Scholar
  9. Diao, L., Cook, R., Lee, K.: A copula model for marked point processes. Lifetime Data Anal. 19, 463–489 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  10. Dobbie, M.J., Welsh, A.H.: Modelling correlated zero-inflated count data. Aust. N. Z. J. Stat. 43, 431–444 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  11. Eberly, L.E., Carlin, B.P.: Identifiability and convergence issues for Markov chain Monte Carlo fitting of spatial models. Stat. Med. 19, 2279–2294 (2000)CrossRefGoogle Scholar
  12. Feng, C.X., Dean, C.B.: Joint analysis of multivariate spatial count and zero-heavy count outcomes using common spatial factor models. Environmetrics 23 (6), 493–508 (2012)MathSciNetCrossRefGoogle Scholar
  13. Gelman, A.: Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 1, 515–533 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  14. Gelman, A., Hill, J.: Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, New York (2007)Google Scholar
  15. Gelman, A., Rubin, D.: Inference from iterative simulation using multiple sequences (with discussion). Stat. Sci. 7, 457–511 (1992)CrossRefGoogle Scholar
  16. Gelman, A., Meng, X., Stern, H.: Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Stat. Sin. 6, 733–807 (1996)MathSciNetzbMATHGoogle Scholar
  17. Gilks, W.R., Wild, P.: Adaptive rejection sampling for Gibbs sampling. Appl. Stat. 41 (2), 337–348 (1992)CrossRefzbMATHGoogle Scholar
  18. Hall, D.B.: Zero-inflated Poisson and binomial regression with random effects: a case study. Biometrics 56, 1030–1039 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  19. Hall, D.B., Zhang, Z.: Marginal models for zero-inflated clustered data. Stat. Model. 4, 161–180 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  20. Hardin, J.W., Hilbe, J.M., Hible, J.: Generalized Linear Models and Extensions, 2nd edn. Stata Press, Texas (2007)zbMATHGoogle Scholar
  21. Hasan, M.T., Sneddon, G.: Zero-inflated Poisson regression for longitudinal data. Commun. Stat. Simul. Comput. 38 (3), 638–653 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  22. Hasan, T., Sneddon, G., Ma, R.: Pattern-mixture zero-inflated mixed models for longitudinal unbalanced count data with excessive zeros. Biom. J. 51 (6), 946–960 (2009)MathSciNetCrossRefGoogle Scholar
  23. Hatfield, L., Boye, M., Hackshaw, M., Carlin, B.: Multilevel Bayesian models for survival times and longitudinal patient-reported outcomes with many zeros. J. Am. Stat. Assoc. 107 (499), 875–885 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  24. He, F., Alfaro, R.: White pine Weevil (Coleoptera: Curculionidae) attack on white spruce: spatial and temporal patterns. Environ. Entomol. 26 (4), 888–895 (1997)CrossRefGoogle Scholar
  25. Heilbron, D.: Zero-altered and other regression models for count data with added zeros. Biom. J. 36 (5), 531–547 (1994)CrossRefzbMATHGoogle Scholar
  26. Jin, X., Carlin, B.P., Banerjee, S.: Generalized hierarchical multivariate CAR models for areal data. Biometrics 61, 950–961 (2005). doi:10.1111/j.1541–0420.2005.00359.xMathSciNetCrossRefzbMATHGoogle Scholar
  27. Johnson, V.: A Bayesian χ 2 test for goodness of fit. Ann. Stat. 32 (6), 2361–2384 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  28. Kuhnert, P.M., Martin, T.G., Mengersen, K., Possingham, H.P.: Assessing the impacts of grazing levels on bird density in woodland habitat: a Bayesian approach using expert opinion. Environmetrics 16, 717–747 (2005)MathSciNetCrossRefGoogle Scholar
  29. Lambert, D.: Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34, 1–14 (1992)CrossRefzbMATHGoogle Scholar
  30. Lawless, J.F.: Negative binomial and mixed Poisson regression. Commun. Stat. 15, 209–225 (1987)MathSciNetzbMATHGoogle Scholar
  31. Lawson, A.B., Clark, A.: Spatial mixture relative risk models applied to disease mapping. Stat. Med. 21 (3), 359–370 (2002)CrossRefGoogle Scholar
  32. Liang, K., Zeger, S.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
  33. Lunn, D., Jackson, C., Best, N., Thomas, A., Speigelhalter, D.: The Bugs Book - A Practical Introduction to Bayesian Analysis. CRC Press, Chapman and Hall, Boca Raton (2012)zbMATHGoogle Scholar
  34. Martin, T.G., Wintle, B.A., Rhodes, J.R., Kuhnert, P.M., Field, S.A., Low-Choy, S.J., Tyre, A.J., Possingham, H.P.: Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecol. Lett. 8, 1235–1246 (2005)CrossRefGoogle Scholar
  35. McCullagh, P., Nelder, J.A.: Generalized Linear Models, 2nd edn. Chapman and Hall, New York (1989)CrossRefzbMATHGoogle Scholar
  36. Mullahy, J.: Specification and testing of some modified count data models. J. Econ. 33, 341–365 (1986)MathSciNetCrossRefGoogle Scholar
  37. Nathoo, F.: Joint spatial modeling of recurrent infection and growth with processes under intermittent observation. Biometrics 66, 336–346 (2010). doi: 10.1111/j.1541-0420.2009.01305.x MathSciNetCrossRefzbMATHGoogle Scholar
  38. Nathoo, F., Dean, C.B.: A mixed mover-stayer model for spatiotemporal two-state processes. Biometrics 63, 881–891 (2007). doi: 10.1111/j.1541-0420.2007.00752.x MathSciNetCrossRefzbMATHGoogle Scholar
  39. Rathbun, S.L., Fei, S.: A spatial zero-inflated Poisson regression model for oak regeneration. Environ. Ecol. Stat. 13, 409–426 (2006)MathSciNetCrossRefGoogle Scholar
  40. Ridout, M., Demetrio, C.G.B., Hinde, J.: Models for count data with many zeros. In: International Biometric Conference, Cape Town (1998)Google Scholar
  41. Rodrigues-Motta, M., Pinheiro, H.P., Martins, E.G., Araujo, M.S., dos Reis, S.F.: Multivariate models for correlated count data. J. Appl. Stat. 40 (7), 1586–1596 (2013)MathSciNetCrossRefGoogle Scholar
  42. Spiegelhalter, D., Thomas, A., Best, N., Lunn, D.: WinBUGS User Manual Version 1.4. Medical Research Council Biostatistics Unit, Cambridge (2003)Google Scholar
  43. Stroup, W.W.: Generalized linear mixed models, modern concepts, methods and applications. CRC Press, Taylor & Francis Group, New York (2013)zbMATHGoogle Scholar
  44. Tzala, E., Best, N.: Bayesian latent variable modelling of multivariate spatio-temporal variation in cancer mortality. Stat. Methods Med. Res. 17, 97–118 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  45. Velarde, L.G.C., Migon, H.S., Pereira, B.B.: Spate-time modeling of rainfall data. Environmetrics 15, 561–576 (2004)CrossRefGoogle Scholar
  46. Ver Hoef, J.M., Jansen, J.K.: Space-time zero-inflated count models of Harbour Seals. Environmetrics 18 (7), 697–712 (2007)MathSciNetCrossRefGoogle Scholar
  47. Wang, K., Yau, K.K.W., Lee, A.H.: A zero-inflated Poisson mixed model to analyze diagnosis related groups with majority of same-day hospital stays. Comput. Methods Programs Biomed. 68, 195–203 (2002)CrossRefGoogle Scholar
  48. Wikle, C.K., Anderson, C.J.: Climatological analysis of tornado report counts using a hierarchical Bayesian spatio-temporal model. J. Geophys. Res. Atmos. 108, 9005 (2003). doi: 10.1029/2002JD002806 CrossRefGoogle Scholar
  49. Williams, D.A. (1982). Extra-binomial variation in logistic linear models. Appl. Stat. 31 (2), 144–148 (1982)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Department of Statistics and Actuarial ScienceSimon Fraser UniversityBurnabyCanada
  2. 2.Department of Statistical and Actuarial ScienceUniversity of Western OntarioLondonCanada
  3. 3.SMRU Consulting Ltd.VancouverCanada

Personalised recommendations