Abstract
In biomedical research, data generated as a consequence of the count process can often possess an ‘excess’ of zeros (e.g. geographical incidence rates, hospital death rates). Whilst there are strategies for analysing such data, some can be biased where the underlying data generation process is not carefully considered. This can be exacerbated where the data are also multilevel, since hierarchical extensions to zero-inflated model strategies do not always satisfy underlying model assumptions. We therefore review zero-inflated modelling strategies for single-level data and show why standard Poisson and binomial zero-inflated models (i.e. where one latent class has a central location of zero) require class membership to be predicted by covariates in the standard regression part of the model. We also introduce generic mixture models and reveal limitations in their interpretation in a number of circumstances. With nested or hierarchical count data with an excess of zeros, upper-level distributional assumptions may not be upheld for standard multilevel models, thereby requiring alternative strategies; in Chap. 7 we introduce and illustrate the semi-parametric multilevel model as a solution to this problem.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Blance, A., Tu, Y. K., Baelum, V., & Gilthorpe, M. S. (2007). Statistical issues on the analysis of change in follow-up studies in dental research. Community Dentistry and Oral Epidemiology, 35(6), 412–420. available from: PM:18039282.
Bland, J. M., & Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research, 8(2), 135–160. available from: PM:10501650.
Böhning, D. (1998). Zero-inflated Poisson models and C.A.MAN: A tutorial collection of evidence. Biometrical Journal, 40(7), 833–843.
Böhning, D., Dietz, E., & Schlattmann, P. (1999). The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. Journal of the Royal Statistical Society, Series A, 162, 195–209.
Carlos, J. P., & Gittelsohn, A. M. (1965). Longitudinal studies of the natural history of caries. II. A life-table study of caries incidence in the permanent teeth. Archives of Oral Biology, 10(5), 739–751. available from: PM:5226906.
Groeneveld, A. (1985). Longitudinal study of prevalence of enamel lesions in a fluoridated and non-fluoridated area. Community Dentistry and Oral Epidemiology, 13(3), 159–163. available from: PM:3860338.
Hall, D. B. (2000). Zero-inflated Poisson and binomial regression with random effects: A case study. Biometrics, 56(4), 1030–1039. available from: http://www.blackwell-synergy.com/loi/biom.
Holst, D. (2006). The relationship between prevalence and incidence of dental caries. Some observational consequences. Community Dental Health, 23(4), 203–208. available from: PM:17194066.
Kinlen, L. J., Clarke, K., & Hudson, C. (1990). Evidence from population mixing in British New Towns 1946–85 of an infective basis for childhood leukaemia. Lancet, 336(8715), 577–582. available from: PM:1975376.
Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics, 34(1), 1–14. available from: ISI:A1992GZ77700001.
Leroy, R., Bogaerts, K., Lesaffre, E., & Declerck, D. (2005). Multivariate survival analysis for the identification of factors associated with cavity formation in permanent first molars. European Journal of Oral Sciences, 113(2), 145–152. available from: PM:15819821.
Liang, K. Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.
Lord, F. M. (1967). A paradox in the interpretation of group comparisons. Psychological Bulletin, 68, 304–305.
Lord, F. M. (1969). Statistical adjustments when comparing preexisting groups. Psychological Bulletin, 72, 337–338.
Macek, M. D., Beltran-Aguilar, E. D., Lockwood, S. A., & Malvitz, D. M. (2003). Updated comparison of the caries susceptibility of various morphological types of permanent teeth. Journal of Public Health Dentistry, 63(3), 174–182. available from: PM:12962471.
Mullahy, J. (1986). Specification and testing of some modified count data models. Journal of Econometrics, 33(3), 341–365. available from: ISI:A1986F205600002.
Parner, E. T., Heidmann, J. M., Vaeth, M., & Poulsen, S. (2007). Surface-specific caries incidence in permanent molars in Danish children. European Journal of Oral Sciences, 115(6), 491–496. available from: PM:18028058.
Poulsen, S., & Horowitz, H. S. (1974). An evaluation of a hierarchical method of describing the pattern of dental caries attack. Community Dentistry and Oral Epidemiology, 2(1), 7–11. available from: PM:4153274.
Poulsen, S., Heidmann, J., & Vaeth, M. (2001). Lorenz curves and their use in describing the distribution of ‘the total burden’ of dental caries in a population. Community Dental Health, 18(2), 68–71. available from: PM:11461061.
Ridout, M., Demétrio, C. G. B., & Hinde, J. (1998) Models for count data with many zeros. Proceedings article for an International Biometric Conference (pp. 179–192). Cape Town. http://www.kent.ac.uk/IMS/personal/msr/webfiles/zip/ibc_fin.pdf.
Senn, S. (2006). Change from baseline and analysis of covariance revisited. Statistics in Medicine, 25(24), 4334–4344. available from: PM:16921578.
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal and structural equation models. London: Chapman & Hall.
Vermunt, J. K., & Magidson, J. (2005a). Latent GOLD 4.0 User’s Guide. Belmont Massachusetts: Statistical Innovations Inc.
Vermunt, J. K., & Magidson, J. (2005b). Technical guide for Latent GOLD 4.0: Basic and advanced. Belmont Massachusetts: Statistical Innovations Inc. http://www.statisticalinnovations.com/products/LGtechnical.pdf.
Vieira, A. M. C., Hinde, J. P., & Demetrio, C. G. B. (2000). Zero-inflated proportion data models applied to a biological control assay. Journal of Applied Statistics, 27(3), 373–389. available from: ISI:000086354400009.
Wong, M. C., Schwarz, E., & Lo, E. C. (1997). Patterns of dental caries severity in Chinese kindergarten children. Community Dentistry and Oral Epidemiology, 25(5), 343–347. available from: PM:9355769.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Gilthorpe, M.S., Frydenberg, M., Cheng, Y., Baelum, V. (2012). Modelling Data That Exhibit an Excess Number of Zeros: Zero-Inflated Models and Generic Mixture Models. In: Tu, YK., Greenwood, D. (eds) Modern Methods for Epidemiology. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-3024-3_6
Download citation
DOI: https://doi.org/10.1007/978-94-007-3024-3_6
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-3023-6
Online ISBN: 978-94-007-3024-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)