Abstract
In this paper, we introduce a flexible class of regression models for counts with high-inflation of zeros that cannot be predicted by the Poisson, the zero-inflated Poisson, the negative binomial and the Poisson-inverse Gaussian regression models. Our proposed flexible regression models are based on a class of zero-inflated mixed Poisson distributions and contain the zero-inflated negative binomial (ZINB) and the zero-inflated Poisson-inverse Gaussian (ZIPIG) distributions, as particular cases, among others. We consider regression structures for the mean, the dispersion, and the zero-inflation parameters. Consequently, we generalize existing models, such as the ZINB regression (with non-varying dispersion), and also open the possibility of introducing new models, such as the ZIPIG and the zero-inflated generalized hyperbolic secant regressions. We propose an Expectation-Maximization (in short EM) algorithm for estimating the parameters and the associated information matrix. Simulation results are presented to compare the finite-sample performance of our proposed EM-algorithm with a direct maximization of the log-likelihood function based on the GAMLSS approach. These simulated results show some advantages of our EM-algorithm concerning the GAMLSS proposal. We also discuss a measure of influence based on the EM approach and propose simulated envelopes for checking the adequacy of our zero-inflated regression models. An empirical application, about the number of roots produced by 270 micropropagated shoots of the columnar apple cultivar Trajan, illustrates the usefulness of the proposed class of regression models for dealing with count data presenting high-inflation of zeros and shows that one cannot use the GAMLSS approach in some practical situations due to numerical problems.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40300-020-00163-9/MediaObjects/40300_2020_163_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40300-020-00163-9/MediaObjects/40300_2020_163_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40300-020-00163-9/MediaObjects/40300_2020_163_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40300-020-00163-9/MediaObjects/40300_2020_163_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40300-020-00163-9/MediaObjects/40300_2020_163_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40300-020-00163-9/MediaObjects/40300_2020_163_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40300-020-00163-9/MediaObjects/40300_2020_163_Fig7_HTML.png)
Similar content being viewed by others
References
Atkinson, A.C.: Plots, Transformations and Regression. Oxford University Press, Oxford (1985)
Barreto-Souza, W., Simas, A.B.: General mixed Poisson regression models with varying dispersion. Stat. Comput. 26, 1263–1280 (2016)
Böhning, D., Dietz, E., Schlattmann, P., Mendonça, L., Kirchner, U.: The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. J. R. Stat. Soc. Ser. A 162, 195–209 (1999)
Cameron, A.C., Trivedi, P.K.: Regression Analysis of Count Data. Cambridge University Press, Cambridge (1998)
Cook, R.D.: Detection of influential observations in linear regression. Technometrics 19, 15–18 (1977)
Dean, C.B., Lawless, J., Willmot, G.E.: A mixed Poisson-inverse Gaussian regression model. Can. J. Stat. 17, 171–182 (1989)
Dean, C.B., Nielsen, J.D.: Generalized linear mixed models: a review and some extensions. Lifetime Data Anal. 13, 497–512 (2007)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. Ser. B 39, 1–38 (1977)
Famoye, F., Singh, K.P.: Zero-inflated generalized Poisson regression model with an application to domestic violence data. J. Data Sci. 4, 117–130 (2006)
Garay, A.M., Hashimoto, E.M., Ortega, E.M., Lachos, V.H.: On estimation and influence diagnostics for zero-inflated negative binomial regression models. Comput. Stat. Data Anal. 55, 1304–1318 (2011)
Hall, D.: Zero-inflated Poisson and binomial regression with random effects: a case study. Biometrics 56, 1030–1039 (2000)
Hilbe, J.M.: Negative Binomial Regression. Cambridge University Press, New York (2008)
Hinde, J., Demétrio, C.G.B.: Overdispersion: models and estimation. Comput. Stat. Data Anal. 27, 151–170 (1998)
Holla, M.S.: On a Poisson-inverse Gaussian distribution. Metrika 11, 115–121 (1966)
Karlis, D., Xekalaki, E.: Mixed Poisson distributions. Int. Stat. Rev. 73, 35–58 (2005)
Lambert, D.: Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34, 1–14 (1992)
Lawless, J.F.: Negative binomial and mixed Poisson regression. Can. J. Stat. 15, 209–225 (1987)
Lee, A.H., Wang, K., Yau, K.K.: Analysis of zero-inflated Poisson data incorporating extent of exposure. Biometr. J. 43, 963–975 (2001)
Li, C.S., Lu, J.C., Park, J., Kim, K., Brinkley, P.A., Peterson, J.P.: Multivariate zero-inflated Poisson models and their applications. Technometrics 41, 29–38 (1999)
Lim, H.K., Li, W.K., Yu, P.L.H.: Zero-inflated Poisson regression mixture model. Comput. Stat. Data Anal. 71, 151–158 (2014)
Louis, T.A.: Finding the observed information matrix when using the EM algorithm. J. R. Stat. Soc. 44, 226–233 (1982)
Mwalili, S.M., Lesaffre, E., Declerck, D.: The zero-inflated negative binomial regression model with correction for misclassification: an example in caries research. Stat. Methods Med. Res. 17, 123–139 (2008)
Oliveira, M., Einbeck, J., Higueras, M., Ainsbury, E., Puig, P., Rothkamm, K.: Zero-inflated regression models for radiation-induced chromosome aberration data: a comparative study. Biometr. J. 58, 259–279 (2016)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2016)
Ridout, M.S., Demétrio, C.G.B., Hinde, J.P.: Models for count data with many zeros. In: Proceedings of the XIXth International Biometrics Conference, Cape Town, Invited Papers, pp. 179–192 (1998)
Ridout, M., Hinde, J., Demétrio, C.G.B.: A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics 57, 219–223 (2001)
Rigby, R.A., Stasinopoulos, D.M.: Generalized additive models for location, scale and shape (with discussion). Appl. Stat. 54, 507–554 (2005)
Shankar, V., Milton, J., Mannering, F.: Modelling accident frequencies as zero-altered probability processes: An empirical inquiry. Accid. Anal. Prev. 29, 829–837 (1997)
Sichel, H.S.: On a family of discrete distributions particularly suited to represent long-tailed frequency data. In: Proceedings of the Third Symposium on Mathematical Statistics, Pretoria, CSIR, pp. 51–97 (1971)
Willmot, G.E.: The Poisson-inverse Gaussian distribution as an alternative to the negative binomial. Scand. Actuar. J. 20, 113–127 (1989)
Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. Stat. 11, 95–103 (1983)
Yau, K.K.W., Wang, K., Lee, A.H.: Zero-inflated negative binomial mixed regression modelling of over-dispersed count data with extra zeros. Biometr. J. 45, 437–452 (2003)
Zhu, H.T., Lee, S.Y., Wei, B.C., Zhu, J.: Case-deletion measures for models with incomplete data. Biometrika 88, 727–737 (2001)
Zhu, H.T., Lee, S.Y.: Local influence for incomplete-data models. J. R. Stat. Soc. Ser. B 63, 111–126 (2001)
Acknowledgements
We would like to thank the referee for the useful comments and suggestions that leads to a substantial improvement of the paper. We also acknowledge the financial support from CAPES (Brazil), CNPq (Brazil) and FAPEMIG (Brazil).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Gonçalves, J.N., Barreto-Souza, W. Flexible regression models for counts with high-inflation of zeros. METRON 78, 71–95 (2020). https://doi.org/10.1007/s40300-020-00163-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40300-020-00163-9