Skip to main content
Log in

Mean and Variance for Count Regression Models Based on Reparameterized Distributions

  • Published:
Sankhya B Aims and scope Submit manuscript

Abstract

We introduce a new regression model for count data where the response variable is mainly in the class of inflated-parameter generalized power series (IGPS) distributions, which take automatically into account both dispersion and zero inflation phenomena. An original parameterization of these distributions is used, which is indexed by the mean and variance parameters, and not generally connected between them. An advantage of our approach is the straightforward interpretation of the regression coefficients in terms of the mean and variance comparing, for instance, to the popular generalized linear models. This attractive methodology is so simple and useful for many models. Some new mathematical and practical properties of the IGPS distributions are studied, including the quantile function, dispersion and zero-inflation indexes. Three basical IGPS models such for geometric, Bernoulli and Poisson are investigated in details. For the corresponding count regression models, the method of maximum likelihood is used for estimating the model parameters. Simulation studies are conducted to evaluate its finite sample performance. Finally, we highlight the ability of some reparameterized IGPS regression models to deal with count data which are overdispersed and zero-inflated; and then, comparing with usual models like zero inflated Poisson and negative binomial which are also reparameterized in terms of mean and variance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Abid R, Kokonendji CC and Masmoudi A (2021) On Poisson-exponential-Tweedie models for ultra-overdispersed count data. AStA Advances in Statistical Analysis 105, 1-23.

    Article  MathSciNet  Google Scholar 

  • Bonat WH, Jrgensen B, Kokonendji, CC, Hinde J and Demétrio CG (2018) Extended Poisson-Tweedie: properties and regression models for count data. Statistical Modelling 18, 24–49.

  • Borges P and Godoi LG (2019) Plya-Aeppli regression model for overdispersed count data. Statistical Modelling 19, 362–385.

  • Bourguignon M and Medeiros RMR de (2022) A simple and useful regression model for fitting count data. TEST 31, 790–827.

    Article  MathSciNet  Google Scholar 

  • Bourguignon M, Gallardo DI and de Medeiros RMR (2022) A simple and useful regression model for underdispersed count data based on Bernoulli-Poisson convolution. Statistical Papers 63, 821–848.

    Article  MathSciNet  Google Scholar 

  • Castellares F, Lemonte AJ, and Moreno-Arenas G (2020) On the two-parameter Bell-Touchard discrete distribution. Communications in Statistics-Theory and Methods 49, 4834–4852.

    Article  MathSciNet  Google Scholar 

  • Consul PC and Jain GC (1973) A generalization of the Poisson distribution. Technometrics 15, 791–799.

    Article  MathSciNet  Google Scholar 

  • Cupach, WR and Spitzberg, BH (2004). The Dark Side of Relationship Pursuit: From Attraction to Obsession and Stalking, 2nd ed. Lawrence Erlbaum Associates, Mahwah, NJ.

    Google Scholar 

  • Dunn PK and Smyth GK (1996) Randomized quantile residuals. Journal of Computational and Graphical Statistics 5, 236–244.

    Google Scholar 

  • Efron B (1986) Double exponential families and their use in generalized linear regression. Journal of the American Statistical Association 81, 709–721.

    Article  MathSciNet  Google Scholar 

  • Evans DA (1953) Experimental evidence concerning contagious distributions in ecology. Biometrika 40, 186–211.

    Article  Google Scholar 

  • Famoye F and Singh KP (2006) Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data. Journal of Data Science 4, 117–130.

    Article  Google Scholar 

  • Ferreri C (2009) On the Polya-Aeppli regression model. Metron 2, 129–152.

    Google Scholar 

  • Graham RL, Knuth DE and Patashnik O (1989) Concrete Mathematics: A Foundation for Computer Science, 2nd ed. Addison & Wesley, Reading, BRK.

    Google Scholar 

  • Greene WH (1994) Some Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. Working Paper EC-94-10: Department of Economics, New York University. SSRN 1293115.

  • Gupta, RC (1974) Modified power series distributions and some of its applications. \(Sankhy \overline{a} B\)35, 288–298.

  • Hall DB (2000) Zero-Inflated Poisson and Binomial Regression with Random Effects: A Case Study. Biometrics 56, 1030–1039.

    Article  MathSciNet  Google Scholar 

  • Joe H and Zhu R (2005), Generalized Poisson distribution: the property of mixture of Poisson and comparison with negative binomial distribution. Biometrical Journal 47, 219–229.

    Article  MathSciNet  Google Scholar 

  • Johnson NL, Kemp AK and Kotz S (2005) Univariate Discrete Distributions, 3rd ed. Wiley, Hoboken, NJ.

    Book  Google Scholar 

  • Kleiber C and Zeileis A (2016) Visualizing count data regressions using rootograms. The American Statistician 70, 296–303.

    Article  MathSciNet  Google Scholar 

  • Kolev N, Minkova L and Neytchev P (2000) Inflated-parameter family of generalized power series distributions and their application in analysis of overdispersed insurance data. ARCH Research Clearing House 2, 295–320.

    Google Scholar 

  • Kumar CS and Ramachandran R (2020) On some aspects of a zero-inflated overdispersed model and its applications. Journal of Applied Statistics 47, 506-523

    Article  MathSciNet  Google Scholar 

  • Kumar CS and Ramachandran R (2023) A generalization to zero-inflated hyper-Poisson distribution: Properties and applications. Communications in Statistics - Theory and Methods 52, 7289–7302.

    Article  MathSciNet  Google Scholar 

  • Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34, 1–14.

    Article  Google Scholar 

  • Lemonte AJ (2022) On the mean-parameterized Bell-Touchard regression model for count data. Applied Mathematical Modelling 105, 1–16.

    Article  MathSciNet  Google Scholar 

  • Loeys T, Moerkerke B, De Smet O and Buysse A (2012) The analysis of zero-inflated count data: Beyond zero-inflated Poisson regression. British Journal of Mathematical and Statistical Psychology 65, 163–180.

    Article  MathSciNet  Google Scholar 

  • Noack A (1950) A Class of random variables with discrete distributions. The Annals of Mathematical Statistics 21, 127–32.

    Article  MathSciNet  Google Scholar 

  • Petterle RR, Bonat WH, Kokonendji CC, Seganfredo JC, Moraes A, da Silva MG (2019) Double Poisson-Tweedie regression models (with Analyzing CD4 cell count in HIV-positive pregnant women). International Journal of Biostatistics 15(1), 15. Paper No. 20180119

  • Puig P and Valero J (2006) Count data distributions: some characterizations with applications. Journal of the American Statistical Association 101, 332–340

    Article  MathSciNet  Google Scholar 

  • R Core Team (2023) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

  • Ribeiro Jr EE, Zeviani WM, Bonat WH, Demétrio CG and Hinde J (2019) Reparametrization of COM-Poisson regression models with applications in the analysis of experimental data. Statistical Modelling 5, 443–466.

    MathSciNet  Google Scholar 

  • Rigby RA, Stasinopoulos MD, Heller GZ and De Bastiani F (2019) Distributions for Modeling Location, Scale, and Shape: Using GAMLSS in R. CRC Press.

  • Rodríguez-Avi J and Olmo-Jiménez MJ (2017). A regression model for overdispersed data without too many zeros. Statistical Papers 58, 749–773.

    Article  MathSciNet  Google Scholar 

  • Sellers KF and Raim A (2016) A flexible zero-inflated model to address data dispersion. Computational Statistics & Data Analysis 99, 68–80.

    Article  MathSciNet  Google Scholar 

  • Vanegas LH and Paula GA (2016) Log-symmetric distributions: statistical properties and parameter estimation. Brazilian Journal of Probability and Statistics 30, 196–220.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We sincerely thank the Associate Editor and two anonymous referees for their valuable comments. This work was performed while the third author was at the LmB of the Université de Franche-Comté (UBFC) as a visiting professor, partly funded by FeProMath of UBFC. The LmB receives support from the EIPHI Graduate School (contract ANR-17-EURE-0002). Marcelo Bourguignon gratefully acknowledges partial financial support of the Brazilian agency Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq: grant 304140/2021-0).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcelo Bourguignon.

Ethics declarations

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kokonendji, C.C., de Medeiros, R.M.R. & Bourguignon, M. Mean and Variance for Count Regression Models Based on Reparameterized Distributions. Sankhya B 86, 280–310 (2024). https://doi.org/10.1007/s13571-024-00325-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13571-024-00325-z

Keywords

Mathematics Subject Classification

Navigation