Abstract
Poisson and negative binomial distributions are frequently used to fit count data. A limitation of the Poisson distribution is that the mean and the variance are assumed to be equal, but this assumption is far from being realistic in many practical applications. The negative binomial distribution is more used in cases of overdispersion, given that their variance is higher than the mean. The two-parameter double Poisson distribution introduced by Efron may be considered as a useful alternative to the Poisson and negative binomial distributions, given that it can account for both overdispersion and under-dispersion. In this article, we obtain maximum likelihood and Bayesian estimates for the double Poisson distribution. We also extend the proposed methodology for the situation in which there is an excess of zeros in a sample. Applications of the double Poisson distribution are considered assuming simulated and real data sets.
Similar content being viewed by others
References
Cameron, A. C., and P. Johansson. 1997. Count data regression using series expansions: With applications. Journal of Applied Econometrics 12:203–23. doi:https://doi.org/10.1002/(ISSN)1099-1255.
Cameron, A. C., and P. K. Trivedi. 2013. Regression analysis of count data. 2nd ed. New York: Cambridge University Press.
Chang, H. Y., C. M. Suchindran, and W. H. Pan. 2001. Using the overdispersed exponential family to estimate the distribution of usual daily intakes of people aged between 18 and 28 in Taiwan. Statistics in Medicine 20:2337–50. doi:https://doi.org/10.1002/sim.838.
Chow, N. T., and D. Steenhard. 2009. A flexible count data regression model using SAS* PROC NLMIXED. SAS Global Forum: Statistics and Data Analysis, p. 1–14.
Conde, D. M., L. Costa-Paiva, E. Z. Martinez, and A. M. Pinto-Neto. 2012. Low bone mineral density in middle-aged breast cancer survivors: Prevalence and associated factors. Breast Care 7:121–25. doi:https://doi.org/10.1159/000337763.
Conway, R. W., and W. L. Maxwell. 1962. A queuing model with state dependent service rates. Journal of Industrial Engineering 12:132136.
Efron, B., 1985. Double exponential families and their use in generalized linear regression. Technical Report no. 107, Stanford University, Department of Statistics. https://doi.org/statistics.stanford.edu/sites/default/files/BIO107.pdf
Efron, B. 1986. Double exponential families and their use in generalized linear regression. Journal of the American Statistical Association 81:709–21. doi:https://doi.org/10.1080/01621459.1986.10478327.
El-Shaarawi, A. H., S. R. Esterby, and B. J. Dutka. 1981. Bacterial density in water determined by Poisson or negative binomial distributions. Applied and Environmental Microbiology 41:107–16.
Gardner, W., E. P. Mulvey, and E. C. Shaw. 1995. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin 118:392–404.
Gelman, A., and J. B. Carlin. 2013. Bayesian data analysis. 3rd ed. New York: Chapman and Hall.
Ghosh, S. K., P. Mukhopadhyay, and J. C. Lu. 2006. Bayesian analysis of zero-inflated regression models. Journal of Statistical Planning and Inference 136:1360–75. doi:https://doi.org/10.1016/j.jspi.2004.10.008.
Henningsen, A., and O. Toomet. 2011. maxLik: A package for maximum likelihood estimation in R. Computational Statistics 26:443–58. doi:https://doi.org/10.1007/s00180-010-0217-1.
Hosmer, D. W., Jr., S. Lemeshow, and R. X. Sturdivant. 2013. Applied logistic regression. 3rd ed. Chichester: Wiley.
Lambert, D. 1992. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14. doi:https://doi.org/10.2307/1269547.
Millar, R. B. 2011. Maximum likelihood estimation and inference: With examples in R, SAS and ADMB, Vol. 111. Chichester: John Wiley & Sons.
Nelder, J. A., and Y. Lee. 1992. Likelihood, quasi-likelihood and pseudolikelihood: Some comparisons. Journal of the Royal Statistical Society, Series B 54:273–84.
Pradhan, N. C., and P. Leung. 2006. A Poisson and negative binomial regression model of sea turtle interactions in Hawaii’s longline fishery. Fisheries Research 78:309–22. doi:https://doi.org/10.1016/j.fishres.2005.12.013.
Sellers, K. R, and G. Shmueli. 2010. A flexible regression model for count data. Annals of Applied Statistics 4:943–61. doi:https://doi.org/10.1214/09-AOAS306.
Spiegelhalter, D. J., N. G. Best, B. P. Carlin, and A. van der Linde. 2002. Bayesian measures of model complexity and fit, (with discussion and rejoinder). Journal of the Royal Statistical Society, Series B 64:583–639. doi:https://doi.org/10.1111/1467-9868.00353.
Ver Hoef, J. M., and P. L. Boveng. 2007. Quasi-poisson vs. negative binomial regression: How should we model overdispersed count data? Ecology 88:2766–72.
Zou, Y., S. R Geedipally, and D. Lord. 2013. Evaluating the double Poisson generalized linear model. Accident Analysis & Prevention 59:497–505. doi:https://doi.org/10.1016/j.aap.2013.07.017.
Funding
This work was supported by the Conselho Nacional de Desenvolvimento Científico e Tecnológico (307767/2015-9).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Aragon, D.C., Achcar, J.A. & Martinez, E.Z. Maximum likelihood and Bayesian estimators for the double Poisson distribution. J Stat Theory Pract 12, 886–911 (2018). https://doi.org/10.1080/15598608.2018.1489919
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1080/15598608.2018.1489919
Keywords
- Maximum likelihood estimator
- Bayes estimator
- double Poisson distribution count data
- Markov-chain Monte Carlo (MCMC)