Abstract
Recently, non-monotonic rate sequences of pure birth processes have been the focus of much attention in the analysis of count data due to their ability to provide a combination of over-, under-, and equidispersed distributions without the need to reuse covariates (traditional methods). They also permit the modeling of excess counts, a frequent issue arising when using count models based on monotonic rate sequences such as the Poisson, gamma, Weibull, Conway-Maxwell-Poisson (CMP), Faddy (1997), etc. Matrix-exponential approaches have always been used for computing the probabilities for count models based on pure birth processes, although none have been proposed for them as a specific algorithm. It is intractable to calculate these pure birth probabilities numerically in an analytic form because severe numerical cancellations may occur. However, we circumvent this difficulty by exploiting a Taylor series expansion, and then a new analytic form is derived. We developed a simple algorithm for efficiently implementing the new formula and conducted numerical experiments to study the efficiency and accuracy of the developed algorithm. The results indicate that this new approach is faster and more accurate than the matrix-exponential methods.
Similar content being viewed by others
References
Ball F (1995) A note on variation in birth processes. Math Sci 20:50–55
Banks WD, Martin G (2013) Optimal primitive sets with restricted primes. Integers 13:A69
Barreto-Souza W, Simas AB (2016) General mixed Poisson regression models with varying dispersion. Stat Comput 26(6):1263–1280
Bartlett MS (1978) An introduction to stochastic processes. Cambridge University Press, Cambridge
Bourguignon M, de Medeiros RMR (2022) A simple and useful regression model for fitting count data. TEST 31(3):790–827
Chanialidis C, Evers L, Neocleous T et al (2018) Efficient Bayesian inference for COM-Poisson regression models. Stat Comput 28:595–608
Cheney EW, Kincaid DR (2004) Numerical mathematics and computing, 5th edn. Brooks/Cole Publishing Company, Belmont
Conway RW, Maxwell WL (1962) A queuing model with state dependent service rates. J Ind Eng 12(2):132–136
Cox DR, Miller H (1965) The Theory of Stochastic Processes. Chapman and Hall, London
Crawford FW, Ho LST, Suchard MA (2018) Computational methods for birth-death processes. WIREs Comput Stat 10(2):1–22
Daniels HE (1982) The saddlepoint approximation for a general birth process. J Appl Probab 19(1):20–28
Eddelbuettel D, Francois R, Allaire J, et al (2023) Rcpp: seamless R and C++ integration. https://CRAN.R-project.org/package=Rcpp, R package version 1.0.10
Faddy MJ (1994) On variation in Poisson processes. Math Sci 19(1):47–51
Faddy MJ (1997) Extended Poisson process modelling and analysis of count data. Biom J 39(4):431–440
Faddy MJ, Smith DM (2008) Extended Poisson process modelling of dilution series data. J Roy Stat Soc: Ser C (Appl Stat) 57(4):461–471
Faddy MJ, Smith DM (2011) Analysis of count data with covariate dependence in both mean and variance. J Appl Stat 38(12):2683–2694
Faddy MJ, Smith DM (2012) Extended Poisson process modelling and analysis of grouped binary data. Biom J 54(3):426–435
Forthmann B, Gühne D, Doebler P (2020) Revisiting dispersion in count data item response theory models: The Conway-Maxwell-Poisson counts model. Br J Math Stat Psychol 73:32–50
Goulet V, Dutang C, Maechler M, et al (2022) expm: matrix exponential, log, ‘etc’. https://CRAN.R-project.org/package=expm, R package version 0.999-7
Guikema SD, Goffelt JP (2008) A flexible count data regression model for risk analysis. Risk Anal Int J 28(1):213–223
Higham NJ (2005) The scaling and squaring method for the matrix exponential revisited. SIAM J Matrix Anal Appl 26(4):1179–1193
Higham NJ (2008) Functions of matrices: theory and computation. Society for Industrial and Applied Mathematics, Philadelphia
Higham NJ (2009) The scaling and squaring method for the matrix exponential revisited. SIAM Rev 51(4):747–764
Huang J, Zhu F (2021) A new first-order integer-valued autoregressive model with Bell innovations. Entropy 23(6):713
Jensen A (1953) Markoff chains as an aid in the study of Markoff processes. Scand Actuar J 1953(sup1):87–91
Jung RC, Ronning G, Tremayne AR (2005) Estimation in conditional first order autoregression with discrete support. Stat Pap 46(2):195–224
Kharrat T, Boshnakov GN (2022) Countr: flexible univariate count models based on renewal processes. https://CRAN.R-project.org/package=Countr, R package version 3.5.6
Kharrat T, Boshnakov GN, McHale I et al (2019) Flexible regression models for count data based on renewal processes. J Stat Softw 90(13):1–35
Maechler M, Maechler MM, MPFR S, et al (2024) Package ‘Rmpfr’. https://CRAN.R-project.org/package=Rmpfr, R package version 0.9-5
MATLAB (2017) version 9.2.0.538062 (R2017a). The MathWorks, Inc., Natick, Massachusetts, United States
McShane B, Adrian M, Bradlow ET et al (2008) Count models based on Weibull interarrival times. J Bus Econ Stat 26(3):369–378
Minka T, Shmueli G, Kadane J, et al (2003) Computing with the COM-poisson distribution. Technical report CMU Statistics Department 776
Moler C, Van Loan C (2003) Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev 45(3):3–49
Morales-Otero M, Núñez-Antón V (2021) Comparing Bayesian spatial conditional overdispersion and the Besag-York-Mollié models: application to infant mortality rates. Mathematics 9(3):282
Peluso A, Vinciotti V, Yu K (2019) Discrete Weibull generalized additive model: an application to count fertility data. J Roy Stat Soc Ser C (Appl Stat) 68(3):565–583
Philipson P, Huang A (2023) A fast look-up method for Bayesian mean-parameterised Conway-Maxwell-Poisson regression models. Stat Comput 33(81):1–16
Podlich HM, Faddy MJ, Smyth GK (2004) Semi-parametric extended Poisson process models for count data. Stat Comput 14:311–321
R Core Team (2022) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria, https://www.R-project.org/
Renshaw E (2011) Stochastic population processes: analysis, approximations, simulations. Oxford University Press, Oxford
Rovenţa I, Temereancă LE (2019) A note on the positivity of the even degree complete homogeneous symmetric polynomials. Mediterr J Math 16(1):1–16
Saez-Castillo AJ, Conde-Sanchez A (2013) A hyper-Poisson regression model for overdispersed and underdispersed count data. J Comput Stat Data Anal 61:148–157
Sellers KF, Shmueli G (2009) A regression model for count data with observation-level dispersion. In: Paper presented at the 24th international workshop on statistical modelling, Cornell University, New York, Accessed 20–24 July 2009
Sellers KF, Shmueli G (2010) A flexible regression model for count data. Ann Appl Stat 4(2):943–961
Sellers KF, Lotze T, Raim AM (2019) COMPoissonReg: Conway-Maxwell Poisson (COM-Poisson) regression. https://CRAN.R-project.org/package=COMPoissonReg, R package version 0.7.0
Skulpakdee W, Hunkrajok M (2022a) 3: a sue-Poisson inarch(1) model. In: Paper presented at the 6th international conference on compute and data analysis, Shanghai, China. Accessed 25-27 February 2022
Skulpakdee W, Hunkrajok M (2022) Unusual-event processes for count data. SORT Stat Oper Res Trans 46(1):39–66
Smith DM, Faddy MJ (2016) Mean and variance modelling of under- and overdispersed count data. J Stat Softw 69(6):1–23
Smith DM, Faddy MJ (2018) CountsEPPM: Mean and variance modeling of count data. https://CRAN.R-project.org/package=CountsEPPM, R package version 3.0
Smith DM, Faddy MJ (2019a) BinaryEPPM: mean and variance modeling of binary data. https://CRAN.R-project.org/package=BinaryEPPM, R package version 2.3
Smith DM, Faddy MJ (2019) Mean and variance modeling of under-dispersed and over-dispersed grouped binary data. J Stat Softw 90(8):1–20
Smyth GK, Podlich HM (2002) An improved saddlepoint approximation based on the negative binomial distribution for the general birth process. Comput Stat 17(1):17–28
Watkins DS (2010) Fundamentals of matrix computations, 3rd edn. Wiley, New Jersey
Weiß CH (2018) An introduction to discrete-valued time series, 1st edn. Wiley, New Jersey
Weiß CH, Zhu F, Hoshiyar A (2022) Softplus INGARCH models. Stat Sin 32(2):099–1120
Winkelmann R (1995) Duration dependence and dispersion in count-data models. J Bus Econ Stat 13(4):467–474
Acknowledgements
The authors thank the associate editor and anonymous reviewers for their valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hunkrajok, M., Skulpakdee, W. A simple algorithm for computing the probabilities of count models based on pure birth processes. Comput Stat (2024). https://doi.org/10.1007/s00180-024-01491-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00180-024-01491-4