Skip to main content

Jeffreys Prior for Negative Binomial and Zero Inflated Negative Binomial Distributions

Abstract

The negative binomial distribution often fits many real datasets, for example, RNA sequence data, adequately. Furthermore, in the presence of many zeros in the data, it is customary to fit a zero inflated negative binomial distribution. In this note, we study the effect of assuming the Jeffreys prior on the parameters of these two distributions. Under this, we derive the closed form expression of the Bayes factor of the zero inflated negative binomial against negative binomial distribution. We demonstrate the effectiveness of our findings through simulations and real data analyses.

This is a preview of subscription content, access via your institution.

References

  • Anders, S. and Huber, W. (2010). Differential expression analysis for sequence count data. Nature Precedings 1–1.

  • Anscombe, F.J. (1949). The statistical analysis of insect counts based on the negative binomial distribution. Biometrics 5, 165–173.

    Article  Google Scholar 

  • Bayarri, M., Berger, J.O., Datta, G.S. et al. (2008). Objective Bayes testing of Poisson versus inflated Poisson models. Institute of Mathematical Statistics, p. 105–121.

  • Bliss, C.I. and Fisher, R.A. (1953). Fitting the negative binomial distribution to biological data. Biometrics 9, 176–200.

    MathSciNet  Article  Google Scholar 

  • Bowden, D.C., Anderson, A.E. and Medin, E. (1969). Frequency distributions of mule deer fecal group counts. J. Wildl. Manag. 33, 895–905.

    Article  Google Scholar 

  • Bradlow, E.T., Hardie, B.G.S. and Fader, P.S. (2002). Bayesian inference for the negative binomial distribution via polynomial expansions. J. Comput. Graph. Stat. 11, 189–201.

    MathSciNet  Article  Google Scholar 

  • Burrell, Q.L. (1990). Using the Gamma-Poisson model to predict library circulations. J. Am. Soc. Inf. Sci. 41, 164–170.

    Article  Google Scholar 

  • Chen, J., King, E., Deek, R., Wei, Z., Yu, Y., Grill, D. and Ballman, K. (2017). An omnibus test for differential distribution analysis of microbiome sequencing data. Bioinformatics 34, 643–651.

    Article  Google Scholar 

  • Douglas, J.B., Leroux, B. and Puterman, M.L. (1994). Empirical fitting of discrete distributions. Biometrics 576–579.

  • Fisher, R.A. (1941). The negative binomial distribution. Ann. Eugen.11, 182–187.

    MathSciNet  Article  Google Scholar 

  • Guo, X., Fu, Q., Wang, Y. and Land, K.C. (2020). A numerical method to compute fisher information for a special case of heterogeneous negative binomial regression. Commun. Pure Appl. Anal. 19, 4179.

    MathSciNet  Article  Google Scholar 

  • Gupta, P.L., Gupta, R.C. and Tripathi, R.C. (1996). Analysis of zero-adjusted count data. Computat. Stat. Data Anal. 23, 207–218.

    Article  Google Scholar 

  • Kass, R.E. and Raftery, A.E. (1995). Bayes factors. J. Am. Stat. Assoc. 90, 773–795.

    MathSciNet  Article  Google Scholar 

  • Kass, R.E. and Vaidyanathan, S.K. (1992). Approximate bayes factors and orthogonal parameters, with application to testing equality of two binomial proportions. J. R. Stat. Soc.: Ser. B (Methodol.) 54, 129–144.

    MathSciNet  MATH  Google Scholar 

  • Lüdecke, D, Ben-Shachar, M.S., Patil, I., Waggoner, P. and Makowski, D. (2021). Performance: an R package for assessment, comparison and testing of statistical models. J. Open Source Softw. 6, 60, 3139. https://doi.org/10.21105/joss.03139.

    Article  Google Scholar 

  • Leroux, B.G. and Puterman, M.L. (1992). Maximum-penalized-likelihood estimation for independent and Markov-dependent mixture models. Biometrics 545–558.

  • Minami, M., Lennert-Cody, C.E., Gao, W. and Roman-Verdesoto, M. (2007). Modeling shark bycatch: the zero-inflated negative binomial regression model with smoothing. Fish. Res. 84, 210–221.

    Article  Google Scholar 

  • Nedelman, J. (1983). A negative binomial model for sampling mosquitoes in a malaria survey. Biometrics 39, 1009–1020.

    Article  Google Scholar 

  • Pritchard, N.A. and Tebbs, J.M. (2011). Bayesian inference for disease prevalence using negative binomial group testing. Biom. J. 53, 40–56.

    MathSciNet  Article  Google Scholar 

  • R. Core Team (2021). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/.

    Google Scholar 

  • Ridout, M., Hinde, J. and DeméAtrio, C.G. (2001). A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics 57, 219–223.

    MathSciNet  Article  Google Scholar 

  • Robinson, M.D. and Smyth, G.K. (2007). Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23, 2881–2887.

    Article  Google Scholar 

  • White, G.C. and Bennetts, R.E. (1996). Analysis of frequency count data using the negative binomial distribution. Ecology 77, 2549–2557.

    Article  Google Scholar 

  • Yau, K.K., Wang, K. and Lee, A.H. (2003). Zero-inflated negative binomial mixed regression modeling of over-dispersed count data with extra zeros. Biom. J.: J. Math. Methods Biosci. 45, 437–452.

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgements

The authors thank the editor, and the anonymous referees for their careful review and constructive comments which helped to improve this article substantially.

Funding

None received.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arnab Kumar Maity.

Ethics declarations

Conflict of Interests

None declared.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of Lemma 2.1.

$$ \begin{array}{@{}rcl@{}} f_{0}(y|\lambda, p) &=& (1 + \lambda/tau)^{-\tau} I(y = 0) + f_{0}(y|\lambda) \\ &=& (1 + \lambda/tau)^{-\tau} I(y = 0) + (1 - (1 + \lambda/tau)^{-\tau}) \\&&\frac{f_{0}(y|\lambda)}{(1 - (1 + \lambda/tau)^{-\tau})} \\ &=& (1 + \lambda/tau)^{-\tau} I(y = 0) + (1 - (1 + \lambda/tau)^{-\tau}) f^{T}(y|\lambda) \\ &=&f_{0}^{*}(y|\lambda, p^{\ast}) , y = 0, 1, 2, \ldots. \end{array} $$

Proof of Lemma 2.2.

$$ \begin{array}{@{}rcl@{}} f_{1}(y|\lambda, p) &=& \{p + (1-p) (1+ \lambda/\tau)^{-\tau} \} I(y=0) + (1-p) f_{0}(y|\lambda) \\ &=& \{p + (1-p) (1+ \lambda/\tau)^{-\tau} \} I(y=0) \\&&+ (1-p) \{1 - (1+ \lambda/\tau)^{-\tau}\} f^{T}(y|\lambda) \\ &=& \{p + (1-p) (1+ \lambda/\tau )^{-\tau}\} I(y=0) \\&&+ \{(1-p) - (1-p)(1 + \lambda/\tau)^{-\tau}\} f^{T}(y|\lambda) \\ &=&\{p + (1 - p) (1 + \lambda/\tau)^{-\tau}\} I(y=0) \\&&+ [1- \{p + (1-p)(1+ \lambda/\tau)^{-\tau}\}] f^{T}(y|\lambda) \\ &=& p^{\ast} I(y=0) + (1-p^{\ast}) f^{T}(y|\lambda), y = 0, 1, 2, {\ldots} \\ &=&f_{1}^{*}(y|\lambda, p^{\ast}) , y = 0, 1, 2, \ldots. \end{array} $$

Proof of Theorem 2.2.

$$ \begin{array}{@{}rcl@{}} p(\boldsymbol{y}|\mathcal{M}_{1}) &=& \int f_{1}(y|\lambda, p) \pi_{1}(\lambda, p) dp d\lambda \\ &=& {\int}_{\lambda} {\int}_{p} C_{0}(y, \tau, n) \{p + (1-p) (1+ \lambda/\tau)^{-\tau}\}^{k} (1-p)^{n-k} \\&&\{\tau/(\lambda+\tau) \}^{(n-k)\tau} \{\lambda/(\lambda+\tau) \}^{s} \\ && \sqrt{\frac{\tau}{\lambda(\lambda+\tau)} } I(0 < p \leq 1) dp d\lambda \\ &=& C_{0}(y, \tau, n) {\int}_{0}^{\infty} {{\int}_{0}^{1}} \{p + (1-p) (1+ \lambda/\tau)^{-\tau}\}^{k} (1-p)^{n-k} \\&&\{\tau/(\lambda+\tau) \}^{(n-k)\tau} \{\lambda/(\lambda+\tau) \}^{s} \\ && \sqrt{\frac{\tau}{\lambda(\lambda+\tau)} } dp d\lambda \\ &=& C_{0}(y, \tau, n) {\int}_{0}^{\infty} \{\tau/(\lambda+\tau) \}^{(n-k)\tau} \{\lambda/(\lambda+\tau) \}^{s} \sqrt{\frac{\tau}{\lambda(\lambda+\tau)} } \\ && {{\int}_{0}^{1}} \{p + (1-p) (1+ \lambda/\tau)^{-\tau}\}^{k} (1-p)^{n-k} dp d\lambda \\ &=& C_{0}(y, \tau, n) {\int}_{0}^{\infty} \tau^{(n-k)\tau + 1/2} \lambda^{s- 1/2} (\lambda+\tau)^{-s- 1/2-(n-k)\tau} \\ && {\sum}_{j = 0}^{k} \frac{k!}{j!(k-j)!} \{\tau/(\lambda+\tau)\}^{(k-j)\tau} {{\int}_{0}^{1}} p^{j} (1-p)^{k-j} \\&&(1-p)^{n-k} dp d\lambda \\ &=& C_{0}(y, \tau, n) {\sum}_{j = 0}^{k} \frac{k!}{j!(k-j)!} {\int}_{0}^{\infty} \tau^{(n-k)\tau + (k-j)\tau + \frac{1}{2}} \lambda^{s- 1/2}\\&& (\lambda+\tau)^{-s-\frac{1}{2}-(n-k)\tau - (k-j)\tau} \\ && {{\int}_{0}^{1}} p^{j} (1-p)^{n-j} dp d\lambda \\ &=& C_{0}(y, \tau, n) {\sum}_{j = 0}^{k} \frac{k!}{j!(k-j)!} \tau^{(n-j)\tau + \frac{1}{2}} {\int}_{0}^{\infty} \lambda^{s- 1/2}\\&& (\lambda+\tau)^{-s-\frac{1}{2}-(n-j)\tau} \frac{{\varGamma}(j+1) {\varGamma}(n-j+1)}{{\varGamma}(n+2)} d\lambda \\ &=& C_{0}(y, \tau, n) {\sum}_{j = 0}^{k} \frac{k!}{j!(k-j)!} \frac{j! (n-j)!}{(n+1)!} \tau^{(n-j)\tau + 1/2} \\ && {\int}_{0}^{\infty} \lambda^{s- 1/2} (\lambda+\tau)^{-s-1/2-(n-j)\tau} d\lambda \\ &=& C_{0}(y, \tau, n) \frac{k!}{(n+1)!} {\sum}_{j = 0}^{k} \frac{(n-j)!}{(k-j)!} \tau^{(n-j)\tau + 1/2} {\int}_{0}^{\infty} \lambda^{s- 1/2}\\&& (\lambda+\tau)^{-s- 1/2-(n-j)\tau} d\lambda . \end{array} $$

Proof of Theorem 2.3.

$$ \begin{array}{@{}rcl@{}} B_{10} \!\!\!\!\!&=&\!\!\!\!\! \frac{p(\boldsymbol{y}|\mathcal{M}_{1})}{p(\boldsymbol{y}|\mathcal{M}_{0})} \\ &=&\!\!\!\!\! \frac{ C_{0}(y, \tau, n) \frac{k!}{(n{\kern-.5pt}+{\kern-.5pt}1)!} {\sum}_{j{\kern-.5pt} ={\kern-.5pt} 0}^{k} \frac{(n{\kern-.5pt}-{\kern-.5pt}j)!}{(k{\kern-.5pt}-{\kern-.5pt}j)!} \tau^{(n{\kern-.5pt}-{\kern-.5pt}j)\tau {\kern-.5pt}+{\kern-.5pt} 1/2} {\int}_{0}^{\infty} \lambda^{s{\kern-.5pt}-{\kern-.5pt} 1/2} (\lambda\!{\kern-.5pt}+{\kern-.5pt}\!\tau)^{{\kern-.5pt}-{\kern-.5pt}s{\kern-.5pt}-{\kern-.5pt} 1/2-(n{\kern-.5pt}-{\kern-.5pt}j)\tau} d\lambda}{C_{0}(y, \tau, n) \tau^{n\tau+ 1/2} {\int}_{0}^{\infty} (\lambda+\tau)^{-n\tau-s- 1/2} \lambda^{s- 1/2} d\lambda} \\ &=&\!\!\!\! \! \frac{k!}{(n+1)!} {\sum}_{j = 0}^{k} \frac{(n-j)!}{(k-j)!} \tau^{-j\tau} \frac{ {\int}_{0}^{\infty} \lambda^{s- 1/2} (\lambda+\tau)^{-s- 1/2-(n-j)\tau} d\lambda}{{\int}_{0}^{\infty} (\lambda+\tau)^{-n\tau-s- 1/2} \lambda^{s- 1/2} d\lambda} \\ &=& \!\!\!\!\! \frac{k!}{(n+1)!} {\sum}_{j = 0}^{k} \frac{(n-j)!}{(k-j)!} \tau^{-j\tau} \frac{I_{1}}{I_{2}} . \end{array} $$
(3)

Now, we can solve the integrals in B10 (3) using Beta function (http://mathworld.wolfram.com/BetaFunction.html). Hence,

$$ \begin{array}{@{}rcl@{}} I_{1} & = & {\int}_{0}^{\infty} \lambda^{s- 1/2} (\lambda+\tau)^{-s- 1/2-(n-j)\tau} d\lambda \\ & = & {\int}_{0}^{\infty} \lambda^{s-\frac{1}{2}} (\lambda+\tau)^{-s+ 1/2- 1/2- 1/2-1+1-(n-j)\tau} d\lambda \\ & = & {\int}_{0}^{\infty} \lambda^{m_{1}^{*}} (\lambda+\tau)^{-m_{1}^{*}-n_{1}^{*}-2} d\lambda, \quad m_{1}^{*}= s- 1/2, n_{1}^{*}= (n-j)\tau - 1\\ & = & {\int}_{0}^{\infty} \frac{\lambda^{m_{1}^{*}}}{(\lambda+\tau)^{m_{1}^{*}+n_{1}^{*}+2} } d\lambda \\ & = & \text{Beta}(m_{1}^{*}+1, n_{1}^{*}+1). \\ I_{2} & = & {\int}_{0}^{\infty} \lambda^{s- 1/2} (\lambda+\tau)^{-s- 1/2-n\tau} d\lambda \\ & = & {\int}_{0}^{\infty} \lambda^{s- 1/2} (\lambda+\tau)^{-s+ 1/2 - 1/2 - 1/2 -1+1-n\tau} d\lambda \\ & = & {\int}_{0}^{\infty} \lambda^{m_{2}^{*}} (\lambda+\tau)^{-m_{2}^{*}-n_{2}^{*}-2} d\lambda, \quad m_{2}^{*}= s- 1/2, n_{2}^{*}= n\tau - 1 \\ & = & {\int}_{0}^{\infty} \frac{\lambda^{m_{2}^{*}}}{(\lambda+\tau)^{m_{2}^{*}+n_{2}^{*}+2} } d\lambda \\ & = & \text{Beta}(m_{2}^{*}+1, n_{2}^{*}+1). \end{array} $$

Hence, B10 (3) can be written as

$$ \begin{array}{@{}rcl@{}} B_{10} &=& \frac{k!}{(n+1)!} {\sum}_{j = 0}^{k} \frac{(n-j)!}{(k-j)!} \tau^{-j\tau} \frac{I_{1}}{I_{2}} \\ &=& \frac{k!}{(n+1)!} {\sum}_{j = 0}^{k} \frac{(n-j)!}{(k-j)!} \tau^{-j\tau} \frac{\text{Beta}(m_{1}^{*}+1, n_{1}^{*}+1)}{\text{Beta}(m_{2}^{*}+1, n_{2}^{*}+1)} . \end{array} $$
(4)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Maity, A.K., Paul, E. Jeffreys Prior for Negative Binomial and Zero Inflated Negative Binomial Distributions. Sankhya A (2022). https://doi.org/10.1007/s13171-022-00286-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13171-022-00286-3

Keywords

  • Bayes factor
  • marginal likelihood
  • negative binomial distribution
  • Jeffreys prior
  • zero inflated negative binomial distribution.

PACS Nos

  • Primary: 62XX; Secondary: 62-08