Abstract
We introduce a new algorithm to compute the difference between values of the \(\log \Gamma\)-function in close points, where \(\Gamma\) denotes Euler’s gamma function. As a consequence, we obtain a way of computing the Dirichlet-multinomial log-likelihood function which is more accurate, has a better computational complexity and a wider range of application than the previously known ones.
Similar content being viewed by others
Notes
In the literature the digamma and polygamma functions are respectively denoted as \(\psi (u)\), \(\psi ^{(\ell )}(u)\) but in this context \(\psi\) has a different meaning.
See the previous footnote.
In fact, Pari/GP does not have a direct implementation of \(\digamma ^{(\ell )}\), \(\ell \in \mathbb {N}^*\), but these functions can be obtained using the Hurwitz zeta function \(\zeta (s, u)\), \(s>1, u>0\), since \(\digamma ^{(\ell )} (u) = (-1)^{\ell -1} (\ell !)\zeta (\ell +1, u)\). So it is more precise to say that in Pari/GP the implementation of \(\zeta (2, u)\) seems to be not optimised, at least for large values of u.
References
Agresti A (2002) Categorical data analysis. Wiley series in probability and statistics, 2nd edn. Wiley, Hoboken
Barnes E (1900) The theory of the gamma function. Messenger Math. 29:64–128
Bouguila N (2008) Clustering of count data using generalized Dirichlet multinomial distributions. IEEE Trans. Knowl. Data Eng. 20:462–474
Bouneffouf D (2020) Computing the Dirichlet-multinomial log-likelihood function. http://arxiv.org/abs/2007.11967
Brown M et al (1993) Using Dirichlet mixture priors to derive hidden Markov models for protein families. In: Proceedings of the first international conference on intelligent systems for molecular biology. AAAI Press, pp 47–55
Cameron CA, Trivedi PK (2013) Regression analysis of count data. Econometric society monographs, 2nd edn. Cambridge University Press, Cambridge
Giudici P, Castelo R (2003) Improving Markov chain Monte Carlo model search for data mining. Mach. Learn. 50:127–158
Hermite MCh (1895) Sur la fonction \(\log \Gamma (a)\). J. Reine Angew. Math. 115:201–208
Hooper PM (2004) Dependent dirichlet priors and optimal linear estimators for belief net parameters. In: Proceedings of the 20th conference on Uncertainty in artificial intelligence. AUAI Press, pp 251–259
Leckenby JD, Kishi S (1984) The Dirichlet multinomial distribution as a magazine exposure model. J. Mark. Res. 21:100–106
MacKay DJC, Bauman Peto LC (1994) A hierarchical Dirichlet language model. Nat. Lang. Eng. 1:1–19
Madsen RE et al (2005) Modeling word burstiness using the Dirichlet distribution. In: Proceedings of the 22nd international conference on machine learning. ICML’05. ACM, New York, NY, pp 545–552
Mimno D, McCallum A (2008) Topic models conditioned on arbitrary features with dirichlet-multinomial regression. In: Proceedings of the twenty-fourth conference on uncertainty in artificial intelligence (UAI2008), Helsinki, Finland
Sjölander K et al (1996) Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Comput. Appl. Biosci. 12:327–45
Stoer J, Bulirsch R (2002) Introduction to numerical analysis, 3rd edn. Springer, New York
The PARI Group (2022) Pari/GP version 2.13.4. Bordeaux. http://pari.math.u-bordeaux.fr
Winkelmann R (2008) Econometric analysis of count data, 5th edn. Springer, Berlin
Yee TW (2010) The VGAM package for categorical data analysis. J. Stat. Softw. 32:1–34
Yee TW (2022) VGAM: vector generalized linear and additive models. R package version 1.1-7. https://cran.r-project.org/web/packages/VGAM/VGAM.pdf
Yee TW, Wild CJ (1996) Vector generalized additive models. J. R. Stat. Soc. B 58:481–493
Yu P, Shaw C (2014) An efficient algorithm for accurate computation of the Dirichlet-multinomial log-likelihood function. Bioinformatics 30:1547–1554
Acknowledgements
The authors would like to thank Marco Vianello (Università di Padova) for having suggested one of the references. We would also like to thank the anonymous referees for their suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Languasco, A., Migliardi, M. On the fast computation of the Dirichlet-multinomial log-likelihood function. Comput Stat 38, 1995–2013 (2023). https://doi.org/10.1007/s00180-022-01311-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-022-01311-7