Abstract
In this paper, we prove a local limit theorem and a refined continuity correction for the negative binomial distribution. We present two applications of the results. First, we find the asymptotics of the median for a \(\textrm{NegativeBinomial}(r,p)\) random variable jittered by a \(\textrm{Uniform}(0,1)\), which answers a problem left open in Coeurjolly and Trépanier (Metrika 83(7):837–851, 2020). This is used to construct a simple, robust and consistent estimator of the parameter p, when \(r > 0\) is known. The case where r is unknown is also briefly covered. Second, we find an upper bound on the Le Cam distance between negative binomial and normal experiments.
Similar content being viewed by others
Notes
Note that \(\{X\in B_{r,p}(1/2)\} \triangle \{K\in B_{r,p}(1/2)\} \subseteq \{K\in B_{r,p}(3/4)\}\) assuming that r is large enough, simply because \(|X - K| \le \frac{1}{2}\).
References
Abramowitz M, Stegun IA (1964) Handbook of mathematical functions with formulas, graphs, and mathematical tables. National Bureau of Standards Applied Mathematics Series, vol. 55. For sale by the Superintendent of Documents, U.S. Government Printing Office, Washington, DC. http://www.ams.org/mathscinet-getitem?mr=MR0167642
Adell JA, Alzer H (2009) Inequalities for the median of the gamma distribution. J Comput Appl Math 232(2):481–495
Adell JA, Jodrá P (2005) The median of the Poisson distribution. Metrika 61(3):337–346
Adell JA, Jodrá P (2005) Sharp estimates for the median of the \(\Gamma (n+1,1)\) distribution. Stat Probab Lett 71(2):185–191
Adell JA, Jodrá P (2008) On a Ramanujan equation connected with the median of the gamma distribution. Trans Am Math Soc 360(7):3631–3644
Alm SE (2003) Monotonicity of the difference between median and mean of gamma distributions and of a related Ramanujan sequence. Bernoulli 9(2):351–371
Alzer H (2005) Proof of the Chen–Rubin conjecture. Proc R Soc Edinb Sect A 135(4):677–688
Alzer H (2006) A convexity property of the median of the gamma distribution. Stat Probab Lett 76(14):1510–1513
Berg C, Pedersen HL (2006) The Chen–Rubin conjecture in a continuous setting. Methods Appl Anal 13(1):63–88
Berg C, Pedersen HL (2008) Convexity of the median in the gamma distribution. Ark Mat 46(1):1–6
Brown LD, Carter AV, Low MG, Zhang C-H (2004) Equivalence theory for density estimation, Poisson processes and Gaussian white noise with drift. Ann Stat 32(5):2074–2097
Carter AV (2002) Deficiency distance between multinomial and multivariate normal experiments. Dedicated to the memory of Lucien Le Cam. Ann Stat 30(3):708–730
Chen C-P (2017) The median of gamma distribution and a related Ramanujan sequence. Ramanujan J 44(1):75–88
Chen J, Rubin H (1986) Bounds for the difference between median and mean of gamma and Poisson distributions. Stat Probab Lett 4(6):281–283
Choi KP (1994) On the medians of gamma distributions and an equation of Ramanujan. Proc Am Math Soc 121(1):245–251
Coeurjolly J-F, Trépanier JR (2020) The median of a jittered Poisson distribution. Metrika 83(7):837–851
Cramér H (1928) On the composition of elementary errors. Scand Actuar J 1928(1):13–74. https://doi.org/10.1080/03461238.1928.10416862
Cramér H (1937) Random variables and probability distributions. In: Cambridge Tracts in Mathematics and Mathematical Physics, vol. 36, 1st ed. Cambridge Univ. Press
Cressie N (1978) A finely tuned continuity correction. Ann Inst Stat Math 30(3):435–442
Esseen C-G (1945) Fourier analysis of distribution functions. A mathematical study of the Laplace-Gaussian law. Acta Math 77:1–125
Göb R (1994) Bounds for median and \(50\) percentage point of binomial and negative binomial distribution. Metrika 41(1):43–54
Govindarajulu Z (1965) Normal approximations to the classical discrete distributions. Sankhyā Ser A 27:143–172
Groeneveld RA, Meeden G (1977) The mode, median, and mean inequality. Am Stat 31(3):120–121
Hamza K (1995) The smallest uniform upper bound on the distance between the mean and the median of the binomial and Poisson distributions. Stat Probab Lett 23(1):21–25
Jodrá P (2012) Computing the asymptotic expansion of the median of the Erlang distribution. Math Model Anal 17(2):281–292
Kolassa JE (1994) Series approximation methods in statistics. Lecture Notes in Statistics, vol 88. Springer-, New York. http://www.ams.org/mathscinet-getitem?mr=MR1295242
Lyon RF (2021) On closed-form tight bounds and approximations for the median of a gamma distribution. PLOS ONE 1:18. https://doi.org/10.1371/journal.pone.0251626
Mariucci E (2016) Le Cam theory on the comparison of statistical models. Grad J Math 1(2):81–91
Nussbaum M (1996) Asymptotic equivalence of density estimation and Gaussian white noise. Ann Stat 24(6):2399–2430
Ouimet F (2021) On the Le Cam distance between Poisson and Gaussian experiments and the asymptotic properties of Szasz estimators. J Math Anal Appl 499(1):125033
Ouimet F (2021) A precise local limit theorem for the multinomial distribution and some applications. J Stat Plann Inference 215:218–233
Patil GP (1960) On the evaluation of the negative binomial distribution with examples. Technometrics 2:501–505
Payton ME, Young LJ, Young JH (1989) Bounds for the difference between median and mean of beta and negative binomial distributions. Metrika 36(6):347–354
Pearson K (1933) On the applications of the double Bessel function \(K_{\tau _1,\tau _2}(x)\) to statistical problems. Biometrika 25(1–2):158–178. https://doi.org/10.1093/biomet/25.1-2.158
Pinelis I (2021) Monotonicity properties of the gamma family of distributions. Stat Probab Lett 171:109027
Teicher H (1955) An inequality on Poisson probabilities. Ann Math Stat 26:147–149
van de Ven R, Weber NC (1993) Bounds for the median of the negative binomial distribution. Metrika 40(3–4):185–189
van der Vaart AW (1998) Asymptotic statistics. Cambridge Series in Statistical and Probabilistic Mathematics, vol 3. Cambridge University Press, Cambridge. http://www.ams.org/mathscinet-getitem?mr=MR1652247
You X (2017) Approximation of the median of the gamma distribution. J Number Theory 174:487–493
Acknowledgements
We thank the referee for carefully reading the manuscript and for his/her helpful comments and suggestions which led to improvements in the writing of this paper.
Funding
The author was previously supported by a postdoctoral fellowship from the NSERC (PDF) and the FRQNT (B3X supplement). The author is currently supported by a postdoctoral fellowship (CRM-Simons) from the Centre de recherches mathématiques (Université de Montréal) and the Simons Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A Proof of the refined continuity correction
Proof of Lemma 1
By taking the logarithm in (1), we have
Stirling’s formula yields
see, e.g., Abramowitz and Stegun (1964), p.257. Hence, we get
By writing
the above is
Now, note that for \(y \ge \eta - 1\), Lagrange’s error bound for Taylor expansions yields
By applying these approximations in (28), we obtain
After some cancellations, we get
which proves (3). To obtain (4) and conclude the proof, we take the exponential on both sides of the last equation and we expand the right-hand side with
For r large enough and uniformly for \(|\delta _k| \le \eta \, r^{1/6} p^{1/2}\), the right-hand side of (29) is \(\mathcal {O}_p(1)\). When this bound is taken as y in (30), it explains the error in (4). \(\square \)
Proof of Theorem 1
Let \(c\in \mathbb {R}\). Note that (8) is a trivial consequence of (7), so we only need to prove (7). By decomposing \([\delta _{a - c},\infty )\) into small intervals, we get
for a small enough constant \(\beta = \beta (p) > 0\), where the exponential error comes from the contributions outside of the bulk. The Taylor expansion of \(\phi (x)\) around any \(x_0\in \mathbb {R}\) is
By taking \(x_0 = \delta _{k}\) in (32) and integrating on \([\delta _{k-\frac{1}{2}}, \delta _{k+\frac{1}{2}}]\), the first and third order derivatives disappear because of the symmetry. We have
Similarly, by taking \(x_0 = \delta _{k}\) in (32) and integrating on \([\delta _{a - c}, \delta _{a - \frac{1}{2}}]\), we have
Using (33), (34), and the expression of \(P_{r,p}(k)\) from Lemma 1 when k is in the bulk, the right-hand side of (31) is equal to
where \({\widetilde{a}} :=a - \tfrac{1}{2}\). For \(d\in \mathbb {R}\), consider
in (35). The terms of order \((r p)^{-1/2}\) cancel out and the d that cancels the terms of order \((r p)^{-1}\) is
Now, using the fact that, for \(a\in \mathbb {R}\),
where \(\Psi \) denotes the survival function of the standard normal distribution, the c that cancel both braces in (35) is
This ends the proof. \(\square \)
Appendix B Moments of the negative binomial distribution
In the lemma below, we compute the second, third, fourth and sixth central moments. It is used to control some expectations in (26) and the \(\asymp _p r^{-1}\) errors in (22) of the proof of Theorem 3. It is also a preliminary result for the proof of Corollary 2 below, where the central moments are bounded on various events.
Lemma 2
(Central moments) Let \(K\sim \textrm{NegBin}(r, p)\) for some \(r > 0\) and \(p\in (0,1)\). We have
Proof of Lemma 2
This was calculated using Mathematica. \(\square \)
Next, we bound the first and third central moments on various events. The corollary below is used to control the \(\asymp _p r^{-1/2}\) errors in (22) of the proof of Theorem 3.
Corollary 2
Let \(K\sim \textrm{NegBin}(r, p)\) for some \(r > 0\) and \(p\in (0,1)\), and let \(A\in {\mathscr {B}}(\mathbb {R})\) be a Borel set. Then,
Proof of Corollary 2
This follows from Lemma 2 and Holder’s inequality. \(\square \)
Appendix C Short proof for the asymptotics of the median of a jittered Poisson random variable
In this section, we present a short proof for the asymptotics of the median of a Poisson random variable jittered by a uniform (Theorem 5), using the same technique introduced in Section 3.1. Our statement is slightly weaker than Theorem 1 in Coeurjolly and Trépanier (2020), but the proof is conceptually simpler.
Theorem 5
Let \(N_{\lambda }\sim \textrm{Poisson}(\lambda )\) and \(U\sim \text {Uniform}(0,1)\). Then, as \(\lambda \rightarrow \infty \), we have
Proof
By conditioning on U and using the local limit theorem from Lemma 2.1 in Ouimet (2021), we want to find \(t = \textrm{Median}(N_{\lambda } + U) > 0\) such that
where \(\{t\}\) denotes the fractional part of t, and
Now, we have the following Taylor series expansion for \(\Phi \) at 0:
Therefore, (36) becomes
After rearranging some terms, this is equivalent to
By applying the expression for \(c_{\lambda }^{\star }\) in (37), this is \(t - \lambda = \frac{1}{2} - \frac{1}{6} + \mathcal {O}(\lambda ^{-1})\). \(\square \)
Let \(N_1,N_2,\dots ,N_n\sim \textrm{Poisson}(\lambda )\) and \(U_1,U_2,\dots ,U_n\sim \textrm{Uniform}(0,1)\) be i.i.d., and define \(Z_i :=N_i + U_i\) for all \(i\in \{1,2,\dots ,n\}\). Then, we have the convergence in probability of the sample median:
see, e.g., van der Vaart (1998), p.47. We deduce the following corollary.
Corollary 3
With the above notation,
as \(n\rightarrow \infty \) and then \(\lambda \rightarrow \infty \).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ouimet, F. A refined continuity correction for the negative binomial distribution and asymptotics of the median. Metrika 86, 827–849 (2023). https://doi.org/10.1007/s00184-023-00897-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-023-00897-2
Keywords
- Local limit theorem
- Continuity correction
- Quantile coupling
- Negative binomial distribution
- Gaussian approximation
- Median
- Comparison of experiments
- Le Cam distance
- Total variation