A refined continuity correction for the negative binomial distribution and asymptotics of the median

Ouimet, Frédéric

doi:10.1007/s00184-023-00897-2

A refined continuity correction for the negative binomial distribution and asymptotics of the median

Published: 03 February 2023

Volume 86, pages 827–849, (2023)
Cite this article

Metrika Aims and scope Submit manuscript

Frédéric Ouimet ORCID: orcid.org/0000-0001-7933-5265^1,2

276 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we prove a local limit theorem and a refined continuity correction for the negative binomial distribution. We present two applications of the results. First, we find the asymptotics of the median for a $\textrm{NegativeBinomial}(r,p)$ random variable jittered by a $\textrm{Uniform}(0,1)$, which answers a problem left open in Coeurjolly and Trépanier (Metrika 83(7):837–851, 2020). This is used to construct a simple, robust and consistent estimator of the parameter p, when $r > 0$ is known. The case where r is unknown is also briefly covered. Second, we find an upper bound on the Le Cam distance between negative binomial and normal experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Behavior of Binomial Distribution near Its Median

Article Open access 01 April 2022

The median of a jittered Poisson distribution

Article 06 February 2020

A von Mises approximation to the small sample distribution of the trimmed mean

Article 29 August 2015

Notes

Note that $\{X\in B_{r,p}(1/2)\} \triangle \{K\in B_{r,p}(1/2)\} \subseteq \{K\in B_{r,p}(3/4)\}$ assuming that r is large enough, simply because $|X - K| \le \frac{1}{2}$.

References

Abramowitz M, Stegun IA (1964) Handbook of mathematical functions with formulas, graphs, and mathematical tables. National Bureau of Standards Applied Mathematics Series, vol. 55. For sale by the Superintendent of Documents, U.S. Government Printing Office, Washington, DC. http://www.ams.org/mathscinet-getitem?mr=MR0167642
Adell JA, Alzer H (2009) Inequalities for the median of the gamma distribution. J Comput Appl Math 232(2):481–495
Article MathSciNet MATH Google Scholar
Adell JA, Jodrá P (2005) The median of the Poisson distribution. Metrika 61(3):337–346
Article MathSciNet MATH Google Scholar
Adell JA, Jodrá P (2005) Sharp estimates for the median of the $\Gamma (n+1,1)$ distribution. Stat Probab Lett 71(2):185–191
Article MathSciNet MATH Google Scholar
Adell JA, Jodrá P (2008) On a Ramanujan equation connected with the median of the gamma distribution. Trans Am Math Soc 360(7):3631–3644
Article MathSciNet MATH Google Scholar
Alm SE (2003) Monotonicity of the difference between median and mean of gamma distributions and of a related Ramanujan sequence. Bernoulli 9(2):351–371
MathSciNet MATH Google Scholar
Alzer H (2005) Proof of the Chen–Rubin conjecture. Proc R Soc Edinb Sect A 135(4):677–688
Article MathSciNet MATH Google Scholar
Alzer H (2006) A convexity property of the median of the gamma distribution. Stat Probab Lett 76(14):1510–1513
Article MathSciNet MATH Google Scholar
Berg C, Pedersen HL (2006) The Chen–Rubin conjecture in a continuous setting. Methods Appl Anal 13(1):63–88
Article MathSciNet MATH Google Scholar
Berg C, Pedersen HL (2008) Convexity of the median in the gamma distribution. Ark Mat 46(1):1–6
Article MathSciNet MATH Google Scholar
Brown LD, Carter AV, Low MG, Zhang C-H (2004) Equivalence theory for density estimation, Poisson processes and Gaussian white noise with drift. Ann Stat 32(5):2074–2097
Article MathSciNet MATH Google Scholar
Carter AV (2002) Deficiency distance between multinomial and multivariate normal experiments. Dedicated to the memory of Lucien Le Cam. Ann Stat 30(3):708–730
Article Google Scholar
Chen C-P (2017) The median of gamma distribution and a related Ramanujan sequence. Ramanujan J 44(1):75–88
Article MathSciNet MATH Google Scholar
Chen J, Rubin H (1986) Bounds for the difference between median and mean of gamma and Poisson distributions. Stat Probab Lett 4(6):281–283
Article MathSciNet MATH Google Scholar
Choi KP (1994) On the medians of gamma distributions and an equation of Ramanujan. Proc Am Math Soc 121(1):245–251
Article MathSciNet MATH Google Scholar
Coeurjolly J-F, Trépanier JR (2020) The median of a jittered Poisson distribution. Metrika 83(7):837–851
Article MathSciNet MATH Google Scholar
Cramér H (1928) On the composition of elementary errors. Scand Actuar J 1928(1):13–74. https://doi.org/10.1080/03461238.1928.10416862
Article MATH Google Scholar
Cramér H (1937) Random variables and probability distributions. In: Cambridge Tracts in Mathematics and Mathematical Physics, vol. 36, 1st ed. Cambridge Univ. Press
Cressie N (1978) A finely tuned continuity correction. Ann Inst Stat Math 30(3):435–442
Article MathSciNet MATH Google Scholar
Esseen C-G (1945) Fourier analysis of distribution functions. A mathematical study of the Laplace-Gaussian law. Acta Math 77:1–125
Article MathSciNet MATH Google Scholar
Göb R (1994) Bounds for median and $50$ percentage point of binomial and negative binomial distribution. Metrika 41(1):43–54
Article MathSciNet MATH Google Scholar
Govindarajulu Z (1965) Normal approximations to the classical discrete distributions. Sankhyā Ser A 27:143–172
MathSciNet MATH Google Scholar
Groeneveld RA, Meeden G (1977) The mode, median, and mean inequality. Am Stat 31(3):120–121
MathSciNet Google Scholar
Hamza K (1995) The smallest uniform upper bound on the distance between the mean and the median of the binomial and Poisson distributions. Stat Probab Lett 23(1):21–25
Article MathSciNet MATH Google Scholar
Jodrá P (2012) Computing the asymptotic expansion of the median of the Erlang distribution. Math Model Anal 17(2):281–292
Article MathSciNet MATH Google Scholar
Kolassa JE (1994) Series approximation methods in statistics. Lecture Notes in Statistics, vol 88. Springer-, New York. http://www.ams.org/mathscinet-getitem?mr=MR1295242
Lyon RF (2021) On closed-form tight bounds and approximations for the median of a gamma distribution. PLOS ONE 1:18. https://doi.org/10.1371/journal.pone.0251626
Article Google Scholar
Mariucci E (2016) Le Cam theory on the comparison of statistical models. Grad J Math 1(2):81–91
MathSciNet MATH Google Scholar
Nussbaum M (1996) Asymptotic equivalence of density estimation and Gaussian white noise. Ann Stat 24(6):2399–2430
Article MathSciNet MATH Google Scholar
Ouimet F (2021) On the Le Cam distance between Poisson and Gaussian experiments and the asymptotic properties of Szasz estimators. J Math Anal Appl 499(1):125033
Article MathSciNet MATH Google Scholar
Ouimet F (2021) A precise local limit theorem for the multinomial distribution and some applications. J Stat Plann Inference 215:218–233
Article MathSciNet MATH Google Scholar
Patil GP (1960) On the evaluation of the negative binomial distribution with examples. Technometrics 2:501–505
Article MathSciNet MATH Google Scholar
Payton ME, Young LJ, Young JH (1989) Bounds for the difference between median and mean of beta and negative binomial distributions. Metrika 36(6):347–354
Article MathSciNet MATH Google Scholar
Pearson K (1933) On the applications of the double Bessel function $K_{\tau _1,\tau _2}(x)$ to statistical problems. Biometrika 25(1–2):158–178. https://doi.org/10.1093/biomet/25.1-2.158
Article Google Scholar
Pinelis I (2021) Monotonicity properties of the gamma family of distributions. Stat Probab Lett 171:109027
Article MathSciNet MATH Google Scholar
Teicher H (1955) An inequality on Poisson probabilities. Ann Math Stat 26:147–149
Article MathSciNet MATH Google Scholar
van de Ven R, Weber NC (1993) Bounds for the median of the negative binomial distribution. Metrika 40(3–4):185–189
MathSciNet MATH Google Scholar
van der Vaart AW (1998) Asymptotic statistics. Cambridge Series in Statistical and Probabilistic Mathematics, vol 3. Cambridge University Press, Cambridge. http://www.ams.org/mathscinet-getitem?mr=MR1652247
You X (2017) Approximation of the median of the gamma distribution. J Number Theory 174:487–493
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank the referee for carefully reading the manuscript and for his/her helpful comments and suggestions which led to improvements in the writing of this paper.

Funding

The author was previously supported by a postdoctoral fellowship from the NSERC (PDF) and the FRQNT (B3X supplement). The author is currently supported by a postdoctoral fellowship (CRM-Simons) from the Centre de recherches mathématiques (Université de Montréal) and the Simons Foundation.

Author information

Authors and Affiliations

California Institute of Technology, Pasadena, USA
Frédéric Ouimet
Université de Montréal, Montreal, QC, H3T 1J4, Canada
Frédéric Ouimet

Authors

Frédéric Ouimet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frédéric Ouimet.

Ethics declarations

Conflict of interest

The author declares no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Proof of the refined continuity correction

Proof of Lemma 1

By taking the logarithm in (1), we have

$$\begin{aligned} \begin{aligned} \log \big (P_{r,p}(k)\big )&= \log \Gamma (r + k) - \log \Gamma (r) - \log k! + r \log q + k \log p. \end{aligned} \end{aligned}$$

Stirling’s formula yields

$$\begin{aligned} \begin{aligned} \log \Gamma (x)&= \frac{1}{2} \log (2\pi ) + (x - \tfrac{1}{2}) \log x - x + \frac{1}{12 x} + \mathcal {O}(x^{-3}), \quad x > 0, \\ \log k!&= \frac{1}{2} \log (2\pi ) + (k + \tfrac{1}{2}) \log k - k + \frac{1}{12 k} + \mathcal {O}(k^{-3}), \quad k\in \mathbb {N}, \end{aligned} \end{aligned}$$

see, e.g., Abramowitz and Stegun (1964), p.257. Hence, we get

$$\begin{aligned} \log \big (P_{r,p}(k)\big )&= - \frac{1}{2} \log (2\pi ) + (r + k) \log (r + k) - r \log (r) - k \log k \\&\quad - \frac{1}{2} \log (r + k) + \frac{1}{2} \log (r) - \frac{1}{2} \log k + r \log q + k \log p \\&\quad + \frac{1}{12} \big [(r + k)^{-1} - r^{-1} - k^{-1}\big ] + \mathcal {O}\big ((r + k)^{-3} + r^{-3} + k^{-3}\big ). \end{aligned}$$

By writing

$$\begin{aligned} r + k = \frac{r}{q} \Big (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\Big ) \quad \text {and} \quad k = \frac{r p}{q} \Big (1 + \frac{\delta _k}{\sqrt{r p}}\Big ), \end{aligned}$$

the above is

$$\begin{aligned} \log \big (P_{r,p}(k)\big )&= - \frac{1}{2} \log (2 \pi ) - \frac{1}{2} \log \Big (\frac{r p}{q^2}\Big ) \nonumber \\&\quad + \frac{r}{q} \bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg ) \log \bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg ) \nonumber \\&\quad - \frac{r p}{q} \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg ) \log \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg ) \nonumber \\&\quad - \frac{1}{2} \log \bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg ) - \frac{1}{2} \log \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg ) \nonumber \\&\quad + \frac{q}{12 r} \bigg [\bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg )^{-1} - \frac{1}{q} - \frac{1}{p} \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg )^{-1}\bigg ] \nonumber \\&\quad + \mathcal {O}\bigg (\frac{q^3}{r^3} \bigg [\bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg )^{-3} + \frac{1}{q^3} + \frac{1}{p^3} \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg )^{-3}\bigg ]\bigg ). \end{aligned}$$

(28)

Now, note that for $y \ge \eta - 1$, Lagrange’s error bound for Taylor expansions yields

$$\begin{aligned} \begin{aligned} (1 + y) \log (1 + y)&= y + \frac{y^2}{2} - \frac{y^3}{6} + \frac{y^4}{12} + \mathcal {O}\bigg (\frac{y^5}{\eta ^4}\bigg ), \\ \log (1 + y)&= y - \frac{y^2}{2} + \mathcal {O}\bigg (\frac{y^3}{\eta ^3}\bigg ), \\ (1 + y)^{-1}&= 1 + \mathcal {O}\bigg (\frac{y}{\eta ^2}\bigg ). \end{aligned} \end{aligned}$$

By applying these approximations in (28), we obtain

$$\begin{aligned}&\log \big (P_{r,p}(k)\big ) \\&= - \frac{1}{2} \log \Big (2 \pi \frac{r p}{q^2}\Big ) \\&+ \frac{r}{q} \left\{ \frac{\delta _k}{\sqrt{r p^{-1}}} + \frac{1}{2} \Big (\frac{\delta _k}{\sqrt{r p^{-1}}}\Big )^2 - \frac{1}{6} \Big (\frac{\delta _k}{\sqrt{r p^{-1}}}\Big )^3 + \frac{1}{12} \Big (\frac{\delta _k}{\sqrt{r p^{-1}}}\Big )^4 + \mathcal {O}\Big (\frac{1}{\eta ^4} \Big (\frac{\delta _k}{\sqrt{r p^{-1}}}\Big )^5\Big )\right\} \\&- \frac{r p}{q} \left\{ \frac{\delta _k}{\sqrt{r p}} + \frac{1}{2} \Big (\frac{\delta _k}{\sqrt{r p}}\Big )^2 - \frac{1}{6} \Big (\frac{\delta _k}{\sqrt{r p}}\Big )^3 + \frac{1}{12} \Big (\frac{\delta _k}{\sqrt{r p}}\Big )^4 + \mathcal {O}\Big (\frac{1}{\eta ^4} \Big (\frac{\delta _k}{\sqrt{r p}}\Big )^5\Big )\right\} \\&- \frac{1}{2} \left\{ \frac{\delta _k}{\sqrt{r p^{-1}}} - \frac{1}{2} \bigg (\frac{\delta _k}{\sqrt{r p^{-1}}}\bigg )^2 + \mathcal {O}\bigg (\frac{1}{\eta ^3} \bigg (\frac{\delta _k}{\sqrt{r p^{-1}}}\bigg )^3\bigg )\right\} \\&- \frac{1}{2} \left\{ \frac{\delta _k}{\sqrt{r p}} - \frac{1}{2} \bigg (\frac{\delta _k}{\sqrt{r p}}\bigg )^2 + \mathcal {O}\bigg (\frac{1}{\eta ^3} \bigg (\frac{\delta _k}{\sqrt{r p}}\bigg )^3\bigg )\right\} \\&+ \frac{q}{12 r} \left\{ 1 - q^{-1} - p^{-1}\right\} + \mathcal {O}\bigg (\frac{\delta _k}{r^{3/2} \eta ^2}\bigg ) + \mathcal {O}_p\bigg (\frac{1}{r^3 \eta ^3}\bigg ). \end{aligned}$$

After some cancellations, we get

$$\begin{aligned} \log \bigg (\frac{P_{r,p}(k)}{\frac{q}{\sqrt{r p}} \phi (\delta _k)}\bigg )&= \left\{ - \frac{p^2}{6 q \sqrt{p}} \frac{\delta _k^3}{r^{1/2}} + \frac{p^2}{12 q} \frac{\delta _k^4}{r} + \mathcal {O}_p\bigg (\frac{\delta _k^5}{r^{3/2} \eta ^4}\bigg )\right\} \nonumber \\&\quad + \left\{ \frac{1}{6 q \sqrt{p}} \frac{\delta _k^3}{r^{1/2}} - \frac{1}{12 p q} \frac{\delta _k^4}{r} + \mathcal {O}_p\bigg (\frac{\delta _k^5}{r^{3/2} \eta ^4}\bigg )\right\} \nonumber \\&\quad - \frac{1}{2} \left\{ \frac{\delta _k}{\sqrt{r p^{-1}}} - \frac{1}{2} \frac{\delta _k^2}{r p^{-1}} + \mathcal {O}_p\bigg (\frac{\delta _k^3}{r^{3/2} \eta ^3}\bigg )\right\} \nonumber \\&\quad - \frac{1}{2} \left\{ \frac{\delta _k}{\sqrt{r p}} - \frac{1}{2} \frac{\delta _k^2}{r p} + \mathcal {O}_p\bigg (\frac{\delta _k^3}{r^{3/2} \eta ^3}\bigg )\right\} \nonumber \\&\quad - \frac{p^2 + q}{12 r p} + \mathcal {O}_p\bigg (\frac{1 + |\delta _k|}{r^{3/2} \eta ^3}\bigg ) \nonumber \\&= (r p)^{-1/2} \left\{ \frac{1 + p}{6} \delta _k^3 - \frac{1 + p}{2} \delta _k\right\} \nonumber \\&\quad + (r p)^{-1} \left\{ - \frac{1 + p + p^2}{12} \delta _k^4 + \frac{p^2 + 1}{4} \delta _k^2 - \frac{p^2 + q}{12}\right\} \nonumber \\&\quad + \mathcal {O}_p\bigg (\frac{1 + |\delta _k|^5}{r^{3/2} \eta ^4}\bigg ), \end{aligned}$$

(29)

which proves (3). To obtain (4) and conclude the proof, we take the exponential on both sides of the last equation and we expand the right-hand side with

$$\begin{aligned} e^y = 1 + y + \frac{y^2}{2} + \mathcal {O}(e^{{\widetilde{\eta }}} y^3), \quad \text {for } -\infty < y \le {\widetilde{\eta }}. \end{aligned}$$

(30)

For r large enough and uniformly for $|\delta _k| \le \eta \, r^{1/6} p^{1/2}$, the right-hand side of (29) is $\mathcal {O}_p(1)$. When this bound is taken as y in (30), it explains the error in (4). $\square $

Proof of Theorem 1

Let $c\in \mathbb {R}$. Note that (8) is a trivial consequence of (7), so we only need to prove (7). By decomposing $[\delta _{a - c},\infty )$ into small intervals, we get

$$\begin{aligned} \begin{aligned}&\sum _{k=a}^{\infty } P_{r,p}(k) - \int _{\delta _{a - c}}^{\infty } \phi (y) \textrm{d}y ~= \sum _{\begin{array}{c} k=a \\ k\in B_{r,p}(1/2) \end{array}}^{\infty } \Big [P_{r,p}(k) - \int _{\delta _{a-\frac{1}{2}}}^{\delta _{a+\frac{1}{2}}} \phi (y) \textrm{d}y\Big ] \\&~\qquad \qquad - \int _{\delta _{a-c}}^{\delta _{a-\frac{1}{2}}} \phi (y) \textrm{d}y + \mathcal {O}(e^{-\beta r}), \end{aligned} \end{aligned}$$

(31)

for a small enough constant $\beta = \beta (p) > 0$, where the exponential error comes from the contributions outside of the bulk. The Taylor expansion of $\phi (x)$ around any $x_0\in \mathbb {R}$ is

$$\begin{aligned} \begin{aligned} \phi (x)&= \phi (x_0) + \phi '(x_0) (x - x_0) + \tfrac{1}{2} \phi ''(x_0) (x - x_0)^2 \\&\quad + \mathcal {O}(|x - x_0|^3). \end{aligned} \end{aligned}$$

(32)

By taking $x_0 = \delta _{k}$ in (32) and integrating on $[\delta _{k-\frac{1}{2}}, \delta _{k+\frac{1}{2}}]$, the first and third order derivatives disappear because of the symmetry. We have

$$\begin{aligned} \int _{\delta _{k-\frac{1}{2}}}^{\delta _{k+\frac{1}{2}}} \phi (y) \textrm{d}y&= \frac{q}{\sqrt{r p}} \phi (\delta _{k}) + \frac{\phi ''(\delta _{k})}{2} \int _{-q/(2\sqrt{r p})}^{q/(2\sqrt{r p})} x^2 \textrm{d}x + \mathcal {O}_p\bigg (\frac{1 + |\delta _k|^5}{r^{5/2}}\bigg ) \nonumber \\&= \frac{q}{\sqrt{r p}} \phi (\delta _{k}) \bigg \{1 + \frac{q^2}{24 r p}(\delta _k^2 - 1) + \mathcal {O}_p\bigg (\frac{1 + |\delta _k|^5}{r^2}\bigg )\bigg \}. \end{aligned}$$

(33)

Similarly, by taking $x_0 = \delta _{k}$ in (32) and integrating on $[\delta _{a - c}, \delta _{a - \frac{1}{2}}]$, we have

$$\begin{aligned} \int _{\delta _{a-c}}^{\delta _{a-\frac{1}{2}}} \phi (y) \textrm{d}y = \frac{q}{\sqrt{r p}} \phi (\delta _{a}) \left\{ \left( c - \frac{1}{2}\right) + \frac{q}{\sqrt{r p}} \frac{1}{2} \left( c^2 - \frac{1}{4}\right) + \mathcal {O}_p\left( \frac{1 + |\delta _a|^2}{r}\right) \right\} . \nonumber \\ \end{aligned}$$

(34)

Using (33), (34), and the expression of $P_{r,p}(k)$ from Lemma 1 when k is in the bulk, the right-hand side of (31) is equal to

$$\begin{aligned} \begin{aligned}&(r p)^{-1/2} \left\{ \begin{array}{l} \frac{1 + p}{6} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y - \frac{1 + p}{2} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y - q (c - \frac{1}{2}) \phi (\delta _{{\widetilde{a}}}) \end{array} \right\} \\ + ~&(r p)^{-1} \left\{ \begin{array}{l} \frac{(1 + p)^2}{72 p} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^6 \phi (y) \textrm{d}y - \frac{2 + 3 p + 2 p^2}{12} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^4 \phi (y) \textrm{d}y \\ + \frac{3 + 2 p + 3 p^2}{8} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^2 \phi (y) \textrm{d}y - \frac{p^2 + q}{12} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} \phi (y) \textrm{d}y \\ - \frac{q^2}{24} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} (y^2 - 1) \phi (y) \textrm{d}y \\ + \tfrac{q}{2} (c - \frac{1}{2}) \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) - \tfrac{q^2}{2} (c^2 - \frac{1}{4}) \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) \end{array} \right\} + \mathcal {O}_p(r^{-3/2}), \end{aligned}\nonumber \\ \end{aligned}$$

(35)

where ${\widetilde{a}} :=a - \tfrac{1}{2}$. For $d\in \mathbb {R}$, consider

$$\begin{aligned} c = \frac{1}{2} + \bigg [\frac{1 + p}{6 q} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})} - \frac{1 + p}{2 q} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})}\bigg ] + \frac{d}{q \sqrt{r p}}, \end{aligned}$$

in (35). The terms of order $(r p)^{-1/2}$ cancel out and the d that cancels the terms of order $(r p)^{-1}$ is

$$\begin{aligned} d_{r,p}^{\star }(a) = \left\{ \begin{array}{l} \frac{(1 + p)^2}{72} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^6 \phi (y) \textrm{d}y - \frac{2 + 3 p + 2 p^2}{12} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^4 \phi (y) \textrm{d}y \\ + \frac{3 + 2 p + 3 p^2}{8} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^2 \phi (y) \textrm{d}y - \frac{p^2 + q}{12} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} \phi (y) \textrm{d}y \\ - \frac{q^2}{24} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} (y^2 - 1) \phi (y) \textrm{d}y \\ + \frac{p}{2} \Big [\frac{1 + p}{6} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})} - \frac{1 + p}{2} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})}\Big ] \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) \\ - \tfrac{1}{2} \Big [\frac{1 + p}{6} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})} - \frac{1 + p}{2} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})}\Big ]^2 \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) \end{array} \right\} \frac{1}{\phi (\delta _{{\widetilde{a}}})}. \end{aligned}$$

Now, using the fact that, for $a\in \mathbb {R}$,

$$\begin{aligned} \begin{aligned}&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^6 \phi (y) \textrm{d}y = (15 \delta _{{\widetilde{a}}} + 5 \delta _{{\widetilde{a}}}^3 + \delta _{{\widetilde{a}}}^5) \phi (\delta _{{\widetilde{a}}}) + 15 \Psi (\delta _{{\widetilde{a}}}), \\&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^4 \phi (y) \textrm{d}y = (3 \delta _{{\widetilde{a}}} + \delta _{{\widetilde{a}}}^3) \phi (\delta _{{\widetilde{a}}}) + 3 \Psi (\delta _{{\widetilde{a}}}), \\&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y = (2 + \delta _{{\widetilde{a}}}^2) \phi (\delta _{{\widetilde{a}}}), \\&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^2 \phi (y) \textrm{d}y = \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) + \Psi (\delta _{{\widetilde{a}}}), \\&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \phi (y) \textrm{d}y = \phi (\delta _{{\widetilde{a}}}), \end{aligned} \end{aligned}$$

where $\Psi $ denotes the survival function of the standard normal distribution, the c that cancel both braces in (35) is

$$\begin{aligned} c_{r,p}^{\star }(a)&= \frac{1}{2} + \bigg [\frac{1 + p}{6 q} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})} - \frac{1 + p}{2 q} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})}\bigg ] + \frac{d_{r,p}^{\star }(a)}{q \sqrt{r p}}\nonumber \\&= \frac{1}{2} + \frac{1+p}{6 q} \big [\delta _{{\widetilde{a}}}^2 - 1\big ] + \frac{1}{q \sqrt{r p}} \left\{ \begin{array}{l} \left[ \begin{array}{l} \frac{(1 + p)^2}{72} \cdot 5 - \frac{2 + 3 p + 2 p^2}{12} \\ + \frac{p(1 + p)}{12} \cdot (1 - 3) \\ + \frac{(1 + p)^2}{72} \cdot (-4 + 6) \end{array} \right] \delta _{{\widetilde{a}}}^3 \\[7mm] \left[ \begin{array}{l} \frac{(1 + p)^2}{72} \cdot 15 - \frac{2 + 3 p + 2 p^2}{12} \cdot 3 \\ + \frac{3 + 2 p + 3 p^2}{8} - \frac{q^2}{24} \\ + \frac{p(1 + p)}{12} \cdot (2 - 3) \\ + \frac{(1 + p)^2}{72} \cdot (-4 + 6 \cdot 2 - 9) \end{array} \right] \delta _{{\widetilde{a}}} \end{array} \right\} \\&= \frac{1}{2} + \frac{1+p}{6 q} \big [\delta _{{\widetilde{a}}}^2 - 1\big ] + \frac{1}{q \sqrt{r p}} \left\{ \begin{array}{l} - \frac{1}{72} \big [5 + 16 p + 17 p^2\big ] \delta _{{\widetilde{a}}}^3 \\ + \frac{1}{36} \big [1 - 4 p - 2 p^2\big ] \delta _{{\widetilde{a}}} \end{array} \right\} .\nonumber \end{aligned}$$

This ends the proof. $\square $

Appendix B Moments of the negative binomial distribution

In the lemma below, we compute the second, third, fourth and sixth central moments. It is used to control some expectations in (26) and the $\asymp _p r^{-1}$ errors in (22) of the proof of Theorem 3. It is also a preliminary result for the proof of Corollary 2 below, where the central moments are bounded on various events.

Lemma 2

(Central moments) Let $K\sim \textrm{NegBin}(r, p)$ for some $r > 0$ and $p\in (0,1)$. We have

$$\begin{aligned} \begin{aligned}&\mathbb {E}[(K - r p q^{-1})^2] = r p q^{-2}, \\&\mathbb {E}[(K - r p q^{-1})^3] = r p q^{-2} \cdot (1 + p) q^{-1}, \\&\mathbb {E}[(K - r p q^{-1})^4] = 3 (r p q^{-2})^2 + \mathcal {O}_p(r), \\&\mathbb {E}[(K - r p q^{-1})^6] = 15 (r p q^{-2})^3 + \mathcal {O}_p(r^2). \end{aligned} \end{aligned}$$

Proof of Lemma 2

This was calculated using Mathematica. $\square $

Next, we bound the first and third central moments on various events. The corollary below is used to control the $\asymp _p r^{-1/2}$ errors in (22) of the proof of Theorem 3.

Corollary 2

Let $K\sim \textrm{NegBin}(r, p)$ for some $r > 0$ and $p\in (0,1)$, and let $A\in {\mathscr {B}}(\mathbb {R})$ be a Borel set. Then,

Proof of Corollary 2

This follows from Lemma 2 and Holder’s inequality. $\square $

Appendix C Short proof for the asymptotics of the median of a jittered Poisson random variable

In this section, we present a short proof for the asymptotics of the median of a Poisson random variable jittered by a uniform (Theorem 5), using the same technique introduced in Section 3.1. Our statement is slightly weaker than Theorem 1 in Coeurjolly and Trépanier (2020), but the proof is conceptually simpler.

Theorem 5

Let $N_{\lambda }\sim \textrm{Poisson}(\lambda )$ and $U\sim \text {Uniform}(0,1)$. Then, as $\lambda \rightarrow \infty $, we have

$$\begin{aligned} \textrm{Median}(N_{\lambda } + U) - \lambda = \frac{1}{3} + \mathcal {O}(\lambda ^{-1}). \end{aligned}$$

Proof

By conditioning on U and using the local limit theorem from Lemma 2.1 in Ouimet (2021), we want to find $t = \textrm{Median}(N_{\lambda } + U) > 0$ such that

$$\begin{aligned} \frac{1}{2}&= \int _0^1 \mathbb {P}(N_{\lambda } \le t - u) \, \textrm{d}u \nonumber \\&= \mathbb {P}(N_{\lambda } \le \lfloor t \rfloor ) \cdot \{t\} + \mathbb {P}(N_{\lambda } \le \lfloor t \rfloor - 1) \cdot (1 - \{t\}) \nonumber \\&= \Phi (\delta _{\lfloor t \rfloor + 1 - c_{\lambda }^{\star }(\lfloor t \rfloor + 1)}) \cdot \{t\} + \Phi (\delta _{\lfloor t \rfloor - c_{\lambda }^{\star }(\lfloor t \rfloor )}) \cdot (1 - \{t\}) + \mathcal {O}(\lambda ^{-3/2}), \end{aligned}$$

(36)

where $\{t\}$ denotes the fractional part of t, and

$$\begin{aligned} c_{\lambda }^{\star }(a) = \frac{1}{2} + \frac{1}{6} \big [\lambda ^{-1} (a - \lambda )^2 - 1\big ] + \mathcal {O}(\lambda ^{-1}), \quad \text {for } a - \lambda \asymp 1. \end{aligned}$$

(37)

Now, we have the following Taylor series expansion for $\Phi $ at 0:

$$\begin{aligned} \Phi (x) = \frac{1}{2} + \frac{x}{\sqrt{2\pi }} + \mathcal {O}(x^3). \end{aligned}$$

Therefore, (36) becomes

$$\begin{aligned} 0 = \frac{1}{\sqrt{2\pi }} \left[ \delta _{\lfloor t \rfloor + 1 - c_{\lambda }^{\star }(\lfloor t \rfloor + 1)} \cdot \{t\} + \delta _{\lfloor t \rfloor - c_{\lambda }^{\star }(\lfloor t \rfloor )} \cdot (1 - \{t\})\right] + \mathcal {O}(\lambda ^{-3/2}). \end{aligned}$$

After rearranging some terms, this is equivalent to

$$\begin{aligned} t - \lambda = c_{\lambda }^{\star }(\lfloor t \rfloor + 1) \cdot \{t\} + c_{\lambda }^{\star }(\lfloor t \rfloor ) \cdot (1 - \{t\}) + \mathcal {O}(\lambda ^{-1}). \end{aligned}$$

By applying the expression for $c_{\lambda }^{\star }$ in (37), this is $t - \lambda = \frac{1}{2} - \frac{1}{6} + \mathcal {O}(\lambda ^{-1})$. $\square $

Let $N_1,N_2,\dots ,N_n\sim \textrm{Poisson}(\lambda )$ and $U_1,U_2,\dots ,U_n\sim \textrm{Uniform}(0,1)$ be i.i.d., and define $Z_i :=N_i + U_i$ for all $i\in \{1,2,\dots ,n\}$. Then, we have the convergence in probability of the sample median:

$$\begin{aligned} \widehat{\textrm{Median}}(Z_1, Z_2, \dots , Z_n) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}} \textrm{Median}(Z_1), \quad \text {as } n\rightarrow \infty , \end{aligned}$$

see, e.g., van der Vaart (1998), p.47. We deduce the following corollary.

Corollary 3

With the above notation,

$$\begin{aligned} \widehat{\textrm{Median}}(Z_1, Z_2, \dots , Z_n) - \lambda {\mathop {\longrightarrow }\limits ^{\mathbb {P}}} \frac{1}{3}, \end{aligned}$$

as $n\rightarrow \infty $ and then $\lambda \rightarrow \infty $.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ouimet, F. A refined continuity correction for the negative binomial distribution and asymptotics of the median. Metrika 86, 827–849 (2023). https://doi.org/10.1007/s00184-023-00897-2

Download citation

Received: 16 March 2021
Accepted: 11 January 2023
Published: 03 February 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00184-023-00897-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A refined continuity correction for the negative binomial distribution and asymptotics of the median

Abstract

Access this article

Similar content being viewed by others

Behavior of Binomial Distribution near Its Median

The median of a jittered Poisson distribution

A von Mises approximation to the small sample distribution of the trimmed mean

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A Proof of the refined continuity correction

Proof of Lemma 1

Proof of Theorem 1

Appendix B Moments of the negative binomial distribution

Lemma 2

Proof of Lemma 2

Corollary 2

Proof of Corollary 2

Appendix C Short proof for the asymptotics of the median of a jittered Poisson random variable

Theorem 5

Proof

Corollary 3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A refined continuity correction for the negative binomial distribution and asymptotics of the median

Abstract

Access this article

Similar content being viewed by others

Behavior of Binomial Distribution near Its Median

The median of a jittered Poisson distribution

A von Mises approximation to the small sample distribution of the trimmed mean

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A Proof of the refined continuity correction

Proof of Lemma 1

Proof of Theorem 1

Appendix B Moments of the negative binomial distribution

Lemma 2

Proof of Lemma 2

Corollary 2

Proof of Corollary 2

Appendix C Short proof for the asymptotics of the median of a jittered Poisson random variable

Theorem 5

Proof

Corollary 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation