Skip to main content
Log in

A refined continuity correction for the negative binomial distribution and asymptotics of the median

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

In this paper, we prove a local limit theorem and a refined continuity correction for the negative binomial distribution. We present two applications of the results. First, we find the asymptotics of the median for a \(\textrm{NegativeBinomial}(r,p)\) random variable jittered by a \(\textrm{Uniform}(0,1)\), which answers a problem left open in Coeurjolly and Trépanier (Metrika 83(7):837–851, 2020). This is used to construct a simple, robust and consistent estimator of the parameter p, when \(r > 0\) is known. The case where r is unknown is also briefly covered. Second, we find an upper bound on the Le Cam distance between negative binomial and normal experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Note that \(\{X\in B_{r,p}(1/2)\} \triangle \{K\in B_{r,p}(1/2)\} \subseteq \{K\in B_{r,p}(3/4)\}\) assuming that r is large enough, simply because \(|X - K| \le \frac{1}{2}\).

References

Download references

Acknowledgements

We thank the referee for carefully reading the manuscript and for his/her helpful comments and suggestions which led to improvements in the writing of this paper.

Funding

The author was previously supported by a postdoctoral fellowship from the NSERC (PDF) and the FRQNT (B3X supplement). The author is currently supported by a postdoctoral fellowship (CRM-Simons) from the Centre de recherches mathématiques (Université de Montréal) and the Simons Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frédéric Ouimet.

Ethics declarations

Conflict of interest

The author declares no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Proof of the refined continuity correction

Proof of Lemma 1

By taking the logarithm in (1), we have

$$\begin{aligned} \begin{aligned} \log \big (P_{r,p}(k)\big )&= \log \Gamma (r + k) - \log \Gamma (r) - \log k! + r \log q + k \log p. \end{aligned} \end{aligned}$$

Stirling’s formula yields

$$\begin{aligned} \begin{aligned} \log \Gamma (x)&= \frac{1}{2} \log (2\pi ) + (x - \tfrac{1}{2}) \log x - x + \frac{1}{12 x} + \mathcal {O}(x^{-3}), \quad x > 0, \\ \log k!&= \frac{1}{2} \log (2\pi ) + (k + \tfrac{1}{2}) \log k - k + \frac{1}{12 k} + \mathcal {O}(k^{-3}), \quad k\in \mathbb {N}, \end{aligned} \end{aligned}$$

see, e.g., Abramowitz and Stegun (1964), p.257. Hence, we get

$$\begin{aligned} \log \big (P_{r,p}(k)\big )&= - \frac{1}{2} \log (2\pi ) + (r + k) \log (r + k) - r \log (r) - k \log k \\&\quad - \frac{1}{2} \log (r + k) + \frac{1}{2} \log (r) - \frac{1}{2} \log k + r \log q + k \log p \\&\quad + \frac{1}{12} \big [(r + k)^{-1} - r^{-1} - k^{-1}\big ] + \mathcal {O}\big ((r + k)^{-3} + r^{-3} + k^{-3}\big ). \end{aligned}$$

By writing

$$\begin{aligned} r + k = \frac{r}{q} \Big (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\Big ) \quad \text {and} \quad k = \frac{r p}{q} \Big (1 + \frac{\delta _k}{\sqrt{r p}}\Big ), \end{aligned}$$

the above is

$$\begin{aligned} \log \big (P_{r,p}(k)\big )&= - \frac{1}{2} \log (2 \pi ) - \frac{1}{2} \log \Big (\frac{r p}{q^2}\Big ) \nonumber \\&\quad + \frac{r}{q} \bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg ) \log \bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg ) \nonumber \\&\quad - \frac{r p}{q} \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg ) \log \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg ) \nonumber \\&\quad - \frac{1}{2} \log \bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg ) - \frac{1}{2} \log \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg ) \nonumber \\&\quad + \frac{q}{12 r} \bigg [\bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg )^{-1} - \frac{1}{q} - \frac{1}{p} \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg )^{-1}\bigg ] \nonumber \\&\quad + \mathcal {O}\bigg (\frac{q^3}{r^3} \bigg [\bigg (1 + \frac{\delta _k}{\sqrt{r p^{-1}}}\bigg )^{-3} + \frac{1}{q^3} + \frac{1}{p^3} \bigg (1 + \frac{\delta _k}{\sqrt{r p}}\bigg )^{-3}\bigg ]\bigg ). \end{aligned}$$
(28)

Now, note that for \(y \ge \eta - 1\), Lagrange’s error bound for Taylor expansions yields

$$\begin{aligned} \begin{aligned} (1 + y) \log (1 + y)&= y + \frac{y^2}{2} - \frac{y^3}{6} + \frac{y^4}{12} + \mathcal {O}\bigg (\frac{y^5}{\eta ^4}\bigg ), \\ \log (1 + y)&= y - \frac{y^2}{2} + \mathcal {O}\bigg (\frac{y^3}{\eta ^3}\bigg ), \\ (1 + y)^{-1}&= 1 + \mathcal {O}\bigg (\frac{y}{\eta ^2}\bigg ). \end{aligned} \end{aligned}$$

By applying these approximations in (28), we obtain

$$\begin{aligned}&\log \big (P_{r,p}(k)\big ) \\&= - \frac{1}{2} \log \Big (2 \pi \frac{r p}{q^2}\Big ) \\&+ \frac{r}{q} \left\{ \frac{\delta _k}{\sqrt{r p^{-1}}} + \frac{1}{2} \Big (\frac{\delta _k}{\sqrt{r p^{-1}}}\Big )^2 - \frac{1}{6} \Big (\frac{\delta _k}{\sqrt{r p^{-1}}}\Big )^3 + \frac{1}{12} \Big (\frac{\delta _k}{\sqrt{r p^{-1}}}\Big )^4 + \mathcal {O}\Big (\frac{1}{\eta ^4} \Big (\frac{\delta _k}{\sqrt{r p^{-1}}}\Big )^5\Big )\right\} \\&- \frac{r p}{q} \left\{ \frac{\delta _k}{\sqrt{r p}} + \frac{1}{2} \Big (\frac{\delta _k}{\sqrt{r p}}\Big )^2 - \frac{1}{6} \Big (\frac{\delta _k}{\sqrt{r p}}\Big )^3 + \frac{1}{12} \Big (\frac{\delta _k}{\sqrt{r p}}\Big )^4 + \mathcal {O}\Big (\frac{1}{\eta ^4} \Big (\frac{\delta _k}{\sqrt{r p}}\Big )^5\Big )\right\} \\&- \frac{1}{2} \left\{ \frac{\delta _k}{\sqrt{r p^{-1}}} - \frac{1}{2} \bigg (\frac{\delta _k}{\sqrt{r p^{-1}}}\bigg )^2 + \mathcal {O}\bigg (\frac{1}{\eta ^3} \bigg (\frac{\delta _k}{\sqrt{r p^{-1}}}\bigg )^3\bigg )\right\} \\&- \frac{1}{2} \left\{ \frac{\delta _k}{\sqrt{r p}} - \frac{1}{2} \bigg (\frac{\delta _k}{\sqrt{r p}}\bigg )^2 + \mathcal {O}\bigg (\frac{1}{\eta ^3} \bigg (\frac{\delta _k}{\sqrt{r p}}\bigg )^3\bigg )\right\} \\&+ \frac{q}{12 r} \left\{ 1 - q^{-1} - p^{-1}\right\} + \mathcal {O}\bigg (\frac{\delta _k}{r^{3/2} \eta ^2}\bigg ) + \mathcal {O}_p\bigg (\frac{1}{r^3 \eta ^3}\bigg ). \end{aligned}$$

After some cancellations, we get

$$\begin{aligned} \log \bigg (\frac{P_{r,p}(k)}{\frac{q}{\sqrt{r p}} \phi (\delta _k)}\bigg )&= \left\{ - \frac{p^2}{6 q \sqrt{p}} \frac{\delta _k^3}{r^{1/2}} + \frac{p^2}{12 q} \frac{\delta _k^4}{r} + \mathcal {O}_p\bigg (\frac{\delta _k^5}{r^{3/2} \eta ^4}\bigg )\right\} \nonumber \\&\quad + \left\{ \frac{1}{6 q \sqrt{p}} \frac{\delta _k^3}{r^{1/2}} - \frac{1}{12 p q} \frac{\delta _k^4}{r} + \mathcal {O}_p\bigg (\frac{\delta _k^5}{r^{3/2} \eta ^4}\bigg )\right\} \nonumber \\&\quad - \frac{1}{2} \left\{ \frac{\delta _k}{\sqrt{r p^{-1}}} - \frac{1}{2} \frac{\delta _k^2}{r p^{-1}} + \mathcal {O}_p\bigg (\frac{\delta _k^3}{r^{3/2} \eta ^3}\bigg )\right\} \nonumber \\&\quad - \frac{1}{2} \left\{ \frac{\delta _k}{\sqrt{r p}} - \frac{1}{2} \frac{\delta _k^2}{r p} + \mathcal {O}_p\bigg (\frac{\delta _k^3}{r^{3/2} \eta ^3}\bigg )\right\} \nonumber \\&\quad - \frac{p^2 + q}{12 r p} + \mathcal {O}_p\bigg (\frac{1 + |\delta _k|}{r^{3/2} \eta ^3}\bigg ) \nonumber \\&= (r p)^{-1/2} \left\{ \frac{1 + p}{6} \delta _k^3 - \frac{1 + p}{2} \delta _k\right\} \nonumber \\&\quad + (r p)^{-1} \left\{ - \frac{1 + p + p^2}{12} \delta _k^4 + \frac{p^2 + 1}{4} \delta _k^2 - \frac{p^2 + q}{12}\right\} \nonumber \\&\quad + \mathcal {O}_p\bigg (\frac{1 + |\delta _k|^5}{r^{3/2} \eta ^4}\bigg ), \end{aligned}$$
(29)

which proves (3). To obtain (4) and conclude the proof, we take the exponential on both sides of the last equation and we expand the right-hand side with

$$\begin{aligned} e^y = 1 + y + \frac{y^2}{2} + \mathcal {O}(e^{{\widetilde{\eta }}} y^3), \quad \text {for } -\infty < y \le {\widetilde{\eta }}. \end{aligned}$$
(30)

For r large enough and uniformly for \(|\delta _k| \le \eta \, r^{1/6} p^{1/2}\), the right-hand side of (29) is \(\mathcal {O}_p(1)\). When this bound is taken as y in (30), it explains the error in (4). \(\square \)

Proof of Theorem 1

Let \(c\in \mathbb {R}\). Note that (8) is a trivial consequence of (7), so we only need to prove (7). By decomposing \([\delta _{a - c},\infty )\) into small intervals, we get

$$\begin{aligned} \begin{aligned}&\sum _{k=a}^{\infty } P_{r,p}(k) - \int _{\delta _{a - c}}^{\infty } \phi (y) \textrm{d}y ~= \sum _{\begin{array}{c} k=a \\ k\in B_{r,p}(1/2) \end{array}}^{\infty } \Big [P_{r,p}(k) - \int _{\delta _{a-\frac{1}{2}}}^{\delta _{a+\frac{1}{2}}} \phi (y) \textrm{d}y\Big ] \\&~\qquad \qquad - \int _{\delta _{a-c}}^{\delta _{a-\frac{1}{2}}} \phi (y) \textrm{d}y + \mathcal {O}(e^{-\beta r}), \end{aligned} \end{aligned}$$
(31)

for a small enough constant \(\beta = \beta (p) > 0\), where the exponential error comes from the contributions outside of the bulk. The Taylor expansion of \(\phi (x)\) around any \(x_0\in \mathbb {R}\) is

$$\begin{aligned} \begin{aligned} \phi (x)&= \phi (x_0) + \phi '(x_0) (x - x_0) + \tfrac{1}{2} \phi ''(x_0) (x - x_0)^2 \\&\quad + \mathcal {O}(|x - x_0|^3). \end{aligned} \end{aligned}$$
(32)

By taking \(x_0 = \delta _{k}\) in (32) and integrating on \([\delta _{k-\frac{1}{2}}, \delta _{k+\frac{1}{2}}]\), the first and third order derivatives disappear because of the symmetry. We have

$$\begin{aligned} \int _{\delta _{k-\frac{1}{2}}}^{\delta _{k+\frac{1}{2}}} \phi (y) \textrm{d}y&= \frac{q}{\sqrt{r p}} \phi (\delta _{k}) + \frac{\phi ''(\delta _{k})}{2} \int _{-q/(2\sqrt{r p})}^{q/(2\sqrt{r p})} x^2 \textrm{d}x + \mathcal {O}_p\bigg (\frac{1 + |\delta _k|^5}{r^{5/2}}\bigg ) \nonumber \\&= \frac{q}{\sqrt{r p}} \phi (\delta _{k}) \bigg \{1 + \frac{q^2}{24 r p}(\delta _k^2 - 1) + \mathcal {O}_p\bigg (\frac{1 + |\delta _k|^5}{r^2}\bigg )\bigg \}. \end{aligned}$$
(33)

Similarly, by taking \(x_0 = \delta _{k}\) in (32) and integrating on \([\delta _{a - c}, \delta _{a - \frac{1}{2}}]\), we have

$$\begin{aligned} \int _{\delta _{a-c}}^{\delta _{a-\frac{1}{2}}} \phi (y) \textrm{d}y = \frac{q}{\sqrt{r p}} \phi (\delta _{a}) \left\{ \left( c - \frac{1}{2}\right) + \frac{q}{\sqrt{r p}} \frac{1}{2} \left( c^2 - \frac{1}{4}\right) + \mathcal {O}_p\left( \frac{1 + |\delta _a|^2}{r}\right) \right\} . \nonumber \\ \end{aligned}$$
(34)

Using (33), (34), and the expression of \(P_{r,p}(k)\) from Lemma 1 when k is in the bulk, the right-hand side of (31) is equal to

$$\begin{aligned} \begin{aligned}&(r p)^{-1/2} \left\{ \begin{array}{l} \frac{1 + p}{6} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y - \frac{1 + p}{2} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y - q (c - \frac{1}{2}) \phi (\delta _{{\widetilde{a}}}) \end{array} \right\} \\ + ~&(r p)^{-1} \left\{ \begin{array}{l} \frac{(1 + p)^2}{72 p} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^6 \phi (y) \textrm{d}y - \frac{2 + 3 p + 2 p^2}{12} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^4 \phi (y) \textrm{d}y \\ + \frac{3 + 2 p + 3 p^2}{8} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^2 \phi (y) \textrm{d}y - \frac{p^2 + q}{12} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} \phi (y) \textrm{d}y \\ - \frac{q^2}{24} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} (y^2 - 1) \phi (y) \textrm{d}y \\ + \tfrac{q}{2} (c - \frac{1}{2}) \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) - \tfrac{q^2}{2} (c^2 - \frac{1}{4}) \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) \end{array} \right\} + \mathcal {O}_p(r^{-3/2}), \end{aligned}\nonumber \\ \end{aligned}$$
(35)

where \({\widetilde{a}} :=a - \tfrac{1}{2}\). For \(d\in \mathbb {R}\), consider

$$\begin{aligned} c = \frac{1}{2} + \bigg [\frac{1 + p}{6 q} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})} - \frac{1 + p}{2 q} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})}\bigg ] + \frac{d}{q \sqrt{r p}}, \end{aligned}$$

in (35). The terms of order \((r p)^{-1/2}\) cancel out and the d that cancels the terms of order \((r p)^{-1}\) is

$$\begin{aligned} d_{r,p}^{\star }(a) = \left\{ \begin{array}{l} \frac{(1 + p)^2}{72} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^6 \phi (y) \textrm{d}y - \frac{2 + 3 p + 2 p^2}{12} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^4 \phi (y) \textrm{d}y \\ + \frac{3 + 2 p + 3 p^2}{8} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^2 \phi (y) \textrm{d}y - \frac{p^2 + q}{12} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} \phi (y) \textrm{d}y \\ - \frac{q^2}{24} \int _{\{y \ge \delta _{{\widetilde{a}}}\}} (y^2 - 1) \phi (y) \textrm{d}y \\ + \frac{p}{2} \Big [\frac{1 + p}{6} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})} - \frac{1 + p}{2} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})}\Big ] \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) \\ - \tfrac{1}{2} \Big [\frac{1 + p}{6} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})} - \frac{1 + p}{2} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})}\Big ]^2 \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) \end{array} \right\} \frac{1}{\phi (\delta _{{\widetilde{a}}})}. \end{aligned}$$

Now, using the fact that, for \(a\in \mathbb {R}\),

$$\begin{aligned} \begin{aligned}&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^6 \phi (y) \textrm{d}y = (15 \delta _{{\widetilde{a}}} + 5 \delta _{{\widetilde{a}}}^3 + \delta _{{\widetilde{a}}}^5) \phi (\delta _{{\widetilde{a}}}) + 15 \Psi (\delta _{{\widetilde{a}}}), \\&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^4 \phi (y) \textrm{d}y = (3 \delta _{{\widetilde{a}}} + \delta _{{\widetilde{a}}}^3) \phi (\delta _{{\widetilde{a}}}) + 3 \Psi (\delta _{{\widetilde{a}}}), \\&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y = (2 + \delta _{{\widetilde{a}}}^2) \phi (\delta _{{\widetilde{a}}}), \\&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^2 \phi (y) \textrm{d}y = \delta _{{\widetilde{a}}} \phi (\delta _{{\widetilde{a}}}) + \Psi (\delta _{{\widetilde{a}}}), \\&\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \phi (y) \textrm{d}y = \phi (\delta _{{\widetilde{a}}}), \end{aligned} \end{aligned}$$

where \(\Psi \) denotes the survival function of the standard normal distribution, the c that cancel both braces in (35) is

$$\begin{aligned} c_{r,p}^{\star }(a)&= \frac{1}{2} + \bigg [\frac{1 + p}{6 q} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y^3 \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})} - \frac{1 + p}{2 q} \cdot \frac{\int _{\{y \ge \delta _{{\widetilde{a}}}\}} y \, \phi (y) \textrm{d}y}{\phi (\delta _{{\widetilde{a}}})}\bigg ] + \frac{d_{r,p}^{\star }(a)}{q \sqrt{r p}}\nonumber \\&= \frac{1}{2} + \frac{1+p}{6 q} \big [\delta _{{\widetilde{a}}}^2 - 1\big ] + \frac{1}{q \sqrt{r p}} \left\{ \begin{array}{l} \left[ \begin{array}{l} \frac{(1 + p)^2}{72} \cdot 5 - \frac{2 + 3 p + 2 p^2}{12} \\ + \frac{p(1 + p)}{12} \cdot (1 - 3) \\ + \frac{(1 + p)^2}{72} \cdot (-4 + 6) \end{array} \right] \delta _{{\widetilde{a}}}^3 \\[7mm] \left[ \begin{array}{l} \frac{(1 + p)^2}{72} \cdot 15 - \frac{2 + 3 p + 2 p^2}{12} \cdot 3 \\ + \frac{3 + 2 p + 3 p^2}{8} - \frac{q^2}{24} \\ + \frac{p(1 + p)}{12} \cdot (2 - 3) \\ + \frac{(1 + p)^2}{72} \cdot (-4 + 6 \cdot 2 - 9) \end{array} \right] \delta _{{\widetilde{a}}} \end{array} \right\} \\&= \frac{1}{2} + \frac{1+p}{6 q} \big [\delta _{{\widetilde{a}}}^2 - 1\big ] + \frac{1}{q \sqrt{r p}} \left\{ \begin{array}{l} - \frac{1}{72} \big [5 + 16 p + 17 p^2\big ] \delta _{{\widetilde{a}}}^3 \\ + \frac{1}{36} \big [1 - 4 p - 2 p^2\big ] \delta _{{\widetilde{a}}} \end{array} \right\} .\nonumber \end{aligned}$$

This ends the proof. \(\square \)

Appendix B Moments of the negative binomial distribution

In the lemma below, we compute the second, third, fourth and sixth central moments. It is used to control some expectations in (26) and the \(\asymp _p r^{-1}\) errors in (22) of the proof of Theorem 3. It is also a preliminary result for the proof of Corollary 2 below, where the central moments are bounded on various events.

Lemma 2

(Central moments) Let \(K\sim \textrm{NegBin}(r, p)\) for some \(r > 0\) and \(p\in (0,1)\). We have

$$\begin{aligned} \begin{aligned}&\mathbb {E}[(K - r p q^{-1})^2] = r p q^{-2}, \\&\mathbb {E}[(K - r p q^{-1})^3] = r p q^{-2} \cdot (1 + p) q^{-1}, \\&\mathbb {E}[(K - r p q^{-1})^4] = 3 (r p q^{-2})^2 + \mathcal {O}_p(r), \\&\mathbb {E}[(K - r p q^{-1})^6] = 15 (r p q^{-2})^3 + \mathcal {O}_p(r^2). \end{aligned} \end{aligned}$$

Proof of Lemma 2

This was calculated using Mathematica. \(\square \)

Next, we bound the first and third central moments on various events. The corollary below is used to control the \(\asymp _p r^{-1/2}\) errors in (22) of the proof of Theorem 3.

Corollary 2

Let \(K\sim \textrm{NegBin}(r, p)\) for some \(r > 0\) and \(p\in (0,1)\), and let \(A\in {\mathscr {B}}(\mathbb {R})\) be a Borel set. Then,

Proof of Corollary 2

This follows from Lemma 2 and Holder’s inequality. \(\square \)

Appendix C Short proof for the asymptotics of the median of a jittered Poisson random variable

In this section, we present a short proof for the asymptotics of the median of a Poisson random variable jittered by a uniform (Theorem 5), using the same technique introduced in Section 3.1. Our statement is slightly weaker than Theorem 1 in Coeurjolly and Trépanier (2020), but the proof is conceptually simpler.

Theorem 5

Let \(N_{\lambda }\sim \textrm{Poisson}(\lambda )\) and \(U\sim \text {Uniform}(0,1)\). Then, as \(\lambda \rightarrow \infty \), we have

$$\begin{aligned} \textrm{Median}(N_{\lambda } + U) - \lambda = \frac{1}{3} + \mathcal {O}(\lambda ^{-1}). \end{aligned}$$

Proof

By conditioning on U and using the local limit theorem from Lemma 2.1 in Ouimet (2021), we want to find \(t = \textrm{Median}(N_{\lambda } + U) > 0\) such that

$$\begin{aligned} \frac{1}{2}&= \int _0^1 \mathbb {P}(N_{\lambda } \le t - u) \, \textrm{d}u \nonumber \\&= \mathbb {P}(N_{\lambda } \le \lfloor t \rfloor ) \cdot \{t\} + \mathbb {P}(N_{\lambda } \le \lfloor t \rfloor - 1) \cdot (1 - \{t\}) \nonumber \\&= \Phi (\delta _{\lfloor t \rfloor + 1 - c_{\lambda }^{\star }(\lfloor t \rfloor + 1)}) \cdot \{t\} + \Phi (\delta _{\lfloor t \rfloor - c_{\lambda }^{\star }(\lfloor t \rfloor )}) \cdot (1 - \{t\}) + \mathcal {O}(\lambda ^{-3/2}), \end{aligned}$$
(36)

where \(\{t\}\) denotes the fractional part of t, and

$$\begin{aligned} c_{\lambda }^{\star }(a) = \frac{1}{2} + \frac{1}{6} \big [\lambda ^{-1} (a - \lambda )^2 - 1\big ] + \mathcal {O}(\lambda ^{-1}), \quad \text {for } a - \lambda \asymp 1. \end{aligned}$$
(37)

Now, we have the following Taylor series expansion for \(\Phi \) at 0:

$$\begin{aligned} \Phi (x) = \frac{1}{2} + \frac{x}{\sqrt{2\pi }} + \mathcal {O}(x^3). \end{aligned}$$

Therefore, (36) becomes

$$\begin{aligned} 0 = \frac{1}{\sqrt{2\pi }} \left[ \delta _{\lfloor t \rfloor + 1 - c_{\lambda }^{\star }(\lfloor t \rfloor + 1)} \cdot \{t\} + \delta _{\lfloor t \rfloor - c_{\lambda }^{\star }(\lfloor t \rfloor )} \cdot (1 - \{t\})\right] + \mathcal {O}(\lambda ^{-3/2}). \end{aligned}$$

After rearranging some terms, this is equivalent to

$$\begin{aligned} t - \lambda = c_{\lambda }^{\star }(\lfloor t \rfloor + 1) \cdot \{t\} + c_{\lambda }^{\star }(\lfloor t \rfloor ) \cdot (1 - \{t\}) + \mathcal {O}(\lambda ^{-1}). \end{aligned}$$

By applying the expression for \(c_{\lambda }^{\star }\) in (37), this is \(t - \lambda = \frac{1}{2} - \frac{1}{6} + \mathcal {O}(\lambda ^{-1})\). \(\square \)

Let \(N_1,N_2,\dots ,N_n\sim \textrm{Poisson}(\lambda )\) and \(U_1,U_2,\dots ,U_n\sim \textrm{Uniform}(0,1)\) be i.i.d., and define \(Z_i :=N_i + U_i\) for all \(i\in \{1,2,\dots ,n\}\). Then, we have the convergence in probability of the sample median:

$$\begin{aligned} \widehat{\textrm{Median}}(Z_1, Z_2, \dots , Z_n) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}} \textrm{Median}(Z_1), \quad \text {as } n\rightarrow \infty , \end{aligned}$$

see, e.g., van der Vaart (1998), p.47. We deduce the following corollary.

Corollary 3

With the above notation,

$$\begin{aligned} \widehat{\textrm{Median}}(Z_1, Z_2, \dots , Z_n) - \lambda {\mathop {\longrightarrow }\limits ^{\mathbb {P}}} \frac{1}{3}, \end{aligned}$$

as \(n\rightarrow \infty \) and then \(\lambda \rightarrow \infty \).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ouimet, F. A refined continuity correction for the negative binomial distribution and asymptotics of the median. Metrika 86, 827–849 (2023). https://doi.org/10.1007/s00184-023-00897-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-023-00897-2

Keywords

Mathematics Subject Classification

Navigation