A generalized Sibuya distribution

Kozubowski, Tomasz J.; Podgórski, Krzysztof

doi:10.1007/s10463-017-0611-3

Tomasz J. Kozubowski¹ &
Krzysztof Podgórski²

640 Accesses
19 Citations
Explore all metrics

Abstract

The Sibuya distribution arises as the distribution of the waiting time for the first success in Bernoulli trials, where the probabilities of success are inversely proportional to the number of a trial. We study a generalization that can be viewed as the distribution of the excess random variable $N-k$ given $N>k$, where N has the Sibuya distribution and k is an integer. We summarize basic facts regarding this distribution and provide several new results and characterizations, shedding more light on its origin and possible applications. In particular, we emphasize the role Sibuya distribution plays in the extreme value theory and point out its invariance property with respect to random thinning operation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compound Geometric Distribution of Order k

Article 04 February 2016

Markos V. Koutras & Serkan Eryilmaz

The distributions of sum, minima and maxima of generalized geometric random variables

Article 10 September 2014

Fatih Tank & Serkan Eryilmaz

The Memoryless Property and Moments of the Gumbel Distribution

Article 07 October 2022

James E. Marengo & David L. Farnsworth

References

Aban, I. B., Meerschaert, M. M., Panorska, A. K. (2006). Parameter estimation for the truncated Pareto distribution. Journal of the American Statistical Association, 101, 270–277.
Bondesson, L. (1992). Generalized gamma convolutions and related classes of distributions and densities. Lecture notes in statistics (Vol. 76). Berlin: Springer.
Buddana, A., Kozubowski, T. J. (2014). Discrete Pareto distributions. Economic Quality Control, 29(2), 143–156.
Christoph, G., Schreiber, K. (1998). Discrete stable random variables. Statistics and Probability Letters, 37, 243–247.
Christoph, G., Schreiber, K. (2000). Shifted and scaled Sibuya distribution and discrete self-decomposability. Statistics and Probability Letters, 48(2), 181–187.
Clauset, A., Newman, M. E. J. (2009). Power-law distributions in empirical data. SIAM Review, 2, 661–703.
Devroye, L. (1993). A triptych of discrete distributions related to the stable law. Statistics and Probability Letters, 18, 349–351.
Article MathSciNet MATH Google Scholar
Gabaix, X. (2009). Power laws in economics and finance. Annual Review of Economics, 1, 255–293.
Article Google Scholar
Gradshteyn, I. S., Ryzhik, I. M. (2007). Tables of integrals, series, and products (7th ed.). Amsterdam: Academic Press.
Huillet, T. E. (2012). On Linnik’s continuous-time random walks (new version of Huillet (2000). On Linnik’s continuous-time random walks. Journal of Physics A: Mathematical and General, 33(14), 2631–2652. Available at http://www.reserachgate.net/publication/231129053.
Huillet, T. E. (2016). On Mittag-Leffler distributions and related stochastic processes. Journal of Computational and Applied Mathematics, 296, 181–211.
Article MathSciNet MATH Google Scholar
Johnson, N. L., Kotz, S., Kemp, A. W. (1993). Univariate discrete distributions. New York: John Wiley & Sons.
Johnson, N. L., Kotz, S., Balakrishnan, N. (1994). Continuous univariate distributions (Vol. 1). New York: John Wiley & Sons.
Kozubowski, T.J., Podgórski, K. (2016). Certain bivariate distributions and random processes connected with maxima and minima. Working papers in statistics 2016:9, Department of Statistics, School of Economics and Management, Lund University.
Krishna, H., Singh Pundir, P. (2009). Discrete Burr and discrete Pareto distributions. Statistical Methodology, 6, 177–188.
Lehman, E. L. (1983). Theory of point estimation. New York: John Wiley & Sons.
Book Google Scholar
Newman, M. E. J. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics, 46, 323–351.
Article Google Scholar
Pakes, A. G. (1995). Characterization of discrete laws via mixed sums and Markov branching processes. Stochastic Processes and their Applications, 55, 285–300.
Article MathSciNet MATH Google Scholar
Pillai, R. N., Jayakumar, K. (1995). Discrete Mittag-Leffler distributions. Statistics and Probability Letters, 23, 271–274.
Rényi, A. (1976). On outstanding values of a sequence of observations. In: Rényi A (ed). Selected papers of A. Rényi (Vol. 3, pp. 50–65). Budapest: Akadémiai Kiadó.
Satheesh, S., Nair, N. U. (2002). Some classes of distributions on the non-negative lattice. Journal of the Indian Statistical Association, 40(1), 41–58.
Sibuya, M., Shimizu, R. (1981). The generalized hypergeometric family of distributions. Annals of the Institute of Statistical Mathematics, 33, 177–190.
Sibuya, M. (1979). Generalized hypergeometric, digamma, and trigamma distributions. Annals of the Institute of Statistical Mathematics, 31, 373–390.
Article MathSciNet MATH Google Scholar
Sornette, D. (2006). Critical phenomena in natural sciences: Chaos, fractals, selforganization and disorder: Concepts and tools (2nd ed.). Berlin: Springer.
Steutel, F. W., van Harn, K. (1979). Discrete analogues of self-decomposability and stability. Annals of Probability, 7, 893–899.
Steutel, F. W., van Harn, K. (2004). Infinitely divisibility of probability distributions on the real line. New York: Marcel Dekker.
Stumpf, M. P. H., Porter, M. A. (2012). Critical truths about power laws. Science, 335, 665–666.
Yule, G. U. (1925). A mathematical theory of evolution based on the conclusions of Dr. J.C. Willis, F.R.S. Philosophical Transactions of the Royal Society of London Series B., 213, 21–87.
Article Google Scholar
Zipf, G. K. (1949). Human behavior and the principle of least effort. Cambridge, MA: Addison-Wesley.
Google Scholar

Download references

Acknowledgements

The authors thank two anonymous referees for their helpful comments. Research of the first author was partially funded by the European Union’s Seventh Framework Programme for research, technological development and demonstration under Grant Agreement No 318984 - RARE. The second author was partially supported by Riksbankens Jubileumsfond Grant Dnr: P13-1024.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Nevada, Reno, NV, 89557, USA
Tomasz J. Kozubowski
Department of Statistics, Lund University, 220 07, Lund, Sweden
Krzysztof Podgórski

Authors

Tomasz J. Kozubowski
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Podgórski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomasz J. Kozubowski.

Appendix

Proof of Proposition 1

For the variables X and Y in (7), we have

$$\begin{aligned} S_X(x)&=\mathbb P(X>x)=1-\mathbb E(\mathbb P(X\le x|N_\alpha ))=1-\mathbb E\left( F^{N_\alpha }(x)\right) =1-G_{N_\alpha }(F(x)),\\ F_Y(y)&=\mathbb P(Y\le y)=1-\mathbb E(\mathbb P(Y > y|N_\alpha ))=1-\mathbb E\left( S^{N_\alpha }(y)\right) =1-G_{N_\alpha }(S(y)), \end{aligned}$$

where $G_{N_\alpha }$ is the PGF given in (11). Thus,

$$\begin{aligned} S_X(x)&=(1-F(x))^\alpha =S^\alpha (x),\\ F_Y(y)&=(1-S(x))^\alpha =F^\alpha (x), \end{aligned}$$

as required. $\square $

Proof of Proposition 2

Suppose that, for some $\alpha \in (0,1)$, N has Sibuya distribution $GS_1(\alpha ,0)$, given by the PMF (5). Then, for each $t\in \mathbb R_+$, the value of the process N(t) defined by (8) admits the stochastic representation (45), where $c(t)=\mathbb P(X_j>t)$. Since for Sibuya distributed N we have (42) with $c=c(t)$, which, in turn, implies (43), N(t) satisfies (9), as desired.

Next, assume that N(t) satisfies equation (9). Thus, for each $t\in \mathbb R_+$, we have

$$\begin{aligned} \mathbb P(N(t)=n)=\mathbb P(N=n)\mathbb P(N(t)>0), \,\,\, n\in \mathbb N. \end{aligned}$$

(57)

Using standard conditioning argument, write

$$\begin{aligned} \mathbb P(N(t)=n)= \sum _{k=1}^\infty \mathbb P(N(t)=n|N=k)\mathbb P(N=k), \,\,\, n\in \mathbb N_0. \end{aligned}$$

Noting that for $k<n$ we have $\mathbb P(N(t)=n|N=k)=0$ while for $k\ge n$ the variable $N(t)=n|N=k$ is binomial with parameters k and $p=1-F(t)$, where F is the common CDF of the $X_j$’s, we conclude that

$$\begin{aligned} \mathbb P(N(t)=n)= \sum _{k=n}^\infty \left( {\begin{array}{c}k\\ n\end{array}}\right) [1-F(t)]^n [F(t)]^{k-n} \mathbb P(N=k), \,\,\, n\in \mathbb N, \,\,\, t\in \mathbb R_+. \end{aligned}$$

(58)

For $n=0$, we have

$$\begin{aligned} \mathbb P(N(t)=0)= \sum _{k=1}^\infty [F(t)]^{k} \mathbb P(N=k), \,\,\, t\in \mathbb R_+. \end{aligned}$$

(59)

We now write $s=F(t)\in (0,1)$ and $p_n=\mathbb P(N=n)$ and substitute (58) and (59) into (57), which results in the following equation

$$\begin{aligned} (1-s)^n \sum _{k=n}^\infty \left( {\begin{array}{c}k\\ n\end{array}}\right) s^{k-n} p_k = p_n\left( 1- \sum _{k=1}^\infty s^{k} p_k\right) , \,\,\, n\in \mathbb N, \,\,\, s\in (0,1). \end{aligned}$$

(60)

Further, by expanding the term $(1-s)^n$ into a power series in s and changing the index of the summation on the left-hand side of (60) to $j=k-n$, we conclude that

$$\begin{aligned} \left\{ \sum _{k=0}^n \left( {\begin{array}{c}n\\ k\end{array}}\right) (-1)^k s^k \right\} \cdot \left\{ \sum _{j=0}^\infty \left( {\begin{array}{c}j+n\\ n\end{array}}\right) p_{j+n} s^{j} \right\} = p_n - \sum _{j=1}^\infty p_j p_n s^{j}, \,\,\, n\in \mathbb N, \,\,\, s\in (0,1). \end{aligned}$$

(61)

Using standard result for power series, stating that the coefficients $c_k$ of the product

$$\begin{aligned} \sum _{k=0}^\infty c_k s^k = \left\{ \sum _{i=0}^\infty a_i s^i \right\} \cdot \left\{ \sum _{j=0}^\infty b_j s^{j} \right\} \end{aligned}$$

(62)

are given by

$$\begin{aligned} c_k = \sum _{i=0}^k a_i b_{k-i}, \end{aligned}$$

following some algebra, we conclude the left-hand side of (61) is of the form (62) with

$$\begin{aligned} c_k = \sum _{j=0}^k\left( {\begin{array}{c}n\\ j\end{array}}\right) \left( {\begin{array}{c}k-j+n\\ n\end{array}}\right) p_{k-j+n}(-1)^j, \,\,\, 0\le k\le n, \,\,\, n\in \mathbb N. \end{aligned}$$

Thus, in view of the above, coupled with (61), and by the uniqueness of the power series, we conclude that

$$\begin{aligned} \sum _{j=0}^k\left( {\begin{array}{c}n\\ j\end{array}}\right) \left( {\begin{array}{c}k-j+n\\ n\end{array}}\right) p_{k-j+n}(-1)^j = - p_kp_n, \,\,\, 1 \le k\le n, \,\,\, n\in \mathbb N. \end{aligned}$$

(63)

In particular, for $k=1$, relation (63) reduces to

$$\begin{aligned} (n+1)p_{n+1}-np_n = - p_1p_n, \,\,\, n\in \mathbb N, \end{aligned}$$

leading to

$$\begin{aligned} p_{n+1} = \frac{(n-p_1)p_n}{n+1}, \,\,\, n\in \mathbb N. \end{aligned}$$

It now follows by induction that the $\{p_n\}$ coincide with Sibuya probabilities (5), where $\alpha =p_1=\mathbb P(N=1)$. This concludes the proof. $\square $

Proof of Proposition 4

Since, in view of (13), the results of Proposition 4 and Corollary 1 are equivalent, it is enough to establish (16). First, by incorporating the well-known property of the gamma function,

$$\begin{aligned} \Gamma (\eta +k) = \Gamma (\eta ) \eta (\eta +1)\cdots (\eta +k-1), \,\,\, \eta \in \mathbb R_+, k\in \mathbb N, \end{aligned}$$

the generalized Sibuya SF (12) can be written as

$$\begin{aligned} \mathbb P(N>n) = \frac{1}{n^\alpha }\frac{\Gamma (\nu +1-\alpha +n)}{\Gamma (n)n^{\nu +1-\alpha }} \frac{\Gamma (n)n^{\nu +1}}{\Gamma (n+\nu +1)} \frac{\Gamma (\nu +1)}{\Gamma (\nu +1-\alpha )}. \end{aligned}$$

Next, since for any $\gamma >0$, we have the asymptotic representation of the Gamma function (see, e.g., Gradshteyn and Ryzhik 2007, formula 8.328.2, p. 895)

$$\begin{aligned} \frac{\Gamma (\gamma +n)}{\Gamma (n)n^{\gamma }} \rightarrow 1 \,\,\, \text{ as } n\rightarrow \infty , \end{aligned}$$

the right-hand side of (7) divided by the right-hand side of (16) converges to 1 with $n\rightarrow \infty $, as desired. $\square $

Proof of Proposition 5

By Proposition 7, we have the equality in distribution $N\mathop {=}\limits ^{d}N(X)$, where $\{N(t), \,t>0\}$ is a standard Poisson process independent of $X\mathop {=}\limits ^{d}X_1 X_{\nu -\alpha +1}/X_\alpha $, where all the three variables on the right-hand side are independent and gamma distributed with scale one and the shape parameters indicated by the subindex. The result now follows from (17), the representation (18) for the integer-order moments of N(t), and and the well-known moment formulas for gamma distribution, which produce

$$\begin{aligned} \mathbb E(X^\delta ) =\frac{\Gamma (\alpha -\delta )\Gamma (\delta +1)\Gamma (\delta + \nu -\alpha +1)}{\Gamma (\alpha )\Gamma (\nu -\alpha +1)},~~\delta <\alpha . \end{aligned}$$

$\square $

Proof of Proposition 6

By Proposition 7, the PGF of N is given by (33), where $\phi _X(\cdot )$ is the LT of the variable X defined in (27). To prove the result, it is enough to show that the LT of X is given by (34). To establish the latter, we condition on $T_{\alpha ,\nu }$ when computing the LT of X, leading to

$$\begin{aligned} \phi _X(t) = \int _0^\infty \mathbb E e^{-tE/x}f(x)\mathrm{d}x, \end{aligned}$$

where f(x) is given in (28) and E is standard exponential with the LT

$$\begin{aligned} \mathbb Ee^{-tE} = \frac{1}{1+t}, \,\,\, t\in \mathbb R_+. \end{aligned}$$

Thus,

$$\begin{aligned} \phi _X(t) = \frac{1}{B(\alpha ,\nu -\alpha +1)}\int _0^\infty \frac{x^{\alpha }}{(t+x)(1+x)^{\nu +1}}\mathrm{d}x, \end{aligned}$$

where B(a, b) is the beta function (29). The result now follows by the integration formula 3.227.1 p. 320 of Gradshteyn and Ryzhik (2007). $\square $

Proof of Proposition 7

It is known (see, e.g., Devroye 1993) that the generalized hypergeometric distribution of type B3, given in (30) with X as in (32), is of the form

$$\begin{aligned} \mathbb P(N=n) = \frac{\Gamma (a+c)\Gamma (b+c)\Gamma (a+n)\Gamma (b+n)}{\Gamma (a)\Gamma (b)\Gamma (c)\Gamma (a+b+c+n)n!}, \,\,\, n\in \mathbb N_0. \end{aligned}$$

(64)

Setting $a=1$, $b=1-\alpha +\nu $, and $c=\alpha $ in (64) produces the $GS_0(\alpha , \nu )$ distribution. $\square $

Proof of Proposition 9

We proceed by showing that the PMF of the variable [W] coincides with that of the $GS_0(\alpha ,\nu )$ distribution. First, using standard conditioning argument, write

$$\begin{aligned} \mathbb P([W]=n) = \int _0^\infty \mathbb P([E/x]=n)g(x) \mathrm{d}x,\,\,\, n\in \mathbb N_0, \end{aligned}$$

(65)

where E has the standard exponential distribution and g is the PDF of $V_{\alpha ,\nu }$, given by (36). Since

$$\begin{aligned} \mathbb P([E/x]=n) = \mathbb P(nx \le E < (n+1)x) = e^{-nx} - e^{-(n+1)x}, \end{aligned}$$

the probability (65) takes on the form

$$\begin{aligned} \mathbb P([W]=n) = \frac{\Gamma (\nu +1)}{\Gamma (\alpha )\Gamma (1-\alpha +\nu )} \left\{ I_{\nu +n}(\alpha ) - I_{\nu +n+1}(\alpha )\right\} , \end{aligned}$$

where

$$\begin{aligned} I_\nu (\alpha ) = \int _0^\infty e^{-\nu x}(e^x-1)^{\alpha -1}\mathrm{d}x,\,\,\, \nu \ge 0,\,\,\, 0<\alpha <\nu +1. \end{aligned}$$

Noting that the function $g(\cdot )$ in (36) is a genuine PDF for each $\nu \ge 0$ and $0<\alpha <\nu +1$, we conclude that

$$\begin{aligned} I_\nu (\alpha ) = \frac{\Gamma (\alpha )\Gamma (1-\alpha +\nu )}{\Gamma (\nu +1)},\,\,\, \nu \ge 0,\,\,\, 0<\alpha <\nu +1. \end{aligned}$$

(66)

A substitution of (66) into (7), followed by some algebra, produces the $GS_0(\alpha ,\nu )$ distribution. This concludes the proof. $\square $

Proof of Proposition 11

To prove the result, we shall use the following sufficient condition for this property to hold (Bondesson 1992, p. 28): A strictly decreasing PMF $\{p_n\}$, $n\in \mathbb N_0$, is DSD if

$$\begin{aligned} \max _{0\le n\le j}\frac{p_{n+1}}{p_n} \le \frac{j+2}{j+1}\frac{p_{j+1}-p_{j+2}}{p_j-p_{j+1}}, \,\,\, j\in \mathbb N_0. \end{aligned}$$

(67)

First, we shall show that generalized Sibuya PMF is strictly decreasing in n. To see this, note that the ratio

$$\begin{aligned} \frac{p_{n+1}}{p_n} = \frac{\mathbb P(N=n+1)}{\mathbb P(N=n)} = \frac{\nu +n+1 - \alpha }{\nu +n+2}, \,\,\, n\in \mathbb N_0, \end{aligned}$$

(68)

is strictly increasing in $n\in \mathbb N_0$. Indeed, the derivative of the function

$$\begin{aligned} g(x) = \frac{\nu +1 - \alpha +x}{\nu +2+x}, \,\,\, x\in \mathbb R_+, \end{aligned}$$

is positive for all $x\in \mathbb R_+$, which can be checked by straightforward algebra. Since the ratio (68) converges to 1 as $n\rightarrow \infty $, we conclude that $p_{n+1}/p_n<1$ for all $n\in \mathbb N_0$, showing the monotonicity of the sequence $\{p_n\}$, $n\in \mathbb N_0$. This also shows that the maximum on the left-hand side of (67) is attained for $n=j$, so that the condition (67) becomes

$$\begin{aligned} \frac{p_{j+1}}{p_j} \le \frac{j+2}{j+1}\frac{p_{j+1}-p_{j+2}}{p_j-p_{j+1}}, \,\,\, j\in \mathbb N_0. \end{aligned}$$

(69)

After some algebra, condition (69) can be restated as

$$\begin{aligned} (j+1)\left( 1-\frac{p_{j+1}}{p_j} \right) \le (j+2)\left( 1-\frac{p_{j+2}}{p_{j+1}} \right) , \,\,\, j\in \mathbb N_0. \end{aligned}$$

(70)

Since

$$\begin{aligned} (j+1)\left( 1-\frac{p_{j+1}}{p_j} \right) = \frac{(j+1)(1+\alpha )}{\nu +1+(j+1)} , \,\,\, j\in \mathbb N_0, \end{aligned}$$

and the function

$$\begin{aligned} h(x) = \frac{x(1+\alpha )}{\nu +1+x} = \frac{1+\alpha }{1+\frac{\nu +1}{x}} \end{aligned}$$

is non-decreasing in $x\in \mathbb R_+$, we obtain (70). This concludes the proof. $\square $

Proof of Proposition 12

According to the remarks following the statement of Proposition 12, condition (42) implies (44), which, in view of (45), is equivalent to (9). The result now follows from Proposition 2. $\square $

Proof of Proposition 14

For $n=1$, the statement is trivial. To prove the result for general $n\in \mathbb N$, it is enough to show the following fact:

(A) For each $n\ge 2$, the conditional distribution of $T_n$ given the $n-1$ values $0< t_1< \cdots<t_{n-1}<1$ of the previous jump locations has a uniform distribution on the interval $(t_{n-1}, 1)$.

Indeed, if (A) is true, the PDF of the joint distribution of $(T_1, \ldots ,T_n)$ is easily seen to be given by (48). This, in turn, is the joint PDF of the random vector on the right-hand side of (47). To see this, consider a random vector $(\Gamma _1, \ldots , \Gamma _n)$ of successive arrivals of standard Poisson process, so that $\Gamma _i=W_1+\cdots + W_i$, $i=1\ldots n$, where the $\{W_i\}$ are IID standard exponential variables. Routine calculations show that the PDF of $(\Gamma _1, \ldots , \Gamma _n)$ is of the form

$$\begin{aligned} r(x_1, \ldots , x_n) = e^{-x_n}, \,\,\, 0<x_1< x_2< \cdots <x_n. \end{aligned}$$

Consider a one-to-one transformation $T_i=H(\Gamma _i)$, $i=1,\ldots ,n$, where $H(x)=1-e^{-x}$ is the common CDF of the $W_i$’s, with the inverse of $H^{-1}(t)=-\log (1-t)$. Since the Jacobian of the inverse transformation is the product

$$\begin{aligned} J=\frac{1}{(1-t_1) \cdot \cdots \cdot (1-t_n)}, \end{aligned}$$

the PDF of $(T_1, \ldots , T_n)$ becomes

$$\begin{aligned} g(t_1, \ldots , t_n) = r(H^{-1}(t_1), \ldots , H^{-1}(t_n))\cdot |J| = e^{\log (1-t_n)} \frac{1}{(1-t_1) \cdot \cdots \cdot (1-t_n)}, \end{aligned}$$

which produces (48).

To establish the claim (A) above, we start with $n=2$, and consider the conditional probability $\mathbb P(T_2>t|T_1=t_1)$ for $t_1<t<1$. Using the law of total probability, we obtain

$$\begin{aligned} \mathbb P(T_2>t|T_1=t_1) = \sum _{k=2}^\infty \mathbb P(R_2<1-t, K_2=k|R_1=r_1), \end{aligned}$$

where $(K_i,R_i)$ are the random pairs of record times and their sizes (with $R_i=1-T_i$), connected with the sequence $\{nU_n\}$ (as described in Sect. 5). Note that the probability under the above sum can be written in terms of the $\{U_n\}$ as

$$\begin{aligned} \mathbb P(R_2<1-t, K_2=k|R_1=r_1) = \mathbb P(2U_2>r_1, \ldots , (k-1)U_{k-1}>r_1, kU_k<1-t), \end{aligned}$$

or, equivalently, as

$$\begin{aligned} \mathbb P(R_2<1-t, K_2=k|R_1=r_1) = p(r_1, k)\frac{1-t}{r_1}, \end{aligned}$$

where

$$\begin{aligned} p(r_1,k) = \left( 1-\frac{r_1}{2}\right) \cdots \left( 1-\frac{r_1}{k-1}\right) \frac{r_1}{k}, \,\,\, k\ge 2. \end{aligned}$$

When compared with (4), $p(r_1, k)$ is recognized as the PMF of $S\sim GS_0(r_1,1)$ and consequently,

$$\begin{aligned} \mathbb P(T_2>t|T_1=t_1) = \frac{1-t}{r_1}\sum _{k=2}^\infty p(r_1, k) = \frac{1-t}{1-t_1}. \end{aligned}$$

Since the quantity on the right-hand side above is the survival function of the uniform distribution on the interval $(t_1,1)$, the result holds for $n=2$. The proof in the case $n>2$ is similar.

Under the same notation and using again the law of total probability, we have

$$\begin{aligned}&\mathbb P(T_n>t|A_{n-1})\nonumber \\&\quad = \sum _{k=n-1}^\infty \sum _{m=1}^\infty \mathbb P(T_n>t, K_n=k+m|K_{n-1}=k)\mathbb P(K_{n-1}=k),\,\,\, t_{n-1}<t<1, \end{aligned}$$

where $A_{n-1}$ denotes the condition $T_1=t_1, \ldots , T_{n-1}=t_{n-1}$. Similarly as before, the conditional probabilities under the double sum above can be expressed as

$$\begin{aligned} \mathbb P(T_n>t, K_n=k+m|K_{n-1}=k) = p(r_{n-1}, k,m)\frac{1-t}{r_{n-1}}, \end{aligned}$$

where

$$\begin{aligned} p(r_{n-1}, k,m) = \left( 1-\frac{r_{n-1}}{k+1}\right) \cdots \left( 1-\frac{r_{n-1}}{k+m-1}\right) \frac{r_{n-1}}{k+m}, \,\,\, m\in \mathbb N, \end{aligned}$$

is recognized as the probability $\mathbb P(S=m)$ with $S\sim GS_1(r_1,k)$. Since these probabilities sum up to one across the values of $m\in \mathbb N_0$, and so do the probabilities $\mathbb P(K_{n-1}=k)$ across the values of $k\ge n-1$, we obtain

$$\begin{aligned} \mathbb P(T_n>t|A_{n-1})= & {} \frac{1-t}{r_{n-1}} \sum _{k=n-1}^\infty \mathbb P(K_{n-1}=k) \sum _{m=1}^\infty p(r_{n-1}, k,m)\\= & {} \frac{1-t}{1-t_{n-1}} ,\,\,\, t_{n-1}<t<1. \end{aligned}$$

Since the quantity on the right-hand side above is the survival function of the uniform distribution on the interval $(t_{n-1},1)$, the result follows. $\square $

Proof of Proposition 17

Write the estimators as

$$\begin{aligned} (\hat{\beta }_n, \hat{\theta }_n) = H(M_1, M_2) = (H_1(M_1,M_2), H_2(M_1,M_2)), \end{aligned}$$

(71)

where

$$\begin{aligned} H_1(y_1,y_2)= & {} \frac{y_2-2y_1^2+y_1}{y_2-2y_1^2+y_1+y_1(y_2-y_1)},\\ H_2(y_1,y_2)= & {} \frac{2(y_2-y_1^2)}{y_2-2y_1^2+y_1+y_1(y_2-y_1)} \end{aligned}$$

whenever $y_2-2y_1^2+y_1>0$, while otherwise, $H_1(y_1,y_2)=0$, $H_2(y_1,y_2)=1/y_1$ (with $y_1, y_2\ge 1$). To prove consistency, apply law of large numbers to the sequence $Z_i=(X_i, X_i^2)'$ and conclude that the sample mean $\overline{Z}_n=(M_1,M_2)'$ converges in distribution to the population mean $m_Z=\mathbb E(Z_i) = (\mu _1,\mu _2)'$, where

$$\begin{aligned} \mu _1=\frac{1-\beta }{\theta -\beta }, \,\,\,\,\,\, \mu _2 = \frac{1-\beta }{\theta -\beta }\frac{2(1-\beta )-\theta }{\theta -2\beta } \end{aligned}$$

are the first two moments of $GS_1(\beta , \theta )$ distribution (and are well defined when $\theta >2\beta $). Since the function H is continuous at $m_Z$, by continuous mapping theorem, the sequence (71) converges in distribution to $H(m_Z) = (\beta , \theta )$. The last equality follows straightforward, albeit tedious, algebra. This proves the estimators are consistent.

Next, we establish their asymptotic normality. Assuming the fourth moment of the $\{X_i\}$ is finite ($\theta >4\beta $), by the classical multivariate central limit theorem, we have the convergence in distribution $\sqrt{n} (\overline{Z}_n - m_Z) \mathop {\rightarrow }\limits ^{d} \text{ N }(0,\Sigma )$, where the right-hand side denotes the bivariate normal distribution with mean vector zero and covariance matrix

$$\begin{aligned} \Sigma = \left[ \begin{array}{cc} \mathbb V ar(X_i) &{} \mathbb C ov(X_i, X_i^2)\\ \mathbb C ov(X_i, X_i^2) &{} \mathbb V ar(X_i^2) \end{array} \right] . \end{aligned}$$

A straightforward calculation, facilitated by Propositions 5 and 3, along with basic properties of expectation, shows that

$$\begin{aligned} \mathbb V ar(X_i)&= \frac{\theta \,\left( 1-\beta \right) \,\left( 1-\theta \right) }{{\left( \beta -\theta \right) }^2\,\left( \theta -2\,\beta \right) },\\ \mathbb C ov(X_i,X_i^2)&= \frac{\theta \,\left( 1-\beta \right) \,\left( 1-\theta \right) \,\left( 4-5\,\beta -\theta \right) }{{\left( \beta -\theta \right) }^2\,\left( 2\,\beta -\theta \right) \,\left( 3\,\beta -\theta \right) },\\ \mathbb V ar(X_i^2)&= \frac{\theta \,\left( 1-\beta \right) \,\left( 1-\theta \right) \,\left( 2-2\,\beta -\theta \right) \,\left( 32\,{\beta }^2-13\,\beta \,\theta -22\,\beta -{\theta }^2+10\,\theta \right) }{{\left( \beta -\theta \right) }^2\,{\left( 2\,\beta -\theta \right) }^2\,\left( 3\,\beta -\theta \right) \,\left( 4\,\beta -\theta \right) }. \end{aligned}$$

Thus, since the function H is differentiable at $m_Z$, standard multivariate delta method leads to the conclusion that, as $n\rightarrow \infty $, the variables

$$\begin{aligned} \sqrt{n} (H(\overline{Z}_n) - H(m_Z)) = \sqrt{n}[ (\hat{\beta }_n, \hat{\theta }_n)' - (\theta ,\beta )'] \end{aligned}$$

converge in distribution to a bivariate normal vector with mean vector zero and covariance matrix $\Sigma _{MME} = D\Sigma D'$, where

$$\begin{aligned} D = \left[ \left. \frac{\partial H_i}{\partial y_j}\right| _{(y_1,y_2)=m_Z}\right] _{i,j=1}^2 = \left[ \begin{array}{cc} d_{11} &{}\quad d_{12}\\ d_{21} &{}\quad d_{22} \end{array} \right] \end{aligned}$$

is the matrix of partial derivatives of the vector-valued function H evaluated at $m_Z$. A routine, rather lengthy calculation yields

$$\begin{aligned} d_{11}&= \frac{\left( \beta -\theta \right) \,\left( 8\,{\beta }^2-4\,\beta \,\theta -6\,\beta -{\theta }^2+4\,\theta \right) }{2\,\left( 1-\theta \right) },\\ d_{22}&= - \frac{\left( \beta -\theta \right) \,{\left( 2\,\beta -\theta \right) }^2}{2\,\left( 1-\beta \right) },\\ d_{21}&= - \frac{\left( \beta -\theta \right) \,\left( -8\,{\beta }^2+4\,\beta \,\theta +8\,\beta +{\theta }^2-6\,\theta \right) }{2\,\left( 1-\beta \right) },\\ d_{12}&= - \frac{\left( \beta -\theta \right) \,{\left( 2\,\beta -\theta \right) }^2}{2\,\left( 1-\theta \right) }. \end{aligned}$$

Finally, straightforward matrix multiplication produces the asymptotic covariance matrix $\Sigma _{MME}$. $\square $

About this article

Cite this article

Kozubowski, T.J., Podgórski, K. A generalized Sibuya distribution. Ann Inst Stat Math 70, 855–887 (2018). https://doi.org/10.1007/s10463-017-0611-3

Download citation

Received: 11 June 2016
Revised: 14 February 2017
Published: 22 June 2017
Issue Date: August 2018
DOI: https://doi.org/10.1007/s10463-017-0611-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A generalized Sibuya distribution

Abstract

Access this article

Similar content being viewed by others

Compound Geometric Distribution of Order k

The distributions of sum, minima and maxima of generalized geometric random variables

The Memoryless Property and Moments of the Gumbel Distribution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Proof of Proposition 1

Proof of Proposition 2

Proof of Proposition 4

Proof of Proposition 5

Proof of Proposition 6

Proof of Proposition 7

Proof of Proposition 9

Proof of Proposition 11

Proof of Proposition 12

Proof of Proposition 14

Proof of Proposition 17

About this article

Cite this article

Keywords

Navigation

A generalized Sibuya distribution

Abstract

Access this article

Similar content being viewed by others

Compound Geometric Distribution of Order k

The distributions of sum, minima and maxima of generalized geometric random variables

The Memoryless Property and Moments of the Gumbel Distribution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Proposition 1

Proof of Proposition 2

Proof of Proposition 4

Proof of Proposition 5

Proof of Proposition 6

Proof of Proposition 7

Proof of Proposition 9

Proof of Proposition 11

Proof of Proposition 12

Proof of Proposition 14

Proof of Proposition 17

About this article

Cite this article

Share this article

Keywords

Search

Navigation