On a General Class of Discrete Bivariate Distributions

Kundu, Debasis

doi:10.1007/s13571-019-00194-x

On a General Class of Discrete Bivariate Distributions

Published: 09 March 2019

Volume 82, pages 270–304, (2020)
Cite this article

Sankhya B Aims and scope Submit manuscript

Debasis Kundu ORCID: orcid.org/0000-0002-9141-422X¹

99 Accesses
Explore all metrics

Abstract

In this paper we develop a general class of bivariate discrete distributions. The basic idea is quite simple. The marginals are obtained by taking the random geometric sum of the baseline random variables. The proposed class of distributions is a flexible class of bivariate discrete distributions in the sense the marginals can take variety of shapes. The probability mass functions of the marginals can be heavy tailed, unimodal as well as multimodal. It can be both over dispersed as well as under dispersed. We discuss different properties of the proposed class of bivariate distributions. The proposed distribution has some interesting physical interpretations also. Further, we consider two specific base line distributions: Poisson and negative binomial distributions for illustrative purposes. Both of them are infinitely divisible. The maximum likelihood estimators of the unknown parameters cannot be obtained in closed form. They can be obtained by solving three and five dimensional non-linear optimizations problems, respectively. To avoid that we propose to use expectation maximization algorithm, and it is observed that the proposed algorithm can be implemented quite easily in practice. We have performed some simulation experiments to see how the proposed EM algorithm performs, and it works quite well in both the cases. The analysis of one real data set has been performed to show the effectiveness of the proposed class of models. Finally, we discuss some open problems and conclude the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new general class of discrete bivariate distributions constructed by using the likelihood ratio

Article 24 November 2017

Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation

Article 12 November 2020

A New Bivariate Distribution with One Marginal Defined on the Unit Interval

Article 05 April 2017

References

Adamidis, K. (1999). An EM algorithm for estimating negative binomial parameters. Australian New Zealand J. Statist. 41, 213–221.
Article MathSciNet Google Scholar
Barreto-Souza, W. (2012). Bivariate gamma-geometric law and its induced Levy process. J. Multivar. Anal. 109, 130–145.
Article MathSciNet Google Scholar
Basu, A.P. and Dhar, S.K. (1995). Bivariate geometric distribution. J. Appl. Statist. Sci. 2, 33–34.
MathSciNet MATH Google Scholar
Campbell, J.T. (1934). The Poisson correlation function. In Proceedings of the Edinburgh Mathematical Society, Series 2, vol. 4, pp. 18–26.
Chahkandi, M. and Ganjali, M. (2009). On some lifetime distributions with decreasing failure rate. Comput. Statist. Data Anal. 53, 4433–4440.
Article MathSciNet Google Scholar
Davis, C.S. (2002). Statistical Methods for the Analysis of Repeated Measurements. Springer, New York.
MATH Google Scholar
Holgate, B. (1964). Estimation for the bivariate Poisson distribution. Biometrika 51, 241–245.
Article MathSciNet Google Scholar
Jayakumar, K. and Mundassery, D.A. (2007). On a bivariate geometric distribution. Statistica LXVII, 389–404.
MathSciNet MATH Google Scholar
Kemp, A.W. (2013). New discrete Appell and Humbert distributions with relevance to bivariate accident data. J. Multivar. Anal. 113, 2–6.
Article MathSciNet Google Scholar
Kocherlakota, S. (1995). Discrete bivariate weighted distributions under multiplicative weight function. Commun. Statist. Theory Methods 24, 533–551.
Article MathSciNet Google Scholar
Kocherlakota, S. and Kocherlakota, K. (1992). Bivariate Discrete Distributions. Marcel and Dekker, New York.
MATH Google Scholar
Kostadinova, K. and Minkova, L. (2014). On a bivariate Poisson negative binomial process. Biomath 3, 1–6.
Article MathSciNet Google Scholar
Kumar, C.S. (2008). A unified approach to bivariate discrete data. Metrika 67, 113–123.
Article MathSciNet Google Scholar
Kundu, D. (2014). Geometric skewed normal distribution. Sankhya, Ser. B. 76, 167–189.
Article Google Scholar
Kundu, D. (2017). Multivariate geometric skew normal distribution. Statistics 51, 1377–1397.
Article MathSciNet Google Scholar
Kundu, D. and Nekoukhou, V. (2018). Univariate and bivariate geometric discrete generalized exponential distributions. J. Statist. Theory Pract. 12, 595–614.
Article MathSciNet Google Scholar
Kozubowski, T.J., Panorska, A.K. and Podgorski, K. (2008). A bivariate Levy process with negative binomial and gamma marginals. J. Multivar. Anal. 199, 1418–1437.
Article Google Scholar
Kozubowski, T.J., Panorska, A.K. and Qeadan, F. (2011). A new multivariate model involving geometric sums and maxima of exponentials. J. Statist. Planning Inference 141, 2353–2367.
Article MathSciNet Google Scholar
Lee, H. and Cha, J.H. (2014). On construction of general class of bivariate distributions. J. Multivar. Anal. 127, 151–159.
Article Google Scholar
Lee, H. and Cha, J.H. (2015). On two general classes of discrete bivariate distributions. Am. Statist. 69, 221–230.
Article MathSciNet Google Scholar
Minkova, L.D. and Balakrishnan, N. (2014). Bivariate Pólya-Aeppli distribution. Commun. Statist. Theory Methods 43, 5026–5038.
Article Google Scholar
Nekoukhou, V. and Kundu, D. (2017). Bivariate discrete generalized exponential distribution. Statistics 51, 1143–1158.
Article MathSciNet Google Scholar
Ozel, G. (2011). On certain properties of a class of bivariate compound Poisson distribution and an application to earthquake data. Revista Colombiana de Estadstica34, 545–566.
MathSciNet MATH Google Scholar
Piperigou, V.E. and Papageorgiou, H. (2003). On bivariate discrete distributions: A unified treatment. Metrika 58, 221–233.
Article MathSciNet Google Scholar
Ye, Y. (2015). Likelihood Inference for Type-I Bivariate Pólya-Aeppli Distribution. MS Thesis, McMaster University.

Download references

Acknowledgements

The author would like to than the unknown reviewers for providing constructive suggestions to improve the paper significantly.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kanpur, Pin 208016, India
Debasis Kundu

Authors

Debasis Kundu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Debasis Kundu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Exact Expressions of C(a, j)

Note that for

$$ C(a,0) = \sum\limits_{k = 1}^{\infty} e^{-a k} = \frac{e^{-a}}{1 - e^{-a}} = \frac{1}{e^{a}-1}. $$

To compute C(a,1), first observe that

$$ C(a,1) = \sum\limits_{k = 1}^{\infty} e^{-a k} k = \frac{1}{e^{a}} + \frac{2}{e^{2a}} + \frac{3}{e^{3a}} + \cdots $$

and

$$ e^{a} C(a,1) = 1 + \frac{2}{e^{a}} + \frac{3}{e^{2a}} + \frac{4}{e^{3a}} + \cdots. $$

Hence

$$ (e^{a} - 1) C(a,1) = 1 + \frac{1}{e^{a}} + \frac{1}{e^{2a}} + \frac{1}{e^{3a}} + {\cdots} = \frac{e^{a}}{e^{a} - 1}. $$

Therefore

$$ C(a,1) = \frac{e^{a}}{(e^{a} - 1)^{2}}. $$

Now to compute C(a,2), note that

$$ C(a,2) = \sum\limits_{k = 0}^{\infty} e^{-a k} k^{2} = \sum\limits_{k = 0}^{\infty} e^{-a k} k(k-1) + \sum\limits_{k = 0}^{\infty} e^{-a k} k. $$

If we denote $\displaystyle S = \sum \limits _{k = 0}^{\infty } e^{-a k} k(k-1)$, then

$$ S = \frac{2 \cdot 1}{e^{2a}} + \frac{3 \cdot 2}{e^{3a}} + \frac{4 \cdot 3}{e^{4a}} + \cdots $$

and

$$ e^{a} S = \frac{2 \cdot 1}{e^{a}} + \frac{3 \cdot 2}{e^{2a}} + \frac{4 \cdot 3}{e^{3a}} + \cdots. $$

Hence

$$ S(e^{a}-1) = \frac{2 \cdot 1}{e^{a}} + \frac{2 \cdot 2}{e^{2a}} + \frac{2 \cdot 3}{e^{3a}} + {\cdots} = 2 \sum\limits_{k = 1}^{\infty} k e^{-ak} = \frac{2 e^{a}}{(e^{a}-1)^{2}}. $$

Therefore,

$$ C(a,2) = \frac{2 e^{a}}{(e^{a}-1)^{3}} + \frac{e^{a}}{(e^{a}-1)^{2}} = \frac{e^{2a} + e^{a}}{(e^{a}-1)^{3}}. $$

We will use the following notations:

$$ \begin{array}{@{}rcl@{}} S_{0}(a) &=& \sum\limits_{k = 0}^{\infty} k e^{-k a}, \ \ S_{1}(a) = \sum\limits_{k = 0}^{\infty} k(k-1) e^{-k a}, \ldots,\\ S_{m}(a) &=& \sum\limits_{k = 0}^{\infty} k(k-1) {\cdots} (k-m) e^{-k a}. \end{array} $$

Then using the fact

$$ e^{a} S_{m}(a) = \sum\limits_{k = 0}^{\infty} k(k-1) {\cdots} (k-m) e^{-(k-1) a}. $$

We can easily obtain the following relation

$$ S_{m}(a)(e^{a}-1) = (m + 1) S_{m-1}(a). $$

Further note that if we denote

$$ \begin{array}{@{}rcl@{}} k^{m} &=& C_{0m} k(k-1) {\cdots} (k-m + 1) + C_{1m} k(k-1) {\cdots} (k-m + 1) \\&&+ {\cdots} + C_{m-2,m} k (k-1) + C_{m-1,m} k, \end{array} $$

then C_0m, C_1m⋯ ,C_m− 1,m can be obtained recursively from the following set of linear equations. C_0m = 1 and

$$ \begin{array}{@{}rcl@{}} -C_{0m} \sum\limits_{1 \le i_1 \le m-1} i_1 + C_{1m} \!\!\!& = &\!\!\! 0 \\ C_{0m} \sum\limits_{1 \le i_1 < i_2 \le m-1} i_1 i_2 - C_{1m} \sum\limits_{1 \le i_1 \le m-2} i_1 + C_{2m} \!\!\!& = &\!\!\! 0 \\ - C_{0m} \sum\limits_{1 \le i_1 < i_2 < i_3 \le m-1} i_1 i_2 i_3 + C_{1m} \sum\limits_{1 \le i_1 < i_2 \le m-2} i_1 i_2 - C_{2m} \sum\limits_{1 \le i_1 \le m-3} i_1 + C_{3m} \!\!\!& = &\!\!\! 0 \\ \!\!\!& {\vdots} &\!\!\! \\ (-1)^{m-1} C_{0m} \prod\limits_{i = 1}^{m-1} i (-1)^{m-2} C_{1m} \prod\limits_{i = 1}^{m-2} i (-1)^{m-3} C_{2m} \prod\limits_{i = 1}^{m-3} i + {\cdots} - C_{m-2,m} + C_{m-1,m} \!\!\!& = &\!\!\! 0. \end{array} $$

If we use the following notations for n < m;

$$ a_{nm} = \sum\limits_{1 \le i_{1} < i_{2} < {\ldots} < i_{n} \le m} i_{1} i_{2} {\cdots} i_{n}, $$

$\displaystyle a_{mm} = \prod \limits _{i = 1}^{m} i$, then clearly

$$ a_{n,m + 1} = a_{n,m} + (m + 1)a_{n-1,m}, $$

and we obtain

$$ \begin{array}{@{}rcl@{}} C_{1m} & = & a_{1,m-1} \\ C_{2m} & = & C_{1m} a_{1,m-2} - a_{2,m-1} \\ C_{3m} & = & C_{2m} a_{1,m-3} - C_{1m} a_{2,m-2} + a_{3,m-1} \\ {\vdots} & = & {\vdots} \\ C_{m-1,m} & = & C_{m-2,m} a_{11} - C_{m-3,m} a_{2,2} + {\ldots} (-1)^{m-2} a_{m-1,m-1}. \end{array} $$

Since we have

$$ C(a,m) = C_{0m} S_{m-1}(a) + C_{1m}S_{m-2}(a) + {\ldots} + C_{m-1,m} S_{0}(a), $$

we can obtain recursively C(a, m + 1) from C(a, m).

Appendix B: Expressions of P(N = n|X = x, Y = y)

In this appendix we provide the expressions of P(N = n|X = x, Y = y) for both BPG and BNBG models. Suppose (X, Y ) ∼ BPG(λ₁, λ₂, p), then

$$ \begin{array}{@{}rcl@{}} P(N=n| X= x, Y= y) & = & \frac{P(X=x, Y=y|N=n) P(N = n)}{P(X = x, Y = y)} \\ & = & \frac{n^{x+y} e^{-n(\lambda_1+\lambda_2)} (1-p)^n}{C(\lambda_1+\lambda_2-\ln (1-p), x+y)}. \end{array} $$

Now to compute arg max_nP(N = n|X = x, Y = y), we consider

$$ g(n) = \frac{P(N=n + 1| X= x, Y= y)}{P(N=n| X= x, Y= y)} = \left( \frac{n + 1}{n} \right)^{x+y} e^{-(\lambda_{1}+\lambda_{2})} (1-p). $$

It is immediate that either g(n) is a decreasing function or it is an unimodal function and if n^∗ = arg max_nP(N = n|X = x, Y = y), then n^∗ is the smallest integer greater than

$$ \left( \left[ \frac{e^{\lambda_{1}+\lambda_{2}}}{1-p} \right]^{1/(x+y)} - 1 \right)^{-1}. $$

Now suppose (X, Y ) ∼BNBG(r₁, 𝜃₁, r₂, 𝜃₂, p), then

$$ \begin{array}{@{}rcl@{}} P(N=n| X= x, Y= y) & = & \frac{P(X=x, Y=y|N=n) P(N = n)}{P(X = x, Y = y)} \\ & = & \frac{(1-p)^n (1-\theta_1)^{nr_1} (1-\theta_2)^{nr_2}}{D(r_1, r_2, \theta_1, \theta_2, x, y, p)} \\&&\times \frac{{\Gamma}(x+nr_1)}{x! {\Gamma}(nr_1)} \frac{{\Gamma}(y+nr_2)}{y! {\Gamma}(nr_2)}. \end{array} $$

Hence,

$$ \begin{array}{@{}rcl@{}} g(n) & = & \frac{P(N=n + 1| X= x, Y= y)}{P(N=n| X= x, Y= y)} \\ & = & (1-p) (1-\theta_1)^{r_1} (1-\theta_2)^{r_2} \times \\ & & \frac{{\Gamma}(x+(n + 1)r_1) {\Gamma}(y+(n + 1)r_2)}{{\Gamma}((n + 1)r_2) {\Gamma}((n + 1)r_2)} \times \frac{{\Gamma}(nr_1) {\Gamma}(nr_2)}{{\Gamma}(x+nr_1) {\Gamma}(y+nr_2)}. \end{array} $$

In this case because of the complicated nature of g(n), it is not possible to show that g(n) has a unique maximum. But in all our numerical experiments it has been observed that g(n) has a unique maximum. We have chosen n^∗ to be the minimum n, such that g(n) < 1.

Appendix C: EM Algorithm for Non-Identical NB Distribution

In this Appendix we will show that if X_i ∼ NB(n_ir, 𝜃), X_i’s are independent, n_i’s are known for i = 1,…,m, then how to obtain MLEs of r and 𝜃, based on a sample {x₁,…,x_m}. In this case we will be using an EM algorithm very similar to Adamidis (1999). We use the following notation: α = −(ln(1 − 𝜃))^− 1, r = αλ and provide the algorithm to compute the MLEs of λ and 𝜃. It is observed that in this case at each E-step, the corresponding M-step can be obtained in explicit forms. Using the same notation as in Adamidis (1999), it can be easily seen that for i = 1,…,m,

$$ X_{i} \overset{d}{=} \sum\limits_{j = 1}^{M_{i}} Y_{ij}, $$

here Y_ij’s are i.i.d. logarithmic series distribution (LSD) with PDF

$$ f_{LSD}(y; \theta) = \frac{\alpha \theta^{y}}{y}; \ \ \ y \in \mathbb{N} = \{1, 2, 3, \ldots\}, $$

M_i ∼PO(n_iλ) and all the random variables are independently distributed. Further, if Z_ij’s are i.i.d. random variables with PDF

$$ f(z; \theta) = \alpha^{-1} \frac{(1-\theta)^{z}}{\theta}; \ \ \ z \in (0,1), $$

and 0, otherwise, then the log-likelihood function of the ‘complete data’ (Y_ij, Z_ij, M_i;i = 1,…,m, j = 1,…,M_i), without the additive constant can be written as

$$ \begin{array}{@{}rcl@{}} l^{*}(\alpha, \lambda) &=& - \lambda \widetilde{n} + \ln \lambda \sum\limits_{i = 1}^{m} m_{i} + \ln \theta \left[ \sum\limits_{i = 1}^{m} \sum\limits_{j = 1}^{m_{i}} y_{ij} - \sum\limits_{i = 1}^{m} m_{i} \right] \\&&+ \ln (1-\theta) \left[ \sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{m_{i}} z_{ij} \right], \end{array} $$

here $\displaystyle \widetilde {n} = \sum \limits _{i = 1}^{m} n_{i}$. Hence, the MLEs of λ and 𝜃 based on the complete observations can be easily obtained as

$$ \widehat{\lambda} = \frac{{\sum}_{i = 1}^{m} m_{i}}{\widetilde{n}} \ \ \ \text{and} \ \ \ \widehat{\theta} = \frac{{\sum}_{i = 1}^{m} {\sum}_{j = 1}^{m_{i}} y_{ij} - {\sum}_{i = 1}^{m} m_{i}}{{\sum}_{i = 1}^{m} {\sum}_{j = 1}^{m_{i}} y_{ij} + {\sum}_{i = 1}^{m} {\sum}_{j = 1}^{m_{i}} z_{ij} - {\sum}_{i = 1}^{m} m_{i}}. $$

Hence, following the same way as in Adamidis (1999), it can be easily seen that if at the k-th stage the estimates of λ and 𝜃 are λ^(k) and 𝜃^(k), respectively, and if we denote

$$ a_{i}(\alpha, \lambda) = n_{i} \alpha \lambda \sum\limits_{l = 1}^{x_{i}} (\alpha n_{i} \lambda + l-1)^{-1}, $$

then

$$ \begin{array}{@{}rcl@{}} \lambda^{(k + 1)} &=& \frac{{\sum}_{i = 1}^{m} a_{i}(\alpha^{(k)}, \lambda^{(k)})}{\widetilde{n}} \ \ \ \text{and} \\ \theta^{(k + 1)} &=& \frac{\widetilde{x} - {\sum}_{i = 1}^{m}a_{i}(\alpha^{(k)}, \lambda^{(k)})}{\widetilde{x} + {\sum}_{i = 1}^{m} a_{i}(\alpha^{(k)}, \lambda^{(k)}) \left( \frac{\alpha^{(k)}(1-\theta^{(k)})}{\theta^{(k)}} -1 \right)}. \end{array} $$

Here $\displaystyle \widetilde {x} = \sum \limits _{i = 1}^{m} x_{i}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kundu, D. On a General Class of Discrete Bivariate Distributions. Sankhya B 82, 270–304 (2020). https://doi.org/10.1007/s13571-019-00194-x

Download citation

Received: 22 November 2018
Published: 09 March 2019
Issue Date: November 2020
DOI: https://doi.org/10.1007/s13571-019-00194-x

Keywords and phrases.

AMS (2000) subject classification.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On a General Class of Discrete Bivariate Distributions

Abstract

Access this article

Similar content being viewed by others

A new general class of discrete bivariate distributions constructed by using the likelihood ratio

Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation

A New Bivariate Distribution with One Marginal Defined on the Unit Interval

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

Appendix A: Exact Expressions of C(a, j)

Appendix B: Expressions of P(N = n|X = x, Y = y)

Appendix C: EM Algorithm for Non-Identical NB Distribution

Rights and permissions

About this article

Cite this article

Keywords and phrases.

AMS (2000) subject classification.

Navigation

On a General Class of Discrete Bivariate Distributions

Abstract

Access this article

Similar content being viewed by others

A new general class of discrete bivariate distributions constructed by using the likelihood ratio

Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation

A New Bivariate Distribution with One Marginal Defined on the Unit Interval

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

Appendix A: Exact Expressions of C(a, j)

Appendix B: Expressions of P(N = n|X = x, Y = y)

Appendix C: EM Algorithm for Non-Identical NB Distribution

Rights and permissions

About this article

Cite this article

Share this article

Keywords and phrases.

AMS (2000) subject classification.

Search

Navigation