Skip to main content
Log in

Discrete dispersion models and their Tweedie asymptotics

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

We introduce a class of two-parameter discrete dispersion models, obtained by combining convolution with a factorial tilting operation, similar to exponential dispersion models which combine convolution and exponential tilting. The equidispersed Poisson model has a special place in this approach, whereas several overdispersed discrete distributions, such as the Neyman Type A, Pólya–Aeppli, negative binomial and Poisson-inverse Gaussian, turn out to be Poisson–Tweedie factorial dispersion models with power dispersion functions, analogous to ordinary Tweedie exponential dispersion models with power variance functions. Using the factorial cumulant generating function as tool, we introduce a dilation operation as a discrete analogue of scaling, generalizing binomial thinning. The Poisson–Tweedie factorial dispersion models are closed under dilation, which in turn leads to a Poisson–Tweedie asymptotic framework where Poisson–Tweedie models appear as dilation limits. This unifies many discrete convergence results and leads to Poisson and Hermite convergence results, similar to the law of large numbers and the central limit theorem, respectively. The dilation operator also leads to a duality transformation which in some cases transforms overdispersion into underdispersion and vice versa. Finally, we consider the multivariate factorial cumulant generating function, and introduce a multivariate notion of over- and underdispersion, and a multivariate zero inflation index.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barreto-Souza, W., Bourguignon, M.: A skew INAR(1) process on \(\mathbb{Z}\). AStA Adv. Stat. Anal. 99, 189–208 (2015)

  • Bryc, W.: Free exponential families as kernel families. Demonstr. Math. XLII, 657–672 (2009)

    MathSciNet  Google Scholar 

  • Dobbie, M.J., Welsh, A.H.: Models for zero-inflated count data using the Neyman type A distribution. Stat. Model. 1, 65–80 (2001)

    Article  MATH  Google Scholar 

  • El-Shaarawi, A.H., Zhu, R., Joe, H.: Modelling species abundance using the Poisson-Tweedie family. Environmetrics 22, 152–164 (2011)

    Article  MathSciNet  Google Scholar 

  • Giles, D.E.: Hermite regression analysis of multi-modal count data. Econ. Bull. 30, 2936–2945 (2010)

    Google Scholar 

  • Grandell, J.: Mixed Poisson Processes. Chapman & Hall, London (1997)

    Book  MATH  Google Scholar 

  • Harremoës, P., Johnson, O., Kontoyiannis, I.: Thinning, entropy, and the law of thin numbers. IEEE Trans. Inf. Theory 56, 4228–4244 (2010)

    Article  Google Scholar 

  • Hougaard, P., Lee, M-L.T., Whitmore, G.A.: Analysis of overdispersed count data by mixtures of Poisson variables and Poisson processes. Biometrics 53, 1225–1238 (1997)

  • Jensen, S.T., Nielsen, B.: On convergence of multivariate Laplace transforms. Statist. Probab. Lett. 33, 125–128 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Johnson, N.L., Kotz, S., Balakrishnan, N.: Discrete Multivariate Distributions. Wiley, New York (1997)

    MATH  Google Scholar 

  • Johnson, N.L., Kemp, A.W., Kotz, S.: Univariate Discrete Distributions, 3rd edn. Wiley, Hoboken (2005)

    Book  MATH  Google Scholar 

  • Jørgensen, B.: The Theory of Dispersion Models. Chapman & Hall, London (1997)

    Google Scholar 

  • Jørgensen, B., Kokonendji, C.C.: Dispersion models for geometric sums. Braz. J. Probab. Statist. 25, 263–293 (2011)

    Article  Google Scholar 

  • Jørgensen, B., Martínez, J.R.: Multivariate exponential dispersion models. In: Kollo, T. (ed.) Multivariate Statistics: Theory and Applications. Proceedings of the IX Tartu Conference on Multivariate Statistics & XX International Workshop on Matrices and Statistics, pp. 73–98. World Scientific, Singapore (2013)

    Google Scholar 

  • Jørgensen, B., Martínez, J.R., Tsao, M.: Asymptotic behaviour of the variance function. Scand. J. Stat. 21, 223–243 (1994)

    Google Scholar 

  • Jørgensen, B., Goegebeur, Y., Martínez, J.R.: Dispersion models for extremes. Extremes 13, 399–437 (2010)

    Article  MathSciNet  Google Scholar 

  • Jørgensen, B., Demétrio, C.G.B., Kristensen, E., Banta, G.T., Petersen, H.C., Delefosse, M.: Bias-corrected Pearson estimating functions for Taylor’s power law applied to benthic macrofauna data. Stat. Probab. Lett. 81, 749–758 (2011)

    Article  Google Scholar 

  • Kalashnikov, V.: Geometric Sums: Bounds for Rare Events with Applications. Kluwer Academic Publishers, Dordrecht (1997)

    Book  MATH  Google Scholar 

  • Karlis, D., Xekalaki, E.: Mixed poisson distributions. Int. Stat. Rev. 73, 35–58 (2005)

    Article  MATH  Google Scholar 

  • Kemp, A.W.: Characterizations of a discrete normal distribution. J. Stat. Plan. Inf. 63, 223–229 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Kemp, C.D., Kemp, A.W.: Some properties of the ‘Hermite’ distribution. Biometrika 52, 381–394 (1965)

    MathSciNet  MATH  Google Scholar 

  • Khatri, C.G.: On certain properties of power-series distributions. Biometrika 46, 486–490 (1959)

    Article  MathSciNet  MATH  Google Scholar 

  • Kokonendji, C.C., Pérez-Casany, M.: A note on weighted count distributions. J. Stat. Theory Appl. 11, 337–352 (2012)

    Google Scholar 

  • Kokonendji, C.C., Dossou-Gbété, S., Demétrio, C.G.B.: Some discrete exponential dispersion models: Poisson-Tweedie and Hinde-Demétrio classes. Stat Operat Res Trans (SORT) 28, 201–214 (2004)

    MATH  Google Scholar 

  • Kokonendji, C.C., Mizère, D., Balakrishnan, N.: Connections of the Poisson weight function to overdispersion and underdispersion. J. Stat. Plan. Inf. 138, 1287–1296 (2008)

    Article  MATH  Google Scholar 

  • Massé, J.-C., Theodorescu, R.: Neyman type A distribution revisited. Stat. Neerl. 59, 206–213 (2005)

    Article  MATH  Google Scholar 

  • McKenzie, E.: Some simple models for discrete variates time series. Water Resour. Bull. 21, 645–650 (1985)

    Article  Google Scholar 

  • Mora, M.: La convergence des fonctions variance des familles exponentielles naturelles. Ann. Fac. Sci. Toulouse 11(5), 105–120 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  • Pistone, G., Wynn, H.P.: Finitely generated cumulants. Stat. Sinica 9, 1029–1052 (1999)

    MathSciNet  MATH  Google Scholar 

  • Puig, P.: Characterizing additively closed discrete models by a property of their maximum likelihood estimators, with an application to generalized Hermite distributions. J. Am. Stat. Assoc. 98, 687–692 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Puig, P., Barquinero, F.: An application of compound Poisson modelling to biological dosimetry. Proc. R. Soc. A 467, 897–910 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Puig, P., Valero, J.: Count data distributions: some characterizations with applications. J. Am. Stat. Assoc. 101, 332–340 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Puig, P., Valero, J.: Characterization of count data distributions involving additivity and binomial subsampling. Bernoulli 13, 544–555 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Ristić, M.M., Bakouch, H.S., Nastić, A.S.: A new geometric first-order integer-valued autoregressive (NGINAR(1)) process. J. Stat. Plan. Inference 139, 2218–2226 (2009)

    Article  MATH  Google Scholar 

  • Roy, D.: The discrete normal distribution. Commun. Stat. Theory Methods 32, 1871–1883 (2003)

    Article  MATH  Google Scholar 

  • Rudin, W.: Principles of Mathematical Analysis, 3rd edn. McGraw-Hill, New York (1976)

    MATH  Google Scholar 

  • Sellers, K.F., Borle, S., Shmueli, G.: The COM-Poisson model for count data: a survey of methods and applications. Appl. Stoch. Models Bus. Ind. 28, 104–116 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Shmueli, G., Minka, T.P., Kadane, J.P., Borle, S., Boatwright, P.: A useful distribution for fitting discrete data: revival of the Conway-Maxwell-Poisson distribution. Appl. Stat. 54, 127–142 (2005)

    MathSciNet  MATH  Google Scholar 

  • Steutel, F.W., van Harn, K.: Discrete analogues of self-decomposability and stability. Ann. Probab. 7, 893–899 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  • Taylor, L.R., Taylor, R.A.J.: Aggregation, migration and population mechanics. Nature 265, 415–421 (1977)

    Article  Google Scholar 

  • Thedéen, T.: The inverses of thinned point processes. Research report 1986:1. Department of Statistics, University of Stockholm (1986)

  • Tweedie, M.C.K.: An index which distinguishes between some important exponential families. In: Ghosh, J.K., Roy, J. (eds.) Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference, pp. 579–604. Indian Statistical Institute, Calcutta (1984)

    Google Scholar 

  • Weiß, C.H.: Thinning operations for modeling time series of counts—a survey. AStA Adv. Stat. Anal. 92, 319–341 (2008)

    Article  MathSciNet  Google Scholar 

  • Willmot, G.E.: The Poisson-inverse Gaussian distribution as an alternative to the negative binomial. Scand. Actuar. J. 1987, 113–127 (1987)

    Article  MathSciNet  Google Scholar 

  • Wimmer, G., Altmann, G.: Thesaurus of Univariate Discrete Probability Distributions. STAMM Verlag, Essen (1999)

    Google Scholar 

  • Wiuf, C., Stumpf, M.P.H.: Binomial subsampling. Proc. R. Soc. A 462, 1181–1195 (2006)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

We are grateful to Christian Weiß and two anonymous referees for useful comments on a previous version of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bent Jørgensen.

Appendices

Appendix A: Exponential dispersion models

In this appendix, we summarize some relevant facts about exponential dispersion models and Tweedie models. An exponential dispersion model \(\mathrm {ED}(\mu ,\gamma )\) with mean \(\mu \in \Omega \), dispersion parameter \(\gamma >0\) and unit variance function \(V(\mu )\) has PDF of the form

$$\begin{aligned} f(y;\mu ,\gamma )=a(y;\gamma )\exp \left[ -\frac{1}{2\gamma }\mathrm{d}(y;\mu )\right] \quad \text { for }\,y,\mu \in \Omega , \end{aligned}$$
(6.1)

where the unit deviance function \(\mathrm{d}(y;\mu )\) is defined by

$$\begin{aligned} d(y;\mu )=2\int _{\mu }^{y}\frac{y-z}{V(z)}\,\mathrm{d}z\quad \text { for }\,y,\mu \in \Omega . \end{aligned}$$

The model (6.1) is, for each known value of \(\gamma \), a natural exponential family with variance function \(\gamma V(\mu )\). Hence, the function \(a(y;\gamma )\) may be determined by Fourier inversion from the CGF, which may in turn be obtained from \(V\). The model \(\mathrm {ED}(\mu ,\gamma )\) satisfies the following reproductive property:

$$\begin{aligned} \overline{Y}_{n}\sim \mathrm {ED}(\mu ,\gamma /n), \end{aligned}$$
(6.2)

where \(\overline{Y}_{n}\) is the average of \(Y_{1},\ldots ,Y_{n}\), which are i.i.d. from \(\mathrm {ED}(\mu ,\gamma )\).

The Tweedie exponential dispersion model \(\mathrm {Tw}_{p}(\mu ,\gamma )\) has mean \(\mu \) and unit variance function

$$\begin{aligned} V(\mu )=\mu ^{p}\quad \text {for }\, \mu \in \Omega _{p},\, \text {where }\, p\notin \left( 0,1\right) . \end{aligned}$$

The domain for \(\mu \) is either \(\Omega _{0}=R\) or \(\Omega _{p}=R_{+}\) for \( p\ne 0\). Tweedie models satisfy the scaling property

$$\begin{aligned} c\mathrm {Tw}_{p}(\mu ,\gamma )=\mathtt {\mathrm {Tw}}_{p}(c\mu ,c^{2-p}\gamma )\quad \text {for }\,c>0. \end{aligned}$$
(6.3)

Conventional Tweedie asymptotics (Jørgensen et al. 1994) have the following form. If \(\mathtt {\mathrm {ED}}(\mu ,\gamma )\) with unit variance function \(V(\mu )\) satisfies

$$\begin{aligned} V(\mu )\sim \mu ^{p} \quad \text { as }\,\mu \downarrow 0\text { or }\mu \rightarrow \infty \end{aligned}$$

then

$$\begin{aligned} c^{-1}\mathtt {\mathrm {ED}}(c\mu ,c^{2-p}\gamma )\overset{D}{\rightarrow } \mathrm {Tw}_{p}(\mu ,\gamma ) \quad \text {as }\,c\downarrow 0 \text { or }c\rightarrow \infty \, , \end{aligned}$$
(6.4)

respectively. The proof is based on convergence of the variance function on the left-hand side of (6.4),

$$\begin{aligned} c^{-2}c^{2-p}\gamma V(c\mu )\rightarrow \gamma \mu ^{p}, \end{aligned}$$

applying Mora’s (1990) convergence theorem. The case \( c^{2-p}\rightarrow \infty \) requires the model \(\mathtt {\mathrm {ED}}(\mu ,\gamma )\) to be infinite divisible. This result implies a Tweedie approximation, by means of (6.3)

$$\begin{aligned} \mathtt {\mathrm {ED}}(c\mu ,c^{2-p}\gamma )\overset{\cdot }{\sim }\mathtt { \mathrm {Tw}}_{p}(c\mu ,c^{2-p}\gamma )\,\quad \text {as }\,c\downarrow 0 \text { or }c\rightarrow \infty . \end{aligned}$$

In some cases, we have a large-sample interpretation of Tweedie convergence. Let us consider the average \(\bar{Y}_{n}\) with distribution (6.2). Then for \(p\ne 2\) we obtain

$$\begin{aligned} n^{-1/(p-2)}\mathtt {\mathrm {ED}}(n^{1/(p-2)}\mu ,\gamma /n)\overset{D}{ \rightarrow } \mathrm {Tw}_{p}(\mu ,\gamma )\quad \text {as }\, n\rightarrow \infty . \end{aligned}$$

We interpret this result as saying that the scaled and exponentially tilted average \(\bar{Y}_{n}\) converges to a Tweedie distribution as \(n\rightarrow \infty \).

Appendix B: Proof of Theorem 4.1

Consider a sequence of factorial tilting families \(\mathrm {FT}_{n}(\mu )\) with local dispersion functions \(v_{n}\) having domain \(\Psi _{n}\) and FCGF \( C_{n}\) satisfying the conditions of Theorem 4.1. The idea of the proof is to obtain the FCGF derivative \(\dot{C}\) from the limiting dispersion function \(v\), and in turn use the uniform convergence to show convergence of the sequence \(C_{n}\).

We begin by considering the nonzero case, where \(v(\mu )\ne 0\) for \(\mu \in \Psi _{0}\). Let \(K\) be a given compact subinterval of \(\Psi _{0}\). By assumption \(\Psi _{0}\subseteq \mathrm {int}\left( \lim \Psi _{n}\right) \), so we may assume that \(K\subseteq \Psi _{n}\) from some \(n_{0}\) on. We only need to consider \(n>n_{0}\) from now on. Fix a \(\mu _{0}\in \mathrm {int}\,K\). Let \(\psi _{n}=\dot{C}_{n}^{-1}\) denote the inverse FCGF derivative defined by \(\dot{\psi }_{n}\left( \mu \right) =1/v_{n}(\mu )\) on \(\Psi _{n}\) and \( \psi _{n}\left( \mu _{0}\right) =0\). Let \(\dot{C}_{n},\, C_{n}\), etc., denote the quantities associated with this parametrization. Similarly, define \(\psi :\Psi _{0}\rightarrow \mathbb {R}\) by \(\dot{\psi }\left( \mu \right) =1/v(\mu ) \) on \(\Psi _{0}\) and \(\psi (\mu _{0})=0\). Then for \(\mu \in K\)

$$\begin{aligned} \left| \dot{\psi }_{n}\left( \mu \right) -\dot{\psi }\left( \mu \right) \right| =\frac{\left| v_{n}(\mu )-v(\mu )\right| }{v_{n}(\mu )v(\mu )}. \end{aligned}$$
(6.5)

By the uniform convergence of \(v_{n}(\mu )\) to \(v(\mu )\) on \(K\), it follows that \(\left\{ v_{n}(\mu )\right\} \) is uniformly bounded on \(K\). Since \( v(\mu )\) is bounded on \(K\), it follows from (6.5) and from the uniform convergence of \(v_{n}\) that \(\dot{\psi }_{n}\left( \mu \right) \rightarrow \dot{\psi }\left( \mu \right) \) uniformly on \(K\). This and the fact that \(\psi _{n}\left( \mu _{0}\right) =\psi (\mu _{0})\) for all \(n\) implies, by a result from Rudin (1976, Theorem 7.17), that \(\psi _{n}\left( \mu \right) \rightarrow \psi \left( \mu \right) \) uniformly on \(K\). Since \(K\) was arbitrary, we have \(\psi _{n}\left( \mu \right) \rightarrow \psi \left( \mu \right) \) for all \(\mu \in \Psi _{0}\).

Let \(I_{n}=\psi _{n}\left( \Psi _{n}\right) \) and \(I_{0}=\psi (\Psi _{0})\subseteq \mathrm {int}\left( \lim I_{n}\right) \). Let \(J=\psi (K)\subseteq I_{0}\) and \(J_{n}=\psi _{n}(K)\subseteq I_{n}\). Define \(\dot{C} :I_{0}\rightarrow \Psi _{0}\) by \(\dot{C}(y)=\psi ^{-1}(y)\). Since \(\psi \) is strictly monotone and differentiable, the same is the case for \(\dot{C}\). Let \(\mu \in K\) be given and let \(y=\psi (\mu )\in J\) and \(y_{n}=\psi _{n}(\mu )\in J_{n}\). Since \(v_{n}(\mu )\) is uniformly bounded on \(K\), there exists an \(M>0\) such that \(\left| v_{n}(\mu )\right| \le M\) for all \(n\) and \(\mu \in K\). It follows that \(\left| \ddot{C}_{n}(y)\right| =\left| v_{n}\left( \dot{C}_{n}(y)\right) \right| \le M\) for all \( y\in J\) due to the fact that \(J\subseteq J_{n}\) for \(n\) large enough. Since \(\mu =\dot{C}(y)=\dot{C}_{n}(y_{n})\) we find, using the mean value theorem, that

$$\begin{aligned} \left| \dot{C}_{n}(y)-\dot{C}(y)\right|= & {} \left| \dot{C}_{n}(y)- \dot{C}_{n}(y_{n})\right| \\\le & {} M\left| y-y_{n}\right| \\= & {} M\left| \psi (\mu )-\psi _{n}(\mu )\right| . \end{aligned}$$

This implies that \(\dot{C}_{n}(y)\rightarrow \dot{C}(y)\) uniformly in \(y\in J \). Since \(C(0)=C_{n}(0)\) for all \(n\), it follows by similar arguments as above that \(C_{n}(y)\rightarrow C(y)\) uniformly on \(J\). We conclude from the convergence of the sequence of MGFs \(\exp \left[ C_{n}\left( e^{s}-1\right) \right] \rightarrow \exp \left[ C\left( e^{s}-1\right) \right] \) for \(s\in \log \left( J+1\right) \) that the sequence of distributions \(\mathrm {FT}_{n}(\mu _{0})\) converges weakly to a probability measure \(P\) with FCGF \(C\). We let \(\mathrm {FT}(\mu )\) denote the factorial tilting family generated by \( P\) with local dispersion function \(v\) on \(\Psi _{0}\). We may now complete the proof in the nonzero case by proceeding like in the proof of Proposition 2.1.

In the case where \(v(\mu )=0\) (the zero case), we cannot define the function \(\psi \) as above. Instead we take \(C(t)=t\mu _{0},\) such that \(\dot{C} (t)=\mu _{0}\) and \(\ddot{C}(t)=0\) for \(t\in \mathbb {R}\). For any \(\epsilon >0\), we may choose an \(n_{0}\) such that \(\left| v_{n}(\mu )\right| \le \epsilon \) for any \(n\ge n_{0}\) and \(\mu \in K\). For such \(n\) and \(\mu \), we hence obtain

$$\begin{aligned} \left| \psi _{n}(\mu )\right| =\int _{\mu _{0}}^{\mu }\frac{1}{ \left| v_{n}(t)\right| }\,\mathrm{d}t\ge \frac{\left| \mu -\mu _{0}\right| }{\epsilon }, \end{aligned}$$

which can be made arbitrarily large by choosing \(\epsilon \) small. We hence conclude that \(J_{n}=\psi _{n}(K)\rightarrow \mathbb {R}\) as \(n\rightarrow \infty \).

Now we let \(J\) be a compact interval such that \(0\in \mathrm {int}J\), implying that \(J\subseteq \,J_{n}\) for \(n\) large enough. For such \(n\) we hence obtain that \(\left| \ddot{C}_{n}(t)\right| =\left| v_{n}( \dot{C}_{n}(t))\right| \le \epsilon \) for all \(t\in J\), because then \(\dot{C}_{n}(t)\in K\). Since \(\mu _{0}=\dot{C}(t)=\dot{C}_{n}(0)\) we find, again by the mean value theorem, that for \(t\in J\),

$$\begin{aligned} \left| \dot{C}_{n}(t)-\dot{C}(t)\right| =\left| \dot{C}_{n}(t)- \dot{C}_{n}(0)\right| \le \epsilon \left| t\right| . \end{aligned}$$

This implies that \(\dot{C}_{n}(t)\rightarrow \dot{C}(t)\) uniformly in \(t\in J \). By similar arguments as above, we conclude that \(\mathrm {FT}_{n}(\mu _{0}) \) converges weakly to a probability measure \(P\) with FCGF \(C(t)=t\mu _{0}\), which implies the desired conclusion in the zero case, completing the proof.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jørgensen, B., Kokonendji, C.C. Discrete dispersion models and their Tweedie asymptotics. AStA Adv Stat Anal 100, 43–78 (2016). https://doi.org/10.1007/s10182-015-0250-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-015-0250-z

Keywords

Mathematics Subject Classification

Navigation