On a stochastic order induced by an extension of Panjer’s family of discrete distributions

We factorize probability mass functions of discrete distributions belonging to Panjer’s family and to its certain extensions to define a stochastic order on the space of distributions supported on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbb {N}}_0$$\end{document}N0. Main properties of this order are presented. Comparison of some well-known distributions with respect to this order allows to generate new families of distributions that satisfy various recurrrence relations. The recursion formula for the probabilities of corresponding compound distributions for one such family is derived. Applications to various domains of reliability theory are provided.


Introduction and motivation
considered the family of discrete random variables X satisfying the relation p(n) = an + b n p(n − 1), n ≥ 1, where p(n) = P(X = n), and derived a recursive formula for calculating compound distributions for the case when the corresponding claim number distribution belongs to that family. Being not only a quite important but also a relatively convenient class of distributions to deal with for solving various problems in actuarial science and, in particular, risk theory, Panjer's family and its certain extensions have subsequently attracted the attention of many researchers. In particular, Sundt and Jewell (1981) have shown that the above family consists of the collection of all (nondegenerate) Binomial, Poisson and Negative Binomial distributions, Willmot (1988) characterized the distributions with p(0) = 0 that satisfy relation (1) for n ≥ 2 and Hess et al. (2002) identified the distributions with p(n) = 0 for n ≤ k − 1 that satisfy relation (1) for n ≥ k + 1 (the so called Panjer's class of order k). Also Panjer and Willmot (1982) considered the class of distributions satisfying the recursion for some k ∈ N, and derived recursions for the correponding compound distributions when k = 1 and k = 2. A recursive algorithm for the compound distribution for arbitrary k had been then derived by Hesselager (1994). Extesions of the Panjer's family to distributions that satisfy recurrence relation of higher orders have also been studied in the literature. In particular, Kitano et al. (2005) considered the generalized Charlier distribution, which satisfies second order recurrence relation p(n) = a + b n p(n − 1) + c + d n + e n − 1 p(n − 2), n > 1.
A quite extensive collection of other generalizations of Panjer's family as well as their applications can be found in Sundt and Vernic (2009).
In this work we first consider the family of discrete distributions supported on N 0 = N∪{0} that satisfy (2). For the sake of simplicity, we assume that the denominator is a monomial n m and show in the following section that the probability mass function p(n) can be decomposed into a finite product where the constant c > 0 does not depend on n and each p i is a probability mass function of one of four different types of distributions, including the Poisson and the Negative Binomial distributions. Parameters of those distributions are determined by the coefficients that appear in the right-hand side of (2). As the values of probability mass functions are always in between 0 and 1, the above factorization implies that p(n) is asymptotically smaller than each of p i (n). This observation leads to the Definition 2.1 of a discrete stochastic order over the set of all discrete random variables supported on N 0 and our main goal is to study the properties of this order. Stochastic orderings have a modest, yet somewhat punctuated, pedigree in the corpus of statistical literature. A quite extensive collection of such orderings can be found in Shaked and Shanthikumar (2007). Also in Artzner et al. (1999), Artzner et al introduced their now famous four coherent risk measures. However, as stated in Giovagnoli and Wynn (2010), most of the existing literature focuses on real (continuous) univariate or multivariate random variables, and there is less work on stochastic orderings for discrete random variables. In terms of ordering functions in a discrete setting, Kitaev (2005) considers partially ordered generalized patterns, and for discrete random variables, in Giovagnoli and Wynn (2010) use Möbius inversions in the theory as well as the notion of dual cones.
The paper is structured as follows: in Sect. 2 we derive the factorization formula (3) and describe the distributions that appear on the right-hand side of that formula. Based on that decomposition we then define the order on the whole space of discrete distributions that are supported on N 0 . To make the definition of this order more intuitive, in Sect. 3 we give two more definitions of that are equivalent to Definition 2.1. In the following proofs we mainly use the condition given in Proposition 3.1 (i) (though from the statistical point of view the condition given in Proposition 3.1 (ii) may seem to be more interesting). We then show that { , } is a partially ordered set and that is uncorrelated with other well-known stochastic orders on . One of the main properties of ≺ that distinguishes it from other stochastic orders is its invariance under the permutations of N 0 (Proposition 3.6). In particular, this means that [Y ]. Instead, as ≺ is determined by the tail behaviour of the distributions, the relation X ≺ Y implies lim n→+∞ R X (n)/R Y (n) = 0, where R X (n) and R Y (n) are survival functions of X and Y (Theorem 3.1). However, the converse is not true, that is, the condition lim n→+∞ R X (n)/R Y (n) = 0 is in general much weaker than X ≺ Y (Examples 3.1 and 3.2). Other characteristics of the partially ordered set { , }, such as the existence and the structure of minimal, maximal and covering elements, are also given in Sect. 3. The relation with respect to ≺ of compound distributions for which the corresponding counting distributions are from is described in Sect. 4. In particular, the conditions are presented under which a compound distribution S is ≺ than a compound distribution S if and only if either the claim size distribution of S is ≺ than the claim size distribution of S (Proposition 4.1) or the counting distribution of S is ≺ than the counting distribution of S (Proposition 4.2). In Sect. 5 we consider some common distributions that can be recursively generated and discuss their relation with respect to ≺. This comparison allows to obtain new families of distributions that satisfy various recurrence relations. We then provide a recursive formula for calculating the probabilities of compound distributions whose counting distributions belong to one of those families. Finally, in the last section we present some applications of the obtained results to reliability theory and to Poisson shock models and show that the relation ≺ is also applicable while conducting sampling with visibility bias.

Factorization of probability mass functions
Consider a family of discrete random variables X satisfying the relation where p(n) = P(X = n) and P k (x) = k i=0 a i x i is a polynomial of degree k ≥ 0. Note that when m = 1 and k ≤ 1 then (4) corresponds to the Panjer's family of discrete distributions given by (1). We are mainly interested in distributions of the above family that have infinite support N 0 , which implies that k ≤ m and P k (n) > 0 for all n ∈ N.
Let us first consider the case m = k. In this case we have by the ratio test that the series ∞ n=0 p(n) converges if |a k | < 1 and it diverges if |a k | > 1. Let us therefore assume that |a k | < 1. This also means that for each M ∈ N, the number of zeros of the polynomial P k (x) in the interval (M, M + 1) is even (possibly 0). Hence, we can rewrite the polynomial P k (x) in the form where a = a 1/k k , k 1 + 2k 2 + 2k 3 = k, the root of each linear function ax + b i 1 , i 1 = 1, .., k 1 , is less than 1, the (possibly equal) roots of each quadratic function a 2 x 2 + c i 2 x + d i 2 lie in the interval (M i 2 , M i 2 + 1) for some M i 2 ∈ N, i 2 = 1, .., k 2 , and the functions a 2 x 2 + e i 3 x + f i 3 , i 3 = 1, .., k 3 , have no real roots. We can then rewrite (4) in the form Let us now describe the distributions which are recursively generated by each of those three types of functions. It has been shown in Sundt and Jewell (1981) that the only distribution satisfying with 0 < a < 1 is the Negative Binomial distribution with parameters (a + b i 1 )/a and a. More precisely, in this case we have that where the last equality above follows from the definition of binomial coefficients with real arguments. Note that the condition that the root of ax + b i 1 is less than 1 together with a > 0 implies that a + b i 1 > 0.

Now consider the distribution satisfying
where the function a 2 x 2 + c i 2 x + d i 2 has two real roots, say x i 2 and y i 2 , both of which belong to the interval (M i 2 , M i 2 + 1) for some M i 2 ∈ N. We then have that where the last equality follows from ( and from the definition of binomial coefficients with real arguments.
Finally, let us suppose that where a 2 x 2 + e i 3 x + f i 3 has no real roots. Then a 2 x 2 + e i 3 x + f i 3 has 2 complex roots z i 3 and z i 3 for some z i 3 ∈ C\R. Using the properties z (z) = (z + 1) and (z) = (z) of the Gamma function (z) we get that Combining (5)-(12) for a given random variable X satisfying (4) with m = k we get where c is some positive constant that does not depend on n, p i 1 , p i 2 and p i 3 are probability mass functions of distributions given, respectively, by (8), (10) and (12).
In case m > k we can rewrite (4) as Noting that if p 0 (n) denotes the probability mass function of Poisson distribution with parameter 1 then we get that in this case there is a constant number c > 0 such that Let us now consider an example that illustrates the above factorization. Suppose X follows Poisson distribution with parameter 1/2. Then (1) and Z ∼ Negative Binomial(1, 1/2). Note that the probability mass functions of those variables satisfy recurrence relations p X , and the factor 1 2n that appears in the first recursion is a product of the corresponding factors appearing in the two other recursions. Clearly, the values of p X (n) are then (asymptotically) smaller than the values of p Y (n) and p Z (n). In general, as the values of probability mass functions are always in between 0 and 1, then given the decomposition (14) we have that p(n) is asymptotically smaller than each of the probability mass functions that appear on the right side of (14). Thus, the above decomposition leads to the following definition of a binary relation over the set of all discrete random variables supported on N 0 .
and we say that Note that given (15), X can also be considered as a weighted distribution (see, e.g., Patil and Rao (1978) for the definition and properties of weighted distributions). In the following section we show that { , } is a partially ordered set and derive its main properties.
Proposition 3.1 Given X , Y ∈ then X ≺ Y if and only if one of the following holds: which means that X and Y |(Y = Z ) are identically distributed.

Proposition 3.2 { , } is a partially ordered set.
Proof Obviously X X for any X ∈ . Now suppose that X Y and Y X , p Z (n) < ∞ which means that X ≺ Z and the transitivity is shown.
Let us now check that ≺ does not coincide with the restriction to of any well known stochastic order presented in Shaked and Shanthikumar (2007). First, as it is shown in Shaked and Shanthikumar (2007) (formulas 1.A.7, 5.A.5, theorems 1.B.1, 1.B.42, 1.C.1, 3.B.13, 5.C.1 and pp. 71, 255), for the usual stochastic order, hazard rate order, reversed hazard order, likelihood ratio order, convolution order, dispersive order, Laplace transform order, factorial moments order and the moments order, the fact that X is smaller than Y with respect to any of these orders implies that , then is not correlated with any of the above orders. Also, as it is shown in Shaked and Shanthikumar (2007) (formulas 2.B.7, 3.A.2, 3.A.4, 3.C.8 and p. 166) for the mean residual life order, harmonic mean residual life order, convex order and excess wealth order, the fact that X is smaller than Y with respect to any of these orders together with the condition [Y ]. As for the random variables X , Y ∈ with probablity mass functions defined as , then is not correlated with any of those orders as well. Finally, as for X ∼ Poisson(1) and Y ∼ N egativeBinomial(1, 1/3) we have that X ≺ Y but E[e s X ] > E[e sY ] at s = ln 2, then is also uncorrelated with the so called moment generating function order (recall that given two nonnegative random variables X and Y with E[e s 0 Y ] < ∞ for some s 0 > 0, we say that X is smaller than Y in the moment generating function order if E[e s X ] ≤ E[e sY ] for all s > 0). Thus, unlike the discrete stochastic orders presented above, does not guarantee any order-preserving property for the operations of taking expected values, variances or moment generating functions. The reason for this is that ≺ compares whether the tails of the random variables are sufficiently far from each other and is therefore not sensitive to the probabilities of smaller values. We, therefore, offer the following proposition: Let us recall some characteristics of partially ordered sets. Given a partially ordered set {P, ≤} an element x ∈ P is called minimal if there is no element y ∈ P such that y ≤ x, y = x. Similarly, an element x ∈ P is called maximal if there is no element y ∈ P such that x ≤ y, x = y. An element x ∈ P is said to be covered by an element y ∈ P if x ≤ y, x = y, and for any z ∈ P with x ≤ z ≤ y we have that either z = x or z = y. The next proposition deals with the above concepts for the set .

Proposition 3.4 (i) There is no minimal element in ; (ii) an element X ∈ is maximal if and only if
The following proposition presents the closure property of ≺ under the operations of summation and multiplication of random variables from .
Proposition 3.5 If X 1 , . . . , X k and Y 1 , . . . , Y k are two sets of independent random variables from with X i ≺ Y i , i = 1, . . . , k, then where the first inequality above holds as n j=0 p Y 1 ( j) p Y 2 (n− j) ≥ p Y 1 (i) p Y 2 (n− i) for each i = 0, 1, . . . n, and the last equality holds because for any i, j ∈ N 0 , the summand j) appears exactly once on each side of that equality. Thus, Note that if X ∈ then p X 2 (2) = 0 and if a = 1 is some real number, then the support of a X is the set aN 0 = N 0 . Hence, as the support of all elements of is N 0 , we get that neither X 2 nor a X belong to . However, ≺ is closed under the permutations of the support set of elements of . More precisely, let σ be any permutation of N 0 , that is, a one-to-one mapping from N 0 to N 0 . Then for X ∈ , let X σ be an element of with probability mass function p X σ (n) = p X (σ (n)), n ∈ N 0 . Proposition 3.6 Let σ be any permutation of N 0 . Then X ≺ Y if and only if X σ ≺ Y σ .
Proof As convergent series with positive terms converge unconditionally, then The survival function of a random variable X ∈ is given by As X ≺ Y indicates that the tail of Y is heavier than the tail of X , and the survival function gives the probability of the right tail (n, ∞), then X ≺ Y should imply that R X (n) is smaller than R Y (n). In fact we have the following: Proof As X , Y ∈ then p X (k) > 0 and p Y (k) > 0 for all k ∈ N 0 . This means that {R X (n)} n∈N and {R Y (n)} n∈N are both strictly decreasing sequences converging to 0. As X ≺ Y then ∞ n=0 p X (n)/ p Y (n) < ∞, and, therefore, Hence, by the Stolz-Cesàro theorem (see, e.g., Choudary and Niculescu (2014), Theorem 2.7.1) we get (16).
Note, however, that the converse is not necessarily true, that is, (16) does not necessarily imply X ≺ Y . In fact, as the following example shows, even the condition ∞ n=0 R X (n) R Y (n) < ∞, which is stronger than (16), does not necessarily imply that p X (n) is bounded from above (which is much weaker than X ≺ Y ).

Example 3.1 Consider
Then, as Note also that in general (16) is the best that we can achieve in Theorem 3.1. More precisely, as the following example shows, the condition X ≺ Y alone does not necessarily imply .
However, (17) holds under the additional assumption of the existence of a positive constant c > 0 such that for all 0 ≤ n < k. In particular, if the sequence { p X (n)/ p Y (n)} ∞ n=0 is eventually non-increasing then (18) is satisfied.

Compound distributions
Consider the compound distribution where the counting random variable N represents the number of claims and M 1 , M 2 , . . . represent the claim size. It is assumed that M 1 , M 2 , . . . are independent and identically distributed discrete random variables which do not depend on N and that S = 0 if N = 0. Denoting p(n) = P(N = n), g n = P(S = n) and h n = P(M = n) we have that where h * n k = P(M 1 + · · · + M n = k) is the n-fold convolution of h k , k ∈ N, The probability generating function of S is given by where G M (z) is the common probability generating function of M 1 , M 2 , . . . and G N (z) is the probability generating function of N .
As we want to compare compound distributions with respect to ≺, then, in order to guarantee that the resulting compound distributions are from , i.e., that their support set is N 0 , we will assume that M 1 , M 2 . . . are all from . In order to compare distributions with respect to ≺ we need to know the asymptotic behaviour of their probability mass functions. In general, estimating the asymptotic behaviour of compound distributions is not easy. However, under certain assumptions on claim number and claim size distributions various asymptotic estimates for compound distributions have been obtained (see, e.g., Embrechts et al. (1984) and its references). We first apply the following theorem from Chover et al. (1973).
Theorem 4.1 Let M be the common distribution of M 1 , M 2 , . . . and assume that the following conditions hold: ∞ and G N (z) is analytic in a region containing the range of G M (z) for |z| ≤ r.
Then lim n→+∞ g n h n = G N (d).
As described in Chover et al. (1973), the above conditions are satisfied by many examples. We then have the following proposition. Let us now consider mixed Poisson distributions. In this case where μ is Lebesgue or counting measure with generalized density f (x). It is shown in Willmot (1990) is a locally bounded function on (0, ∞) which varies slowly at infinity, β ≥ 0 and −∞ < α < ∞, then Clearly Poisson(λ 1 ) ≺ Poisson(λ 2 ) if and only if λ 1 < λ 2 in which case λ 1 λ 1 +β < λ 2 λ 2 +β for all β > 0. Thus, if the densities f 1 and f 2 in both mixtures share the same β > 0, then Poisson(λ 1 ) ≺ Poisson(λ 2 ) implies S 1 ≺ S 2 , where S 1 ∼ (λ 1 , f 1 ) and S 2 ∼ (λ 2 , f 2 ). However, if β = 0 (in which case we should have α < −1), then the condition Poisson(λ 1 ) ≺ Poisson(λ 2 ) alone does not imply S 1 ≺ S 2 . In order to have S 1 ≺ S 2 in this case, we should instead assume that f 1 and f 2 are such that Recall from the introductory section, that the recursive formulae and other properties of compound distributions are usually obtained under the assumption that the corresponding claim number distribution belongs to Panjer's family or to its certain extensions. As the order ≺ is obtained by an extension of Panjer's family of distributions, then, in some sense, ≺ is more naturally connected to the claim number distribution N rather than to the claim size distribution M. It is therefore interesting to see when the condition N ≺ N alone implies S ≺ S given that the claim size distributions are same: M = M := M. The following proposition shows that the desired implication holds for a large class of claim number distributions. where C 1 and C 2 are some positive constants, α 1 , α 2 ∈ R and θ 1 , θ 2 ∈ (0, 1). Assume also that the probability generating function G M (z) of M has radius of convergence exceeding both τ 1 and τ 2 , where G M (τ 1 ) = θ −1 1 and G M (τ 2 ) = θ −1 2 . Then S ≺ S if and only if N ≺ N .
Proof As it is shown in Willmot (1989), in this case we have that where g n = P(S = n) and g n = P(S = n). Note that N ≺ N if and only if either θ 1 < θ 2 or θ 1 = θ 2 and α 2 − α 1 > 1. As G M is increasing on R + , then in the first case we have that τ 1 > τ 2 and, therefore, S ≺ S . Clearly in the second case we again have that S ≺ S .
The above argument can also be applied to get the desired implication in the cases when one of the claim number distributions is asymptotically smaller (e.g. Poisson distribution) or larger (e.g. Waring distribution) than n α θ n .
To conclude this section, we note that in general the conditions M ≺ M and N ≺ N do not necessarily imply S ≺ S . To see this let p N (n) ∼ C 1 4 −n , p N (n) ∼ C 2 3 −n , M ∼ Poisson(ln 4) and M ∼ N B(1, 1/4). Then G M (2) = 4 and G M (3) = 3 and, therefore, g n ∼ C 1 2 −n and g n ∼ C 2 3 −n . Hence, not only does S ≺ S not hold but in fact S ≺ S .

Generation of new families of distributions
In this section we will consider the following distributions from : We will use the notations 1 X p , X hp , X nb , X sl , X w and X pb to denote random variables following Poisson, Hyper-Poisson, Negative Binomial, Shifted Logarithmic, Waring and Poisson-Beta distributions respectively.
As was mentioned in the first section, the probability mass functions of the Poisson and Negative Binomial distributions satisfy the recurrence relation It is also known (see Willmot and Panjer 1987), that the probability mass functions of the Hyper-Poisson, Shifted Logarithmic and Waring distributions satisfy the first order linear recursion and the probability mass function of the Poisson-Beta distribution, together with some other distributions, such as Poisson-Generalized Inverse Gaussian distribution, satisfies the second order quadratic recursion where the values of the coefficients are determined by the values of the parameters of the corresponding distributions. As for any α > 0 lim n→∞ (n + α) (n)n α = 1, we have the following: (ii) if ν < μ then X pb (ν, α, β) ≺ X hp (μ, η); (iii) for any values of parameters of corresponding distributions, X pb is ≺ than each of X nb , X sl and X w .
Now let X be an element of with probability mass function satisfying the first order recursion and let Y be an element of with probability mass function satisfying the second order recursion where f , g and h are some functions from N to R. Then, if Y ≺ X , then for the random variable Z with we have that the probability mass function of Z satisfies the following second order recursion: Rewriting the probability mass function of the Poisson-Beta distribution in terms of the confluent hypergeometric function M as and using asymptotic properties of M (see, e.g., NIST Digital Library of Mathematical Functions, Ch. 13) we get that We therefore have the following: (ν, α, β). Now assume that X is an element of with probability mass function satisfying (23), Y is an element of with probability mass function satisfying (24), and X ≺ Y . Then for the random variable Z with p X (n) = cp Y (n) p Z (n), n ∈ N 0 , we have that the probability mass function of Z satisfies the following second order inverse recursion: , n > 1. (26)

Recursion formula
Let X belong to the family of distributions satisfying (20) and let Y belong to the family of distributions satisfying (22). Then X follows either Poisson or Negative Binomial distribution. In the first case we have that in (20) a 0 = 0 and applying (25) we get that the probability mass function of Z satisfies the reccurence relation Dividing both sides of the above expression by n + 1 we can rewrite (27) as If X follows a Negative Binomial distribution, then in (20) a 0 = 0 and applying (25) we get As the coefficients of each of p(n), p(n − 1) and p(n − 2) in the above expression are polynomials on n of degree at most 2, then each of them can be written in the form d 1 (n + 1)(n + 2) + d 2 (n + 1) + d 3 , where for each case the values of the fixed terms d 1 , d 2 and d 3 are determined by the values of corresponding fixed terms appearing in (28). Using this decomposition and dividing both sides of (28) by (n + 1)(n + 2) (also using 1 (n+1)(n+2) = 1 n+1 − 1 n+2 ), we get that (28) can be rewritten as Thus, in both cases we get that the probability mass function of Z satisfies the recurrence relation Now assume that the counting random variable N with p(n) = P(N = n) satisfies (29) and consider the compound distribution where, as before, claim sizes M 1 , M 2 , . . . are independent and identically distributed discrete random variables which are independent of N and follow a common distri-bution M, h k = P(M = k) and with g 0 = G N (h 0 ). If k > 0, then multiplying both sides of (29) by h * n+2 k and summing over n ≥ 2 yields Denoting and applying formulae (29)-(31) from Kitano et al. (2005) together with as n → ∞.
The reverse hazard function of X ∈ is defined as (see Chechile 2011) and it gives the conditional probability of the failure at the n-th event given that failure occurs at k ≤ n.
For X ∈ , the odds for surviving age n is given by the odds function for survival (ii) Poisson shock model. Consider two devices that are subject to shocks occuring randomly in time as events in a Poisson process with constant intensity λ. Let the probabilities of failure of the devices on the n-th shock be respectively p X (n) and p Y (n). Then, if R X (n) = k≥n p X (k) and R Y (n) = k≥n p Y (k), the probability that the devices survive until time t are given respectively by the formulae (see, e.g., Esary et al. 1973) Below we show that if X ≺ Y then lim t→∞ H X (t)/H Y (t) = 0. By Theorem 3.1, lim n→∞ R X (n)/R Y (n) = 0, and, therefore, for any ε > 0 there exists N ∈ N such that R X (n)/R Y (n) < ε for n > N . We then have that n)(λt) n /n! < N n=0 R X (n)(λt) n /n! + ∞ n=N +1 ε R Y (n)(λt) n /n! N n=0 R Y (n)(λt) n /n! + ∞ n=N +1 R Y (n)(λt) n /n! < N n=0 R X (n)(λt) n /n! ∞ n=N +1 R Y (n)(λt) n /n! + ε.
As t → ∞, the first summand of the last expression goes to 0. Hence, there exists T > 0 such that H X (t)/H Y (t) < 2ε for t > T , which means that lim t→∞ H X (t)/H Y (t) = 0.
(iii) Sampling with visibility bias. Recall that formula (15), which defines the order ≺, was initially motivated by the factorization of rational functions that generate discrete distributions. In the following example we show how the same structural connection between discrete distributions can arise in practice in a completely different setting such as sampling with visibility bias to estimate wildlife abundance.
One of the most practical ways of estimating wildlife abundance are aerial surveys. However, when conducting such surveys, the visibility bias caused by various factors, such as snow cover, dense vegetation and observer fatigue, need to be taken into account. There are various techniques proposed in the literature that tackle the problem of population estimation in the face of visibility bias. For example, the quadrat sampling method proposed in Cook and Martin (1974) is based on the exploration of a collection of quadrats within the area of interest. It is then assumed that the number of groups within those quadrats and the size of each group are independently distributed and that each animal is observed with probability p. We say that the group of n animals is observed if all its members are observed. Then a group of size n is observed with probability p n . Now let the random variables X and Y represent, respectively, the established and the true number of animals in a group. Then we have that where c = ( ∞ k=1 p Y (k) p k ) −1 . Thus, the above formula is identical to (15) with Z following a Geometric distribution with parameter 1 − p.

Conclusion
We factorize rational functions that recursively generate probability mass functions to obtain a factorization of those probability mass functions. In case the denominator of the generating rational function is a monomial, we show that the probability mass function can be factorized into a product of 4 different types of mass functions, 2 of which correspond to the Poisson and the Negative Binomial distrbutions. This factorization leads to a definition of a stochastic order on the space of discrete distributions that are supported on N 0 . We show that { , } is a partially ordered set and present its main properties. Remarkably, ≺ is not only closed under the operations of addition and multiplication, but is also invariant under relabelings of discrete distributions. In many setups, ≺ is also closed under the operation of compounding distributions. The relation X ≺ Y also implies a strong relation lim n→+∞ R X (n)/R Y (n) = 0 between the survival functions of X and Y . Comparison of some common distributions with respect to ≺ allows to generate new families of distributions that satisfy various recurrence relations. For some of those families, when their members play the role of counting distributions, recursive formulae for the probabilities of corresponding compound distributions are derived. Obtained results have various applications in survival analysis and in aerial census with visibility bias.