Introduction

Combinations of random variables (e.g., sums, products, ratios) regularly occur in many scientific areas. Particularly useful is the ratio of two random variables. For example, plant scientists use the ratio of leaf area to total plant weight (leaf area ratio) in the plant growth analysis (Poorter and Garnier 1996), and geneticists use the ratio of total genetic diversity distributed among populations to total genetic diversity in the pooled populations as a measure of population differentiation (Culley et al. 2002). The ratio of two fluorescent signals has several applications in fluorescence microscopy, e.g., estimating the DNA sequence copy number as a function of chromosomal location (Piper et al. 1995), and there are many (dimensionless) ratios employed in engineering (Mekic et al. 2012). In case of categorical data (i.e., from a binomial or multinomial distribution), there are numerous applications of ratios as well in consumer preference studies, election poll results, quality control, epidemiology, and so on.

Formally, a ratio distribution is a probability distribution constructed as the distribution of the ratio of two random variables, each having another (known) distribution. More particularly, given two random variables Y1 and Y2, the distribution of the random variable Z that is formed as the ratio Z=Y1/Y2 is a ratio distribution. When using ratio distributions for theoretical and practical purposes, it is helpful to know its mean and variance, preferably in a computationally efficient form. In the case that Y1 and Y2 follow normal distributions, and \(\mu _{Y_{2}}=0\), Z is known as Cauchy distribution (Geary 1930; Fieller 1932; Hinkley 1969; Korhonen and Narula 1989; Marsaglia 2006). Other authors have addressed ratios of binomial proportions (also known as relative risk) (Koopman 1984; Bonett and Price 2006; Price and Bonett 2008), ratios of uniform distributions (Sakamoto 1943), Student’s t distributions (Press 1969), Weibull and gamma distributions (Basu and Lochner 1971; Provost 1989; Nadarajah and Kotz 2006), beta distributions (Pham-Gia 2000), Laplace and Bessel distributions (Nadarajah 2005; Nadarajah and Kotz 2005) and others. General notes on the product and ratio of two (not necessarily normal) random variables can also be found in (Frishman 1971; Van Kempen and Van Vliet 2000).

In our paper, we consider a ratio involving two or more random variables that jointly have a multinomial distribution. This situation is similar to relative risk or risk ratio which is the ratio of the probability of an event occurring (for example, developing a disease or being injured) in an exposed group to the probability of the event occurring in a comparison, non-exposed group. However, while the probabilities in the risk ratio are independent (in the sense that they describe two independent events in two independent groups), in our case, the probabilities are tied together through the covariance between multinomial categories. These ratios serve as a common framework for opinion polls, statistical quality control, and consumer preference studies. Confidence intervals for the odds ratio, which can be easily calculated, if the standard deviation is known, are especially important for applications. Nelson (1972) presented estimates, confidence intervals, and hypothesis tests for the odds ratio in trinomial distributions. Piegorsch and Richwine (2001) examined some types of confidence intervals in the context of analysis of genetic mutant spectra. Quesenberry and Hurst (1964) and Goodman (1965) explored methods for obtaining a set of simultaneous confidence intervals for the probabilities of a multinomial distribution. A comparison of performance of various confidence intervals also appeared in Alghamdi (2015); Aho and Bowyer (2015). To the best of our knowledge, however, there has been no analytical treatment of the ratio of multinomial proportions including derivations for formulae for the mean and variance of such a ratio.

A ratio between two or more random variables that jointly have a multinomial distribution also arises in the trending field of the non-invasive prenatal testing of common fetal aneuploidies such as trisomy of the 13th, 18th or 21st chromosome (Chiu et al. 2008; Sehnert et al. 2011; Lau et al. 2012; Minarik et al. 2015). We are currently working on implementation of this model into laboratory practice, and this paper represents a mathematical background of our work. In this paper, we discuss two solutions to the problem of mean and variance of the said ratio. More particularly, we derive asymptotic formulae for the mean and variance of the random variable Z=Y1/Y2, where \(Y_{1}=\sum _{k\in I} X_{k}\) and \(Y_{2}=\sum _{k\in J} X_{k}\), I,J⊂{1,...,r} and IJ=, are sums of random variables X1,...,X r which together have a joint multinomial distribution.

Solution by Taylor series

There is a simple solution to the mean and variance of the ratio of multinomial proportions that can be derived by using the Taylor series. Formally, let a set of random variables X1,...,X r have a probability function

$$pr\left(X_{1}=x_{1},..., X_{r}=x_{r}\right) = \frac{n!}{\prod_{i=1}^{r}{x_{i}!}}\prod_{i=1}^{r}{p_{i}^{x_{i}}}, $$

where x i are non-negative integers such that \(\sum x_{i} = n\) and p i are constants with p i >0 and \(\sum p_{i}=1\). The joint distribution of X1,...,X r is known as multinomial distribution. Let u,v∈{0,1}r be two binary vectors such that \(\sum u_{i}>0\), \(\sum v_{i}>0\) and u i v i =0 for all i. We define

$$Z_{0} = \frac{X \cdot u}{X \cdot v}, $$

where · represents a scalar product and X=(X1,...,X r ). Without loss of generality, we will restrict our explorations to r=3 and Z0=X1/X2. This holds because the choice vectors u,v have no common X i ; thus, the X i s can be grouped to three disjoint sets: 1) X i s selected by u, 2) X i s selected by v, and 3) all others.

Also, the reader will note that the ratio Z0=X1/X2 can be viewed as a ratio of absolute quantities as well as a ratio of fractions or probabilities because Z0=(X1/n)/(X2/n).

Before we proceed any further, observe that because of the possible zero in the denominator of Z0, there is no analytical solution to the mean and variance of the ratio Z0. A workaround for this problem is to rewrite this ratio using a function that does not have a singularity. Let Z0=f(X1,X2)=X1/X2 be a function of two random variables. Then, with \(\mu =\left (\mu _{X_{1}}, \mu _{X_{2}}\right)\), we can use the Taylor series to approximate the function f as

$$\begin{array}{*{20}l} Z_{0}=f\left(X_{1},X_{2}\right) \approx&\ f(\mu) + \left(X_{1} - \mu_{X_{1}}\right)\frac{\partial f}{\partial X_{1}}(\mu) + \left(X_{2} - \mu_{X_{2}}\right)\frac{\partial f}{\partial X_{2}}(\mu) \\ &+ \frac{1}{2}\left(X_{1} - \mu_{X_{1}}\right)^{2}\frac{\partial^{2} f}{\partial X_{1}^{2}}(\mu) + \frac{1}{2}\left(X_{2} - \mu_{X_{2}}\right)^{2}\frac{\partial^{2} f}{\partial X_{2}^{2}}(\mu) \\ &+\left(X_{1} - \mu_{X_{1}}\right)\left(X_{2} - \mu_{X_{2}}\right)\frac{\partial^{2} f}{{\partial X_{1}}{\partial X_{2}}}(\mu), \end{array} $$

from which we have

$$ E(Z_{0}) \approx f(\mu) + \frac{1}{2}\frac{\partial^{2} f}{\partial X_{1}^{2}}(\mu)\sigma_{X_{1}}^{2} + \frac{1}{2}\frac{\partial^{2} f}{\partial X_{2}^{2}}(\mu)\sigma_{X_{2}}^{2} + \frac{\partial^{2} f}{{\partial X_{1}}{\partial X_{2}}}(\mu)\sigma_{X_{1},X_{2}}. $$
(1)

Since X1 and X2 are terms of a random vector X=(X1,X2,X3) drawn from the multinomial distribution given by (n,p1,p2,p3), we have \(\mu _{X_{i}} = np_{i}\) and \(\sigma _{X_{i}}^{2}=np_{i}(1-p_{i})\) for i=1,2, and \(\sigma _{X_{1},X_{2}} = -np_{1}p_{2}\). It follows easily that

$$ E(Z_{0}) \approx \frac{p_{1}}{p_{2}} + \frac{1}{n}\left(\frac{p_{1}(1-p_{2})}{p_{2}^{2}} + \frac{p_{1}}{p_{2}}\right) = \frac{p_{1}}{p_{2}}\left(1 + \frac{1}{np_{2}}\right). $$
(2)

For variance, we use a simpler approximation of f

$$f\left(X_{1},X_{2}\right) \approx f(\mu) + \left(X_{1} - \mu_{X_{1}}\right)\frac{\partial f}{\partial X_{1}}(\mu) + \left(X_{2} - \mu_{X_{2}}\right)\frac{\partial f}{\partial X_{2}}(\mu), $$

from which we have

$$ var(Z_{0}) \approx \frac{\partial f}{\partial X_{1}}(\mu)^{2}\sigma_{X_{1}}^{2} + \frac{\partial f}{\partial X_{2}}(\mu)^{2}\sigma_{X_{2}}^{2} + 2\frac{\partial f}{\partial X_{1}}(\mu)\frac{\partial f}{\partial X_{2}}(\mu)\sigma_{X_{1},X_{2}}, $$
(3)

and finally

$$ var(Z_{0}) \approx \frac{1}{n}\left(\frac{p_{1}(1-p_{1})}{p_{2}^{2}} + \frac{p_{1}^{2}(1-p_{2})}{p_{2}^{3}} + 2\frac{p_{1}^{2}}{p_{2}^{2}} \right) = \frac{1}{n}\left(\frac{p_{1}}{p_{2}}\right)^{2}\left(\frac{1}{p_{1}} + \frac{1}{p_{2}}\right). $$
(4)

Solution by a modified ratio

3.1 Definition

Let the symbols X, u, and v have the same meaning as in Section 2. We define a new random variable Z1 as

$$ Z_{1} = \frac{X \cdot u}{X \cdot v + 1}. $$
(5)

The + 1 in the above definition serves to avoid zero in the denominator, and thus solves the problem with the singularity of Z0. For the same reasons as in Section 2, we will restrict our explorations to k=3 and Z1=X1/(X2+1).

3.2 Sample space

The sample space \(S_{Z_{1}}\subseteq \mathbb {Q}\) of the random variable Z1 is limited by the sample space S X of the multinomially distributed random vector X=(X1,X2,X3). Therefore, if X assumes values from the multinomial distribution given by (n,p1,p2,p3), then Z1 cannot assume all rational values a/(b+1) for some \(a, b\in \mathbb {N}\), but only those that satisfy a+bn and a,b≥0. Furthermore, values 2/2 and 4/4 are considered identical; therefore, different outcomes of random vector X may correspond with the same outcome of Z1. In other words, each instance (a,b,c) of X corresponds with exactly one instance a/(b+1) of Z1, while an instance of Z1 may correspond with multiple instances of X.

Naturally, the probability of a particular value of Z1 can be determined by summing the probabilities of all (multinomial) vectors that are associated with this value. From this, it follows that if the initial multinomial probability distribution function of random vector X is

$$pr\left(X_{1}=a,X_{2}=b,X_{3}=c\right) = \left(\begin{array}{c} n\\ a,b,c \end{array}\right)p_{1}^{a}~p_{2}^{b}~p_{3}^{c}, $$

then the probability distribution function of random variable Z1 is

$$pr\left(Z_{1} = d\right) = \sum_{\substack{{a,b,c\in\{0,...,n\}}\\a+b+c=n\\a/(b+1)=d}} \left(\begin{array}{c} n\\ a,b,c \end{array}\right)p_{1}^{a}~p_{2}^{b}~p_{3}^{c}, $$

which can be rewritten as

$$pr\left(Z_{1} = d\right) =\sum_{b=0}^{n}\sum_{\substack{a=0\\a/(b+1)=d}}^{n-b}\left({n \atop b}\right)\left({n-b \atop a}\right) p_{1}^{a}~ p_{2}^{b}~ (1-p_{1}-p_{2})^{n-a-b}. $$

3.3 Mean and variance

Now we can state the mean and variance of Z1. The proofs of the statements can be found in the Appendix.

Theorem 1

Let X=(X1,X2,X3) be a random vector from the multinomial distribution given by (n,p1,p2,p3). The expected value of the random variable Z1, given by (5), is

$$E(Z_{1})=\frac{p_{1}}{p_{2}}\left(1 - (1-p_{2})^{n}\right). $$

Theorem 2

Let X=(X1,X2,X3) be a random vector from the multinomial distribution given by (n,p1,p2,p3), where

$$n>\frac{1-p_{2}}{p_{2}}N + \frac{1-2p_{2}}{p_{2}} $$

for some natural non-zero N. The variance of the random variable Z1, given by (5), is

$$\begin{array}{*{20}l} var(Z_{1}) =&\ \left[\frac{p_{1}}{p_{2}(1-p_{2})}\right]^{2}\frac{\frac{1-p_{2}}{p_{1}}-2}{n+2} + \frac{p_{1}}{p_{2}(1-p_{2})}\frac{\frac{p_{1}}{1-p_{2}}-1}{n+1} \\ &+ \sum_{k=1}^{N} \frac{\left[\frac{p_{1}}{p_{2}(1-p_{2})}\right]^{2}}{\left({n+k+1 \atop k}\right)p_{2}^{k}}\left[1-\frac{k+2 - \frac{1-p_{2}}{p_{1}}}{n+k+2}\right] + O\left(\frac{1}{n^{N+1}}\right). \end{array} $$

Corollary 1

For N=1 we have for the variance from Theorem 2

$$var(Z_{1}) = \frac{1}{n}\left(\frac{p_{1}}{p_{2}}\right)^{2} \left(\frac{1}{p_{1}} + \frac{1}{p_{2}}\right) + O\left(\frac{1}{n^{2}}\right). $$

Observe that the formula for the variance is asymptotic in nature, and thus it may not work well for small n and certain configurations of p1, p2 and p3. See Section 5 for more details.

Approximate error of solution by a modified ratio

Let

$$Err = g(X_{1},X_{2}) = \frac{X_{1}}{X_{2}} - \frac{X_{1}}{X_{2}+1} = \frac{X_{1}}{X_{2}(X_{2}+1)} $$

be a function of two random variables expressing the difference between Z0 and Z1. Analogous to the Eqs. (1)–(4) from Section 2 and with f(X1,X2)=X1/[X2(X2+1)], we have for the mean and variance of Err

$$\begin{array}{*{20}l} E(Err) &\approx \frac{p_{1}}{p_{2}(1+np_{2})} + \frac{p_{1}(1-p_{2})\left(1 + 3np_{2} + 3n^{2}p_{2}^{2}\right)}{np_{2}^{2}(1 + np_{2})^{3}} + \frac{p_{1}(1+2np_{2})}{np_{2}(1+np_{2})^{2}} \\ &= \frac{p_{1}\left[1 + 4np_{2} + (5-p_{2})n^{2}p_{2}^{2} + n^{3}p_{2}^{3}\right]}{np_{2}^{2}(1 + np_{2})^{3}}, \end{array} $$
(6)
$$\begin{array}{*{20}l} var(Err) &\approx \frac{(1 - p_{1}) p_{1}}{n p_{2}^{2} (1 + n p_{2})^{2}} + \frac{(1 - p_{2}) (p_{1} + 2 n p_{1} p_{2})^{2}}{n p_{2}^{3} (1 + n p_{2})^{4}} + \frac{2 p_{1}^{2} (1 + 2 n p_{2})}{n p_{2}^{2} (1 + n p_{2})^{3}} \\ &= \frac{p_{1} \left[p_{2} (1 + n p_{2})^{2} + p_{1} \left\{1 + 4 n p_{2} + (4 - p_{2}) n^{2} p_{2}^{2}\right\}\right]}{n p_{2}^{3} (1 + n p_{2})^{4}}. \end{array} $$
(7)

It follows from the Eqs. (6) and (7) that Z1 is an asymptotically (n) unbiased estimator of the ratio of multinomial proportions Z0. Moreover, the Eqs. (6) and (7) can be used to correct the mean and variance of the modified ratio Z1 to better reflect the mean and variance of the original ratio Z0. Let \(Z_{1}^{cor} = Z_{1} + Err\) be a new random variable. Since the expected value is linear, we have directly

$$\begin{array}{*{20}l} E\left(Z_{1}^{cor}\right) &= E(Z_{1}) + E(Err) \approx \\ &\approx \frac{p_{1}}{p_{2}}\left(1 - (1-p_{2})^{n}\right) + \frac{p_{1}\left[1 + 4np_{2} + (5-p_{2})n^{2}p_{2}^{2} + n^{3}p_{2}^{3}\right]}{np_{2}^{2}(1 + np_{2})^{3}}. \end{array} $$

For the variance, we have

$$var(Z_{1}^{cor}) = var(Z_{1}) + var(Err) + 2cov(Z_{1},Err), $$

where

$$cov(Z_{1},Err) = E(Z_{1}\cdot Err) - E(Z_{1}) \cdot E(Err). $$

To approximate the value of E(Z1·Err), we use the Taylor series again, particularly Eq. (1). After some rearrangement, we get

$$\begin{array}{*{20}l} {}E\left(\frac{X_{1}^{2}}{X_{2}(X_{2}+1)^{2}}\right) \approx& \frac{n p_{1}^{2}}{p_{2} (1 + n p_{2})^{2}} + \frac{(1 - p_{1}) p_{1}}{p_{2} (1 + n p_{2})^{2}} \\ &+ \frac{p_{1}^{2} (1 - p_{2}) \left(1 + 4 n p_{2} + 6 n^{2} p_{2}^{2}\right)}{p_{2}^{2} (1 + n p_{2})^{4}} + \frac{2 p_{1}^{2} (1 + 3 n p_{2})}{p_{2} (1 + n p_{2})^{3}} \\ =& \frac{p_{1} \left[p_{2} (1 + n p_{2})^{2} + p_{1} \left\{1 + (5 + 2 p_{2}) n p_{2} + (8 - p_{2}) n^{2} p_{2}^{2} + n^{3} p_{2}^{3}\right\}\right]}{p_{2}^{2} (1 + n p_{2})^{4}} \end{array} $$

Thus, we can now easily calculate the value of \(var\left (Z_{1}^{cor}\right)\) (equation omitted due to its length). In the next section, we shall discuss numerical simulations and performance of the presented formulae.

Numerical simulations

Numerical simulations were performed in the following way. We selected several multinomial distributions given by (n,p1,p2,p3) and for each such distribution, we sampled 105 random vectors (X1,X2,X3). Vectors with X2=0 were counted (variable zeros) and omitted from further calculations; that is, they were not replaced by new random vectors. For the vectors with X2≠0, we calculated the ratios Z0=X1/X2, while the ratios Z1=X1/(X2+1) were calculated from all 105 sampled vectors. Thus, we obtained 105zeros values of Z0 and 105 values of Z1. From both sets we calculated the mean and variance of the sampled data. We compared these values with the predictions as follows below.

For the mean, we compared the means of the two data sets with the Taylor-series solution given by Eq. (2), and with the modified ratio (MR) solution given by Theorem 1 with and without the correction given by the Eq. (6).

For the variance, we compared the variances of the two data sets with the Taylor-series solution given by Eq. (4), and with the modified ratio solution given by Theorem 2 with and without the correction (the final formula for corrected variance of the modified ratio was omitted due to its length, but see Section 4 for calculation details). Note that for variance given by Theorem 2, we considered the case N=5 so that its error O(1/n6) would not interfere with the correction.

Figure 1 shows the simulation results for the multinomial distribution given by (n=10,…,50,p1=0.25,p2=0.5,p3=0.25). The corrected modified ratio gives the best model of the mean and variance of Z0. Observe also that the uncorrected modified ratio is a very precise model of Z1.

Fig. 1
figure 1

The simulation results based on the multinomial distribution given by (n,0.25,0.5,0.25), where n ranges from 10 to 50. The mean and variance of the original ratios Z0 (squares) as well as modified ratios Z1 (red circles) are compared with models: the Taylor-series model (solid line), the modified ratio model (dashed line), and the corrected modified ratio model (dash-dot line). Additionally, in the upper plot, there is also information about the variable zeros on the secondary right axis (dots). In this case, the modified ratio model outperforms the Taylor series model for Z0 data. Additionally, the uncorrected modified ratio model describes the Z1 data very well

In Fig. 2, when p2 and n are small, the discrepancy between the models and the data gets larger, although the corrected modified ratio still outperforms the Taylor-series approach. The uncorrected modified ratio is also a very good model of Z1.

Fig. 2
figure 2

The simulation results based on the multinomial distribution given by (n,0.25,0.05,0.7), where n ranges from 120 to 300. For the use of symbols see Fig. 1. Again, the modified ratio model outperforms the Taylor-series model for Z0 data in this case, although the fit is not so close as in Fig. 1. The uncorrected modified ratio model still describes the Z1 data very well

Figures 3 and 4 further explore the limits of the presented models. In Fig. 3, we compared the performance of the variance models in three multinomial distributions (with decreasing value of p2) for various values of N from Theorem 2. Note that with growing N, there also grows the minimal value of n for which the Theorem 2 holds; therefore, the variance models start from a different n. It will be observed that all models have difficulty describing the initial part of the variance curve of the simulated data. However, one should keep in mind that the formula in Theorem 2 is only asymptotic.

Fig. 3
figure 3

The simulation results based on three multinomial distributions and various values of N from Theorem 2. Displayed are the results for variance. The simulation data for original ratios Z0 (squares) are compared with models: the Taylor-series model (solid line) and the corrected modified ratio models with N=1 (dashed line), N=3 (dots), N=5 (dash-dot line). Observe that because of the condition on n in Theorem 2, the modified ratio models do not start at the same value of n for different N

Fig. 4
figure 4

The results for mean for the data from Fig. 3. The simulation data for original ratios Z0 (squares) are compared with models: the Taylor-series model (solid line) and corrected modified ratio model (dashed line). Observe the uncorrected modified ratio model (dash-dot line) which exactly models the modified ratios Z1 (red circles) in all cases

In Fig. 4, we compared the models for mean on the same data as in Fig. 3. Again, for small values of n, the models fail to capture the real trend of the data. On a side note, the data for Z1 are very well described by the uncorrected modified ratio model from Theorem 1.

The supplemental material contains a script (Additional file 1) to generate similar plots for the user-specified multinomial distribution (n,p1,p2,p3) and a range of n. Given the results from the simulation data, we encourage the reader to use this script and check whether the formulae presented in the paper will provide for a good approximation of Z0 for his/hers particular multinomial distribution.

Appendix

Proof of Theorem 1

Lemma 1

Let \(n\in \mathbb {N}\) and \(R\in \mathbb {R}\). Then it holds

$$\sum_{k=0}^{n}{\left({n \atop k}\right)R^{k}k} = nR\left(1+R\right)^{n-1}. $$

Proof

From \(\left ({n \atop k}\right)=\frac {n}{k}\left ({n-1 \atop k-1}\right)\) it directly follows that

$$\sum_{k=0}^{n}{\left({n \atop k}\right)R^{k}k} =nR\sum_{k=0}^{n-1}{\left({n-1 \atop k}\right) R^{k}} =nR(1+R)^{n-1}. $$

Proof of Theorem 1

From the definition of the expected value we have

$$E(Z_{1}) = \sum_{d\in S_{Z_{1}}} pr(Z_{1} = d) \cdot d, $$

where \(S_{Z_{1}}\) is a sample space of Z1. By using

$$pr(Z_{1} = d) = \sum_{b=0}^{n}\sum_{\substack{a=0\\a/(b+1)=d}}^{n-b}\left({n \atop b}\right)\left({n-b \atop a}\right)p_{1}^{a}~ p_{2}^{b}~ (1-p_{1}-p_{2})^{n-a-b} $$

from Section 3.2, we can write

$$E(Z_{1}) = \sum_{d \in S_{Z_{1}}} \left(\sum_{b=0}^{n}\sum_{\substack{a=0\\a/(b+1)=d}}^{n-b}\left({n \atop b}\right)\left({n-b \atop a}\right)p_{1}^{a} p_{2}^{b} (1-p_{1}-p_{2})^{n-a-b}\right) d. $$

Furthermore, because \(\sum _{b=0}^{n}\sum _{a=0}^{n-b}\) enumerates all possible values of a random vector (X1,X2,X3)=(a,b,nab) for the given n, it also enumerates all values of Z1 including their multiplicities (see Section 3.2). Thus, we can simplify the expression of E(Z1) into

$$E(Z_{1}) = \sum_{b=0}^{n}\sum_{a=0}^{n-b}\left({n \atop b}\right)\left({n-b \atop a}\right)p_{1}^{a} p_{2}^{b} (1-p_{1}-p_{2})^{n-a-b}\frac{a}{b+1}. $$

We rewrite this expression to separate the sums, thus obtaining

$$\begin{array}{*{20}l} E(Z_{1})=(1-p_{1}-p_{2})^{n} &\sum_{b=0}^{n}{\left({n \atop b}\right)\left(\frac{p_{2}}{1-p_{1}-p_{2}}\right)^{b}\frac{1}{b+1}}\cdot \\ \cdot&\sum_{a=0}^{n-b}{\left({n-b \atop a}\right)\left(\frac{p_{1}}{1-p_{1}-p_{2}}\right)^{a} a}. \end{array} $$
(8)

Using Lemma 1, we have for (8)

$$\begin{array}{*{20}l} \sum_{a=0}^{n-b}{\left({n-b \atop a}\right)\left(\frac{p_{1}}{1-p_{1}-p_{2}}\right)^{a} a} =(n-b)\frac{p_{1}}{1-p_{1}-p_{2}}\left(\frac{1-p_{2}}{1-p_{1}-p_{2}}\right)^{n-b-1}. \end{array} $$

By putting this back to E(Z1) and after some rearrangement of the terms, we get

$$\begin{array}{*{20}l} E(Z_{1}) = (1-p_{2})^{n} \left(\frac{p_{1}}{1-p_{2}}\right) \sum_{b=0}^{n}\left({n \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b}\frac{n-b}{b+1}. \end{array} $$
(9)

We continue by splitting the following fraction into two terms

$$\frac{n-b}{b+1} = \frac{n+1}{b+1} - 1. $$

By this, the sum in (9) splits into two parts

$$E(Z_{1}) = A + B, $$

where

$$\begin{array}{*{20}l} A&=(1-p_{2})^{n} \left(\frac{p_{1}}{1-p_{2}}\right) \sum_{b=0}^{n}\left({n \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b}\frac{n+1}{b+1},\\ B&=(1-p_{2})^{n} \left(\frac{p_{1}}{1-p_{2}}\right) \sum_{b=0}^{n}\left({n \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b} (-1). \end{array} $$

With \(\left ({n \atop b}\right)\frac {n+1}{b+1}=\left ({n+1 \atop b+1}\right)\) and some rearrangement of the terms, we obtain

$$A=\frac{p_{1}}{p_{2}}\left(\frac{1}{1-p_{2}} - (1-p_{2})^{n}\right), $$

and a straightforward calculation of B yields

$$B=-\frac{p_{1}}{1-p_{2}}. $$

Finally, after putting A and B together, we get

$$E(Z_{1}) = A+B = \frac{p_{1}}{p_{2}} - \frac{p_{1}}{p_{2}}(1-p_{2})^{n} = \frac{p_{1}}{p_{2}}\left(1 - (1-p_{2})^{n}\right). $$

Proof of Theorem 2

The proof of Theorem 2 relies on a series of lemmas and corollaries. For a better navigation through the proof, see Fig. 5 for the proof scheme.

Fig. 5
figure 5

Scheme of the proof of Theorem 2

Lemma 2

Let \(n\in \mathbb {N}\) and \(R\in \mathbb {R}\). Then it holds

$$\sum_{k=0}^{n}{\left({n \atop k}\right)R^{k}k^{2}} = n(n-1)R^{2}(1+R)^{n-2} + nR(1+R)^{n-1}. $$

Proof

From \(\left ({n \atop k}\right)=\frac {n}{k}\left ({n-1 \atop k-1}\right)\) and Lemma 1 it follows that

$$\begin{array}{*{20}l} \sum_{k=0}^{n}{\left({n \atop k}\right)R^{k}k^{2}} &= nR\sum_{k=0}^{n-1}{\left({n-1 \atop k}\right)R^{k}(k+1)} \\ &= nR\sum_{k=0}^{n-1}{\left({n-1 \atop k}\right)R^{k}k} + nR\sum_{k=0}^{n-1}{\left({n-1 \atop k}\right)R^{k}}\\ &= n(n-1)R^{2}(1+R)^{n-2} + nR(1+R)^{n-1}. \end{array} $$

Lemma 3

Let \(n\in \mathbb {N}\) and \(R\in \mathbb {R}\backslash \{0\}\). Then, for any \(n\in \mathbb {N}\) it holds

$$\sum_{b=1}^{n} \left({n \atop b}\right)\frac{R^{b}}{b} = \sum_{k=0}^{N} \left(A_{2k} - B_{2k}\right) + A_{2N+1}, $$

where

$$\begin{array}{*{20}l} A_{2k} &= \left(\prod_{i=1}^{k+1}\frac{1}{n+i}\right) \frac{k!} {R^{k+1}}\left(1+R\right)^{n+k+1},\\ B_{2k} &= \left(\prod_{i=1}^{k+1}\frac{1}{n+i}\right) \frac{k!} {R^{k+1}}\sum_{b=0}^{k+1}\left({n+k+1 \atop b}\right)R^{b},\\ A_{2k+1} &= \left(\prod_{i=1}^{k+1}\frac{1}{n+i}\right) \frac{(k+1)!}{R^{k+1}}\sum_{b=k+2}^{n+k+1}\left({n+k+1 \atop b}\right)\frac{R^{b}}{b - (k+1)}. \end{array} $$

Proof

By induction on N. Let N=0. Then, it follows

$$\sum_{b=1}^{n} \left({n \atop b}\right)\frac{R^{b}}{b} = \sum_{b=1}^{n} \left({n \atop b}\right)\frac{R^{b}}{b+1}\left(1 + \frac{1}{b}\right) = \sum_{b=1}^{n} \left({n \atop b}\right)\frac{R^{b}}{b+1} + \sum_{b=1}^{n} \left({n \atop b}\right)\frac{R^{b}}{b(b+1)}. $$

By using \(\frac {n+1}{k+1}\left ({n \atop k}\right)=\left ({n+1 \atop k+1}\right)\) and the binomial theorem, we can write

$$\begin{array}{*{20}l} \sum_{b=1}^{n} \left({n \atop b}\right)\frac{R^{b}}{b} &= \frac{1}{n+1}\frac{1}{R}\sum_{b=1}^{n} \left({n+1 \atop b+1}\right)R^{b+1} + \frac{1}{n+1}\frac{1}{R}\sum_{b=1}^{n} \left({n+1 \atop b+1}\right)\frac{R^{b+1}}{(b+1) - 1} \\ &=\frac{1}{n+1}\frac{1}{R}\sum_{b=2}^{n+1} \left({n+1 \atop b}\right)R^{b} + \frac{1}{n+1}\frac{1}{R}\sum_{b=2}^{n+1} \left({n+1 \atop b}\right)\frac{R^{b}}{b - 1} \\ &=A_{0} - B_{0} + A_{1}. \end{array} $$

The base of the induction holds. Assume that the lemma holds up to some natural N. We prove that it holds for N+1 as well. Consider the term A2N+1. We have

$$\begin{array}{*{20}l} A_{2N+1} &= \left(\prod_{i=1}^{N+1}\frac{1}{n+i}\right) \frac{(N+1)!}{R^{N+1}}\sum_{b=N+2}^{n+N+1}\left({n+N+1\atop b}\right)\frac{R^{b}}{b+1}\left(1 + \frac{N+2}{b-(N+1)}\right) \\ &= X_{1} + X_{2}, \end{array} $$

where

$$\begin{array}{*{20}l} X_{1} &= \left(\prod_{i=1}^{N+1}\frac{1}{n+i}\right) \frac{(N+1)!}{R^{N+1}}\sum_{b=N+2}^{n+N+1}\left({n+N+1 \atop b}\right)\frac{R^{b}}{b+1},\\ X_{2} &= \left(\prod_{i=1}^{N+1}\frac{1}{n+i}\right) \frac{(N+1)!}{R^{N+1}}\sum_{b=N+2}^{n+N+1}\left({n+N+1 \atop b}\right)\frac{R^{b}}{b+1}\frac{N+2}{b-(N+1)}. \end{array} $$

Furthermore, by the same trick with the binomial coefficient as above, we rewrite the terms X1 and X2 as

$$\begin{array}{*{20}l} {}X_{1} &= \left(\prod_{i=1}^{N+1}\frac{1}{n+i}\right) \frac{(N+1)!}{R^{N+1}} \frac{1}{n+N+2}\frac{1}{R} \sum_{b=N+2}^{n+N+1}\left({n+N+2 \atop b+1}\right)R^{b+1},\\ {}X_{2} &= \left(\prod_{i=1}^{N+1}\frac{1}{n+i}\right) \frac{(N+1)!}{R^{N+1}} \frac{1}{n+N+2}\frac{1}{R} \sum_{b=N+2}^{n+N+1}\left({n+N+2 \atop b+1}\right)\frac{R^{b+1}(N+2)}{(b+1) - 1 -(N+1)}. \end{array} $$

After some rearrangement, we finally get (again using the binomial theorem)

$$\begin{array}{*{20}l} X_{1} &= \left(\prod_{i=1}^{N+2}\frac{1}{n+i}\right) \frac{(N+1)!}{R^{N+2}} \sum_{b=N+3}^{n+N+2}\left({n+N+2 \atop b}\right)R^{b} = A_{2(N+1)} - B_{2(N+1)},\\ X_{2} &= \left(\prod_{i=1}^{N+2}\frac{1}{n+i}\right) \frac{(N+2)!}{R^{N+2}} \sum_{b=N+3}^{n+N+2}\left({n+N+2 \atop b}\right)\frac{R^{b}}{b -(N+2)} = A_{2(N+1) + 1}. \end{array} $$

Remark 1

We will often use Lemma 3 with n+1 instead of n. Therefore, we restate the Lemma 3 with this change. Let \(n\in \mathbb {N}\) and \(R\in \mathbb {R}\backslash \{0\}\). Then, for any \(n\in \mathbb {N}\) it holds

$$\sum_{b=1}^{n+1} \left({n+1 \atop b}\right)\frac{R^{b}}{b} = \sum_{k=0}^{N}\left(A_{2k} - B_{2k}\right) + A_{2N+1}, $$

where

$$\begin{array}{*{20}l} A_{2k} &= \left(\prod_{i=2}^{k+2}\frac{1}{n+i}\right) \frac{k!} {R^{k+1}}\left(1 + R\right)^{n+k+2},\\ B_{2k} &= \left(\prod_{i=2}^{k+2}\frac{1}{n+i}\right) \frac{k!} {R^{k+1}}\sum_{b=0}^{k+1}\left({n+k+2 \atop b}\right)R^{b},\\ A_{2k+1} &= \left(\prod_{i=2}^{k+2}\frac{1}{n+i}\right) \frac{(k+1)!}{R^{k+1}}\sum_{b=k+2}^{n+k+2}\left({n+k+2 \atop b}\right)\frac{R^{b}}{b - (k+1)}. \end{array} $$

Lemma 4

Let p1,p2∈(0,1)be some real constants. Let k,n be some non-zero natural numbers. Let A2k+1 be the term from Remark 1. Furthermore, let R=p2/(1−p2), and let

$$\begin{array}{*{20}l} A &= (n+1)n\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + (n+1)\frac{p_{1}}{1-p_{2}},\\ D &= \frac{(1-p_{2})^{n}}{n+1}\frac{1-p_{2}}{p_{2}}. \end{array} $$

Then, for α∈[1,k+2], it holds

$$ADA_{2k+1} \leq \alpha\frac{n}{(k+2)\left({n+k+3 \atop k+2}\right)}\frac{p_{1}}{p_{2}^{k+3}(1-p_{2})}\left(\frac{p_{1}}{1-p_{2}} + \frac{1}{n}\right) = O\left(\frac{1}{n^{k+1}}\right). $$

Proof

First of all, for α∈[1,k+2] we have

$$A_{2k+1} = \alpha \left(\prod_{i=2}^{k+3}\frac{1}{n+i}\right) \frac{(k+1)!}{R^{k+2}} \sum_{b=k+3}^{n+k+3}\left({n+k+3 \atop b}\right)R^{b}. $$

This follows easily by applying the inequality

$$\frac{k+2}{b+1} \geq \frac{1}{b - (k+1)} \geq \frac{1}{b+1} $$

to the term A2k+1 from Remark 1, which holds for any natural b,k except for pairs b=k+1 (in our case b>k+1). We can see this by solving the inequality

$$\frac{1+x}{b+1}\geq\frac{1}{b - (k+1)} $$

for x. By this, we get an upper and lower bound on the term A2k+1, which differ by a multiplicative constant k+2. Finally, the lemma follows by extending the summation through index b in the term A2k+1 to a full range from 0 to n+k+3, by applying the binomial theorem and some simple rearrangement of the terms. The O bound follows from the fact that \(\left ({n \atop k}\right)\geq \left (\frac {n}{k}\right)^{k}\). □

Lemma 5

Let p1,p2∈(0,1)be some real constants. Let k,n be some non-zero natural numbers. Let A2k be the term from Remark 1. Furthermore, let R=p2/(1−p2), and let

$$\begin{array}{*{20}l} A &= (n+1)n\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + (n+1)\frac{p_{1}}{1-p_{2}},\\ D &= \frac{(1-p_{2})^{n}}{n+1}\frac{1-p_{2}}{p_{2}}. \end{array} $$

Then, it holds

$$ADA_{2k} = \frac{\left(\frac{p_{1}}{p_{2}(1-p_{2})}\right)^{2}}{\left({n+k+1 \atop k}\right) p_{2}^{k}}\left(1 - \frac{k+2 - \frac{1-p_{2}}{p_{1}}}{n+k+2}\right). $$

Proof

The lemma follows easily by a straightforward multiplication of the terms A, D and A2k, and some rearrangement of the terms. □

The following lemma is an extension of one borrowed from Graham et al. (1994).

Lemma 6

Let 0<α<R/(1+R)for some real R>0. Then, it holds

$$\sum_{k\leq\alpha n} \left({n \atop k}\right) R^{k} = R^{m} 2^{nH(\alpha) - \frac{1}{2}\lg n + O(1)}, $$

where m=⌊αn⌋ and

$$H(\alpha) = \alpha\lg\frac{1}{\alpha} + (1-\alpha)\lg\frac{1}{1-\alpha}. $$

Proof

First of all, we have

$$\frac{\left({n \atop k-1}\right)R^{k-1}}{\left({n \atop k}\right)R^{k}} = \frac{k}{n-k+1}\frac{1}{R} \leq \frac{\alpha n}{n-\alpha n + 1}\frac{1}{R}< \frac{\alpha}{1-\alpha}\frac{1}{R}. $$

Let m=⌊αn⌋=αnε. It holds

$$\begin{array}{*{20}l} \left({n \atop m}\right)R^{m} < \sum_{k\leq m}\left({n \atop k}\right)R^{k} &< \left({n \atop m}\right)R^{m}\left(1 + \frac{\alpha}{1 + \alpha}\frac{1}{R} + \left(\frac{\alpha}{1-\alpha}\frac{1}{R}\right)^{2} + \ldots\right) \\ &= \left({n \atop m}\right)R^{m} \frac{(1-\alpha)R}{(1-\alpha)R - \alpha} \end{array} $$

because

$$\frac{\alpha}{1-\alpha}\frac{1}{R}<1, $$

which follows from α<R/(1+R). Thus,

$$\sum_{k\leq m}\left({n \atop k}\right)R^{k} = \left({n \atop m}\right)R^{m} O(1). $$

By Stirling’s approximation, we have

$$\begin{array}{*{20}l} {}\log \left({n \atop m}\right) &= -\frac{1}{2}\log n - (\alpha n - \epsilon)\log\left(\alpha - \frac{\epsilon}{n}\right) - \left((1-\alpha)n + \epsilon\right)\log\left(1-\alpha + \frac{\epsilon}{n}\right) + O(1) \\ &= -\frac{1}{2}\log n - n \alpha \log \alpha - n(1-\alpha)\log(1- \alpha) + O(1), \end{array} $$

and the lemma follows. □

Lemma 7

Let p1,p2∈(0,1)be some real constants. Let k,n be some non-zero natural numbers such that

$$n>\frac{1-p_{2}}{p_{2}}k + \frac{1-2p_{2}}{p_{2}}. $$

Let B2k be the term from Remark 1. Furthermore, let R=p2/(1−p2), and let

$$\begin{array}{*{20}l} A &= (n+1)n\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + (n+1)\frac{p_{1}}{1-p_{2}},\\ D &= \frac{(1-p_{2})^{n}}{n+1}\frac{1-p_{2}}{p_{2}}. \end{array} $$

Then, it holds

$${}ADB_{2k} = n(1-p_{2})^{n} \frac{p_{1}}{p_{2}(1-p_{2})}\left(p_{1} + \frac{1-p_{2}}{n}\right)\frac{2^{k+1}O(1)}{(k+1)(n+k+2)^{\frac{1}{2}}} = O\left(n^{\frac{1}{2}}(1-p_{2})^{n}\right). $$

Proof

Let α=(k+1)/(n+k+2). One can easily verify that α<R/(1+R)=p2 because of the choice of n. Thus, we can apply Lemma 6 to the sum from the term B2k. From this, it follows that

$$ {}\sum_{b=0}^{k+1}\left({n+k+2 \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b} = \left(\frac{p_{2}}{1-p_{2}}\right)^{k+1}2^{(n+k+2)H(\alpha) - \frac{1}{2}\lg (n+k+2) + O(1)}, $$
(10)

where

$$H(\alpha) = \alpha\lg\frac{1}{\alpha} + (1-\alpha)\lg\frac{1}{1-\alpha}. $$

Moreover, for H(α) we have

$$H(\alpha) = \frac{k+1}{n+k+2}\lg \left(\frac{2(n+k+2)}{k+1}\right) - O\left(\frac{1}{n^{2}}\right), $$

which follows from

$$\lg (1-\alpha) = -\sum_{i=1}^{\infty} \frac{\alpha^{i}}{i}. $$

Plunging this into (10), we get

$$\sum_{b=0}^{k+1}\left({n+k+2 \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b} = \left(\frac{p_{2}}{1-p_{2}}\right)^{k+1} \frac{\left(\frac{2(n+k+2)}{k+1}\right)^{k+1}O(1)}{(n+k+2)^{\frac{1}{2}}}. $$

With this, we can write for the whole B2k term from Remark 1

$$ {}B_{2k} = \frac{\left(\frac{2(n+k+2)}{k+1}\right)^{k+1}O(1)}{\left({n+k+1 \atop k}\right)(n+k+2)^{\frac{3}{2}}} \leq \frac{2^{k+1}\left({n+k+2 \atop k+1}\right)O(1)}{\left({n+k+1 \atop k}\right)(n+k+2)^{\frac{3}{2}}} = \frac{2^{k+1}O(1)}{(k+1)(n+k+2)^{\frac{1}{2}}} $$
(11)

because \(\left (\frac {n}{k}\right)^{k}\leq \left ({n \atop k}\right)\). Similarly, with \(\left ({n \atop k}\right)<\left (\frac {ne}{k}\right)^{k}\), we have for B2k

$$B_{2k} \geq \frac{\left(\frac{2}{e}\right)^{k+1}O(1)}{(k+1)(n+k+2)^{\frac{1}{2}}}, $$

if we use

$$\left({n+k+1 \atop k}\right)(n+k+2)^{\frac{3}{2}}=\left({n+k+2 \atop k+1}\right)(k+1)(n+k+1)^{\frac{1}{2}}. $$

Thus, we have

$$B_{2k} = \frac{2^{k+1}O(1)}{(k+1)(n+k+2)^{\frac{1}{2}}}, $$

and the lemma easily follows by multiplying B2k with the term AD. □

Corollary 2

Let p1,p2∈(0,1)be some real constants. Let n,N be some non-zero natural numbers such that

$$n>\frac{1-p_{2}}{p_{2}}N + \frac{1-2p_{2}}{p_{2}}. $$

Let A2k,B2k, k=0,...,N, and A2N+1 be terms from Remark 1. Furthermore, let R=p2/(1−p2), and let

$$\begin{array}{*{20}l} A &= (n+1)n\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + (n+1)\frac{p_{1}}{1-p_{2}},\\ D &= \frac{(1-p_{2})^{n}}{n+1}\frac{1-p_{2}}{p_{2}}. \end{array} $$

Then, it holds

$${}AD\sum_{b=1}^{n+1}\left({n+1 \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b}\frac{1}{b} = \left(\frac{p_{1}}{p_{2}(1-p_{2})}\right)^{2}\sum_{k=0}^{N} \frac{1-\frac{k+2 - \frac{1-p_{2}}{p_{1}}}{n+k+2}}{\left({n+k+1 \atop k}\right)p_{2}^{k}} + O\left(\frac{1}{n^{N+1}}\right). $$

Proof

Follows from Lemmas 4, 5 and 7. □

Lemma 8

Let p1,p2∈(0,1) be some real constants and n some non-zero natural number. Let

$$\begin{array}{*{20}l} B &= (2n+1)\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + \frac{p_{1}}{1-p_{2}},\\ D &= \frac{(1-p_{2})^{n}}{n+1}\frac{1-p_{2}}{p_{2}}. \end{array} $$

Then, it holds

$$\begin{array}{*{20}l} {}BD\sum_{b=1}^{n+1}\left({n+1 \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b} &= 2\left(\frac{p_{1}}{1-p_{2}}\right)^{2}\frac{1}{p_{2}} + \frac{1}{n+1}\frac{p_{1}}{p_{2}(1-p_{2})}\left(1 - \frac{p_{1}}{1-p_{2}}\right) \\ &+ O\left((1-p_{2})^{n}\right). \end{array} $$

Proof

Straightforward by binomial theorem. □

Lemma 9

Let p1,p2∈(0,1) be some real constants and n some non-zero natural number. Let

$$\begin{array}{*{20}l} C &= \left(\frac{p_{1}}{1-p_{2}}\right)^{2},\\ D &= \frac{(1-p_{2})^{n}}{n+1}\frac{1-p_{2}}{p_{2}}. \end{array} $$

Then, it holds

$$CD\sum_{b=1}^{n+1}\left({n+1 \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b} b= \left(\frac{p_{1}}{1-p_{2}}\right)^{2}. $$

Proof

Straightforward by Lemma 1 and binomial theorem. □

Proof of Theorem 2

The variance of the random variable Z1 can be calculated as

$$var(Z_{1})=E(Z_{1}^{2})-E^{2}(Z_{1}). $$

By Theorem 1, we have

$$E(Z_{1})=\frac{p_{1}}{p_{2}}\left(1-(1-p_{2})^{n}\right). $$

So, we only need to determine the value of \(E\left (Z_{1}^{2}\right)\). From the definition of the expected value, we have

$${}E\left(Z_{1}^{2}\right) = \sum_{b=0}^{n}\sum_{a=0}^{n-b}\left({n \atop b}\right)\left({n-b \atop a}\right)p_{1}^{a} p_{2}^{b} (1-p_{1}-p_{2})^{n-a-b}\left(\frac{a}{b+1}\right)^{2} =(1-p_{1}-p_{2})^{n} V_{1} V_{2}, $$

where

$$\begin{array}{*{20}l} V_{1} &= \sum_{b=0}^{n}{\left({n \atop b}\right)\left(\frac{p_{2}}{1-p_{1}-p_{2}}\right)^{b}\left(\frac{1}{b+1}\right)^{2}},\\ V_{2} &= \sum_{a=0}^{n-b}{\left({n-b \atop a}\right)\left(\frac{p_{1}}{1-p_{1}-p_{2}}\right)^{a} a^{2}}. \end{array} $$

By application of Lemma 2 to V2, we obtain

$$\begin{array}{*{20}l} E(Z_{1}^{2}) &= (1-p_{2})^{n} \sum_{b=0}^{n}{\left({n \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b}\left(\frac{1}{b+1}\right)^{2}} W,\\ W&=(n-b)(n-b-1)\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + (n-b)\frac{p_{1}}{1-p_{2}}. \end{array} $$

By using the equality

$$\left({n \atop b}\right)\left(\frac{1}{b+1}\right)^{2} = \left({n+1 \atop b+1}\right)\frac{1}{n+1}\frac{1}{b+1} $$

and adjustment of the summation borders, we get

$$\begin{array}{*{20}l} E\left(Z_{1}^{2}\right) &= \frac{(1-p_{2})^{n}}{n+1}\cdot\frac{1-p_{2}}{p_{2}} \cdot\sum_{b=1}^{n+1} {\left({n+1 \atop b}\right)\left(\frac{p_{2}}{1-p_{2}}\right)^{b}\frac{1}{b}} W,\\ W&=(n-b+1)(n-b)\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + (n-b+1)\frac{p_{1}}{1-p_{2}}. \end{array} $$

Next, we split the term W according to powers of b, thus obtaining

$$W = A - Bb + Cb^{2}, $$

where

$$\begin{array}{*{20}l} A&= (n+1)n\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + (n+1)\frac{p_{1}}{1-p_{2}},\\ B&= (2n+1)\left(\frac{p_{1}}{1-p_{2}}\right)^{2} + \frac{p_{1}}{1-p_{2}},\\ C&= \left(\frac{p_{1}}{1-p_{2}}\right)^{2}. \end{array} $$

If we set

$$D=\frac{(1-p_{2})^{n}}{n+1}\cdot\frac{1-p_{2}}{p_{2}}, $$

then we can write

$$E\left(Z_{1}^{2}\right) = D\sum_{b=1}^{n+1} \left({n+1 \atop b}\right) \left(\frac{p_{2}}{1-p_{2}}\right)^{b}\left(\frac{A}{b} - B + Cb\right) = S_{1} + S_{2} + S_{3}, $$

where

$$\begin{array}{*{20}l} S_{1} &= AD\sum_{b=1}^{n+1} \left({n+1 \atop b}\right) \left(\frac{p_{2}}{1-p_{2}}\right)^{b}\frac{1}{b},\\ S_{2} &= -BD\sum_{b=1}^{n+1} \left({n+1 \atop b}\right) \left(\frac{p_{2}}{1-p_{2}}\right)^{b},\\ S_{3} &= CD\sum_{b=1}^{n+1} \left({n+1 \atop b}\right) \left(\frac{p_{2}}{1-p_{2}}\right)^{b} b, \end{array} $$

and by Corollary 2 (S1) and Lemmas 8 (S2) and 9 (S3) we get

$$\begin{array}{*{20}l} E(Z_{1}^{2}) =&\ \sum_{k=0}^{N} \left(\frac{p_{1}}{p_{2}(1-p_{2})}\right)^{2} \frac{1}{\left({n+k+1 \atop k}\right)p_{2}^{k}}\left(1-\frac{k+2 - \frac{1-p_{2}}{p_{1}}}{n+k+2}\right) -\\ &- 2\left(\frac{p_{1}}{1-p_{2}}\right)^{2}\frac{1}{p_{2}} - \frac{1}{n+1}\frac{p_{1}}{p_{2}(1-p_{2})}\left(1 - \frac{p_{1}}{1-p_{2}}\right) \\ &+ \left(\frac{p_{1}}{1-p_{2}}\right)^{2} + O\left(\frac{1}{n^{N+1}}\right). \end{array} $$

The rest of the proof follows from adding the term −E2(Z1) to the derived expression for \(E\left (Z_{1}^{2}\right)\), separating the term for k=0 from the rest of the sum, and simple rearrangement of the resulting terms. □