A modiﬁcation of Chao’s lower bound estimator in the case of one-inﬂation

For zero-truncated count data, as they typically arise in capture-recapture modelling, the nonparametric lower bound estimator of Chao is a frequently used estimator of population size. It is a simple, nonparametric estimator involving only counts of one and counts of two. The estimator is asymptotically unbiased if the count distribution is a member of the power series family and is providing a lower bound estimator if the distribution is a mixture of a member of the power series family. However, if there is one-inﬂation Chao’s estimator can severely overestimate as we show here. This is also illustrated by routinely collected country-wide data on family violence in the Netherlands. A new lower bound estimator is developed which involves only counts of twos and threes, thus avoiding the overestimation caused by one-inﬂation. We show that the new estimator is asymptotically unbiased for a power series distribution with and without one-inﬂation and provides a lower bound estimator under a mixture of power series distributions with and without one-inﬂation. For all estimators bias-adjusted versions are developed that reduce the bias considerably when the sample size is small. A simulation study compares the modiﬁed Chao estimator with the conventional estimator as well as with an estimator suggested by Chiu and Chao more recently.

Keywords Capture-recapture · Behavioral response · Power series distribution · Nonparametric estimator of population size · Mixture model · Bias reduction 1 Introduction The size N of a target population needs to be determined. For this purpose a trapping experiment or study is done where members of the target population are identified at T occasions where T might be known or not. For each member i the count of identifications X i is returned where X i takes values in {0, 1, 2, · · · , T } for i = 1, · · · , N . However, zero-identifications are not observed, they remain hidden in the experiment. Hence, a zero-truncated sample X 1 , · · · , X n is observed, where we have assumed without loss of generality that X n+1 = · · · = X N = 0 (for a general introduction into the topic see Borchers et al. 2004, Bunge and Fitzpatrick 1993, Bunge, Willis, and Walsh 2014. One way to undertake capture-recapture modelling is on the basis of a zero-truncated count distribution f 1 , f 2 , ..., f T where f x is the frequency of count x with T being the largest observed count and n = f 1 + ... + f T is the observed sample size. The frequency of zero-counts (of hidden members of the target population) remains unobserved and needs to be estimated. For this purpose Chao's (1987) conventional estimator f 2 1 /(2f 2 ) for the unobserved frequency f 0 of zero-counts is frequently used. Chao's estimator n + f 2 1 /(2f 2 ) of the population size N is asymptotically unbiased if count X follows a Poisson distribution and represents a lower bound if X follows a mixture of Poisson distributions. In fact, it is pointed out in Chao and Colwell (2017) that the result of asymptotic unbiasedness of Chao's estimator holds under the weaker condition that only the rare counts need to follow a Poisson distribution, more precisely the counts of ones and twos, the singletons and doubletons, and the unseen units need to follow a Poisson distribution. The purpose of this note is to present a modification of the Chao estimator in the case of one-inflation as it can severely over-estimate in this case. This is in considerable contrast to the expectation of users of the estimator as it is expected that it provides a meaningful lower bound , i.e. a lower bound that is relatively close to the true population size. One-inflation can occur when the population under study has a subpopulation that cannot be captured anymore after the first capture. Below we discuss an example of police data on perpetrators of domestic violence. Here it is realistic to assume that some individuals in the population refrain from domestic violence after their first contact with the police, in other words their probability to have another capture is zero. A second example is hospital admissions of drug users: the first hospital admission may lead to a change in drug use. In animal studies the idea may be relevant in trap avoidance, where an animal avoids the trap after being captured for the first time. Recently, the problem of one-inflation has received some attention in the literature. Chiu and Chao (2016) consider estimating microbial diversity in the presence of sequencing errors. Bunge et al. (2012) consider estimating population diversity with unreliable low frequency counts (see also Bunge et al. 2014, Willis 2016. All have in common that the frequency f 1 of observed singletons is inflated. Whereas in Bunge et al. (2012) several approaches are suggested to deal with inflated singletons including a mixture model and left-censoring, Chiu and and Chao (2016) and Willis (2016) suggest a sort of double estimation procedure. First, the observed frequency f 1 is re-estimated (Willis 2016) or bias-adjusted (Chiu and Chao 2016) and then incorporated in the ratio-estimator of Willis and Bunge (2015) or the Chao estimator (Chiu and Chao 2016). In addition, Puig and Kokonendji (2018) suggest several lower bound estimators for count distributions with log-convex probability generating functions including compound and mixed Poisson distributions. These, hoowever, do not cover the case of one-inflation. Here, we will develop a lower bound estimator generalizing the original Chao (1987) estimator without dealing with the frequency f 1 of singletons measured with error. To layout the most general setting we consider discrete distributions of the power series family with density where a x is a known, nonnegative coefficient, θ a positive parameter and x = 0, 1, · · · ranges over the set of nonnegative integers; η(θ) = ∞ x=0 a x θ x is the normalizing constant. The power series distributional family contains the Poisson, the binomial, the geometric, the negative-binomial with known shape parameter, the log-series and others. The coefficient a x defines the specific member of the power series, for example a x = 1/x! defines the Poisson, a x = T x for x = 0, · · · , T with positive integer T defines the binomial (a x = 0 for x > T ) and a x = 1 gives the geometric. Assume further that the target population of interest is not homogeneous so that a more adequate modelling is achieved with the general mixture model for the power series family Whereas the modelling capacity of the power series distribution is limited, mixtures of power series distributions experience enhanced flexibility in model fitting. The mixture (2) has two parts, the mixture kernel p x (θ) and the mixing distribution f (θ). If we leave the mixing distribution unspecified, the nonparametric estimate is discrete (Lindsay 1995) and connects to clustering. However, when mixed power series distributions are used to model the zerotruncated distribution, problems may arise due to the lack of identifiability of the mixing distribution (see Link 2003); in addition, boundary problems in maximum likelihood estimation may occur for finite mixture models as outlined by Wang and Lindsay (2005). Hence a renewed interest in lower bound estimation has emerged (Mao 2006;Mao and Lindsay 2007). The original idea of Chao (1987Chao ( , 1989 was to keep the mixing distribution unspecified and to apply nonparametric inference based on the Cauchy-Schwarz inequality in the context of zero-truncated count mixture modelling which arises naturally in capture-recapture experiments or studies. Here we take up this idea again and develop it further for one-inflated count distributions. The associated zero-truncated densities will be denoted as p + for the zero-truncated power series and the zero-truncated mixture of power series distributions, respectively.

Mixtures of Power Series Distributions and the Monotonicity of the Probability Ratio
The power series (1) has an important property. If we consider ratios of neighboring probabilities multiplied by the inverse ratios of their coefficients then in other words, the ratio r x is constant over the range of x with value equal to the unknown parameter θ. Note that r x is also identical to the zero-truncated A nonparametric estimate of r x is readily available witĥ where f x is the frequency of observations with count value x. The graph x →r x is called ratio plot and was developed in Böhning et al.
(2013) as a diagnostic device providing evidence for the aptness of a distribution. The coefficient a x determines the type of ratio plot. For example, if a x = 1/x! we investigate for a Poisson distribution and we call the associated ratio plot Poisson ratio plot, or if a x = 1 we call it the geometric ratio plot. The ratio plot might be used as guidance for choosing the component density in the mixture. We follow the paradigm that the more horizontal the ratio plot the more homogeneous is the population w.r.t. the component density, and this would indicate a preference of the distribution with more horizontal pattern in the associated ratio plot.

Example 1
We apply the ratio plot to family violence data for the Netherlands in the year 2009 provided by van der Heijden et al. (2014). Here the perpetrator study is reported with the data given in Table 1. There were 15, 169 perpetrators identified being involved in a domestic violence incident exactly once, 1, 957 exactly twice, and so forth. In total, there were 17, 662 different perpetrators identified in the Netherlands for 2009. The data represent the Netherlands except the police region for The Hague. It is known that domestic violence is largely a hidden activity and many incidents remain unreported (Summers and Hoffman 2002). In Figure 1, we see the geometric ratio plotr x = f x+1 /f x against x for the family violence data in the Netherlands. Clearly, the ratio plot shows some monotone increasing trend. We will see in the following that Year 15,169 1,957 393 99 28 16 17,662 this monotone pattern can be associated with some form of population heterogeneity. In addition, it is apparent that the first ratio f 2 /f 1 is too small to be in agreement with the line pattern we see in the ratio plot. This indicates an inflation of ones or singletons in the data. In conclusion, we observe two aspects in Figure 1: the occurrence of heterogeneity and of one-inflation. We return to the question how unobserved heterogeneity is associated with the ratio plot, or in other words, how unobserved heterogeneity can be identified in the ratio plot. It was shown in (2) that the occurrence of unobserved heterogeneity leads to the mixture of power series distributions. We can likewise consider the ratio plot for mixtures where we use the coefficients a x associated with the mixture kernel, for example, in the case of a Poisson kernel a x = 1/x! or the case of a geometric kernel a x = 1. The estimate of r x will not change, however, the interpretation of the observed pattern in the ratio plot will. This is mainly due to the following result (Chao 1987, and more general Böhning and Del Rio Vilas 2008): is a member of the power series family and f (θ) an arbitrary density. Then, for r x = ax ax+1 mx+1 mx we have the following monotonicity: This result says that in the case of a mixture of power series distributions the ratio plot will no longer show a horizontal line pattern but will be increasing monotonously. Hence, if a monotone pattern occurs in the ratio plot this may be taken as indication for presence of heterogeneity which can be captured by a nonparametric mixture (2). For this general form of allowing population heterogeneity the estimator of Chao had been developed. If on top of this general heterogeneity one-inflation occurs, Chao's estimator needs modification which we will discuss in the next section.

Modified Chao estimation
As a consequence of the result in Theorem 1 we have that a0 Replacing the theoretical quantities m x by their sample estimates f x /N leads to Chao's estimate for f 0 (Chao 1987(Chao , 1989) By comparing (5) with (6) it can be seen that (6) provides a lower bound of the part of the population that is missed. The estimate (6) is most popular and frequently used in capture-recapture estimation, in particular in connection with the Poisson density (a x = 1/x!) in the mixture (2). However, it should be noted that other bounds are possible as well using the monotonicity result in Theorem 1. Note that also holds, or equivalently This bound has never been used nor elaborated on, as it seems pointless since we have observed counts of one, and no bounds seem to be required. If we replace m 1 in (5) with the bound given in (8) we yield The bound can be simplified to Plugging in relative frequencies leads tô Note that we can expectf new 0 to be smaller thanf 0 in the mean as Specific forms of the modified Chao estimator arise for mixtures of particular power series members. We havê if m x is a geometric mixture, Note that for T becoming large the lower bound for the Poisson mixture and the binomial mixture will agree. Furthermore, if the mixture reduces to a power series distribution (i.e. there is no mixing involved), both estimators, f new 0 andf 0 , are asymptotically unbiased. Note that, similar to the original Chao estimator (Chao and Colwell 2017), for asymptotic unbiasedness the assumption of a power series distribution can be relaxed to hold only for the rare counts, the doubletons and tripletons, i.e. counts of twos and counts of threes, and the unseen units. The question arises why the boundf new 0 could be of interest, as, according to (12), it will typically provide an even lower bound than the conventional Chao lower bound estimatorf 0 . This question is the topic of the next section.

One-inflation
In practice, counts of one, the singletons, occur often more frequently than compatible with a nonparametric mixture model. For example, in the family violence study a portion of the perpetrators having a contact with the police the first time might take this as a serious motivation for a change in behavior and it will never happen again. As Figure 1 indicates, there appear to be two processes going on. The first process can be viewed as a mixture of geometric distributions (as the linear trend in the ratios of frequencies for counts larger than one indicates) . The second process is an inflation of ones (as the much lower ratio f 2 /f 1 supports). In these instances, it is more appropriate to allocate extra-mass at counts of one. Hence, we assume that the following one-inflation model holds: where m x is the mixture of a power series member. Note that (13) can be written as m x = (1 − π)δ 1 (x) + πm x for x = 0, 1, 2, ... and δ y (x) = 1 for x = y and zero otherwise. For a one-inflation model, more singletons will occur than compatible with the nonparametric mixture model as the oneinflation model is outside the class of nonparametric mixtures. Hence Chao's estimator is no longer a lower bound estimator as Theorem 1 no longer holds. In fact, Chao's estimator can experience serious overestimation as also becomes clear when considering its form which involves f 2 1 . Note that one-inflation models behave differently than zero-inflation models as every zero-inflated power series distribution can be written as the mixture ( which is within the class of nonparametric mixtures of power series distributions.
Here comes now the advantage of the new lower bound estimator.
Theorem 2 Assume a one-inflation model m x as given in (13) We provide a short proof of the result in the appendix. As a consequence of this theorem we can expectf new 0 to be a lower bound estimator in the mean under heterogeneity of the parameter of the power series distribution and under one-inflation. Consider the case of a power series distribution with one-inflation, in other words m x = (1 − π)δ 1 (x) + πp x . Then, the conventional Chao estimator has asymptotic bias whereas the newly suggested estimator is asymptotically unbiased, even if the power series distribution is one-inflated.

Example 2
To illustrate the potential of large bias with the conventional Chao estimator consider the following synthetic example. 500 counts were simulated from a Poisson with parameter 1 and merged with 500 extra-ones so that in total N = 1, 000 is the population size. The frequency distribution as follows: f 0 = 186, f 1 = 690, f 2 = 95, f 3 = 32, f 4+ = 7, so that the observed sample size is n = 814. The associated ratio plot is presented in Figure 2 and shows clear evidence of one-inflation. In this case, ignoring the fact that f 0 is known,f new 0 = 186, corresponding exactly to the observed f 0 , which compares to the conventional Chao estimatorf 0 = 2, 434, the latter giving a serious overestimate of the true f 0 = 186.   Figure 3 shows the associated geometric ratio plot based upon the first five frequencies (we restrict the plotting on the larger frequencies), ignoring the the zero-counts. The geometric ratio plot shows evidence for a geometric distribution, except for x = 1 which is lower than the other ratio indicating one-inflation. This becomes even more clear if we us the concept of geometric ratio plot under the null, a diagnostic tool developed in Böhning and Punyapornwithaya (2018). The idea is to plot the logarithm of r x = ax ax+1 fx+1 fx against x as before but also include a pointwise 95% confidence band which is computed on the basis of power series distribution which is assumed to be valid. If the distribution is valid then the band should contain all empirical log-ratios. Figure 4 shows the geometric ratio plot under the null for the H5N1 data set. Clearly, the first point is below the confidence band indicating one-inflation. Again, we assume an arbitrary mixture of geometric distributions with one-inflation as the analysis the ratio plots suggests. We findf new 0 = 551 andf 0 = 1, 044. We note that the conventional Chao estimator is about twice as large as the modified Chao estimator, an effect we would expect if there is one-inflation. We conclude that we estimate at least 550 subdistricts of the 6, 587 subdistricts to be affected by the outbreak.

Example 1 (revisited)
We return to Example 1 of the domestic violence study of section 2. A likelihood ratio test, testing a simple geometric against a one-inflated geometric, leads to a value of 98.9 which is highly significant given that the nulldistribution is a χ 2 -mixture 0.5χ 2 0 + 0.5χ 2 1 . We also include the geometric ratio plot under the null for the domestic violence data in Figure 5. There is clear evidence that the first ratio is outside the confidence band, indicating one-inflation. To be more general, we assume an arbitrary mixture of geometric distributions with one-inflation as the analysis the ratio plots suggests (even though the remaining points are inside the confidence band there is

Bias reduction
The Chao estimators can have severe bias when the sample size is small. To understand the occurrence of bias we go back to the original Chao estimator as developed in (5). As the arguments used in bias-reduction are not readily available in the published literature we outline them here. We try to estimate N m 2 1 /m 2 = E(f 1 ) 2 /E(f 2 ) using f 2 1 /f 2 . However, the latter estimates E(f 2 1 /f 2 ) which is not necessarily close to E(f 1 ) 2 /E(f 2 ) unless f 1 /N and f 2 /N are close to m 1 and m 2 , respectively. Hence the idea of bias reduction is to express E(f 1 ) 2 , which we cannot estimate directly, as f 2 1 by means of E(f 1 ) and E(f 2 1 ) which we can estimate directly as f 1 and f 2 1 . Indeed, we use that ) − E(f 1 ) which can be estimated as f 2 1 −f 1 leading to the numerator of the bias-corrected Chao estimator. Turning to the denominator, we note that our interest is in 1/λ = E(f 2 ), but using 1/f 2 will estimate E(1/f 2 ) if the latter exists. Alternatively, 1/(1 + f 2 ) will estimate E[1/(1 + f 2 )] which can be evaluated using the Poisson assumption for f 2 as with the approximation error less than 0.001 for λ > 5. This leads to the bias-corrected Chao estimator In a similar way, we derive the bias correction for the modified Chao estimator leading toN but leave the details for Appendix 2.

Variance estimation
It is useful to put the proposed estimator into a likelihood framework. Evidently, the estimator (11) uses only counts of ones and twos. Hence it seems reasonable to consider a binomially truncated likelihood where p = P (X = 2|X = 2 or X = 3) = a 2 /(a 2 + a 3 θ). , which corresponds to the proposed estimator (11).
To continue developing a variance estimate we write (11) as T (θ)(f 2 + f 3 ) with T (θ) = a0 a2θ 2 +a3θ 3 . We will use the fact that V ar(X) = E[V ar(X|Y )] + V ar[E(X|Y )] for any two random variables X and Y . This conditioning techniques is helpful in the capture-recapture context (Böhning 2008;van der Heijden et al. 2003). We apply this here by using X = T (θ)(f 2 + f 3 ) and Y = f 2 + f 3 . The first term E[V ar(X|Y )] can be approximated as As T (θ) 2 = The second term V ar[E(X|Y )] can be approximated by under the conventional Poisson assumption. The latter is then estimated by the moment estimate f 2 + f 3 . In total we yield Note that (18) can be written in a simple form aŝ wheref 0 is given by (11). As we have seen in the previous section, it is necessary to stabilize the estimator (11), it is also necessary to use a bias-corrected version of the variance estimator. We suggest to usê as a variance estimator forf 0 , (f3+1)(f3+2) is the biascorrected estimator of f 0 developed in the previous section in (16).
To investigate the performance of our variance estimator (20) we proved a small simulation study comparing the estimated standard error according to (20) with the true standard error estimated from the simulation. The results are provided in Table 3. It can be seen that the approximation is excellent for the larger population size N = 1000 and reasonable for the small population size N = 50 where is provides a conservative estimate. A more detailed investigation of the proposed variance estimator is given in Kaskasamkul (2018). We are now able to give a more realistic estimation of the hidden frequency f 0 for our examples. This is done in Table 4. All estimates appear to be realistic. In the synthetic examples the standard error is relatively large, likely due to the small frequencies in the upper counts.

Simulation
In the first part, we concentrate on the comparison of the the bias-adjusted conventional Chao-estimator (15) and the bias-adjusted modified Chao estimator (16). In the second part, we compare the bias-adjusted modified Chao estimator (16) with a previously suggested estimator by Chiu and Chao (2016).

Comparison of the modified Chao estimator with the conventional Chao estimator
In the following we will focus on the bias-adjusted conventional Chao-estimator (15) and the bias-adjusted modified Chao estimator (16). Bias will occur for any member of the power series family as sampling distribution for X. However, the bias-reduction has been developed under a Poisson assumption for the frequency f x . To demonstrate how well the bias reduction works (outside the Poisson sampling for X) we consider as basic sampling the geometric. The latter, as mixture of a Poisson with a geometric, seems to be an attractive distribution as it can incorporate some basic form of heterogeneity (the one that can be modelled by an exponential). We look at two population sizes N = 50 and 1, 000 and consider five different scenes with different parameter constellations for each of them.
1. Scene 1 is the homogeneous geometric distribution with four parameters θ = 0.1, 0.2, 0.3, 0.4 denoted as populations 1 to 4. 2. Scene 2 is as scene 1 but with 20% one-inflation. More precisely this means that with probability π = 0.8 the count is taken from a homogeneous geometric and with probability 1 − π = 0.2 it is taken as a count of one. 3. Scene 3 is as scene 1 but with 50% one-inflation.

Scene 4 allows heterogeneity in the parameter of the geometric in addition
to 20% one-inflation. The count is taken with probability π = 0.8 from a equally weighted mixture of two geometric distributions. The following six two-component mixture populations were considered: θ 2 = 0.2, 0.3, 0.4 with θ 1 = 0.1, θ 2 = 0.3, 0.4 with θ 1 = 0.2 and θ 2 = 0.4 with θ 1 = 0.3 and denoted as populations 1 to 6. Here θ 1 is parameter of the geometric from the first component and θ 2 is the parameter of the geometric from the second component. 5. Scene 5 is as in scene 4 but with 50% one-inflation. The results of the simulation study are presented in Figure 6. For a generic estimatorN of population size we define relative bias as and relative standard deviation as to allow for comparisons across different sized populations. It is clear that the modified Chao estimatorN Chao-N with bias-reduction avoids the overestimation bias of the conventional Chao estimatorN Chao-C that clearly occurs for all populations with one-inflation as the left panels in Figure 6 indicate. It becomes also transparent that the larger the one-inflation the higher the overestimation bias ofN Chao-C . Furthermore, in a way surprisingly, also the relative standard deviation is smaller forN Chao-N in comparison toN Chao-C , most significantly for the one-inflation scenes, as the right panels in Figure 6 show. In Figure 7 we provide a comparison of the modified Chao estimator n+ (f3+1)(f3+2) (given in (16)) on the basis of a geometric distribution. Clearly, the bias-corrected version is performing well.
In our general power series framework, the bias-corrected Chiu-Chao estimator takes the formN Chiu and Chao suggested also a different bias-correction in eq. (5) which we did not consider as it is undefined if f 3 of f 4 is zero. Also, they suggest a population size estimator which replaces n by n − f 1 +f 1 which we did not consider here, mainly to achieve a fair comparison. In our context, we consider the singletons as true counts of ones. There are just more than compatible with any Power series mixtures which is the source of a potential severe bias. We will take up this point again in the discussion. In this context it is important to see the difference of one-inflation models to zero-inflation models. Whereas the latter is a also a Power series mixture, and hence, Chao's conventional estimator is also a lower bound for zero-inflation models, one-inflation models are not in the family of the Power series mixture and hence Chao's estimator no longer a lower bound, as we have seen in the examples.
We expect thatN Chao-N andN CC behave quite similarly. Indeed, there are only small differences in their values for all examples (see column 2 in Table  tableexam). Nevertheless, we comparedN Chao-N andN CC in a simulation study for a variety of scenarios. We look here at the setting of geometrically distributed counts with and without 20% one-inflation. The results are presented in Figure 8. Both estimators behave very similar and identical for larger population sizes above 1, 000. For the smaller population sizesN Chao-N seems to show benefits, in particular with respect to relative standard error. The graphs for Poisson counts with and without one-inflation look similar and are not presented here.

Discussion
We have focussed here on one-inflation as this appears to be the most relevant case in practice. Often in the application the occurrence of one-inflation can be well explained and interpreted. For example, in the case of family violence in the Netherlands, one-inflation might occur because many perpetrators might change their behavior after their first identification by the police. However, in principle, it is also possible to extend the approach to higher inflated counts such as two -inflation. To demonstrate this, it follows from Theorem 1 that a0 a1 m1 m0 ≤ a3 a4 m4 m3 , or m 0 ≥ a0a4 a1a3 m1m3 m4 . Replacing the theoretical probabilities by their associated frequencies gives the lower bound. Also, a bound can be developed for the situation there is inflation for both, ones and twos. The ratio plot may be helpful again to gain insights on the form of inflation. However, the most practical case occurs with the inflation of counts of ones. In addition, these zero-truncated count distributions as they arise in capture-recapture settings have often very little information in the upper tail, so that there comes in a natural restriction in considering types of higher inflated counts.
One-inflation can occur in several ways. Here, we view the occurrence of ones as true ones, whether they arise from the Power series mixture or as extra-ones. For example, we imagine in the case of family violence that some of the perpetrators change their behavior after they have been identified by the police the very first time, and then never re-occur in the police database. This might lead to extra-ones in the sample. In any case, here is no doubt about the observed sample size n. Another scenario is the case where we think of the singletons as being misclassified, so that some of these might be truly doubletons or tripletons etc. In this case, the observed sample size of different units is overestimated and needs to be corrected, for example, using n−f 1 +f 1 as suggested in Chiu and Chao (2018). Which estimator to use, will depend on the application at hand. and multiplying both sides with π gives a 0 a 2 3 a 3 2 (πm 2 ) which is the result as m x = πm x for x = 1.

Appendix 2
Here we give some details on the bias-reduction for the modified Chao estimator. We note that Using a Poisson assumption for Using the Poisson assumption once more, we have that E( = E(f 3 2 ) + 2E(f 2 ) − 3E(f 2 2 ), which can be validly estimated by f 3 2 − 3f 2 2 + 2f 2 . For the denominator we note that E[1/(f 3 + 1)(f 3 + 2)] 2 can be evaluated using the Poisson assumption as (with the abbreviations f = f 3 and λ = E(f )) which is an excellent approximation of 1 λ 2 if λ ≥ 5 (see also Figure 9).