1 Introduction

The sterile insect technique (SIT) and closely related incompatible insect technique (IIT) are methods used for biological control of pest insect populations. Conceived by Knipling (1955, 1959), it was first successfully applied in the eradication of screwworm fly from parts of North America in the 1950s–1970s. Since then, it has also been used to eradicate screwworm in many regions of the Americas and Libya (Vargas-Teran et al. 2005); melon fly in Okinawa (Ito et al. 2003); tsetse fly in Zanzibar (Vreysen et al. 2000); and the Mediterranean fruit fly in Mexico and Guatemala (Hendrichs et al. 1982). SIT is also seen as a potential means of reducing the incidences of mosquito borne diseases such as Malaria, Dengue Fever, Zika Fever and Yellow Fever.

The standard mode of SIT control is the release of sterile males that have been mass-reared and sterilised. Advances in biotechnology have provided new methods beyond the more traditional approach of ionising radiation, by which insects can be sterilised for mass release. For example, the use of interfering RNAs (Whyard et al. 2015) and the use of transgenic insects (Catteruccia et al. 2009) are newer technologies potentially available for SIT control programs. In recent years, endosymbiotic bacteria, such as Wolbachia (Zhang et al. 2015), have been used for the biological control of mosquito populations. Infecting male mosquitoes with Wolbachia is not sterilisation per se, since mating is still possible with Wolbachia-infected females, but where wildtype females do not contain Wolbachia, matings do not produce viable offspring because of a biological mechanisms known as cytoplasmic incompatibility (Stouthamer et al. 1999). The use of incompatible (rather than sterile) males is referred to as IIT rather than SIT and is seeing use for the biological control of mosquitoes (Atyame et al. 2011; O’Connor et al. 2012). For simplicity herein, we shall adopt the terms incompatible males and incompatible matings as a blanket reference to sterile (SIT) or incompatible (IIT) males or their matings. We use the term wildtype to refer for males and females that are compatible.

Fundamental to the use of SIT/IIT control strategies are mathematical models that are used to predict the population dynamics of the target species and make decisions about the release numbers, times and locations of incompatible insects into the wild population. Models of SIT control began with the simple model of Knipling (1955, 1959) that built on a discrete-time model of geometric population growth. Barclay (2005) provides a thorough overview of the mathematical basis of a range of SIT control strategies. In more recent years, models of SIT control have moved beyond ordinary-differential (continuous-time) and difference (discrete-time) equations in a number of directions, including: spatio-temporal advection-diffusion-reaction models with multiple life stages (Dufourd and Dumont 2013), agent-based simulation models (de Almeida et al. 2010; Magori et al. 2009) and stochastic spatial models (Otero et al. 2008).

All such models, regardless of their complexity, rely on parameters that govern insect behaviour, life stages, and birth and death rates. One common parameter playing an important role in the model of Knipling (1955, 1959) to the more complex model of Dufourd and Dumont (2013) is the mating competitiveness of incompatible and wildtype male insects. In many cases, the act of sterilisation (e.g. radiation) or inducing incompatibility (e.g. Wolbachia-infection) renders a male insect less competitive to a conspecific female than a wildtype male (e.g. Helinski et al. 2009; Champion de Crespigny and Wedell 2006). The mating competitiveness parameter is therefore a fundamentally important parameter that must be quantified in order to determine the number of incompatible insects that must be reared and released to achieve a successful biocontrol program. The criticality of accurately and precisely estimating this parameter was highlighted early on in the work of Berryman (1967). Berryman showed that under the original model of Knipling (1955, 1959) the number of incompatible males released into a wild population must exceed a threshold if the pest population is to be eradicated. Failure to at least meet this threshold will see the pest population continuing to grow geometrically. Assuming a 1:1 sex ratio in the natural population, this threshold is equal to \(S_t = \frac{M_t(\lambda - 1)}{c}\), where \(S_t\) is the number of incompatible males that must be released at time t to prevent the insect population from growing, \(M_t\) is the number of male insects at time t, \(\lambda \) is the number of offspring produced by a female with each time-step and c (lowercase) is the mating competitiveness coefficient of the compatible males, which quantifies the relative success of the compatible male insects at mating a female over a wildtype male. A value of \(c = 1\) corresponds to compatible and wildtype males having equal mating competitiveness; \(c < 1\) to compatible males being less competitive for mates; and \(c > 1\) meaning that compatible males are more competitive than wildtypes.

Typically, experimentalists quantify the mating competitiveness of compatible males through Fried’s Competitiveness Index (C), introduced by Fried (1971). Some recent examples of its use include Maïga et al. (2014), McInnis et al. (2011), Olivia et al. (2012) and Segoli et al. (2014), and it is presented as the standard method for assessing competitiveness in Vreysen (2005). Ordinarily, Fried’s Index would be estimated by conducting three experiments involving: (i) wildtype females and males (compatible matings); (ii) wildtype females and incompatible males (incompatible matings); and (iii) a mixture of wildtype females, wildtype males and incompatible males (mixed matings). Each experiment allows calculation of the percentage of hatched eggs, and these are used in the formula for Fried’s Index to quantify mating competitiveness. One popular experimental procedure, such as that used by Segoli et al. (2014), is to recover individual females from mating cages and isolate them so that the number of eggs laid and hatched is known for each mosquito. This particular experimental procedure is the focus of this work, but we note that in some experimental contexts, the isolation of females prior to egg-laying may not be feasible (e.g. Maïga et al. 2014).

The purpose of this study is to demonstrate that Fried’s Index can be more precisely estimated using only mixed mating cages, that is, (iii) of the three experiments described above. Information from compatible mating cages and incompatible mating cages can be used to supplement the analysis if desired, but are not necessary. Firstly, we clearly articulate that Fried’s Index (uppercase C) is mathematically equivalent to the mating competitiveness coefficient (lowercase c) that features in the SIT models of Knipling (1955, 1959) and others. Secondly, using this insight, we outline a new approach for estimating this parameter from cage experiments and provide R code to enable researchers to easily adopt the method. The benefits of this new approach to estimating C (equivalently c) are numerous. The main benefit is that cage experiments need only employ mixtures of wildtype females, compatible males and incompatible males, instead of the traditional approach that also requires cages consisting of compatible mates only and incompatible mates. The use of the proposed new method with only mixed cages leads to improved precision and accuracy of the estimator for C. A second important benefit is that this new analysis can be performed with data from a single experiment involving a mixed population, making it particularly useful where either: the number of cages (e.g. with large semi-field cages of the type employed in Segoli et al. 2014) is an experimental constraint, or competitiveness is to be assessed solely through a field release. Thirdly, the method allows for easy computation of Bayesian credible intervals to assess the precision of the estimate of C (equivalently c). Finally, the method also provides estimates and Bayesian credible intervals on the proportions of eggs hatching from wildtype and incompatible matings.

2 Equivalence of Fried’s Competitiveness Index and Knipling’s Mating Competitiveness Coefficient

Fried’s Competitiveness Index (Fried 1971) also known as Haisch’s Competitiveness Index (Haisch 1970), is a simple measure for quantifying the mating competitiveness of incompatible males compared to wildtype males. It is often reported in mating experiments involving wildtype females and mixtures of incompatible and wildtype males. Fried’s original calculations were devised as a simple method for determining the ratio of incompatible to wildtype males that would be required to obtain a given egg hatch rate. He noted that this formula could also be rescaled to provide an index of competitiveness that was independent of the incompatible male to wildtype male ratio used in an experiment, therefore allowing comparison between experiments designed using different ratios.

In his original 1971 paper, Fried’s Index was derived by starting with the assumption that the expected proportion of eggs that hatch, \(H_m\), from a mixture of S incompatible and W wildtype insects is given by

$$\begin{aligned} H_m = \frac{WH_w}{W + S} + \frac{S H_s}{W + S}, \end{aligned}$$
(1)

where the quantities \(H_w\) and \(H_s\) are the expected proportion of eggs to hatch when exposing wildtype females to wildtype males and incompatible males, respectively. Underpinning Eq. (1) is the assumption of equal mating competitiveness of wildtype and incompatible males. In practice, \(H_s\) is often not simply equal to zero, because sterilisation/ incompatibility may not be complete. For our purposes, it is more natural to start from a position of unequal competitiveness, and we rewrite the previous equation as \(H_m = \frac{WH_w}{W + cS} + \frac{cS H_s}{W + cS}\), where c is the mating competitiveness coefficient. In a probabilistic setting, the parameter c can be thought of as the odds, \(p_s/\{(1-p_s)\}\), of an incompatible male mosquito mating a female instead of a wildtype male when exposed to wildtype and incompatible males in a 1:1 mixture (i.e. \(W = S\)) and \(p_s\) is the probability that a mated female would mate an incompatible male over a wildtype male.

To see this, note that \(H_m\) is a weighted average of the hatch rates \(H_w\) and \(H_s\), with weights \(W/(W + cS)\) and \(cS/(W + cS)\), respectively. We now see that c simply weights the number of incompatible males in terms of their wildtype equivalent (i.e. for every wildtype mating, we expect c incompatible matings). In other words, when \(W=S\), the ratio of wildtype to incompatible matings is expected to be 1:c, with the expected proportions of wildtype matings being \((1 + c)^{-1}\) and the proportion of incompatible matings being \(c(1 + c)^{-1}\). Consequently, the mating odds \(p_s/(1-p_s)\) in a 1:1 mixture is equal to c.

Rearranging the above equation to obtain it in terms of c yields

$$\begin{aligned} c = \frac{W(H_w - H_m)}{S(H_m - H_s)}. \end{aligned}$$
(2)

If sterilisation or the cause of incompatibility (e.g. presence of Wolbachia ) reduces the mating competitiveness of males, then we expect \(c < 1\), whereas if sterilisation/incompatibility has no effect, we expect \(c=1\). If, for example, in a 1:1 mixture of incompatible and wildtype males, the proportion of female eggs hatching was reduced by 10%, and incompatibility is complete (i.e. \(H_s = 0\)) then we would estimate \(c =1/9\).

Equation (2) is exactly the formula for Fried’s Index as given in Fried (1971), albeit derived through a slightly different lens. What is important is the interpretation of the index, which we can now see is equal to the odds of a female being mated by an incompatible male over a wildtype male when exposed to both in equal numbers. This shows that Fried’s Index implies physical (or biological) interpretation identical to the mating competitiveness coefficients in SIT models of Berryman (1967), Bogyo et al. (1971), Ito (1977) and Barclay (1982) and that is summarised in Box 4 of Barclay (2005). Herein, we will adopt the lowercase c and the reader should interpret this as corresponding to both Fried’s Index and the mating competitiveness coefficient. As we shall see, this physical interpretation of Fried’s Index underpins our new approach of estimation using only mixed mating cages.

3 Mating Competitiveness Experiments

Mating competitiveness experiments are routinely used to calculate Fried’s Index, for assessing the effectiveness of incompatible insects for use in SIT control programs. The basic protocol is to have cages containing wildtype females subjected to three treatments (usually with replicates of each). For the first treatment, wildtype males are introduced to the cage; for the second incompatible males are introduced; and for the third, a mixture of incompatible and wildtype males is introduced. Once adequate time for mating has transpired, eggs from each of the cages are collected and counted. These are then left to hatch, and the number of hatched eggs is counted. The proportions of hatched eggs from cages of the first, second and third treatments correspond to \(H_w\), \(H_s\) and \(H_m\), respectively, in the formula for Fried’s Index (i.e. Eq. (2)).

Upon computing Fried’s Index, the variance of the estimator can be approximated using the Delta method, similar to the approach of Iwahashi et al. (1983). For cage experiments, there are two points of departure from the assumptions of Iwahashi et al. (1983): (i) the ratio W : S is assumed to be known; and (ii) \(H_s\) is not assumed equal to zero because sterilisation/incompatibility may not be perfectly achieved. Under these two modifications, the Delta method approximation of the variance is

$$\begin{aligned} \text {Var}(c)&= \frac{W^2}{S^2}\left\{ \left( \frac{\partial c}{\partial H_w} \right) ^2 \text {Var}(H_w) + \left( \frac{\partial c}{\partial H_s} \right) ^2 \text {Var}(H_s) + \left( \frac{\partial c}{\partial H_m} \right) ^2 \text {Var}(H_m) \right\} \\&\quad + \frac{2W^2}{S^2}\left\{ \frac{\partial c}{\partial H_w} \frac{\partial c}{\partial H_s} \text {Cov}(H_w, H_s) + \frac{\partial c}{\partial H_w} \frac{\partial c}{\partial H_m} \text {Cov}(H_w, H_m)\right. \\&\quad \left. +\, \frac{\partial c}{\partial H_s} \frac{\partial c}{\partial H_m} \text {Cov}(H_s, H_m) \right\} . \end{aligned}$$

The covariance terms in the above are all equal to zero since, \(H_w\), \(H_s\) and \(H_m\) are statistically independent on the grounds that they calculated using separate cages under the traditional experimental approach. The partial derivatives for the calculation above are

$$\begin{aligned} \frac{\partial c}{\partial H_w}&= \frac{W}{S(H_m - H_s)}, \\ \frac{\partial c}{\partial H_m}&= \frac{W(H_s - H_w)}{S(H_m - H_s)^2},\quad \text {and} \\ \frac{\partial c}{\partial H_s}&= \frac{W(H_w - H_m)}{S(H_m - H_s)^2}. \end{aligned}$$

What remains is to compute the variances of \(H_w\), \(H_s\) and \(H_m\). Cage experiments differ substantially from the field-study setting considered by Iwahashi et al. (1983). In the cage experiments, mosquitoes are blocked by cages, with each cage introducing a random effect. We use generalised linear mixed models to estimate \(\mu _x = \text {logit}(H_x)\) and \(\text {Var}(\mu _x)\), where \(x \in \{ w, s, m \}\) indicates the type of mating, and \(\text {logit}(H_x) = \log \{H_x/(1 - H_x)\}\). The generalised linear mixed model used to estimate \(\mu _x\) and \(\text {Var}(\mu _x)\) is simply an intercept model with a random effect for each cage, where the observed data for each mosquito are modelled as following a Binomial distribution (eggs hatched from eggs laid) with the canonical logit link-function:

$$\begin{aligned} Y_{x,i,j}&\sim \text {Binomial}(n_{x,i,j}, p_{x,i}), \\ \text {logit}(p_{x,i})&= \mu _x + \epsilon _i, \quad \epsilon _i \sim N(0, \sigma ^2_{\epsilon _x}). \end{aligned}$$

Here, \(Y_{x,i,j}\) is the number of eggs hatched from \(n_{x,i,j}\) laid for the jth mosquito in the ith cage of mating type x; \(\mu _x\) is the intercept which corresponds to the logit-transformed hatch rate \(H_x\); and \(\epsilon _i\) is the random effect of the ith cage. The model assumes that the probability of an egg hatching does not vary as a function of the number of eggs laid.

We construct three such models (indicated by the subscript x), each of which is built using only cages involving compatible matings, incompatible matings or mixed mating cages. Inference was performed using the lme4R package (Bates et al. 2015) and yielded estimates of \(\mu _x\) and \(\text {Var}(\mu _x)\). The parameters of interest are recovered as \(H_x = \text {logit}^{-1}(\mu _x)\) and using the Delta method again:

$$\begin{aligned} \text {Var}(H_x) = \sigma ^2_x H_x(1 - H_x), \end{aligned}$$

where \(\sigma _x\) is the standard error of \(\mu _x\) and

$$\begin{aligned} \text {logit}^{-1}(\mu _x) = \frac{e^{\mu _x}}{1 + e^{\mu _x}}. \end{aligned}$$

4 Estimating Fried’s Index from a Single, Mixed Population Cage Experiment

In the previous section, we presented a statistical procedure for estimating the variance (and standard error) of the traditionally used estimator for c that is given in Eq. (2). In this section, we outline an alternative method for estimating c using a two-component binomial mixture model. This new approach is built on the connection between Fried’s Index and the mating competitiveness coefficient in population models, which was outlined in Sect. 2.

Let \(S_e\) and \(W_e\) be the number of incompatible and wildtype males used in the experiment, respectively. In addition, let \(\pi _s\) be the probability that a female is mated by an incompatible male and \((1 - \pi _s)\) the probability that a female is wildtype-mated. Under the assumption that matings are proportional to the numbers of males of each type, we have

$$\begin{aligned} \pi _s = (cS_e)/(cS_e + W_e), \end{aligned}$$
(3)

and rearranging for c yields

$$\begin{aligned} c = (\pi _s W_e)/\{(1 - \pi _s)S_e\}. \end{aligned}$$
(4)

Ultimately, this shows that if we had an inferential framework, through which \(\pi _s\) could be estimated, the mating competitiveness coefficient (Fried’s Index) can, in theory, be estimated from a single mating experiment involving only a mixture of incompatible and wildtype males rather than conducting separate mating experiments to estimate \(H_w\), \(H_s\) and \(H_m\) and then using these in Eq. (2). Of course, replication of an experiment is likely to yield more defensible results that are less likely to be affected by extraneous variation (e.g. random effects acting at the cage level).

In practice, insects can be collected from the mixed population prior to egg-laying, allowing the number of eggs laid and numbers hatched to be counted for each female. This is the experimental approach adopted by Segoli et al. (2014) for example. However, whether each female was mated by a wildtype or incompatible male is unknown to the experimentalist, making the task of estimating \(\pi _s\) a challenge. When sterilisation is extremely effective, these latent variables (i.e. whether each female was wildtype or incompatible mated) might be assumed to be obvious based on the hatch rate for each female. However, making this assumption can be criticised for: (i) ignoring that low hatch rates are also possible (although less likely) for wildtype matings; and (ii) subjectivity, particularly when treatments (e.g. sterilisation methods) result in partial incompatibility and hatch rates only marginally lower than from wildtype matings. Finite mixture models (see McLachlan and Peel 2000) provide an effective means by which one might estimate \(\pi _s\) from a single experiment despite not knowing which type of mating each female underwent.

Let \((Y_1, N_1, Z_1), \ldots , (Y_n, N_n, Z_n)\) be the data obtained from n female mosquitoes, where \(N_i > 0\) is the number of eggs laid by the ith female and \(Y_i \le N_i\) is the number of hatched eggs from the ith female. \(Z_i\) is a latent (unobserved) indicator variable that takes the value 1 if the ith female was mated by an incompatible male and 0 otherwise. For each female, we can model the number of hatched eggs, given the number of eggs and its latent variable for the mating type as

$$\begin{aligned} Y_i | Z_i, N_i = n_i \sim {\left\{ \begin{array}{ll} \text {Binomial}(n_i, \gamma _w) \quad (Z_i = 0), \\ \text {Binomial}(n_i, \gamma _s) \qquad (Z_i = 1). \end{array}\right. } \end{aligned}$$

Here, \(\gamma _w\) and \(\gamma _s\) are the probabilities of eggs hatching for females mated by wildtype and incompatible males, respectively. This model assumes that the probability of an egg hatching does not vary as a function of the number of eggs laid. In addition, we model the latent variables as \(Z_i \sim \text {Bernoulli}(\pi _s)\). Integrating out the latent variable \(Z_i\) yields a two-component binomial mixture model (BMM) for \(Y_i | N_i\) (the number of eggs hatched given the number of eggs) with probability mass function

$$\begin{aligned} f_{Y_i | N_i}(y_i | n_i) = \pi _s \left( {\begin{array}{c}n_i\\ y_i\end{array}}\right) \gamma _s^{y_i}(1 - \gamma _s)^{n - y_i} + (1 - \pi _s)\left( {\begin{array}{c}n_i\\ y_i\end{array}}\right) \gamma _w^{y_i}(1 - \gamma _w)^{n_i - y_i}. \end{aligned}$$

Importantly, the above model is applicable to both monogamous and polygamous female insects, since the numbers of eggs hatching from an individual female can be a mixture of eggs fertilised by incompatible and wildtype males. In Aedes mosquito populations, females are generally considered monogamous, but multiple matings have been observed (Williams and Berger 1980; Young and Downe 1982; Ritchie et al. 2013; Oliva et al. 2014). Fitting this model to experimental data is facilitated by adopting a Bayesian statistical framework, and we refer the unfamiliar reader to Bolstad and Curran (2017) for further details of the Bayesian statistical approach. In a Bayesian statistical framework, we place prior distributions on all parameters of our model to capture our prior knowledge of these quantities. Initially, we might proceed by placing uninformative, uniform prior distributions on the parameters of our model (recall these parameters all correspond to probabilities and are therefore in the range [0, 1]) as follows:

$$\begin{aligned} \gamma _w&\sim \text {Uniform}(0, 1), \\ \gamma _s&\sim \text {Uniform}(0,1), \quad \text {and}\\ \pi _s&\sim \text {Uniform}(0,1). \end{aligned}$$

If data from control cages (i.e. cages that do not contain a mixture of incompatible and wildtype males) were available this could be used to construct more informative prior distributions on the hatch probabilities \(\gamma _w\) and \(\gamma _s\), otherwise uninformative uniform distributions such as those above are appropriate. Such a model can be easily fitted using Stan (Stan Development Team 2016a; Carpenter et al. 2017), a modelling language for building Bayesian models and undertaking statistical inference using a sampling algorithm called the No-U-Turn sampler (Hoffman and Gelman 2014) to obtain samples from the posterior distributions for the model parameters above. The posterior distributions are the probability distribution for the model parameters after having observed the experimental data and can be used to create Bayesian credible intervals that quantify our uncertainty (or certainty) about the model parameters and quantities derived from these parameters, such as c.

A fully functional R package for undertaking this Bayesian statistical analysis of mating experiments is provided by the authors (see the link provided in the Discussion). The package is intended to greatly simplify the analysis of mating experiments using the RStan package (Stan Development Team 2016b). Substituting the samples of \(\pi _s\) from the posterior distribution into Eq. (3), provides samples from the posterior distribution for c. This procedure allows direct estimation of the posterior distribution of c and the precision with which it has been estimated.

5 Replication of the Mixed Population Cage Experiment

Experimental replication is good practice and helps to account for the influence of extraneous variation (random effects) on the response variable being measured. Whilst the method we have outlined above can be applied to a single cage experiment, the practitioner may wish to replicate the experiment using several cages to improve confidence in the results. To allow for multiple replicates of the experiment, we introduce an additional tier in our hierarchical model and introduce \(\pi _{s,j}\) as the probability of a sterile mating in the jth replicate, with

$$\begin{aligned} \pi _{s,j}&\sim \text {Beta} \left\{ \alpha , \frac{\alpha (1 - \pi _s)}{\pi _s} \right\} , \\ \pi _s&\sim \text {Uniform}(0, 1), \quad \text {and} \\ \alpha&\sim \text {Exponential}(0.001). \end{aligned}$$

In the above hierarchy, \(\pi _{s,j}\) is modelled to follow the beta distribution with shape parameters \(\alpha > 0\) and \(\alpha (1 - \pi _s)/\pi _s > 0\). With these two shape parameters, the mean of the beta distribution is \(\pi _s\). The prior placed on \(\alpha \) is an exponential distribution. Specifically, we assigned a value 0.001 to the rate parameter of the exponential prior such that the prior variance is equal to \(10^6\). In such a setting, the prior is diffuse and has little subjective impact to parameter inferences. The additional hierarchy introduced for \(\pi _{s,j}\) allows for some additional variability in the sterile mating probabilities between cages as a result of random effects beyond the control of the experimentalist.

6 Simulation Studies

We demonstrate that using the BMM yields greater precision in estimating Fried’s Index than the traditional approach by analysing synthetic data from a series of simulation experiments. For each of the simulation experiments we used: (i) nine cages containing a mixture of wildtype and sterilised males with wildtype females for analysis by the BMM method (mixed mating); and (ii) three cages of compatible mating, three cages of incompatible mating and three cages of mixed mating for analysis by the traditional method. Each simulated cage experiment, assumed an experimental setup similar to Segoli et al. (2014), whereby each cage contained 20 females and 30 males. In mixed mating cages, we assumed 15 sterile and 15 wildtype males.

In these experiments, we allowed c to vary between 0.25 and 2.0 and the hatch rate of incompatible mated females (\(\gamma _s\)) to vary between 50 and 1% of the wildtype hatch probability (assumed to be \(\gamma _w\)=0.8). We also assumed that there is no death of individual mosquitoes and that all females were mated. In cages with incompatible or compatible mating only, the male mate type is known with certainty, whilst in the mixed mating cages, the probability of an incompatible mating is calculated using Eq. (2) with \(S_e = 15\) and \(W_e = 15\) and is randomly generated for each female. Each mated female is then modelled as laying a random number of eggs, drawn from a Poisson distribution with a mean of 30. Whether each egg then hatched was a random event where the probability that it hatched was \(\gamma _s\) for matings with incompatible males and \(\gamma _w\) for matings with wildtype males. For each simulated experiment, the synthetic experimental data consisted of the number of eggs laid and number of eggs hatched for each female in each cage type used and this was used to estimate Fried’s Index under the BMM and the traditional approach. For the BMM, the estimate of Fried’s Index was taken to be the mean of the posterior distribution, whilst for the traditional approach, the estimate was calculated by plugging-in the proportions of eggs hatching in each of the three cage types (compatible, incompatible and mixed) and plugging these quantities into Eq. (2).

Regardless of whether the traditional design and analysis, or a design consisting of only mixed cages with analysis via the BMM was used, all simulated experiments required identical experimental resources: 9 cages, 180 wildtype female mosquitoes, 135 incompatible males and 135 wildtype males. As seen in Fig. 1, the designs consisting of nine mixed cages with analysis via BMM, exhibit a marked improvement in precision when estimating Fried’s Index. It is clear that over a wide range of Fried’s Index and incompatible hatch rates, the BMM approach offers a notable gain in information about c.

Fig. 1
figure 1

Distributions of estimates of Fried’s Index (c) using 1000 simulated experiments for different combinations of c and the effectiveness of incompatibility (through \(\pi _s\)). The distribution of estimates obtained using only mixed cages with data analysed by the BMM approach shown in green, and that for the traditional design and analysis shown in blue. The red vertical line shows the true value of Fried’s Index used to generate the data in each case (Color figure online).

7 Application: Aedes aegypti Cage Experiments

To demonstrate our method and its usefulness with real data, we apply it to the analysis of mating competitiveness experiments of Wolbachia-infected Ae. aegypti performed by Segoli et al. (2014). The data we use consist of 29 mating experiments conducted in tents, of which, 18 tents contained a mixture of wildtype males, incompatible males and wildtype females, five tents contained wildtype males and females (compatible matings), and six tents contained incompatible males and wildtype females (incompatible matings). We use these data to carry out two sets of analyses. Firstly, we apply both the BMM and the traditional method to the full data to obtain the estimates of Fried’s Index and then, compare the obtained estimates. We then also use some subsets of the tents to examine the reliability of our method and the traditional method when there are fewer replicates, in particular, we examine three datasets, each consisting of nine mating tents randomly selected from the full dataset. Where the BMM is used, each experiment uses nine randomly selected mixed mating tents, and where the traditional approach is used, each experiment uses a sample of three mixed mating tents, three compatible mating tents and three incompatible mating tents.

To first demonstrate that our method has merit with a real dataset, it was important to check that we obtained something similar to the traditional method for computing Fried’s Index with our new approach. Figure 2 shows the estimates obtained from an analysis of the full dataset using both approaches. We note that the median of the posterior distribution under the BMM is nearly identical to the estimate of Fried’s Index obtained using the traditional approach. We are also able to easily quantify the uncertainty surrounding our estimate as shown by the blue rectangles in this figure that span the interquartile range (dark blue) and the 2.5th to 97.5th percentiles (95% credible interval) for Fried’s Index from these data. We note that the 95% credible intervals span a much narrower range than the 95% confidence interval constructed from the quantities detailed in Sect. 3 as \(({\hat{c}} - 1.96 \sqrt{\text {Var}({\hat{c}})}, {\hat{c}} + 1.96 \sqrt{\text {Var}({\hat{c}})})\). It is also apparent from Fig. 2 that, unlike the confidence interval, the credible interval is asymmetric, which is perhaps a more reliable interval estimate, since zero is a lower bound for Fried’s index. Clearly, the lower bound of the confidence interval under the traditional approach does not make physical sense.

Fig. 2
figure 2

Estimates of Fried’s Index from experiments involving all 29 tents. “Trad.” Corresponds to the traditional estimate of Fried’s Index, with the estimate shown as a black circle (grey horizontal line for ease of comparison between approaches). The 95% confidence interval for the traditional method is spanned by the light blue rectangle; “BMM” corresponds to the binomial mixture model estimate with median of the posterior distribution shown as a circle, the dark blue spans the 25th to 75th percentiles, and light blue extends to the 2.5th and 97.5th percentiles (Color figure online).

One of the strengths of the BMM for estimating Fried’s Index is that it does not rely on using control cages of compatible and incompatible matings. Figure 3 demonstrates this using three datasets constructed from subsets of the full dataset. We can see that when computing Fried’s Index from nine mixed population cages under the BMM, the traditional estimate from the full data is spanned by the 95% credible interval in all three examples and medians of the posterior distribution are consistently proximal to the corresponding grey line, indicating an accurate estimator. When a study uses three mixed cages, three incompatible cages and three compatible cages and applied the traditional approach, the estimates are reasonably close to the estimate from the full dataset, but the confidence intervals are very wide. In only one case does the 95% confidence interval include the grey line, which was the estimate obtained for the complete dataset. Clearly, the BMM with only mixed mating cages provides a more accurate approach, and provides a more reliable interval estimate when only a small number of mating cages are used.

Fig. 3
figure 3

Each graph shows estimates of Fried’s Index from three smaller experiments constructed from subsets of the complete Segoli et al. (2014) dataset. Each experiment uses three mixed populations, one compatible mating population and one incompatible mating population. “Trad.” Corresponds to the traditional approach for computing Fried’s Index, with the estimate shown as a circle with the 95% confidence interval spanned by the blue rectangle; “BMM” corresponds to the binomial mixture model with median of the posterior distribution shown as a circle, dark blue spans the 25th to 75th percentiles, light blue extends to the 2.5th to 97.5th percentiles. The grey line shows the value of Fried’s Index computed from the full dataset (all 29 tents) as per Fig. 1 (Color figure online).

As per Fig. 2, Fig. 3 also highlights the realistic asymmetry that exists in the credible intervals, which is typical when estimating strictly positive quantities. We also see that in all panels of Fig. 3, the confidence intervals extend into negative values. Indeed, this again highlights a major deficiency of relying upon Eq. (2), as it is possible, through random chance, to obtain estimates whereby \(H_w < H_m\) and \(H_m > H_s\), which yields a negative value of Fried’s index. A negative value does not make biological or physical sense and cannot be obtained using the BMM approach; further evidence that this new method provides an honest assessment of the uncertainty associated with estimating Fried’s Index from experimental data.

An additional benefit of using the BMM is that it also yields posterior distributions on the probabilities of eggs hatching under sterile (or incompatible) and wildtype (compatible) matings. The columns of Fig. 4 show the distributions obtained from each of the three small data sets studied and reported on in Fig. 3. In general, posterior distributions for these probabilities are narrow, indicating that the data are highly informative about these quantities. We see that the distributions for hatch probabilities in incompatible matings are low, but have a mode that is not at zero, which is consistent with the raw data (i.e. there were small numbers of eggs that hatched from incompatible matings in control experiments).

Fig. 4
figure 4

Posterior distributions for the probabilities of wildtype/compatible (green; bottom row) and incompatible (blue; top row) matings for the three datasets involving only nine randomly selected mixed cages. Labels (a)–(c) correspond to each of the same experiment labels in the panes of Fig. 3 (Color figure online).

8 Discussion

It is almost 50 years since Fried (1971) proposed his index of mating competitiveness and over this period technology has advanced substantially. The statistical methods at our disposal today are vastly more powerful than in the 1970s, and this enables us to compute Fried’s Index via new and more efficient means. We have presented a two-component BMM that can be used to obtain estimates of Fried’s Index using only mixed mating cages of incompatible males, wildtype males and wildtype females. Estimators constructed around this new approach show evidence of being more precise and accurate than traditional approaches of estimating Fried’s Index.

When applied to a large dataset from a mating experiment, the BMM results in an almost identical estimate to that obtained under the traditional approach. On smaller datasets created by randomly sampling a smaller number of replicate cages from the complete data, the BMM remains accurate and precise, whilst estimates from the traditional approach can have unacceptably wide confidence intervals and are not constrained to be positive.

In simulation experiments, our new method has been shown to offer more precise and accurate estimates of Fried’s Index of competitiveness. This is advantageous since equivalent precision to the traditional approach to estimating Fried’s Index can be achieved with fewer cages, and may be particularly advantageous when conducting experiments in large semi-field cages, of which, a research facility may only have a small number. Indeed, using our new approach, Fried’s Index can, if necessary, be estimated with only a single cage containing a mixture of incompatible and wildtype males with wildtype females. Our proposed approach also drastically improves the practitioner’s ability to quantify the uncertainty surrounding estimates of Fried’s Index and hatch probabilities (hatch rates) through the use of Bayesian credible intervals. A further advantage of the method is that it is applicable to mating experiments involving both monogamous and polygamous mating.

We have outlined that Fried’s Index is equivalent to the mating competitiveness coefficient, c, that is routinely employed in most models of population dynamics under biological control programs employing sterile insect technique. Uncertainty analyses and sensitivity analyses of these models often require some assessment of the error in key parameters such as c, and our approach should aid experimentalists in quantifying this for such purposes. To facilitate the use of these methods by practitioners and experimentalists, we have made available a fully functional R package for performing these analyses at https://github.com/dpagendam/friedsIndex.