Zero-inflated-censored Weibull and gamma regression models to estimate wild boar population dispersal distance

The dynamics of the wild boar population has become a pressing issue not only for ecological purposes, but also for agricultural and livestock production. The data related to the wild boar dispersal distance can have a complex structure, including excess of zeros and right-censored observations, thus being challenging for modeling. In this sense, we propose two different zero-inflated-right-censored regression models, assuming Weibull and gamma distributions. First, we present the construction of the likelihood function, and then, we apply both models to simulated datasets, demonstrating that both regression models behave well. The simulation results point to the consistency and asymptotic unbiasedness of the developed methods. Afterwards, we adjusted both models to a simulated dataset of wild boar dispersal, including excess of zeros, right-censored observations, and two covariates: age and sex. We showed that the models were useful to extract inferences about the wild boar dispersal, correctly describing the data mimicking a situation where males disperse more than females, and age has a positive effect on the dispersal of the wild boars. These results are useful to overcome some limitations regarding inferences in zero-inflated-right-censored datasets, especially concerning the wild boar’s population. Users will be provided with an R function to run the proposed models.


Introduction
The dynamics of wild animal populations is an important issue not only for ecological purposes, but also for agricultural and livestock production (Cumming et al., 2012). The wild boar (Sus scrofa) is considered to have one of largest geographical ranges of all terrestrial mammals (Lewis et al., 2017). It causes several types of losses related to the environment, agricultural activities, and animal production/ trade by spreading diseases (Meng et al., 2009;Sánchez-Vizcaíno et al., 2019). With the re-emergence of African Swine Fever (ASF) as a threat to the global pig industry (Sánchez-Cordón et al., 2018), there is an ongoing need for better quantitative descriptions of the movement and land-occupation behaviors of wild boars (McClure et al., 2015;Morelle et al., 2014).
Animal dispersal occurs when individuals leave their social group or home range, redistributing the population (Breed & Moore, 2016). According to Casas-Díaz et al. (2013), dispersal is an important characteristic of the wild boar's ecology and should be taken into account in the design of disease monitoring programs as these animals can affect the spread of diseases and the probability of new outbreaks. Several techniques are used to measure animal dispersal [e.g., radio telemetry, GPS, ground surveys, or hunting statistics (Morelle et al., 2014)], and datasets produced by recapture with hunting bags are important sources of information for veterinary epidemiology (Keuling et al., 2018;Scillitani et al., 2010;Vicente et al., 2018).
Animal dispersal data may have a complex structure: it may be right-skewed given the short-distance dispersal; have excessive zeros, because many animals do not disperse at all or they disperse distances that are inside the error margin of telemetry (Keuling et al., 2008). Also, censored observations may occur, when an animal's track is lost (e.g., collars or ear tags are lost, or animals move away from the hunting perimeter) (Podgórski et al., 2014;Prévot & Licoppe, 2013;Truvé & Lemel, 2003). This data structure is challenging from an inferential perspective, creating issues for researchers attempting to formulate hypotheses or build models using dispersal distance information (Bowman et al., 2002). Thus, many authors discard the censored data and also avoid fitting generalized linear models. As a consequence, the datasets are not correctly used in making inferences and testing hypotheses about dispersal distances (Prévot & Licoppe, 2013;Truvé & Lemel, 2003). Evidently, improvements in quantitative analytical techniques on dispersal distance are still needed (Whitmee & Orme, 2013).
According to Lambert (1992), the zero values can have the following sources: they can come from a population where the value would always be zero, called "structural zeros"; alternatively, they can be "sampling zeros" from a population whose observations belong to some probability distribution. To analyze a zeroinflated non-negative outcome (semi-continuous outcome) data, Manning et al. (1981) proposed a two-part model (2PM), separating the zero and positive values explicitly in two submodels (parts). Liu et al. (2010) proposed an extension of the 2PM, assuming a generalized gamma distribution of the positive values. Lee et al. (2010) proposed a two-part multilevel modeling, in which the zero proportion is modeled by logistic regression and the continuous values by gamma regression. Gebregziabher et al. (2017) presented a family of models for zero-heavy continuous outcomes, with Weibull mixture regression as a special case, and with a marginal inference. Louzada et al. (2018) proposed a zero-inflated non-default rate regression, assuming that the positive values follow a Weibull distribution.
To consider the whole data structure of wild boar dispersal distances, we propose an extension of the zero-inflated gamma model (Lee et al., 2010), incorporating the right-censored observations in the continuous values through the gamma regression. Furthermore, we consider the zero-inflated, right-censored Weibull regression (Louzada et al., 2018), discarding the cure rate of the model. In the models proposed in this paper, both techniques allow the incorporation of the whole data (without excluding zeros and the right-censored observations) to make inferences. This includes covariates in the model, as the survival models can also be used to assess distances, as seen in Reader (2000) and Chatwin et al. (2013). The gamma is a flexible distribution; it describes a different type of survival pattern according to the hazard rate: increasing, decreasing, or constant; it fits a variety of lifetime data adequately and has the exponential distribution as a particular case (Lawless, 2011). The Weibull distribution is also flexible and has a monotonically increasing, decreasing, or constant hazard rate (Klein & Moeschberger, 2003); is probably the most used parametric lifetime model (Wienke, 2011); it has been widely used to model right-skewed data and has also motivated proposals of various types of generalizations (Liao et al., 2020;Ramos et al., 2018;Shinohara et al., 2020).
Given the lack of a dataset fully describing zeros, dispersal distances, and censored data, we apply both techniques to an artificial dataset that mimics the wild boar population's dispersal. The simulated animal population includes three subpopulations: animals for which no dispersal was registered (zeros); animals that showed some measurable distance, and animals whose track has been lost (censored). Such data mimic situations of capture-recapture using hunting bags in which a proportion of animals moved away from the hunting perimeter. The manuscript is structured as follows.
First, we present the definitions and the construction of the zero-inflated-censored Weibull (ZIWeibull) and the zero-inflated-censored gamma (ZIGamma) models. Next, we show the properties of the models' estimators and compare the two models. Finally, we demonstrate their application to the artificial wild boar dispersal dataset. An R function was created to enable users to run the models with their own data. Instructions regarding this function and package installation details are available in Appendix (instructions for installing the R functions and run the models).

Formulation of the models
Let T denote the random variable of interest, and it is the observable time to event, called lifetime or failure time. Finally, consider x x x 1 , x x x 2 , x x x 3 three vectors of the covariates, where x x x 1 relates to the zero-inflation probability, x x x 2 relates to the scale, and x x x 3 relates to the shape of the lifetime distribution.
We propose an alteration of the zero-inflated non-default regression model (Louzada et al., 2018), using the zero-inflated and regression parts while removing the cure rate part of the model. Thus, the probability density function (PDF) and the cumulative density function (CDF) of the observable lifetime time T of the zero-inflated model are, respectively, given by where p 0 ∈ (0, 1) is the zero-inflation probability, f(t) is the probability density function, and F(t) is the cumulative density function of the observable lifetime time.
The observed data are denoted by D obs = {(t i , i ), i = 1, … , n} , where i = 1 if the event of interest occurs and i = 0 if it is right-censored, i = 1, … , n . Assuming independent and non-informative censoring, the likelihood function L( ;D obs ) is given by where = (p 0i , ) , denotes the parameter vector associated to the probability density function f(t) and, consequently, to the survivor function S(t); p 0i is the zeroinflate probability of the ith subject.
Considering the relation f (t) = h(t)S(t) and S(t) = exp{−H(t)} in (2), where h(t) is the hazard function and H(t) is the cumulative hazard function, we obtain the following likelihood function: Therefore, the log-likelihood function for = (p 0i , ) , corresponding to the observed data and the likelihood function as in (3), is given by To obtain the maximum-likelihood estimates for the vector = (p 0i , ) , we must find the score functions U( ) = log{L( ;D obs )} . To solve the non-linear system of equations U( ) = log{L( ;D obs )} = 0 , we chose the function "optim" in the R software (R Core Team, 2019) for numerical maximization, with the method "BFGS", according (1) to Louzada et al. (2018). Details on non-linear optimization can be found in Press et al. (2007). The standard errors (SE) of the estimators are obtained through the diagonal of the inverted Fisher information matrix. The Fisher information matrix can be generically defined by where K pp denotes second-order partial derivatives of with respect to the covariates vector associated with p 0 , K denotes second-order partial derivatives of with respect to the vector .
The approach presented thus far can be used for any choice of hazard function h(t) and, consequently, any cumulative hazard function H(t). In the following section, we present the equations when the Weibull and gamma distributions are chosen to adjust the survival times.

Zero-inflated-censored Weibull model
If we consider the Weibull distribution to model the non-negative random variable T|T > 0 , in other words T|T > 0 ∼ Weibull( w , w ) , which here denotes the observable survival time, we obtain the following probability density function: where w > 0 is the shape parameter and w > 0 is the scale parameter of the distribution.
Replacing the hazard function h(t) and the cumulative hazard function H(t) from the log-likelihood function (4) with the hazard function and the cumulative hazard function from the Weibull distribution (6), we obtain the following log-likelihood function: The inclusion of covariates determines the effect of covariates on the observable survival time and the probability of zeros. The parameters (p 0i , w , w ) are related to the proportion of zeros, the shape, and the scale parameters of the Weibull distribution, respectively. The systematic components of the regression version of the zeroinflated-censored Weibull model are given by 2i 2 , and 2i = x x x T 3i 3 are the linear predictors and; 's are the unknown regression coefficient vectors. The link functions G 1 (.) , G 2 (.) , and G 3 (.) provide the connection between the linear predictor and the parameters of the probability density function. According to the fit by Louzada et al. (2018), G 1 (.) is set as . Since the scale and shape parameters are defined on the positive real line, the G 2 (.) and G 3 (.) link functions are chosen as G 2 ( wi ) = log( wi ) and G 3 ( wi ) = log( wi ) . Thus, wi = e x x x T 2i 2 and wi = e x x x T 3i 3 . Considering the observable time T|T > 0 ∼ Weibull( w , w ) and the probability density function f 0 (t) (see (1)), the expected value of T is defined by where (.) is the gamma function.
Replacing the parameters in (9) by the relations in (8), we have the following expected value given covariates:

Zero-inflated-censored gamma model
The zero-inflated-censored gamma model is an extension of Lee's model (Lee et al., 2010). It is based on the concept of two-part or "hurdle" models, in which zeros and non-zeros are considered two independent processes (Mullahy, 1986;Nobre et al., 2017), and the right-censored data are considered in the continuous values. Here, the zeros will be modeled as "success events" using a logistic regression. The positive observable survival time is modeled by a gamma regression survival model.
Let Z be a binary variable, where Z = 1 when the observable time was zero in the database, and Z = 0 otherwise. Therefore, Z has Bernoulli distribution with p 0 parameter, Z ∼ Ber(p 0 ) . Since the zero probability, P(Z = 1) = p 0 , is independent of the time distribution, the likelihood function (2) can be rewritten as follows: where = (p 0i , ) and denotes the parameter vector associated with the probability density function f(t). Factoring the likelihood allows for the estimation of p 0i to be made independently of the fit for the time distribution. To model the zero probability, p 0i , we used a logistic regression with G 4 (p 0i ) = log Considering the observable time T|T > 0 ∼ Gamma( g , g ) , with a shape parameter g and a scale parameter g , we obtain the following probability density function: where g > 0 is the shape parameter and g > 0 is the scale parameter of the distribution.
Considering the likelihood function (11), the relations f (t) = h(t)S(t) and S(t) = exp{−H(t)} , assuming gamma distribution to the observable survival time, produce the following log-likelihood function: We obtain the maximum-likelihood estimates for = (p 0i , g , g ) using iterative techniques from the "glm" function for estimate p 0i and from the package "flexsurv" (Jackson, 2016) to estimate g and g . Both functions are implemented in the software R (R Core Team, 2019).
The inclusion of the covariates associated with the observable time was achieved through the scale and shape parameters of the gamma distribution. The systematic component of the regression version of the zero-inflated-censored gamma model is given by 2i 2 , and 2i = x x x T 3i 3 are the linear predictors, and 's are the unknown regression coefficient vectors. The link functions G 4 (.) , G 5 (.) , and G 6 (.) provide the connection between the linear predictor and the parameters of the probability density function. G 4 (.) is a set of logistic regression G 4 (p 0i ) = log . According to Jackson (2016) G 5 (.) and G 6 (.) , link functions are G 5 ( gi ) = log( gi ) and G 6 ( gi ) = log( gi ) . Thus, gi = e x x x T 2i 2 and gi = e x x x T 3i 3 . Assuming the failure time T|T > 0 ∼ Gamma( g , g ) and the probability density function (1), the expected value of T is defined by (T) = (1 − p 0 ) g g . Considering the inclusion of the covariates, we obtain the following expected value given covariates:

Simulation studies
In this section, we present a simulation study to evaluate the models proposed in the previous section. The ZIWeibull model was fitted in a dataset generated from a Weibull distribution, and the ZIGamma model was fitted in a dataset generated from a gamma distribution. The models' parameters were estimated as described in Sect. 2. Computer codes are available online (see "Availability of data and codes" section for more information). To check the performance of the estimators, we examine the coverage probabilities of the 95% confidence intervals, the bias and root-mean-square errors, as well as the estimator's asymptotic consistency. We consider five sample sizes n = (200, 400, 600, 800, 1000) , and for each scenario, we generate 1000 datasets. In the process of generating each data set, we considered failure times to follow Weibull and gamma distributions with the following regression structure: i = e 1 +x 3i 2 , i = e 3 +x 2i 4 , and p 0i = e 5 +x 1i 6 1+e 5 +x 1i 6 , where the covariate vectors are specified by x i ≡ x i1 = x i2 = x i3 ∼ Normal(0, 1) and the regression coefficients by 1 = 0.5, 2 = 0.5, 3 = 1.5, 4 = 2, 5 = −3, 6 = 1.
To simulate random samples of size n, we suppose that the time until the occurrence of the event of interest has the cumulative distribution function F(t) given by F 0 (t) = p 0i + (1 − p 0i )F(t), t ≥ 0 . The algorithmic steps to generate the random samples are as follows: 1. Generate x i ∼ Normal(0, 1) and calculate p 0i , i , i ; 2. Generate u i from a uniform distribution U(0, 1); 3. Generate v i from a uniform distribution U(p 0i , 1); 4. If u i ≤ p 0i , set t i = 0; 5. If u i > p 0i , set t i equals to the inverse function F −1 0 (v i ); 6. Generate the censoring times c i from a gamma distribution G(0.25, 128.2), specified to control the censoring percentage; 7. If t i < c i , set i = 1 ; otherwise, set i = 0; 8. If t i = 0 ; set Z i = 1 , otherwise, set Z i = 0.
For the zero-inflated-censored model, since failure times have a Weibull distribution, the average time until the event of interest is calculated by For the zero-inflated-censored model, since failure times have a gamma distribution, the average time until the event of interest is calculated by For both models proposed, considering the 1000 replications of each scenario, Fig. 1 shows that as the sample size increases from 200 to 1000, the means of the six estimated parameters ( ̂1 , … ,̂6 ) converge on the true values of the parameters (0.5, 0.5, 1.5, 2, −3, 1). Figure 2 shows that for both models, the increase of the sample size from 200 to 1000 reduces asymptotically the variance, pointing to the minimal variance of the parameters estimated in these models. In Appendix (parameter estimates in the simulation studies), several summary statistics for the parameters are provided, such as mean, standard deviation (SD), confidence interval, bias, mean squared error, and coverage probability. Notably, for all sample size scenarios, the means of the standard errors (SE) are similar to the SD, indicating the good performance of the models. The coverage probability is remarkably close to the nominal value of 95%.
(17) (T|x 1 , x 2 , x 3 ) = 1 − e 5 +x 1 6 1 + e 5 +x 1 6 e 1 +x 3 2 ⋅ e 3 +x 2 4 . For both ZIGamma and ZIWeibull models, the expected values of the proportion of zeros were similar, for all sample sizes, converging asymptotically to the true value (Table 1). The average times-to-event estimated by the ZIWeibull and ZIGamma models was close to the true value, regardless the scenario of gamma or Weibull distribution for both datasets generation (Table 1).

Wild boar dispersal application
In this section, we adjust the ZIWeibull and ZIGamma models to the wild boar natal dispersal study based on recapture by hunting bag (Truvé & Lemel, 2003). Dispersal occurs when wild boars leave their territory for new territories or home ranges (Breed & Moore, 2016). Dispersal accounts for three distinct subpopulations of animals: a segment that will not disperse, i.e., have zero distance; a segment that will disperse measurably (event), and animals whose dispersal is unknown (right-censored).
Examples of such a data structure in wild boar dispersal are available in Keuling et al. (2010), Prévot and Licoppe (2013), and Jerina et al. (2014). These authors report a dataset structure compatible with that of Koenig et al. (1996) and Paradis et al. (1998) who reported that recaptures often occur in the same location at which the animal was marked, creating zeros, or animals moving more than the limit set for recapture in generating right-censored data. Unfortunately, the full datasets for these papers are not publicly available. To address this limitation, we Fig. 2 Variance for the ̂ estimated by the ZIWeibull model (when dataset was generated by the ZIWeibull model) and ZIGamma model (when dataset was generated by the ZIGamma model) using different sample sizes created a dataset that mimics a real situation. Details about the data generation are available in Appendix (Information used to generate the applied dataset). The computer code used to generate the data and the dataset spreadsheet are available as Electronic supplementary material.
Applied to wild boars, our interest is to adjust the dispersal distance, denoted by the random variable D, representing the observable distance. Therefore, the likelihood contribution of a dispersal distance D i of an animal, or i, can assume two different values: p 0i , if animal i has a dispersal equal to zero, and (1 − p 0i )f 0 (d i ) , if animal i has a dispersal larger than zero.
The models' parameters were tested under null hypothesis of equality to zero using a Wald test. For both ZIWeibull and ZIGamma models, the regression coefficients 2 (age), 3 (sex) are accounting for the shape parameter ( ), and 5 (age), Table 1 The simulation results of estimating the probability of zero and for the expected time, given these two values for the covariates x = −1 and x = 1 and gamma and Weibull scenarios for data generation ℙ(p 0i |⋅) denotes the probability of zero; (T|⋅) denotes the expected time where w = e 1 + 2 age+ 3 sex , w = e 4 + 5 age+ 6 sex , and p 0 = e 7 + 8 age+ 9 sex 1+(e 7+ 8 age+ 9 sex ) obtained through the ZIWeibull model adjust, and g = e 1 + 2 age+ 3 sex , g = e 4 + 5 age+ 6 sex , and p 0 = e 7 + 8 age+ 9 sex 1+(e 7+ 8 age+ 9 sex ) obtained through the ZIGamma model. In the ZIGamma model, the proportion of zeros in the simulated wild boar dataset is not significantly dependent on age ( 8 = −0.016 P value = 0.163), or sex ( 9 = 0.423 P value = 0.135), as shown in Table 2. On the other hand, age ( 5 = 0.04 Pvalue < 0.001) and sex ( 6 = −0.64 P value = 0.001) are associated with the scale parameter of the curve. For the shape of the curve, there is association of age ( 2 = −0.01 P value = 0.028) but not sex ( 3 = 0.027 P value = 0.98). For the ZIWeibull model, the proportion of zeros in the simulated wild boar dataset is also not dependent on age (P value = 0.162) or on sex (P value = 0.134), see 8 (age) and 9 (sex) in Table 2. However, age and sex are associated with the scale of the curve ( 5 = 0.032 and 6 = −0.644 Pvalue < 0.001 for both); and age was associated with the shape ( 2 = −0.007 P value = 0.042) ( Table 2).
Because the generated dispersal distance is affected by sex and age, we have chosen to calculate the mean distance using the average age for both sexes. For males, the average age was 15 months, so the estimated mean distance was 13.9 km and 13.66 km with the ZIWeibull and ZIGamma models, respectively. For females, the mean age was 12 months, at which age the estimated mean distance was 6.1 km and 6.08 km with the ZIWeibull and ZIGamma models, respectively.
For the regression models proposed here, increase in the mean distance is associated with increased age and males dispersing more than females, as shown in Fig. 3. On average, males are known to disperse more than females. For both sexes, increased age increases the distances. These findings are expected, as we intentionally generated an applied dataset while attempting to account for the matrilineal, territorial societal structure (females and their offspring having very low dispersion), as well as for the solitary males that disperse more to seek territory and mating opportunities (Keuling et al., 2010).
To assess the models' goodness-of-fit, we evaluated the following information criteria: Akaike and Bayes information criteria, (AIC and BIC). These were calculated according to qk − 2 log(L) , where k is the number of parameters, L is the likelihood, and q = 2 for AIC, or q = ln(n) , n = sample size for BIC (Brewer et al., 2016). As the ZIGamma model is a hurdle model, the full log-likelihood was calculated adding the log-likelihood of the logistic regression and the log-likelihood of the gamma regression (McDowell, 2003).
AIC are similar for both proposed models, but larger difference is observed in BIC which is smaller for the ZIGamma model (see Table 2). However, our aim here is not strictly to select the best model from between ZIGamma and ZIWeibull, but to present both as alternative methods in this applied context, allowing for the incorporation of all the information contained in the dataset to calculate the likelihood function.

Discussion
Although the lack of a real dataset hampers the attempt to draw a direct conclusion about an actual wild boar population, our aim was to highlight that both ZIWeibull and ZIGamma can be useful tools for statistical inferences in wild boar dispersal. According to Whitmee and Orme (2013), dispersal is one of the principal mechanisms underlying wild animals' land occupation, and there is an urgent need to obtain quantitative empirical data and methods to analyze dispersal distance. The models presented here are designed to accommodate the proportions of individuals that will not disperse at all and zeros created by measurement error (accounting for the excess of zeros), the observable dispersal distance, and the cases in which the dispersal is greater than our capacity to follow it, or incomplete followings (rightcensoring). In this sense, researchers could count with an inferential tool avoiding data exclusion using the full dataset in the same likelihood function. For instance, Truvé and Lemel (2003) excluded censored data and split the dataset to fit a nonparametric model to each covariate. From an applied perspective, estimating the characteristics of wild boar populations [e.g., average dispersal across sexes and ages (Keuling et al., 2010)] is important in enabling animal health managers and decision-makers to make evidencebased decisions (EFSA, 2014;Morelle et al., 2014). For instance, risk assessment models for the introduction/spread of African Swine Fever are heavily dependent on parameters regarding the land occupation (De la Torre et al., 2015). Surveillance programs for commercial swine herds often consider the dynamics of the wild boar population in preventing diseases (Casas-Díaz et al., 2013). As such, the models proposed here are useful for generating inferences about dispersal distance and testing if covariates are associated with the distance, thus helping to overcome current gaps in our understanding of wild boar behavior (Guinat et al., 2016;EFSA, 2019).
For the zero-inflated-censored Weibull (ZIWeibull) model, the estimators were consistent and efficient, and both the standard deviation values and the estimated standard errors values are quite close, indicating a good adjustment. Likewise, for the ZIWeibull model, the coverage probabilities for the ZIGamma model were close to the nominal level, and both the standard deviation values and the standard error values are remarkably close. The large sample theory for the models proposed in this paper is still an open question. However, our simulation studies show that the asymptotic validity seems to be satisfied. Finally, we note that the zero-inflated gamma model proposed in this paper can be extended to include the cure rate in the model. Furthermore, the zero-inflated-censored Weibull and gamma regression models can be extended to interval-censored survival data.
1. Install and load the package devtools: install.packages ("devtools") library(devtools) OBS: If the installation of "devtools" do not not occur, then the user will not succeed the "ZIdispersal" installation.

Information used to generate the applied dataset
Dispersal distance was generated based on Truvé and Lemel (2003). The distances are dependent on sex and age (month). For females, the distance increases between seven and nine months, and the general average dispersal for females is 6 km. For males, the distance increases from 10 to 13 months, and the general dispersal distance for males is approximately 10 km. The proportion of zeros was 7% for males and 12% for females; 5% of censored observations were generated for females and 16% for males. To generate data as similar as possible to the real data found in Truvé and Lemel (2003), we consider a sample size n = 400 for animal dispersal consisting of 50% males and 50% females. We consider exponential distribution for age, with scale parameters of 12 for females and 15 for males. We also consider the zeros proportion of 15% and right-censoring when the distance is greater than 30 km.
The general average distance in the generated dataset was 8.68 km, 6.17 km for females and 11.11 km for males. The proportions of zeros were 12.5 % and 18.5 % for males and females, respectively. The censoring proportions were 15 % and 4.5 % for males and females, respectively. More details can be seen in Table 3.
For both sexes, the observed dispersal distance distribution is right-skewed, with more zeros for females and a larger tail on the right, and more right-censored data for males (Fig. 4).

Parameter estimates in the simulation studies
See Tables 4 and 5.   Table 5 Results of the simulation study for the zero-inflated-censored gamma model adjustment considering different sample sizes SD is the standard deviation calculated from 1000 parameter estimates, SE is the mean of the 1000 estimated standard errors, CI is the mean of the confidence intervals, MSE is the mean of the 1000 estimated mean squared errors, and CP is the empirical coverage probability of the estimated 95% confidence intervals