The abundance of organisms is of central interest in ecology (Ehrlich and Roughgarden 1987). However, abundance measurements are almost always affected by imperfect detection; that is, abundance is underestimated when detection probability is less than 1. Detection probability may vary by species, observer, survey method and environment (Royle and Dorazio 2008; Kéry and Schaub 2012; Kéry and Royle 2016). The consequences of imperfect detection can vary widely, and can prevail in the analysis of abundance from local habitat to regional-scales (Lahoz-Monfort et al. 2014; Higa et al. 2015). For example, local population densities can be underestimated, while extinction and colonization rates of populations may be overestimated (Moilanen 2002; Kéry et al. 2013). Some have argued that imperfect detection need not always be considered provided that a study employs a standardized sampling design (Johnson 2008; Banks-Leite et al. 2014). However, if absolute abundance needs to be estimated and/or if detection probability depends on covariates that also affect abundance, then detection probability must be accounted for in any modeling framework for estimating abundance (Kéry 2008; Kéry et al. 2010; Yamaura 2013).

During the last several decades, a vast number of statistical methods of inference about distribution and abundance have been developed that accommodate imperfect detection, especially those developed during the last 15 years (Buckland et al. 2001; Borchers et al. 2002; Williams et al. 2002; Buckland et al. 2004; MacKenzie et al. 2006; Royle and Dorazio 2008; Link and Barker 2009; King et al. 2010; Kéry and Schaub 2012; Royle et al. 2014). Detection probability is typically estimated using a ‘single-species approach’, i.e., probability is estimated for every species individually (Alldredge et al. 2007). However, because the analysis of detection probability needs adequate sample sizes (Buckland et al. 2001; MacKenzie and Royle 2005), rare species are usually difficult to analyze independently. In this situation, researchers have repeatedly suggested overcoming the problem of small sample size for rare species by lumping (or pooling) their data with data for more common species that may be expected to have similar detection probabilities and respond to covariates in a similar fashion (MacKenzie et al. 2005; Buckland et al. 2008). As an alternative, one might analyze multiple or all species together in an analysis that stratifies by species, thereby accounting for the effects of species identity on parameters of abundance and detection (Alldredge et al. 2007), perhaps treating them as random effects, so that some information is shared among them (Kéry and Royle 2008; Zipkin et al. 2009).

Dorazio and Royle (2005) and Dorazio et al. (2006) proposed an approach of modeling a community as an ensemble of elemental species-level models from which community-level variables such as species richness or site similarity can naturally be derived (for a similar approach see also Gelfand et al. 2005; Ovaskainen and Soininen 2011). Using a series of binary detection/non-detection data of all species detected in a community, their community occupancy model estimates binary occupancy (presence/absence) of individual species at each site while correcting for imperfect detection. Species in the same community share hyper-parameters, and parameters of rare species (e.g., their detection probability) can be estimated by combining their own information with that coming from all the other species in the community, i.e., thereby borrowing strength from the ensemble (Link and Sauer 1996; Sauer and Link 2002; Kéry 2010). Furthermore, community occupancy models allow us to estimate the number of species that were not observed in a survey and their unobserved occupancy status while using data augmentation (Royle et al. 2007). Hence, community occupancy models produce a species richness estimator that accounts (1) for species that occurred in at least one sampling site but were missed in any of the other sites (i.e., were detected at least once), (2) for species that occurred in at least one sampling site but were missed in all other sampling sites (i.e., were never detected), and (3) for species that did occur in the meta-community (i.e., in the wider region that is sampled near the studied sites), but did not occur (and therefore were not detected) at any of the sampling sites (Kéry and Royle 2009; Kéry 2011; Iknayan et al. 2014). This modeling framework has recently been extended for abundance as a state variable based on count data (Yamaura et al. 2012; Chandler et al. 2013; Barnagaud et al. 2014; Dorazio and Connor 2014). In these models which estimate abundance of species (herein, community abundance or community N-mixture models), the occurrence (or occupancy) of a species is naturally a function of its local abundance (i.e., a species occurs if its local abundance is greater than zero), and community-level species richness and total abundance is obtained as a derived parameter. We can assume that a studied community is composed of multiple (functional) groups in which species may have similar parameters, which are summarized by group-level hyper-parameters (Sauer and Link 2002; Ruiz-Gutiérrez et al. 2010; Yamaura et al. 2012; Chen et al. 2013; Barnagaud et al. 2014; Pacifici et al. 2014).

A large number of other models and methods have been developed over the years to study biological communities that are subject to imperfect detection. These nonparametric models estimate the number of unobserved species and compare community composition (Gotelli and Colwell 2001; Williams et al. 2002; Chao et al. 2005, 2009). Compared with these methods, community abundance models have several desirable properties that are not shared by other methods (Dorazio et al. 2011). First, abundance and the detection process are treated separately; therefore, common species with low detection probability and rare species with high detection probability are treated differently. This separation yields less-biased estimates of diversity measures (Broms et al. 2015). Second, community abundance models are able to estimate local (site-specific) species richness (α diversity) as well as overall species richness (γ diversity) and the size of the regional species pool; γ diversity can be greatly smaller than the size of regional species pool depending on the total area of the sampling plots used. Turnover of community composition among sites (β diversity) can be calculated by subtracting the mean α diversity from γ diversity using the framework of additive partitioning (Veech et al. 2002; Crist et al. 2003) or by computing indices such as the Jaccard index that are based on pairwise comparisons of species occurrence (Dorazio and Royle 2005; Dorazio et al. 2010; Kéry and Royle 2016).

In contrast to community occupancy models, community abundance models are still in their infancy, they have not been widely applied (Iknayan et al. 2014; Dénes et al. 2015), and their performance is essentially unknown. Occupancy models, which are the building blocks of all community occupancy models, are now commonly used for modeling presence/absence of individual species (Guillera-Arroita et al. 2015). Their estimation performance as well as appropriate sampling design that is used to maximize performance have both been actively examined and discussed for single species models (e.g., MacKenzie and Royle 2005; Guillera-Arroita et al. 2010, 2014; Rota et al. 2011; Guillera-Arroita and Lahoz-Monfort 2012; Wintle et al. 2012; Welsh et al. 2013). It would be natural to assume that what was learned for single-species models could be carried over to community models, and this was shown by Sanderlin et al. (2014) for the community occupancy model. In contrast, N-mixture models (Royle 2004), which are the building blocks of community abundance models and estimate abundance of a single species from repeated count data, have received much less attention. Several studies have compared estimates between N-mixture and other models that accommodate imperfect detection using field data (Kéry et al. 2005; Hunt et al. 2012; Couturier et al. 2013). In addition, some simulation studies related to their performance have also been conducted (Kéry 2008; McIntyre et al. 2012; Yamaura 2013; Dennis et al. 2015). However, estimation performance of community N-mixture abundance models has not been examined to date.

The objective of this study is to examine the bias and precision of community abundance models under various conditions of abundance, detectability and different combinations of the number of survey visits and sampling sites. We also compared the performance of community abundance models as a species richness estimator to that of community occupancy models under conditions where abundance is low and where count data converge towards binary detection/non-detection records. This comparison included the computation time of both models. Finally, we divided γ diversity of species richness and the Shannon index into α and β diversity in the estimation procedure of community abundance models, and examined whether community abundance models produced unbiased estimates of the diversity measures. Because the true values of species richness and all parameters of individual species are known, we can gauge the differences between true and estimated values, i.e., the bias and precision of all estimates. Following Yamaura (2013), we focused on small-sample situations with a limited number of sampling sites as a worst-case scenario. If we know the minimal conditions under which the model performs adequately, then we can be assured that they perform even better with larger samples.


A brief outline of community abundance models

Submodel of the ecological process

In community abundance models, we assume that a community is assembled as an ensemble of independent Poisson processes for each individual species. The abundance of species i at site j, N ij , is a Poisson random variable:

$$N_{ij} \sim {\text{ Poisson}}(\lambda_{ij} )$$

where λ ij is the expected (or mean) abundance (Royle et al. 2005), and λ ij can be expressed as a function of site-level covariates \((\varvec{x}_{j}^{'}\)) typically using a log-link:

$$\log \left( {\lambda_{ij} } \right) = \beta_{0i} + \varvec{x}_{j}^{{\prime }}\varvec{\beta}_{i}$$

Here intercepts β 0i and covariate coefficients (β i ) are assumed to follow separate normal distributions, e.g., \(\beta_{0i} \sim {\text{Normal}}(\mu_{{\beta_{0} }} ,\sigma_{{\beta_{0} }}^{2} )\). Under this model the community-level hyper-parameters, i.e., the mean and the standard deviation of these normal distributions, are shared by all species in the community; they describe the average of the community and the among-species heterogeneity, respectively. We can also use separate normal distributions for individual species (functional) groups, and examine group-specific responses to covariates (Yamaura et al. 2012; Chen et al. 2013; Barnagaud et al. 2014). Thanks to this sharing of hyper-parameters among species, we can obtain better estimates of the parameters of rare species and even those of unobserved species by “borrowing strength” (i.e., sharing information) among similar but more common species (Zipkin et al. 2009; Ovaskainen and Soininen 2011).

Submodel of the detection process

Counts of detected individuals are one of the most convenient data types to collect, and the N-mixture (or binomial mixture) model is a natural model for abundance of a single species based on count data (Royle 2004; Kéry et al. 2005). The main idea is that if every individual at a given point in space and time (i.e., every member of N ij ) has the same detection probability and is detected independently, then the number of individuals detected will be a binomial random variable. When individual sites are visited multiple times and repeated counts can be obtained within a short period in which abundance does not change (this is called the “closure assumption”), we can describe the detection process as a binomial process with a probability of success (or here, detection) of p i :

$$y_{ijk} \sim {\text{ Binomial}}\left( {N_{ij} ,p_{i} } \right)$$

Here, y ijk is the number of detected individuals (i.e., the count) of species i at site j on visit k. In a single-species situation, parameters can easily be estimated by maximum likelihood (Royle 2004; Kéry et al. 2005) or using Bayesian inference (Kéry 2010; Kéry and Schaub 2012). However, in the much more complex multispecies situation we typically have to resort to a Bayesian implementation of the model (Royle and Dorazio 2008; Kéry and Royle 2016). Similarly as the abundance submodel, the individual species-level detection probability of species i (p i ) is also assumed to be drawn from a normal distribution with community-level hyper-parameters defined on the logit-link scale, i.e., \({\text{logit}}\left( {p_{i} } \right) = q_{i} ,\quad {\text{with }}q_{i} \sim {\text{Normal}}(\mu_{q} ,\sigma_{q}^{2} )\). Although p i is here assumed to be constant across sites and replicate surveys, we can easily relax this assumption, for example, to model covariate effects (Kéry 2008; Yamaura 2013).

Data augmentation to account for unobserved species

Royle et al. (2007) and Dorazio et al. (2006) used data augmentation to estimate the number of unobserved species in a survey, and community composition at each site by accounting for unobserved species. In traditional community analyses such as in ordination methods, we analyze the available detection histories (e.g., series of counts or detection/non-detection over replicate surveys) only for the observed species. In the community occupancy model of Dorazio and Royle, we add (augment) all-zero detection histories for an arbitrary number of hypothetical, unobserved species to the detection histories of observed species and analyze the augmented data set (Dorazio et al. 2006; Royle et al. 2007). We call the resulting community, which is composed of observed and augmented species, a ‘super-community.’ The augmented data set of detection histories is analyzed to estimate the number of unobserved species, where a species can go unobserved either because it does not happen to occur in the sampled areas by chance (but does occur in the wider sampled area: \(\sum\nolimits_{j = 1}^{{n_{site} }} {N_{ij} = 0}\)) or because it does occur \((\sum\nolimits_{j = 1}^{{n_{site} }} {N_{ij} > 0}\)) but went undetected by chance. Under data augmentation the size of the super-community (S) is prescribed, and it should be chosen to be (much) larger than R (which is the unknown species pool size to be estimated). This can easily be achieved in practice by trial-and-error and making sure that the posterior mass of R is concentrated away from the chosen value of S (see below).

To re-formulate the community model using data augmentation, we introduce a binary, partially observed indicator variable, w i , which takes the value of 1 if a species in the super-community is a member of the community of R species that are exposed to sampling and 0 otherwise. This “community membership indicator variable” is known to be 1 for all species that are observed at least once, but its value must be estimated for the augmented species. We assume that w i are mutually independent Bernoulli random variables with an inclusion parameter Ω, i.e., w i  ~ Bernoulli(Ω). We then estimate R as a function of the inclusion variables as \(\sum\nolimits_{i = 1}^{S} {w_{i} }\). Estimating the data augmentation parameter Ω is therefore functionally equivalent to estimating R in the sense that under the data augmentation scheme R is a binomial random variable with sample size S and success parameter Ω. In the analyses, we use a sufficiently large value of S such that all the mass of the posterior distribution of Ω is well away from 1 (Royle and Dorazio 2008; Kéry and Royle 2016). However, larger values of S require a longer computation time and hence, the selected S should not be too high for purely practical reasons.

Detection histories of all augmented species contain only zeroes, but the species that are exposed to sampling could in principle have produced non-zero counts depending on the covariates and their detection probabilities, while the observations for non-exposed species are structural zeroes. Following previous studies with community abundance models (Yamaura et al. 2011, 2012), we formulate this zero-inflation in y ijk by modifying Eq. 3 (but see also the “Discussion”):

$$y_{ijk} \sim {\text{ Binomial}}\left( {N_{ij} w_{i} ,p_{i} } \right).$$

That is, for species that are not exposed to sampling (with w i  = 0), the observations are binomial with a sample size of zero and therefore counts are necessarily equal to zero.

We can estimate site-specific species richness and total abundance of communities as derived parameters using the posterior samples of the latent variables (N ij and w i ; see also the “Discussion”). We can similarly estimate the number of species occurring at any site (γ diversity) by tallying up species with at least one individual at that site. We note that this γ diversity of species richness is different from the community size R (O’Hara 2005; Iknayan et al. 2014), and can be greatly smaller than R when the total area of sampling sites is small and therefore many species in the wider region may simply not occur in the sampled sites (see estimation of species-accumulation curve in Dorazio et al. 2006 and Kéry and Royle 2016).

Another way is available to obtain site-specific species richness (R j ). Using a property of the Poisson distribution, we can formulate the probability that at least one individual occurs (Royle and Dorazio 2008): Pr[N ≥ 1] = 1 − exp(−λ). By using species-level parameters of community abundance models (β 0i and β i in Eq. 2), we can obtain point estimates of R j by expanding this probability into r observed species:

$$E[\hat{R}_{j} ] = \sum\limits_{i = 1}^{r} {\left[ {1 - \exp \left( { - \exp \left( {\hat{\beta }_{0i} + \mathop {\user2{x}}\nolimits_{j}^{{\prime }} \hat{\varvec{\beta }}_{i} } \right)} \right)} \right]}$$

If data augmentation is used to account for the existence of unobserved species, we can use the following quantity with the aid of an indicator variable, w i :

$$E[\hat{R}_{j} ] = \sum\limits_{i = 1}^{S}\left\{w_{i}\times {\left[ {1 - \exp \left( { - \exp \left( {\hat{\beta }_{0i} + \mathop {\user2{x}}\nolimits_{j}^{{\prime }} \hat{\varvec{\beta }}_{i} } \right)} \right)} \right]} \right\}$$

Simulation experiments

We conducted three simulation experiments, and first assessed bias and precision of community abundance models under various conditions that all characterize situations in which there is only little information about the model parameters. We next compared community abundance and occupancy models as species richness estimators. We finally examined the performance of community abundance models when quantifying β diversity.

Simulation experiment 1: assessing bias and precision of community abundance models

Parameter settings, data generation, and estimation procedure

Kéry (2008) and Yamaura (2013) showed that N-mixture models can remove the bias in inferred abundances based on the fact that detection probability depends on covariates that also affect abundance. Here, for simplicity, we assumed a constant detection probability and only varied the following four factors related to the sampling design and species biology (Yamaura 2013):

  • Number of sampling sites (n site ): 10, 20, 40.

  • Number of visits (n visit ): 2, 3, 4.

  • Mean detection probability (logit transformed, and denoted by μ q ): −2.2, −1.4, 0.0, corresponding to average of p = 0.1, 0.2, 0.5.

  • Mean abundance (log-transformed and denoted by \(\mu_{{\beta_{0} }}\)): −2.3, −0.69, 0.0, 1.61, corresponding to average of λ = 0.1, 0.5, 1, 5 at the average value of the covariate modeled.

Ranges of these four factors represent different aspects of a “small sample size;” in addition, the estimation performance of N-mixture model can greatly change within these ranges (Yamaura 2013). That is, our study examined the performance of the community abundance models in a worst-case scenario; with larger sample sizes, performance typically improves. Mean abundance and mean detection probability are not “settings” of a study design but characteristics of the analyzed communities (but see also “Discussion”). These are mean values of community-level hyper-parameters, and we fixed the other hyper-parameter, the standard deviations, at 1 (i.e., σ q  = \(\sigma_{{\beta_{0} }}\) = 1). We assumed that every sampling site received the same number of visits, and that the sampling area of individual sites was constant across sites (we can relax this assumption by modifying Eq. 1: Royle et al. 2005). We considered the situation in which only a single site-specific covariate x j was ecologically important; an example might be a continuous measurement of habitat quality. Minimum and maximum values of x j were −1 and 1, respectively, the intermediate values were equally spaced, and their intervals depended on the number of sampling sites. We chose a mean value of the coefficient of this covariate \((\mu_{{\beta_{1} }}\)) of 0 and for its associated standard deviation \((\sigma_{{\beta_{1} }}\)) a value of 1. Furthermore, we set the number of possible occurring species (i.e., species richness: community size or species pool size) at 40 throughout the simulation.

To simulate a data set, we randomly drew 40 species-level parameters (β 0i , β 1i , q i  = logit[p i ]) from their normal distributions with the given hyper-parameters. Next, we computed the site- and species-specific expected abundance using Eq. 2 (i.e., λ ij  = exp[β 0i  + β 1i  × x j ]), and then drew the realized abundances from Poisson distributions with those expectations (i.e., N ij  ~ Poisson[λ ij ]). Finally, we simulated replicated surveys of each site, and obtained count histories specified by detected species, sites, and visits (y ijk ), which followed a binomial process described by Eq. 3. Count histories and the number of detected species (the observed total species richness r) were of course different among replications because of sampling variability and differences in the combinations of the four factors, but r was always less than or equal to 40. We replicated each combination of four factors (n site , n visit , mean detection μ q , and mean abundance \(\mu_{{\beta_{0} }}\)) ten times, meaning that our experiments had a balanced design with 1080 replicate data sets representing 3 × 3 × 3 × 4 = 108 parameter combinations. See also Chapter 11 in Kéry and Royle (2016) for more information and an R function that can be used to simulate community abundance data.

We analyzed all simulated data sets using the above-described community abundance model. To estimate the number of unobserved species (R − r), we added all-zero count histories for 80 − r ‘potential’ species ([80 − r] × n site  × n visit ) to those of the detected species, and the resulting augmented data sets of 80 count histories were then analyzed. This means that we used a constant super-community size of S = 80. To minimize computation time, we used a constant of 40 augmented species in the case of λ = 5 (where the simulated survey data detected almost all 40 species, and the number of 40 augmented species was sufficiently large to estimate R [=40]). We fitted the model using Markov chain Monte Carlo (MCMC), and adopted conventional vague priors (e.g., \(\mu_{{\beta_{0} }}\) ~ Normal [0,1002], \(\sigma_{{\beta_{0} }}\) ~ Uniform [0,10], Ω ~ Uniform [0,1]). We fitted the community abundance model using R 3.0.2 (R Core Team 2014) and JAGS 3.4.0 (Plummer 2013) via the package R2jags 0.03-11 (Su and Yajima 2013). We discarded as a burn-in the first 10,000 iterations of three chains with different initial values, and ran an additional 100,000 iterations to accumulate a posterior sample to be used for inference. We assumed chain convergence was achieved when the Gelman-Rubin statistic of all parameters \((\mu_{{\beta_{0} }}\), \(\mu_{{\beta_{1} }}\), μ r , \(\sigma_{{\beta_{0} }}\), \(\sigma_{{\beta_{1} }}\), σ r , Ω, N, p i , β 0i , β 1i ) was <1.1; otherwise, we ran additional sets of 100,000 iterations until we achieved chain convergence, using the function autojags in R package R2jags.

Bias and precision of estimators under the community abundance models

To assess the performance of community abundance models, we focused on the bias and precision of the estimates of the following parameters: species-level intercepts (β 0i ) and slopes (β 1i ) in the abundance model and overall species richness (R). For β 0i and β 1i , we first calculated the absolute differences between the estimates (posterior mean of the parameter) and true values for each species, i.e., \(\left| {\hat{\beta }_{0i} - \beta_{0i} } \right|\), and then averaged them across the observed species, ignoring unobserved species in this calculation. We then averaged this average over the ten replicate data sets to quantify the bias of this estimate in a given simulation scenario. As a measure of precision, we averaged the standard deviations (SDs) of estimates across the observed species, and again averaged over the ten replicate data sets. For species richness R, we simply calculated absolute differences between estimates (posterior means) and the true value (40), and averaged this value across the ten replicate data sets. We also averaged SDs of estimated R over the ten replicates, and treated this as a measure of imprecision (analogous to a standard error). We plotted these averaged values as a function of the number of sampling sites (n site ) and visits (n visit ) for each combination of μ r and \(\mu_{{\beta_{0} }}\).

In these calculations, we took the absolute differences between the true and estimated values of the intercepts and slopes for each species, because otherwise negative and positive errors would cancel out. In contrast, for species richness R, the posterior means were usually larger than the true value of 40 and there was only a very slight difference in the results between absolute and raw differences. To summarize the relative effects of experimental factors on the bias and imprecision, we conducted an analysis of variance (ANOVA) on the bias or the imprecision of each replication and viewed the four factors in our simulation design as treatments (n site , n visit , μ r , \(\mu_{{\beta_{0} }}\)), with all interactions included. Following White et al. (2014), we evaluated sums of squares (SS) and mean SS rather than the statistical significance of the ANOVA to assess the importance of the effects (Tyler and Hargrove 1997; Fahrig 2001; Fletcher 2006).

Simulation experiment 2: comparing community abundance and occupancy models as species richness estimators

When a Poisson distribution has low expected values \((\bar{\lambda }\)), both the realized abundance and the observed count data will consist mostly of zeroes and ones and hence will be approximately equal to the binary presence/absence and detection/non-detection, respectively. Therefore, we might expect the community occupancy model to achieve convergence faster and perform in a similar fashion to the community abundance model, which may need more MCMC iterations to achieve convergence. To test this expectation, we compared the estimates of species richness \((\hat{R}\)) between community abundance and community occupancy models when \(\bar{\lambda }\) \(({ \exp }[\mu_{{\beta_{0} }} ]\)) was low (=0.1, 0.5). We simulated communities and obtained count histories using the same methods described in the previous section. We then fitted both the community abundance and the community occupancy models to the same data sets, where we reduced the count histories to binary detection/non-detection histories. Because of the differences in model structure, we could no longer directly compare species-level parameters (e.g., β 0i ) between community occupancy and abundance models but we could still compare the estimates of community properties such as \(\hat{R}\). We fitted the community occupancy model with a single covariate (x j ), and drew species-level parameters (β 0i , β 1i , r i ) from separate independent normal distributions. We used the same settings (e.g., conditions, replications, priors) as were chosen in the first experiment (e.g., a super-community size S = 80).

Simulation experiment 3: partitioning γ diversity into α and β diversity subject to imperfect detection

We used additive partitioning (Veech et al. 2002; Crist et al. 2003) in the community abundance models for those scenarios of the sampling design where community abundance models achieved good performance (see “Results”): n site  = 40, n visit  = 4, and mean abundance \((\bar{\lambda } = { \exp }(\mu_{{\beta_{0} }} )\)) = 0.5. We only varied the mean detection probability \((\bar{p}\)), which will affect the observed site-level species richness, at the three levels of 0.1, 0.2, and 0.5. We conducted these simulations in an analogous way as in the above experiments, and treated the number of detected species throughout the survey (in any visits at any sites) as the observed (i.e., detection-naïve) γ diversity of species richness. Because the number of detected species throughout the survey (r) was always ≥35, we used 40 augmented species in the analysis. The mean site-level detection-naïve species richness is the observed α diversity; hence, we obtained the detection-naïve β diversity estimate by subtracting the mean of the α diversity from γ diversity, i.e., β = γ − mean (α) (Veech et al. 2002; Crist et al. 2003). We calculated the site-level Shannon index (α diversity of Shannon index) using the maximum count (over the four visits) of each species and used the overall Shannon index as a measure of γ diversity by summing these maximum counts for each species across the sites. We then obtained the β diversity of the Shannon index by subtraction (Crist et al. 2003). As a benchmark, we obtained the true values of these measures in an analogous way from the known true values of abundance in the simulation. We conducted these calculations using the function adipart in R package vegan 2.2-1 (Oksanen et al. 2015). In community abundance models, we obtained estimates of α, β, and γ diversity of species richness and the Shannon index using species- and site-specific abundance estimates. We also obtained the estimates of the expected species richness at each site (Eq. 6), and of the corresponding estimates of mean α and β diversity of species richness. We used an alternative location of the data augmentation indicator variable to minimize computation time (see “Discussion”). We conducted the analysis using the function autojags in R package jagsUI 1.3.1 (Kellner 2015), discarded as a burn-in the first 10,000 iterations, and then ran additional sets of 100,000 iterations until convergence was achieved (with parallel computing). For each level of the mean detection probability \((\bar{p}\)) we generated and analyzed ten data sets, and obtained mean values of species diversity indices through the replications.


Experiment 1: bias and precision of community abundance models

Bias and precision showed similar responses to the four factors (Figs. 1, 2, 3; and S1–3). The effects of mean abundance \((\bar{\lambda }\)), detection probability \((\bar{p} = 1/[1 + { \exp }[ - \mu_{r} ]]\)), number of visits (n visit ), and number of sites (n site ) were as expected; that is, bias and imprecision decreased with increasing values of all these variables (Figs. 1, 2, 3; and S1–3). The results of the ANOVA suggested that variation in \(\bar{p}\) was most important among the four experimental factors to explain variation in bias and imprecision of mean abundance (i.e., the intercepts β 0i ); its sums of squares (SS) and/or mean SS were the largest (Table 1). In contrast, variation in abundance \((\bar{\lambda }\)) was the most important factor for explaining variation in bias and imprecision of the slopes (β 1i ) and species richness (R). When mean abundance \(\bar{\lambda }\) was larger than or equal to 0.5 (i.e., the average species had mean abundance greater than or equal to 0.5), the bias and imprecision in estimates of β 1i and R decreased greatly (Figs. 2, 3; and S2–3). For all parameters, the number of surveyed sites (n site ) was more important than was the number of visits (n visit ), and this also held when we removed the data with n site  = 40 (Table S1). We found evidence for interacting effects for all three parameters (β 0i , β 1i , and R), and specific interaction effects were different between the two species-level parameters (β 0i , β 1i ). Interaction terms important to the community-level property R included those important to species-level parameters (Table 1). Interaction terms between \(\bar{\lambda }\) and n site , and between \(\bar{\lambda }\) and \(\bar{p}\) were the most important, indicating that larger n site (≥20) and \(\bar{p}\) (≥0.2) and large values of \(\bar{\lambda }\) (≥0.5) increased the effectiveness of community abundance models to accurately estimate species richness. These conditions also yielded relatively good performance for the intercepts and slope estimators (Figs. 1, 2; and S1–2).

Fig. 1
figure 1

Bias of intercept estimates as a function of the number of sampling sites (s on the right axis = 10, 20, 40) and the number of visits (v on the left axis = 2, 3, 4). One plot is produced for each combination of \(\bar{\lambda }\) (=\({ \exp }[\mu_{{\beta_{0} }} ]\)) and \(\bar{p}\) (=1/[1 + exp[− μ r]]). Each bar indicates the averaged value of ten replicate data sets for each combination of s and v. Fig. S1 shows the corresponding results for the imprecision measure

Fig. 2
figure 2

Bias of the slope estimates as a function of the number of sampling sites (s on the right axis = 10, 20, 40) and the number of visits (v on the left axis = 2, 3, 4). Details are seen in Figs. 1 and S2 for the corresponding results for the imprecision measure

Fig. 3
figure 3

Bias of species richness estimates as a function of the number of sampling sites (s on the right axis = 10, 20, 40) and the number of visits (v on the left axis = 2, 3, 4). Details are seen in Figs. 1 and S3 for the analogous results for the imprecision measure

Table 1 ANOVA tables for the contributions to variations in bias and precision by four factors and their interactions

Experiment 2: community abundance and occupancy models as species richness estimators

Because the convergence of MCMC chains took a very long time for the combination of \(\bar{\lambda }\) = 0.1 and \(\bar{p}\) = 0.1 (e.g., more than 2–3 days for a single data set), we did not include the analyses for this combination in our simulation results. For the analyses with the other combinations of \(\bar{\lambda }\) and \(\bar{p}\), there were only modest differences in bias and imprecision of the estimates under the two versions of the community models (Figs. 4, S4). However, the bias and imprecision of estimators under the community abundance model were slightly smaller than that of estimators of the community occupancy models for the scenario producing the highest expected counts \((\bar{\lambda }\) = \(\bar{p}\) = 0.5). The time to achieve convergence in the community occupancy model was less than half of that for the community abundance model (Table 2).

Fig. 4
figure 4

Bias of species richness estimates derived from the community abundance and community occupancy models. Community abundance models (CAMs) and community occupancy models (COMs) have different structures; therefore, only the estimates of species richness (R) are directly comparable. Fig. S4 shows the analogous results for the imprecision measure

Table 2 Time required for convergence by a community abundance model (CAM) and community occupancy models (COM) for five combinations of mean abundance and mean detection probability

Experiment 3: partitioning of γ diversity into α and β diversity

In the scenario of the lowest value for detection probability \((\bar{p}\) = 0.1), community abundance models slightly overestimated γ and α diversity of species richness (Fig. 5). On the other hand, the observed (i.e., detection-naïve) values of α were much lower than true values and those of γ diversity were slightly lower, resulting in the overestimation of β diversity. Community abundance models produced unbiased estimates of the α, β, and γ diversity based on the Shannon index, while the observed values again underestimated α diversity and overestimated β diversity. The bias of estimates from community abundance models and also of the observed values decreased for larger \(\bar{p}\), and was negligible even for the observed values once \(\bar{p}\) became 0.5. The expected α diversity of species richness (obtained from Eq. 6) was almost the same for posterior means of the number of species with at least one individual in each site. This equivalence suggests that we can use either of the two metrics as the estimates of local-species richness interchangeably.

Fig. 5
figure 5

Additive partitioning of species richness and Shannon indices under a 40 sites and four visits sampling design with different mean detection probabilities. For this analysis, we subtracted overall γ diversity from the mean site-level α diversity, and thus obtained β diversity. We fixed mean abundance \((\bar{\lambda }\)), the numbers of sampling sites and the number of visits, and only varied the mean detection probability \((\bar{p}\)). For each of three values of \(\bar{p},\) we repeated the generation of communities and their analyses ten times, and here show mean diversity indices. The leftmost bar shows the true values, and community abundance models (CAMs) and rightmost bars are the estimates and naïve values, respectively. In the results of species richness, we also obtained expected α diversity, obtained from Eq. 6


Performance of community abundance models in small-sample situations

In this first published performance assessment of the community abundance models of Yamaura et al. (2012), we found that different variables were most influential in explaining the magnitude of the bias and imprecision of estimates of species-level intercepts (β 1i ), slopes (β 1i ), and species richness (R). The average detection probability \((\bar{p}\)) had the most influence on the estimates of β 0i . In contrast, average abundance \((\bar{\lambda }\)) was most influential on the estimates of β 1i and R. It has previously been shown that the precision and the accuracy of intercepts in the abundance model (i.e., the estimated abundance) were greatly affected by detection probability in N-mixture models (McIntyre et al. 2012; Yamaura 2013), which is consistent with the result of our present study. Both results emphasize the difficulty in achieving accurate estimates of community abundance when the detection probability is very low (which of course is a manifestation of “the first law of capture-recapture”: p. 246 in Kéry and Royle 2016). Modification of the sampling methods to increase \(\bar{p}\) would be important in such cases, for example, by spending ≥15 min instead of 5 min (Drapeau et al. 1999), or by broadcasting mobbing calls (Senzaki et al. 2015) when counting birds.

In contrast, \(\bar{\lambda }\) was the most important factor explaining the magnitude of bias and imprecision of β 1i and R. This highlights the difficulty in quantifying species richness and changes in community composition when abundances of many species are low. In such cases, an increase in the sampling area would be useful to increase the number of individuals exposed to sampling. For instance, Bibby et al. (2000) recommended adopting line transects instead of point counts when bird densities are low. Indeed, relatively accurate estimates would be expected with \(\bar{\lambda }\) ≥ 1 even for low values of \(\bar{p}\) (Figs. 2, 3 and S2–3). Of course, this would also entail an increase in effort.

For all three parameters that were of primary interest in our simulation (β 0i , β 1i , and R), the number of sampling sites (n site ) was more important than was the number of visits (n visit ). This suggests that when we model abundances of multiple species at sites with different environments, at least in the small-sample situations considered here, it would be more beneficial to increase the number of sampling sites than to increase the number of replicate surveys. We note that n site was more important than n visit even after removing the replications where n site  = 40. The trade-off between the number of sites and the number of visits per site has previously been addressed in single-species models several times. That is, an increase in the number of visits (e.g., 2–3 or 4) was typically found to be more important than increasing the number of sampling sites to improve the estimates under single-species occupancy (Guillera-Arroita and Lahoz-Monfort 2012) and N-mixture models (McIntyre et al. 2012), and also in community occupancy models (Sanderlin et al. 2014). Therefore, the results of previous studies do not concur with this study. In single-species occupancy models, the importance of n visit increased when detection probability was smaller (MacKenzie and Royle 2005; Guillera-Arroita et al. 2010; Guillera-Arroita and Lahoz-Monfort 2012). Differences between previous studies and this study suggests that the relative importance of n site to n visit may increase when multiple species are modeled whose abundance varies among sites. In other words, when sampling sites are heterogeneous with respect to densities of individual species, we need to sample more sites to improve the estimates of community abundance models.

Community abundance and occupancy models as species richness estimators

Our simulation experiments suggest that increases in mean detection probability \(\bar{p}\) (≥0.2) and especially of the number of sampling sites n site (≥ 20) are useful options at large values of mean abundance \(\bar{\lambda }\) (≥0.5) to increase the performance of community abundance models as a species richness estimator. Community abundance models provide an abundance-based species richness estimator, which is unique in that the abundance and detection probability of each species are treated separately, and species richness is estimated via an ensemble of species-level elemental Poisson models. Estimates of site-specific species richness (R j ) are obtained from the posterior distributions of the sum of species with N ij w ij  > 0 (but see below).

Our experiments also suggest that, as expected, when mean species-level abundance is small \((\bar{\lambda }\) ≤ 0.5), estimates of species richness by community abundance and community occupancy models are fairly similar. Differences between the two models were only evident when the expected counts were the largest \((\bar{\lambda }\) = \(\bar{p}\) = 0.5). In that case, bias and imprecision were smaller under the community abundance models than under the corresponding community occupancy models. In N-mixture models for a single species, Yamaura (2013) found that the expected abundance has to be 1 or more to achieve accurate estimations in small-sample situations. This is probably why community abundance models attained better estimation results than did community occupancy models. In these cases, we would be able to make the best use of count data to increase the accuracy of species richness estimates.

Location of an indicator variable in the hierarchical model formulation

In our data augmentation scheme, we originally inserted the data augmentation variable (w i ) into the detection model (in this study, Eq. 4: Yamaura et al. 2011). However, in community occupancy models, this variable is usually inserted into the ecological model (Royle et al. 2007; Royle and Dorazio 2008). If we also do this in community abundance models, an indicator variable is inserted into Eq. 1, and the series of equations defining the likelihood of the hierarchical model becomes the following:

$$w_{ij} \sim {\text{ Bernoulli}}({\Omega})$$
$$N_{ij} \sim {\text{ Poisson}}(\lambda_{ij} w_{ij} ),$$
$$y_{ijk} \sim {\text{ Binomial}}\left( {N_{ij} ,p_{i} } \right).$$

This change of the location of the indicator variable zeroes out the abundance of species that are not part of the sampled community (and therefore have structural zeros in the observed data), and it seems that the subsequent estimation is more efficient. Indeed, for the same detection history data under \(\bar{\lambda }\) = 0.5, \(\bar{p}\) = 0.2, n site  = 20, n visit  = 3, this change of the location of the indicator variable decreased computation time to nearly 30 % (232 min vs. 64 min), and the two sets of estimates appeared to be identical up to MC simulation error. We then conducted additional simulations of the first experiments for values of \(\bar{\lambda }\) = 0.1, \(\bar{p}\) = 0.5, n site  = 10, n visit  = 2 with ten simulated data sets and fit both parameterizations of the community abundance model. The computation time for the model with a changed location of the indicator variable was again much smaller (61 ± 33 vs. 31 ± 1 min). We note that these computation times did not include the initial 110,000 iterations including burn-in (ca. 36 min in this case) in which the chain convergence was not achieved. Long computation times are a challenge in the application of these parameter-rich models; therefore, we recommend the parameterization in Eq. 79. We note that the same algorithmic equivalence in a community abundance model with data augmentation has independently been discovered by Tobler et al. (2015).

Partitioning γ diversity into α and β diversity subject to imperfect detection

Quantifying β diversity is an important component of community analysis (Anderson et al. 2011; Legendre 2014), and our results using additive partitioning showed that imperfect detection can be confounded by the turnover of species composition among sites. Community abundance models were successful at resolving this confounding of imperfect detection and species turnover, and yielded more accurate estimates of α, β, and γ diversity. The confounding of imperfect detection with ecological parameters has been observed in the single species situations for population extinction and colonization rates (Moilanen 2002; Kéry 2004; Kéry et al. 2006; 2013). Beck et al. (2013) also showed that most metrics measuring community differences are susceptible to incomplete sampling, and called for the development of robust metrics. In our simulation experiments, although we assumed that detection probability of individual species was constant among the sampling sites, this assumption cannot be always accepted in field surveys. That is, open habitats can have higher detection probability than closed habitats (e.g., Ruiz-Gutiérrez et al. 2010), and detection-naïve diversity measures can be confounded with covariates (Mc New and Handel 2015). In such case, our community abundance model can be easily expanded to relax this assumption (Kéry 2008; Yamaura 2013). Hierarchical community abundance and occupancy models represent flexible and powerful methods that can be used to deal with these sampling issues, and open up a new avenue to study biological communities in varied situations.