Environmental and Ecological Statistics

, Volume 21, Issue 4, pp 611–625

Partitioning of \(\alpha \) and \(\beta \) diversity using hierarchical Bayesian modeling of species distribution and abundance

Authors

    • Department of StatisticsMiami University
  • Thomas O. Crist
    • Department of Biology, Institute for the Environment and SustainabilityMiami University
  • Peijie Hou
    • Department of StatisticsUniversity of South Carolina
Article

DOI: 10.1007/s10651-013-0271-2

Cite this article as:
Zhang, J., Crist, T.O. & Hou, P. Environ Ecol Stat (2014) 21: 611. doi:10.1007/s10651-013-0271-2

Abstract

Diversity partitioning is becoming widely used to decompose the total number of species recorded in an area or region \((\gamma )\) into the average number of species within samples \((\alpha )\) and the average difference in species composition \((\beta )\) among samples. Single-value metrics of \(\alpha \) and \(\beta \) diversity are popular because they may be applied at multiple scales and because of their ease in computation and interpretation. Studies thus far, however, have emphasized observed diversity components or comparisons to randomized, null distributions. In addition, prediction of \(\alpha \) and \(\beta \) components using environmental or spatial variables has been limited to more extensive data sets because multiple samples are required to estimate single \(\alpha \) and \(\beta \) components. Lastly, observed diversity components do not incorporate variation in detection probabilities among species or samples. In this study, we used hierarchical Bayesian models of species abundances to provide predictions of \(\alpha \) and \(\beta \) components in species richness and composition using environmental and spatial variables. We illustrate our approach using butterfly data collected from 26 grassland remnants to predict spatially nested patterns of \(\alpha \) and \(\beta \) based on the predicted counts of butterflies. Diversity partitioning using a Bayesian hierarchical model incorporated variation in detection probabilities by butterfly species and habitat patches, and provided prediction intervals for \(\alpha \) and \(\beta \) components using environmental and spatial variables.

Keywords

Bayesian hierarchical modelingButterfliesDiversity partitioningMultiple scalesMarkov chain Monte Carlo (MCMC)Zero-inflated Poisson distribution

1 Introduction

The partitioning of the total species richness \((\gamma )\) into the average number of species within samples \((\alpha )\) and the shift in species composition among samples \((\beta )\) is now widely used to quantify species diversity across multiple spatial and temporal scales (Lande 1996; Wagner et al. 2000; Crist et al. 2003; Crist and Veech 2006; Tuomisto 2010; Anderson et al. 2011). An intuitively appealing aspect of diversity partitioning is that the \(\beta \) component can be expressed as the number of species (the species richness) in the same manner as \(\alpha \) (additive partitioning) or as scaled units of \(\alpha \) (multiplicative partitioning), providing simpler expressions of the turnover in species composition than their multivariate counterparts (Anderson et al. 2011) that can be communicated more widely to non-specialists. Unlike multivariate ordination, however, additive or multiplicative \(\beta \) components cannot be related to continuous environmental variables unless there are replicate landscapes or regions because single-value metrics of \(\beta \) diversity are derived from the relationship between the mean \((\alpha )\) and pooled \((\gamma )\) species richness of the samples rather than a matrix of pairwise dissimilarities among samples. Thus, local and landscape studies using diversity components have relied on comparisons of observed \(\alpha \) and \(\beta \) components with those expected from null hypothesis tests conditioned on sample abundances among habitat types or hierarchical sampling scales (Crist et al. 2003; Anderson et al. 2011). Studies have used environmental variables to model \(\alpha \) and \(\beta \) components of species richness only when there are multiple estimates of diversity components from different landscapes or regions (Roschewitz et al. 2005; Veech and Crist 2007; Hofer et al. 2008; Kraft et al. 2011; Stegen et al. 2013). Moreover, although interval estimates from null hypothesis tests can be used to compare observed diversity components, they do not provide point predictions or prediction intervals that depend on environmental variables, or the variation in detection probabilities by species and samples.

Here, we report on a new approach to partitioning diversity components using Bayesian Hierarchical Models with Poisson or zero-inflated Poisson (ZIP) distributions of species abundances across sampling locations. This approach allows investigators to use environmental, design, or spatial variables to predict variation in species abundances among samples and to estimate \(\alpha \) and \(\beta \) components from spatially dependent samples. Our approach is equally applicable to additive or multiplicative \(\beta \) components, recognizing that each has different properties that may be useful to investigators, depending on their ecological questions (Anderson et al. 2011). We illustrate the use of the proposed models to estimate both additive and multiplicative \(\beta \) components using a butterfly data set from 26 small grassland remnants in southeastern Ohio, USA (see Crist and Veech 2006).

2 Motivating data and diversity measures

This section introduces the definition of \(\alpha \) and \(\beta \) diversities with an illustration example. The motivating data set contains the abundance of butterflies collected from 26 isolated grassland remnants surrounded by a forest matrix at the Edge-of-Appalachia Preserve, Ohio, USA. Grassland patches were small (0.1–2.5 ha) and clumped into six clusters of 3–9 patches on soils derived from calcareous rock outcrops. Patches within clusters were separated by an average of distance of 0.26 km; patches among clusters were separated by an average of 2.93 km. This created two natural hierarchical levels of sampling. Butterfly counts of each species were recorded along Pollard transects in five surveys of each patch conducted during summer 2004. The \(\gamma \) is the total species richness found by pooling together all 26 patches. Here there are two \(\alpha \)-components, representing the mean species richness at the level of the patch and cluster. The \(\alpha _{\textit{patch}}\) is the mean number of species per patch, and the \(\alpha _{\textit{cluster}}\) is the mean number of species per cluster obtained by pooling together all of the species present in those patches sampled within each cluster. There were 32, 26, 28, 28, 36 and 40 distinctive butterfly species observed and recorded in the six cluster of the motivating data, hence the \(\alpha \)-component at the cluster level, \(\alpha _{\textit{cluster}}\), is the average of the species richness counts here, i.e., \(\alpha _{\textit{cluster}}=31.7\). Likewise, the \(\beta \)-component of species richness can be determined at two scales: the turnover in species composition among patches \((\beta _{\textit{patch}})\), and the turnover in composition among clusters of patches \((\beta _{\textit{cluster}})\). In additive partitions \((\beta ^A=\gamma -\alpha )\), the \(\beta \) components are the mean number of species that are absent from a randomly chosen patch or cluster (Crist et al. 2003), whereas in multiplicative partitions \((\beta ^M=\gamma /\alpha )\) the \(\beta \) components are the effective number of communities at the scale of the patch or cluster (Veech et al. 2002).

For the Ohio butterfly data, a total of \(\gamma =49\) species and 1,334 individuals were recorded in the 26 patches. The Chao estimate of species richness was 51, so virtually all of the species present were likely sampled in this survey. Additive partitions of \(\beta \) showed that \(\alpha _{\textit{patch}}=16.7\) and \(\alpha _{\textit{cluster}}=31.7\) species, so that \(\beta _{\textit{patch}}^A=15.0\) and \(\beta _{\textit{cluster}}^A=17.3\). The corresponding multiplicative partitions of \(\beta \) were \(\beta _{\textit{patch}}^M=1.90\) and \(\beta _{\textit{cluster}}^M=1.55\). Several environmental variables were also measured for each patch and are denoted as follows: \(X_1=\) the natural log of the area of the habitat patch, \(X_2=\) connectivity, measured as the inverse of the area-weighted isolation between habitat patches, \(X_3=\) the natural log of the number of inflorescences along transects, and \(X_4=\) the number of potential larval host plant species in each patch based on the sampled pool of butterfly species.

3 Bayesian hierarchical models

Instead of computing the diversity measurements based on observed average and total richness or diversity (shown in previous section), we show that a Bayesian hierarchical approach can be used based on species distribution and abundance data to provide point estimates and prediction intervals for \(\alpha \) and \(\beta \) components of species richness or diversity. Because it starts from abundance data, one or more predictor or spatial variables may also be used in the estimation of \(\alpha \) and \(\beta \) components. For the remainder of the paper, we focus on the estimation of components of species richness, but the same approach could be used to estimate effective species diversity derived from Shannon entropy or Simpson index (Jost 2007; Tuomisto 2010).

Modeling the species abundances has become an important research topic for statisticians. Bayesian hierarchial models have been widely used to analyze the distribution of plants and animals (Gelfand et al. 2006). The Poisson distribution has been widely used to model the abundance of species (for examples, see Caughley and Grice 1982; Sandland and Cormack 1984). One limitation of approximating the abundance count with a Poisson random variable is that the variance of Poisson random variable is equal to the mean. One challenge with species abundance data, however, is excess number of zeros, which might cause overdispersion (variance greater than the mean), and hence increase the proportion of zeros in predictions. In the motivating data set, 39 out of 49 species was not observed in 10 or more patches among all 26 patches and therefore recorded as zeros; around \(66\,\%\) of the 49 \(\times \) 26 = 1,274 recorded individual species abundances collected from different patches were zeros. Zeros in the sampled data may arise because species are not present (“true zeros”) or because they were not detected (“structural zeros”). Estimates of diversity components will be affected if the difference between these two sources of zeros is not incorporated into the data analysis. Different modeling options have been proposed to address the zero-inflation in count data. Researchers have broadly studied zero-inflated Poisson (ZIP) model, which has been widely used in industry (e.g. manufacturing defects, Lambert 1992), toxicology (e.g. to accommodate the individual exposure, Lee et al. 2001), Psychometric assessments (e.g. to model both propensity and level perspectives, Wang 2010) and many other fields. Later ZIP and Zero-inflated binomial (ZIB) regression models with random effects were discussed in Hall and Zhang (2004). Excellent comprehensive reviews of these modeling options are given in Ridout and Hinde (1998) and Potts and Elith (2006), and a formal score test was developed to help the practitioners to choose between a ZIP model versus a Zero-Inflated Negative Binomial (ZINB) alternative in Ridout et al. (2001). The zero-inflation characteristic of species distribution data has been noticed and models assuming different zero-inflated distributions has been applied to the analysis of such data. For example, in order to model the species richness using the “presence/absence” data, Dorazio et al. (2006) used a ZIB model to describe the detection probabilities in repeated surveys of birds and butterflies. In the present study, we focused on the analysis and prediction of the species abundances rather than the ” “presence/absence” of species. The diversity measures were then computed based on the species abundances. Therefore, we proposed using a ZIP model to analyze overdispersed data with excess zeros, which describes our butterfly data and species abundance data in general.

It is of interest to compare the ZIP model with the traditional Poisson regression model applied in such analysis. It is also of interest to explore the possibility of incorporating environmental covariates into the analysis if there is any. Therefore we developed a series of Bayesian hierarchical models to analyze the Ohio butterfly data in the present study, including a Poisson regression using sampling information (species abundances at different sampling scales) only, a ZIP model using sampling information only, and a ZIP model incorporating the available environmental variables.

3.1 Poisson regression using sampling information only

We begin by building a Bayesian alternative to the traditional Poisson regression model, i.e., modeling the species abundance with a Poisson random variable to account for the variation from different levels or scales (Clark 2007), such as grassland patches and clusters for the butterfly data. The use of prior probability distribution allows incorporation of knowledge from previous studies, and facilitates control for confounding factors.

A Bayesian Poisson Regression model is fitted to model the number of butterflies within each patch and cluster as follows:
$$\begin{aligned} Y_{ijk}&\sim \textit{independent\ Poisson} (\lambda _{ijk}), \end{aligned}$$
(1)
$$\begin{aligned} \textit{log}(\lambda _{ijk})&= \mu +\tau _i+\psi _j+\theta _k, \end{aligned}$$
(2)
where \(i=1,2,\ldots ,26,\,j=1,2,\ldots ,6\) and \(k=1,2,\ldots ,49\). Here \(Y_{ijk}\) denotes the count of the butterflies of the k th species in the i th patch nested in the jth cluster, and \(\lambda _{ijk}\) denotes the mean of the count. The count of butterflies is assumed to follow a Poisson distribution with mean \(\lambda _{ijk}\). It is common to assume that the logarithm of the Poisson mean to be a linear function of factors impacting the distribution of species, as shown in the log-linear regression model in Eq. (2). Here \(\mu \) denotes the overall intercept, and \({\varvec{\tau }} =(\tau _1, \tau _2,\ldots , \tau _{26}),\,{\varvec{\psi }}=(\psi _1, \psi _2,\ldots , \psi _6),\,{\varvec{\theta }}=(\theta _1, \theta _2,\ldots , \theta _{49})\) are the fixed effects according to species, cluster and patch.
Following the Bayesian Theorem, the prior distributions of the parameters, \(\mu ,\,{\varvec{\tau }},\,{\varvec{\psi }}\) and \({\varvec{\theta }}\) along with the likelihood function determine the joint posterior function:
$$\begin{aligned} p(\mu , {\varvec{\tau }}, {\varvec{\psi }}, {\varvec{\theta }}\mid \mathbf {Y}) \propto P(\mathbf {Y}\mid \mu , {\varvec{\tau }}, {\varvec{\psi }}, {\varvec{\theta }}) \pi (\mu ) \pi ({\varvec{\tau }}) \pi ({\varvec{\psi }}) \pi ({\varvec{\theta }}), \end{aligned}$$
(3)
where \(\mathbf {Y}\) denotes the vector combining all the observed responses \(Y_{ijk}\)’s and \(\pi (\cdot )\) denotes a prior distribution. Normal prior distributions of the parameters are used in the proposed model, which are defined as follows:
$$\begin{aligned} \mu&\sim N(0,\sigma _\mu ^2), \end{aligned}$$
(4)
$$\begin{aligned} \tau _i&\sim N(0,\sigma _\tau ^2), \quad i=1,2,\ldots ,26, \end{aligned}$$
(5)
$$\begin{aligned} \psi _j&\sim N(0,\sigma _\psi ^2), \quad j=1,2,\ldots ,6, \end{aligned}$$
(6)
$$\begin{aligned} \theta _k&\sim N(0,\sigma _\theta ^2), \quad k=1,2,\ldots ,49. \end{aligned}$$
(7)
In a hierarchical setting, the uncertainty in the parameters is accounted for with higher level priors of the variance parameters, i.e., \(\sigma _\mu ^2,\,\sigma _\tau ^2,\,\sigma _\psi ^2\) and \(\sigma _\theta ^2\). Following the recommendation of Gelman (2006), rather than assigning prior distributions for the variances, we used uniform priors for the standard deviation parameters,
$$\begin{aligned} \sigma _\mu&\sim \textit{Uniform}(0,5), \end{aligned}$$
(8)
$$\begin{aligned} \sigma _\tau&\sim \textit{Uniform}(0,5), \end{aligned}$$
(9)
$$\begin{aligned} \sigma _\psi&\sim \textit{Uniform}(0,5), \end{aligned}$$
(10)
$$\begin{aligned} \sigma _\theta&\sim \textit{Uniform}(0,5). \end{aligned}$$
(11)
Combining the likelihood function specified in Eqs. (1), (2) and the prior distributions specified in Eqs. (4)–(11), the joint posterior distribution of model parameters can then be derived based on Eq. (3). Samples of the model parameters were drawn from the posterior distribution using Monte Carlo Markov Chain (MCMC). The R statistical computing software (version 2.14.1) together with the “R2WinBUGS” package (Sturtz et al. 2005) were used to implement the posterior sample simulation. The “R2WinBUGS” package (Sturtz et al. 2005) enables R users to implement a Bayesian model in WinBUGS software and save the simulations in arrays for easy access in R.

In the implementation of the Poisson regression model, three separate MCMC chains of model parameters were simulated from the posterior distributions, with 40,000 total iterations each (the first 10,000 iterations were burn-in iterations and discarded from the samples to ensure the convergence of posterior samples). Multiple simulated MCMC chains allowed the computation of the potential scale reduction factor (Gelman and Rubin 1992; Brooks and Gelman 1998), which can be used in the diagnosis of the convergence of the MCMC chains. The potential scale reduction factor (R-hat) was computed for each parameter and approximate convergence is diagnosed when R-hat is close to 1. Besides visually inspecting the mixing of MCMC chains via trace plot, computing the values of R-hat indicated our chains to ran out long enough. The posterior samples from all three chains after convergence were then combined to conduct the posterior inference.

The predicted abundance, \(\alpha \) and \(\beta \) components of species richness were simulated from the posterior predictive distribution. After convergence was achieved, the simulated model parameter values were used to predict abundance counts for each species within each patch and cluster, and \(\alpha \) and \(\beta \) components of species richness were computed using the predicted abundances. The point estimates and interval estimates of the predicted abundances and species richness were then obtained based on the posterior predictive samples.

Results of the Poisson regression model showed several limitations of this model (Table 1, supplementary materials). In the original data, approximately \(66\,\%\) of the 1,274 counts were zeros, and these zeros came from either the true absence of the butterflies or those that were present but not detected. The sample variance of these counts was 6.84, while the sample mean was as low as 1.05. Therefore, excess zeros occurred as well as over-dispersion, limiting the applicability of the Poisson regression model to these data. These effects resulted in higher predicted values for \(\alpha _{\textit{cluster}}\) and \(\alpha _{\textit{patch}}\) than the observed values. Therefore, the prediction intervals (PI) of several components of species richness did not cover the observed richness (Table 1, supplementary materials).

3.2 ZIP model using sampling information only

To incorporate the excess zeros in the butterfly abundance data, a ZIP model was applied as a two-component mixture model. One component is a Poisson distribution for count responses (including positive responses and “structural zeros”) the other component is a degenerated point mass at 0 for the “true zeros.” Specifically, we applied the ZIP model to the count of the butterflies within each patch and cluster as shown below:
$$\begin{aligned} Y_{ijk}=\left\{ \begin{array}{ll} 0,&{}\textit{with}\ \textit{prob}\ \eta _{ijk},\\ \textit{Poisson}(\lambda _{ijk})&{}\textit{with}\ \textit{prob}\ 1-\eta _{ijk}. \end{array} \right. \end{aligned}$$
(12)
Or equivalently,
$$\begin{aligned} p(Y_{ijk}\mid \eta _{ijk},\lambda _{ijk})\!=\![\eta _{ijk}\!+\!(1\!-\!\eta _{ijk})e^{-\lambda _{ijk}}]^{I_{Y_{ijk}=0}}\left[ (1-\eta _{ijk})\frac{e^{-\lambda _{ijk}}\lambda _{ijk}^{Y_{ijk}}}{Y_{ijk}!}\right] ^{I_{Y_{ijk}>0}},\nonumber \\ \end{aligned}$$
(13)
where \(\eta _{ijk}\) is a mixture proportion range from 0 to 1 which measures the probability that the kth species is absent (“true zeros”) in the ith patch nested in the jth cluster. The mean and the variance of the response were then functions of the Poisson mean and mixture proportion: \(E (Y_{ijk}) =\lambda _{ijk} (1 - \eta _{ijk}) ;\,Var (Y_{ijk}) = \lambda _{ijk}(1 - \eta _{ijk}) (1+\eta _{ijk}\lambda _{ijk})\).
In the ZIP model, the Poisson mean \(\lambda _{ijk}\) was defined with the same log-linear function as specified in Eq. (2) and the priors specified in Eqs. (4)–(11) were used for the associated parameters. For the parameter \(\eta _{ijk}\), we considered two different options:
  • Option 1. \(\eta _{ijk}=0.5\), for all \(i,\,j\) and \(k\). This option indicated a strong assumption, i.e., the probability that a species is absent did not vary among different species or spatial locations, and it is equally possible for a species to be present or absent.

  • Option 2. \(\eta _{ijk}=\eta _k\ \sim \ \textit{Beta} (1,1),\,i =1,2,\ldots ,26,\,j = 1,2,\ldots ,6,\,k =1,2,\ldots ,49\). This prior indicated that the probability that a species is absent was assumed to be species-dependent, but not spatially varying. A \(\textit{Beta}(1, 1)\) distribution is equivalent to a \(\textit{Uniform}(0, 1)\) distribution, and is usually used as a non-informative prior for proportions.

The two options indicate how strong the prior belief is about model parameters. We introduced the two possible choices given above here to illustrate the flexibility of prior specification in Bayesian hierarchical models. Other choices, such as patch-specific or cluster-specific mixture proportions, are also possible modeling options. To decide which option was favored in these data, the deviance information criterion (DIC) was computed for the ZIP models considering these two options respectively. DIC (Linde 2005; Celeux et al. 2006) has been introduced as a Bayesian measure of model complexity and fit. It can be easily calculated from the samples generated by a MCMC chain and hence has become a popular measure of model assessment. Models with smaller DIC values are preferred to models with larger DIC. For each of the two ZIP models using different options for \(\eta _{ijk}\), three separate MCMC chains of model parameters were simulated from the posterior distributions, with 40,000 total iterations each (the first 10,000 iterations were burn-in iterations). The two ZIP models with different options for \(\eta _{ijk}\) produced similar estimated diversity measurements; however, the ZIP model assuming common mixture proportion (option 1) produced a smaller DIC (Table 1) and hence fitted the data better than the one with species-varying mixture proportions (option 2) (Table 2, supplementary materials). For the ZIP model with either a common mixture proportion or species-specific mixture proportions, all of the \(95\,\%\) PI’s of species richness measures covered the observed values. The ZIP model provided better predictions of the observed species abundance and richness components because it distinguished between zeros resulting from those species present but not detected and those species that were absent. The resulting \(95\,\%\) PI of all the diversity components produced by the ZIP models covered the observed value (Table 1).
Table 1

ZIP model using sampling information only (common mixture proportions assumed) output: summary statistics of the posterior samples of diversity measurements

Parameter

Mean

Median

\(95\,\%\hbox { CI}\)

Observed

Violation

R-hat

\(\alpha _{\textit{cluster}}\)

32.08

32.17

[29.67, 34.50]

31.67

0

1.0

\(\alpha _{\textit{patch}}\)

16.29

16.27

[15.27, 17.35]

16.69

0

1.0

\(\gamma \)

47.39

48.00

[45.00, 49.00]

49.00

0

1.0

\(\beta _{\textit{patch}}^A\)

15.79

15.79

[14.06, 17.54]

14.93

0

1.0

\(\beta _{\textit{cluster}}^A\)

15.31

15.33

[12.67, 17.83]

17.33

0

1.0

\(\beta _{\textit{patch}}^M\)

1.97

1.97

[1.87, 2.07]

1.90

0

1.0

\(\beta _{\textit{cluster}}^M\)

1.48

1.48

[1.38, 1.59]

1.55

0

1.0

DIC = 2,356.5

      

The posterior means and medians are both considered as the estimated diversity measurements and the \(95\,\%\) prediction intervals log of the number of \((95\,\%\hbox { CI})\) give ranges of the diversity measurements associated with 0.95 inclusion probability. The diversity measurements computed based on the observed counts of different species of butterflies in different locations is also listed to compare with the model estimated diversity measurements. The indicator variable “violation” shows whether the prediction interval covers the observed value, 1 stands for failed to cover and 0 means that the observed value is covered. The potential scale reduction factor, R-hat, measures the convergence of the posterior samples. R-hat \(=\) 1.0 indicates that convergence is achieved and the resulting Bayesian estimates are reliable. \(\beta ^A\) is the additive model of species richness and \(\beta ^M\) is the multiplicative model

In the analysis of the butterfly data, the ZIP model with common mixture proportion produced a smaller DIC; However, more generally it will be worthwhile to assume species-dependent or location-dependent mixture proportion in the estimation of diversity partition components since the probability of detection will often vary among species and locations. The posterior distributions of the mixture proportions were plotted in Fig. 1 where the species-specific mixture proportions were used. Based on the \(95\,\%\) CI of the mixture proportions, most of the species did not show statistically significant difference in terms of the mixture proportion, while some of these species did have different mixture proportions (e.g. the \(95\,\%\hbox { CI}\) of species 24 and 25 did not overlap). Therefore, may be of interest to classify species into several groups according to the distribution of mixture proportion as a possible future work.
https://static-content.springer.com/image/art%3A10.1007%2Fs10651-013-0271-2/MediaObjects/10651_2013_271_Fig1_HTML.gif
Fig. 1

Posterior distributions of the species-specific mixture proportions. Dots are the posterior medians and the bars are the equal-tail \(95\,\%\) credible intervals of the mixture proportions

These results suggested that the Bayesian ZIP regression model provided good predictions of species abundances and species richness components. Instead of computing diversity partition measures based on ”“snapshot” data collected on a particular time point, it would now be possible to build up a posterior distribution to describe the “population” of such measures (Clark 2007). The components of species richness were no longer single numbers based on the observations only, but instead based on a set of distributions from posterior predictive samples, which also provided a reasonable prediction interval. Variation was estimated among the species, patches, and cluster, and future sampling would enable the inclusion of historical information into the prior distribution of the model analysis.

3.3 ZIP model with environmental covariates

In this section we explore the modeling option of incorporating the four environmental variables introduced earlier into the analysis: the area of the habitat patch \((X_1)\), connectivity \((X_2)\), the number of inflorescences along Pollard transects \((X_3)\), and the number of potential host plant species available to butterflies in each patch \((X_4)\). These environmental variables described the characteristics of the sampling locations (patch and cluster); therefore, an alternative Bayesian ZIP regression model using all the environmental variables while ignoring the cluster effect \(\varvec{\psi }\) and the patch effect \(\varvec{\tau }\) was fitted to the data. The same likelihood function specified in Eqs. (12), (13) was used, while the regression equation specified in Eq. (2) was replaced with the following:
$$\begin{aligned} log(\lambda _{ijk}) = \mu +\theta _k+b_1X_{1ij}+b_2X_{2ij}+b_3X_{3ij}+b_4X_{4ij}, \end{aligned}$$
(14)
where \(i=1,2,\ldots ,26,\,j=1,2,\ldots ,6\) and \(k=1,2,\ldots ,49\).
The diffuse priors were used for the regression coefficients, i.e, \(b_r \sim N(0, \sigma _b^2)\), where \(r=1,2,3,4\). Following the recommendation of Gelman (2006), a uniform prior distribution was assigned to the standard deviation \(\sigma _b,\,\sigma _b \sim \textit{Uniform}(0, 5)\). The same prior distribution of \(\varvec{\theta }\) defined in Eq. (7) was used here while a different diffuse prior was used for \(\mu \) to facilitate the posterior sample simulation, \(\mu \sim N(0,100)\). For the specification of the mixture proportions, we tried both options described in the previous section. For each of the two ZIP models using different options for \(\eta _{ijk}\), three separate MCMC chains of model parameters were simulated from the posterior distributions, with 80,000 total iterations each (the first 30,000 iterations were burn-in iterations). Using environmental variables with common mixture proportions resulted in estimated diversity components that were quite similar to those produced by the ZIP models without the environmental predictors (Tables 2 and 3), which is perhaps not surprising since the patch and cluster effects in Eq. (2) were partially explained by the environmental predictors in Eq. (14). The covariate \(X_3\) (the number of inflorescences) did not significantly affect the species abundances of butterflies while the other environmental predictors were all significant. The observed \(\alpha \)- and \(\beta \)- components are in the range of predicted \(95\,\%\hbox { CI}\). When the species-specific mixture proportions were assumed (Tables 3 and 4, supplementary materials), similar estimates were obtained for the diversity components and regression coefficients, except that a larger DIC value (3,067.4) was obtained compared to the case using common mixture proportions (2,448.4). This comparison suggested that the simpler model with a common mixture proportion was favored by the data analysis results here. Overall, the Bayesian ZIP model with environmental variables provided a good fit of predicted values to observed components of richness. The coefficients of \(X_1,\,X_2\) and \(X_4\) were all positive: butterfly species abundances increased with habitat area, connectivity, and the species richness of larval host plants.
Table 2

ZIP model with environmental covariates (common mixture proportions assumed) output: summary statistics of the posterior samples of diversity measurements

Parameter

Mean

Median

\(95\,\%\hbox { CI}\)

Observed

Violation

R-hat

\(\alpha _{\textit{cluster}}\)

32.28

32.33

[30.00, 34.67]

31.67

0

1.0

\(\alpha _{\textit{patch}}\)

16.47

16.46

[15.46, 17.50]

16.69

0

1.0

\(\gamma \)

47.34

47.00

[45.00, 49.00]

49.00

0

1.0

\(\beta _{\textit{patch}}^A\)

15.81

15.81

[14.07, 17.58]

14.93

0

1.0

\(\beta _{\textit{cluster}}^A\)

15.06

15.00

[12.50, 17.67]

17.33

0

1.0

\(\beta _{\textit{patch}}^M\)

1.96

1.96

[1.86, 2.06]

1.90

0

1.0

\(\beta _{\textit{cluster}}^M\)

1.47

1.46

[1.37, 1.57]

1.55

0

1.0

DIC = 2,448.4

      

The posterior means and medians are both considered as the estimated diversity measurements and the \(95\,\%\) prediction intervals \((95\,\%\hbox { CI})\) give ranges of the diversity measurements associated with 0.95 inclusion probability. The diversity measurements computed based on the observed counts of different species of butterflies in different locations is also listed to compare with the model estimated diversity measurements. The indicator variable “violation” shows whether the prediction interval covers the observed value, 1 stands for failed to cover and 0 means that the observed value is covered. The potential scale reduction factor, R-hat, measures the convergence of the posterior samples. R-hat = 1.0 indicates that convergence is achieved and the resulting Bayesian estimates are reliable. \(\beta ^A\) is the additive model of species richness and \(\beta ^M\) is the multiplicative model

Table 3

ZIP model with environmental covariates (common mixture proportions assumed) output: summary statistics of the posterior samples of regression coefficients, including posterior mean, median, standard deviations and \(95\,\%\hbox { CI}\)

Parameter

Mean

Median

sd

\(95\,\%\hbox { CI}\)

\(b_1\)

0.16

0.16

0.04

[0.08, 0.25]

\(b_2\)

0.16

0.16

0.07

[0.01, 0.30]

\(b_3\)

-0.06

-0.06

0.08

[\(-\)0.22, 0.07]

\(b_4\)

0.02

0.02

0.01

[0.01, 0.04]

4 Discussion

The partitioning of species richness and diversity into single values of \(\alpha \) and \(\beta \) at each sampling scale has thus far relied on observed components and comparisons to null distributions expected from the sampling design and sample size (Crist et al. 2003; Anderson et al. 2011; Kraft et al. 2011). Here we provided a new approach using Bayesian hierarchical models to give prediction intervals for \(\alpha ,\,\beta \), and \(\gamma \) components of species richness, and a framework for relating single-value components to environmental variables based on variation in predicted species abundances among samples. Application to the Ohio butterfly data demonstrated that three environmental variables—patch area, connectivity (the proximity to adjacent grassland patches), and host plant diversity—were important predictors of the number and composition of butterfly species among patches and clusters of patches. About one-third of the total butterfly richness occurred within patches \((\alpha _{\textit{patch}})\), one third was due to variation in species composition among patches \((\beta _{\textit{patch}})\), and the remaining third was due to variation in composition among clusters\((\beta _{\textit{cluster}})\). Multiplicative \(\beta \) components suggest that complete turnover in species composition occurs among patches within clusters (1.96), and about \(50\,\%\) turnover in composition occurs at the scale of the cluster (1.47). A common mixture model for the probability of species absence was best supported for butterflies, but generally we would expect the probability of detection to vary among species and locations depending on the empirical patterns of species abundance and spatial distribution. More broadly, our results for species richness, composition, and detection emerge from a single modeling framework, whereas most ecological analyses of diversity involve three separate approaches using general linear models (\(\alpha \)-diversity), multivariate ordination (\(\beta \)-diversity), and species accumulation curves or sight–resight estimators (detection probability).

Among all the approaches to model the species abundance data, the standard Poisson is the most straightforward to apply. But the standard Poisson cannot deal with either zero-inflation or over-dispersion, and the estimates of individual species abundances might be biased. The ZIP distribution addresses the zero-inflation and potential over-dispersion resulted by the zero-inflation. The ZIP model considers that the distribution of the abundance of each species in a given habitat is a mixture of a point mass at zero and a Poisson distribution. Hence the chance of observing a zero species abundance includes two parts: the probability that a species is absent from a habitat, and the probability that a species is present but undetected. Hence, to estimate the probability of detection, our abundance-based approach does not require repeat presence-absence surveys under the assumption of no community change (e.g. Dorazio et al. 2006). Our study goal was also to estimate diversity components from predicted abundances of individual species, whereas the Dorazio et al. (2006) study was aimed at estimating the true species richness from repeat surveys and total community abundance.

The direct modeling of species abundances in a Bayesian hierarchical approach provides, for the first time, prediction intervals to \(\alpha \) and \(\beta \) components that stem from variation in species abundances among samples. A Bayesian hierarchical modeling approach will also facilitate greater prediction and explanation in diversity partitioning studies because \(\alpha \) and \(\beta \) components can be linked together with environmental and spatial variables in the same modeling framework.

The different modeling options were compared using DIC, which is easy to apply for a wide range of Bayesian hierarchical models. However, it has been pointed out that DIC could not compete with traditional Bayesian model comparison/selection methods based on Bayesian factors or posterior predictive distribution and sometimes it does not distinguish between alternative fits. For a quick and simple posterior check, we also looked at the posterior predictive intervals of diversity measures and compare those with the observed values. It is possible to implement a more formal posterior predictive check, such as simulating multiple posterior predictive samples and computing the posterior predictive p values.

Besides evaluating the model with the prediction intervals of the diversity measures, model validation can be done using cross-validation methods, i.e, repeatedly splitting the data into a training set (model building set) and a validation set (holdout set) and then comparing the holdout observations with predictions based on the model fitted to the training sets only. However, it is challenging to implement the cross-validation methods to multi-level data due to the hierarchical structure of the data. The traditional leave-one-out cross-validation can be implemented here according to different level of the data; for each species, we could remove single data point (the species abundances in a patch) and check the prediction from the model fit to the rest of the data; or we could remove single cluster and perform the same procedure. Simulation studies done in Wang and Gelman (2013) also revealed that sample size and structure of the data affected the cross-validation based model selection results significantly. Therefore, the combination of multiple model assessment tools, including posterior predictive checking, cross-validation and DIC, would be an important part of model validation and comparison.

As in most studies using diversity partitioning, we did not use a spatially explicit representation of patch location, but instead the spatial location of a particular abundance observation was considered implicitly in the categorical variables indicating which cluster and patch this observed was from. Now that the groundwork is laid for using continuous predictor variables to model diversity components, however, space can be represented as explicit rather than categorical variables. For example, we might consider the mixture portion parameter, \(\eta _{ijk}\), as functions of environmental or spatial covariates rather than using the species-specific mixture proportions. Thus continuous spatial and environmental variables may be used in a Bayesian hierarchical framework to model single-value diversity components in an analogous fashion to multivariate ordination (Wagner 2004; Peres-Neto et al. 6).

It also possible to introduce spatial dependence in the modeling of both \(\lambda _{ijk}\) and \(\eta _{ijk}\), assuming that the parameters values for nearby sampling locations tend to be similar. This would require the incorporation of a spatial random effect in the Bayesian hierarchical model, i.e., replacing Eq. (2) with
$$\begin{aligned} log(\lambda _{ijk}) = \mu +\theta _k+\phi _{ij}. \end{aligned}$$
(15)
When the environmental covariates are used, we could replace Eq. (14) with
$$\begin{aligned} log(\lambda _{ijk}) = \mu +\theta _k+b_1X_{1ij}+b_2X_{2ij}+b_3X_{3ij}+b_4X_{4ij}+\phi _{ij}. \end{aligned}$$
(16)
Here \(\phi _{ij}\) in Eqs. (15) and (16) represents the spatial random effect, and can be modeled with different spatial dependence structures (Cressie 1993). The ZIP models assuming a common mixture proportion with this spatial random effect (assuming an exponential covariogram structure) were implemented with and without the environmental covariates. The resulting DIC for the ZIP model assuming common mixture proportion without using the environmental covariates did not improve when the spatial random effect is used instead of original model specification in Sect. 3.2 (DIC = 2,474.0). And the ZIP model assuming common mixture proportion and using the environmental covariates showed poorer fit (DIC = 3,578.1) when the spatial random effect was considered as in Eq. (16), too. Therefore, the incorporation of spatial random effects was not considered further in the present study. However, it is one of the important potential extension work to explore the incorporation of suitable geostatistics methods in the prediction of diversity measures.

The present study focused on the development of predicting the species richness and computing the diversity measures based on the modeling of species abundances rather than modeling the species richness directly. There are many other modeling options which could potentially improve the analysis of species abundances, for example, assuming a negative binomial distribution (Bliss and Fisher 1953; Wulu et al. 2002) or a generalized Poisson distribution (Wang and Famoye 1997; Wulu et al. 2002; Famoye 1993) of the species abundance would account for the over-dispersion in such data. A zero-inflated generalized Poisson distribution (Felix and Singh 2006) would address both the zero-inflation and over-dispersion. A more extensive exploration of these modeling options and the corresponding model comparison would be an important future work.

Supplementary material

10651_2013_271_MOESM1_ESM.docx (26 kb)
Supplementary material 1 (docx 25 KB)

Copyright information

© Springer Science+Business Media New York 2013