Imperfect observations in ecological studies
- 1.5k Downloads
- 1 Citations
Abstract
Every ecological data set is the result of sampling the biota at sampling locations. Such samples are rarely a census of the biota at the sampling locations and so will inherently contain biases. It is crucial to account for the bias induced by sampling if valid inference on biodiversity quantities is to be drawn from the observed data. The literature on accounting for sampling effects is large, but most are dedicated to the specific type of inference required, the type of analysis performed and the type of survey undertaken. There is no general and systematic approach to sampling. Here, we explore the unification of modelling approaches to account for sampling. We focus on individuals in ecological communities as the fundamental sampling element, and show that methods for accounting for sampling at the species level can be equated to individual sampling effects. Particular emphasis is given to the case where the probability of observing an individual, when it is present at the site sampled, is less than one. We call these situations ‘imperfect observations’. The proposed framework is easily implemented in standard software packages. We highlight some practical benefits of this formal framework: the ability of predicting the true number of individuals using an expectation that conditions on the observed data, and designing appropriate survey plans accounting for uncertainty due to sampling. The principles and methods are illustrated with marine survey data from tropical northern Australia.
Keywords
Compound distributions Detection probability Ecological modelling Marine surveys Sampling Species distribution models (SDMs)1 Introduction
One of the long-standing challenges in quantitative ecology is to model biodiversity quantities, such as a species’ presence/absence, abundance or biomass. In the last decade or so there has been rapid methodological developments and a wide range of applications (Guisan and Zimmermann 2000; Ferrier and Guisan 2006; Pitcher et al. 2007; Elith and Leathwick 2009; Gattone and Battista 2009; Lozier et al. 2009; Franklin 2010; Bax 2011). The core of the modelling challenge is typically a regression-type problem: how to relate biodiversity to a set of descriptors (covariates) such as environmental or anthropogenic variables. It involves describing the variability in the data into parts that are common to all data (the signal), and a part that remains unexplained (the noise). There are two types of variance that can adversely affect the model: one is due to sampling biodiversity (considered in this article), and the other is due to sampling/predicting the covariates (e.g. Foster et al. 2012; Stoklosa et al. 2015).
It is tempting, when modelling biodiversity attributes, to ignore any variance in the data due to sampling (including imperfect observations). This assumes that the manner in which data were collected is unimportant, or that it will simply add to the random part of the model and not the signal about biodiversity. However, this requires assumptions that are unlikely to be met. The unfortunate implication is that sampling issues can affect inferences. Accounting for sampling effects within a model requires careful consideration as it tends to vary from one survey to the next.
One sampling effect, which is often ignored, arises when the data are a sample (not a census) of the biological material at a sample location. We refer to these data as imperfect observations, and they are the central topic of the work presented here. An example of where imperfect observations occur is in marine surveys, where a large amount of biotic material is obtained (too much for scientific processing of all the material). The practical method to quantify all of the biotic material is to sample the different catch (a process called sub-sampling). Sub-sampling can take a number of forms—sample from all the biotic material as a single group, or the sample different broad taxonomic or size strata. This process adds another layer of variability into the data.
The methods presented here have broader application. The effect of imperfect observations is explored in this work and simple methods for adjusting statistical models for analysis of these types of data are presented. The model unifies many of the disparate research areas that consider imperfect observations, which has not been done before. One key point of distinction in previous approaches for accounting for imperfect observations is whether the focus is on sampling an individual organism (individual detectability), or a species (species detectability). For example, species abundances and biomass are related with individual detectability and has been studied as ‘ascertainment’ (Fisher 1934; Fisher et al. 1943; Rao 1965), ‘detection probability’ (Borchers et al. 2002; Buckland et al. 2004), or as ‘attenuation’ (Shimadzu and Darnell 2015). In contrast, species presence/absence and richness are more related with species detectability, and it has been studied as ‘rarefaction’ (Sanders 1968; Hurlbert 1971; Simberloff 1972; Heck et al. 1975) and ‘occupancy’ (MacKenzie et al. 2002). However, these approaches concentrate on species-level data and fail to exploit the fact that for the species to be detected at least one of the individuals needs to be sampled. So individual detection must play a pivotal role in understanding species detection.
Our approach utilises a compound distribution of the possible number of individuals at a site. It highlights the precise data needed to disentangle the number of individuals and the probability of sampling. We present a modelling framework, which is trivially implemented in software packages, to handle imperfect observations (including detectability issues and sub-sampling). The formal modelling framework has some practical benefits as well—predictions of the true number of individuals at any sampling site can be made through predictive distributions and the effect of imprecise observations can be incorporated easily when designing surveys. The principles and methods are illustrated throughout the manuscript with two marine data sets from tropical northern Australia. Both of these examples are for a particular case of imperfect observations, namely subsampling. However, we note that the methods presented are for a wider class of applications—any situations where the probability of observation is less than one.
2 Conceptual framework for imperfect observation
Every ecological data set is the result of sampling from a population of interest and every ecological datum can likewise be thought of as a sample (sometimes a census) of the population at a site. Here and elsewhere we use the statistical term ‘population’ to mean the individuals at a sampling site, as this is our prime interest. We note however, that much of the suggested framework could extend to a broader definition of population. For notation convenience, we omit possible site subscripts. Formally, the principle of sampling is the random partition of a population \({\mathcal {P}}=\{\omega _i\}_{i=1}^{M_0}\), of size \(M_0=|{\mathcal {P}}|\), into two disjoint categories: the sample \({\mathcal {S}}\), and the remainder of the population \({\mathcal {S}}^c = {\mathcal {P}}\setminus {\mathcal {S}}\). These two sets are disjoint, \({\mathcal {S}}^c \cap {\mathcal {S}} = \emptyset \). Each element of the population, \(\omega _i\), is typically an individual organism but it may also be a colony or a family in certain situations (e.g. corals and sponges). We shall use the ‘individual’ nomenclature to describe all possibilities. Note that we treat \(M_0\) as random throughout the paper, in order to dealiniate the extent to which the expected abundance, E[\({M_0}\)], responds to different environment conditions; more details will be discussed in the later sections.
- Abundance Species abundance in a sample, \(M_1 = |{\mathcal {S}}|\), is given as the compound form$$\begin{aligned} M_1 = {\left\{ \begin{array}{ll} \displaystyle \sum _{i=1}^{M_0} Z_i, &{}\quad (M_0 > 0);\\ 0, &{}\quad (M_0 = 0). \end{array}\right. } \end{aligned}$$(1)
- Species presence/absence Species presence/absence in a sample \(Y_1\) can be defined by using an indicator function \(I(\cdot )\) as$$\begin{aligned} Y_1 = I(M_1 > 0). \end{aligned}$$(2)
- Biomass Species biomass \(V_1\) in a sample can also be defined in a compound form as an extension of abundance. Let \(W_i\) be the weight of the i-th organism then biomass becomesNote that \(V_1=0\) when \(M_0=0\) as in Eq. (1). If individual weight \(W_i\)’s are independent gamma random variables then biomass, \(V_1\), follows a distribution called Poisson–gamma distribution. This formulation has been exploited previously (Foster and Bravington 2013) as a special case of the Tweedie distribution (Jørgenson 1997; Dunn and Smyth 2005).$$\begin{aligned} V_1 = \sum _{i=1}^{M_0} W_i Z_i. \end{aligned}$$
Another useful construct is the species detection probability, which is defined as one minus the probability that none of the individuals of the species is observed: \(1-\prod _{i=1}^{M_0}\left\{ 1-\Pr (Z_i = 1)\right\} \); this obviously assumes individual independence. The species detection probability is a function of individual detection probability, \(\Pr (Z_i = 1)\), and species abundance, \(M_0\).
3 Models for imperfect observation
3.1 Compound distributions
3.2 Sampling mechanisms
- Simple random sampling Under the simple random sampling scheme, the probability of observing a sample \({\mathcal {S}}\) consisting of \(m_1\) individuals from the population \({\mathcal {P}}\) of \(m_0\) individuals is a multiplication of the probability of each individual being sampled, \(\Pr (\omega _i \in {\mathcal {S}})\), or being not sampled, \(1-\Pr (\omega _i \in {\mathcal {S}})\). If it is assumed that the probability of being sampled is common among individuals, \(\Pr (\omega _i \in {\mathcal {S}})=r\) say, then the probability of sampling \(m_1\) individuals from a population with size \(m_0\) is a binomial distribution \(\mathsf{Bi}(m_{1}; m_{0}, r)\),$$\begin{aligned} f(m_1 | m_0) = {m_0 \atopwithdelims ()m_1} r^{m_1} (1-r)^{m_0 - m_1}. \end{aligned}$$
- Stratified sampling Let \({\mathcal {U}}_j\) be the j-th stratum in stratified sampling. The population \({\mathcal {P}}\) then consists of k strata, \({\mathcal {P}} = \cup _{j=1}^k {\mathcal {U}}_j\), \({\mathcal {U}}_j \cap {\mathcal {U}}_{j'} = \emptyset \ (j \ne j')\), and the individuals in the population are partitioned into the strata. The number of individuals (\(m_0\) say) are randomly partitioned as \(\varvec{m}_0 = (m_{01}, m_{02}, \ldots , m_{0k}), m_{0j}=|{\mathcal {U}}_j|\). This partitioning mechanism can be described by a multinomial distribution, \(\mathsf{Mn}(\varvec{m}_0; m_0, \varvec{p})\), with parameters \(\varvec{p}=(p_1, p_2, \ldots , p_j)\) giving the probability of belonging to each multinomial class. The samples are then randomly drawn from each stratum j with sampling fraction \(r_j\). This draw is independent between the strata so each strata’s sampling process can be described by a binomial distribution with simple random sampling, as above. In terms of the imperfect observation model (3) the probability mass function of the stratified sample iswhere \(\sum p_j=1\).$$\begin{aligned} f(\varvec{m}_1 | m_0)= & {} f(\varvec{m}_1 | \varvec{m}_0) f(\varvec{m}_0 | m_0)\\= & {} \left\{ \prod _{j=1}^k {m_{0j} \atopwithdelims ()m_{1j}} r_j^{m_{1j}} (1-r_j)^{m_{0j}-m_{1j}} \right\} {m_0 \atopwithdelims ()m_{01}, \ldots , m_{0k} } \prod _{j=1}^k p_j^{m_{0j}}, \end{aligned}$$
3.3 Marginal distributions
In general, conventional ecological modelling can be regarded as a mix of design-based and model-based approaches. From a design-based aspect, as we have discussed, it leads to a general model (Eq. 5) that plays a key role in species abundance and presence/absence modelling as we will see in the following Sect. 3.4. The other aspect, model-based one, can also be vital because the assumption that the individual detection probability, \(\alpha \), is fixed by design may sound unreasonable for some cases. It assumes the detection probability as an unknown function of other variable t as \(\alpha (t)\) which needs to be estimated, for example by maximum likelihood. The component of estimation is thus model-based. Although the formulation allows more flexibility to cope with heterogeneity induced by different types of sampling, such as observer error and species rarity, for example, it requires an extra care, since with the Poisson model the individual detection probability, \(\alpha \), cannot be dis-entangled with the abundance expectation from the data alone. Sprott (1965) studied the condition of the probability generating function and identified this model as being inestimable, amongst other compound distributions. Further information about the sampling mechanism or the population’s rate are required.
As noted, the probability mass function \(f(m_0|\varvec{x})\) in Eq. (3) can take any plausible form, and a negative binomial distribution could also be used. When a negative binomial distribution \(\mathsf{NB}(m_0; s, t)\) is, instead of a Poisson distribution, coupled with a binomial sampling distribution, we still obtain an equivalent result. The marginal distribution is a compound negative binomial distribution and its form is explicitly written as \(\mathsf{NB}(m_1; s, t/\{1-(1-\alpha )(1-t)\})\) (see Appendix for the detailed derivation).
There is a close link between Eq. (5) and a model class, namely N-mixture models, by Royle (2004). When the sampling replication is one, Eq. (5) gives the exact analytical expression of the N-mixture model, although Royle (2004) suggested a numerical approximation, calculating instead a finite summation over \(m_0\) up to a reasonably large number. We note that a recent study (Dennis et al. 2015) has pointed out that the choice of the arbitrary large number in the numerical calculation can result in underestimation of abundance.
3.4 Modelling biological responses
- Species abundance The sampled abundance \(m_1\) has the expected value, from Eq. (5), asThis form suggests that the effect of sampling can be treated as an offset using the log link function,$$\begin{aligned} \mu = \hbox {E}[{M_1}] = \alpha \lambda (\varvec{x}). \end{aligned}$$where \(\eta (\varvec{x})\) is the (non-)linear predictor of environment variables. So, to convert a Poisson model for the sampled abundance into a model for the population abundance, all one has to do is to include \(\log (\alpha )\) as an offset.$$\begin{aligned} \log (\mu ) = \log (\alpha ) + \eta (\varvec{x}), \end{aligned}$$
- Species presence/absence From Eq. (2), the distribution of sampled presence/absence, \(Y_1\), is the binarisation of a Poisson random variable:It has the expected value$$\begin{aligned} f(y_1 | \varvec{x})= & {} \left( 1-e^{-\alpha \lambda (\varvec{x})}\right) ^{y_1} \left( e^{-\alpha \lambda (\varvec{x})}\right) ^{(1-y_1)}. \end{aligned}$$(7)and is equal to the probability of the Poisson random variable \(M_1\) to take non-zero values. In the GLM and GAM contexts, a Bernoulli model for the presence/absence variable is also easily implemented, using the complementary log-log link function, viz$$\begin{aligned} \mu = \hbox {E}[{Y_1}] = 1-e^{-\alpha \lambda (\varvec{x})}. \end{aligned}$$The sampling effect \(\alpha \), again is an offset term. The early idea of the complementary log-log link can be found in Fisher (1922) for a dilution assay study and it is more formally stated by McCullagh and Nelder (1989).$$\begin{aligned} \log (-\log (1-\mu )) = \log (\alpha )+ \eta (\varvec{x}). \end{aligned}$$(8)
3.5 The effect of binarisation
3.6 Predictions of biological responses
The estimated intensity function \(\hat{\lambda }(\varvec{x})\) reveals how the biological responses are related to the environment factors \(\varvec{x}\). It allows us to make predictions of unobserved biological responses: species population abundance, \(M_0\), and presence/absence, \(Y_0\). We describe here two types of predictions. The first is marginal prediction which is the unconditional expectation of the observation (\(\hbox {E}[{M_0}]\) or \(\hbox {E}[{Y_0}]\)) and is directly derived from the distribution \(f(m_0; \hat{\lambda }(\varvec{x}))\), Eq. (4). The other is conditional prediction (\(\hbox {E}[{M_0 | M_1 = m_1}]\) or \(\hbox {E}[{Y_0 | Y_1 = y_1}]\)) calculated from the distribution, \(f(m_0 | m_1;\hat{\lambda }(\varvec{x}))\). Note that \(\hat{\lambda }{(\varvec{x})}\) is used as a plug-in estimate. In a Bayesian analysis, one would incorporate uncertainty in this estimate into the predictive distribution. Due to the dependence on sample data, the conditional distribution (and predictions) are only available at previously sampled locations. However, at other locations one could define the conditional prediction to coincide with the marginal prediction—note though that there is no extra data to inform the process.
- Species abundance The predictors are derived respectively from Eq. (4) and (10) asThis is the observed abundance plus the difference between expected true and expected observed abundance. In essence, it takes the observation and adjusts it for what is likely to be missed through sampling.$$\begin{aligned} \hbox {E}[{M_0}]= & {} \lambda (\varvec{x}), \quad \hbox {and}\\ \hbox {E}[{M_0 | M_1=m_1}]= & {} \sum _{m_0=0}^\infty m_0 f(m_0 | m_1)\\= & {} (1-\alpha )\lambda (\varvec{x}) + m_1\\= & {} \hbox {E}[{M_0}] - \hbox {E}[{M_1}] + m_1. \end{aligned}$$
- Presence/absence Viewing presence/absence data as the binarisation of a Poisson variable, the predictors areThe conditional expectation, \(\hbox {E}[{Y_0 | Y_1=y_1}]\), is 1 if the species is observed and a non-zero probability if the species is not observed. The non-zero probability reflects the difference in expectation between the true and the observed presence/absence record.$$\begin{aligned} \hbox {E}[{Y_0}]= & {} 1-e^{-\lambda (\varvec{x})}, \quad \hbox {and}\\ \hbox {E}[{Y_0 | Y_1=y_1}]= & {} \sum _{y_0=0}^1 y_0 f(y_0 | y_1)\\= & {} \sum _{y_0=0}^1 y_0 \left\{ \left( 1-e^{-(1-\alpha ) \lambda (\varvec{x})}\right) ^{y_0} \left( e^{-(1-\alpha ) \lambda (\varvec{x})}\right) ^{(1-y_0)} I(y_1=0) \right. \\&\quad \left. + \,y_0 I(y_1=1)\right\} \\= & {} \left( 1-e^{-(1-\alpha ) \lambda (\varvec{x})}\right) I(y_1=0) + I(y_1=1)\\= & {} \frac{\hbox {E}[{Y_0}]-\hbox {E}[{Y_1}]}{1-\hbox {E}[{Y_1}]} I(y_1=0) + I(y_1=1). \end{aligned}$$
4 Data analysis
4.1 Carnarvon Shelf data
4.1.1 Data and sampling method
Data were collected in a seabed mapping survey of the Carnarvon Shelf offshore (Fig.1b) from central Western Australia (Brooke et al. 2009). The aim of the survey was to acquire physical and biological data to enable a range of environmental parameters to be tested as surrogates of benthic biodiversity patterns. A Smith-McIntyre grab was deployed at 142 sites. For each grab, a sediment sample (\(\sim \)50 ml) was retained for analysis of textural characteristics while the remaining sediments were processed for infauna. The infauna samples were separated by washing sediments through a 500 \(\upmu \)m sieve and then a sample was taken if necessary. The proportion sampled was recorded as the sampling ratio r. The samples were classified into food guild groups and species.
As an example, for illustration, we present the results for two food-guild groups: non-selective and selective feeders. Both guilds ingest sediment and derive nutrients from the microorganisms living on the particles but selective feeders often have a physical structure which enables them to select optimally-sized particles for ingestion (e.g. tentacle with a ciliated groove). Such a physical difference may let them have different preference in their ambient sediment conditions. In our modelling, we assume that a species’ preference in sediment conditions is common over all survey sites. That is, there is no interaction between preference and spatial location.
4.1.2 Modelling
We develop a model to describe how the presence/absence of each food guild group responds to the seabed grain size. As each species obviously has different abundances, we fit a model to each species separately and then combine the models for one food guild group.
4.1.3 Result
Different inferences are obtained when the sub-sampling effect is taken into account or ignored (equivalent to assuming that \(r_s=1\)). Figure 2 illustrates the probability of presence of each food guild group (Eq. 11). The model that accounts for the sub-sampling effect (the left panel, Fig. 2) suggests that the probability of presence of the non-selective group species decreases as seabed sediments become coarse but the selective group species have little influence of the sediment size. In contrast, the model that ignores the effect (the right panel, Fig. 2) shows that both groups respond to the sediment size and the probability of presence decreases as the sediment size increases. These contradictory results highlight the risk of misinterpretation when the sub-sampling effect is mis-specified in the model. To us, it seems more plausible that the selective feeders should not have much dependence on grain size as this is their evolutionary advantage.
4.2 Great Barrier Reef data
4.2.1 Data and sampling method
The biological samples were collected by a scientific trawl net towed behind a survey vessel. After each tow, the samples were processed entirely or sub-sampled for enumeration, weighing and identification. On the deck, the samples were sorted into rough phylogenetic groups (strata \({\mathcal {U}}_j, j=1, 2, \ldots , k\)) and then a sub-sample was taken from each stratum (group) if necessary. The proportion of sub-sample was recorded as sub-sampling ratio \(r_j\) for the j-th stratum. On board taxonomic stratification is a difficult task, and some mis-specification is inevitable—some species were observed in an unexpected stratum some times. The taxonomic sorting suggested some heterogeneity (\(p_j \ne 1/k\)) was induced, and its mis-specification meant \(p_j \ne 1\) for the j-th stratum, \({\mathcal {U}}_j\), given an organism belonging to the j-th group (stratum). This required extra consideration for \(p_j, j=1, 2, \ldots , k\), when modelling. We consider here four environment variables: depth, % carbonates, % gravel and % sand. Note that none of these percentages sum to 100 %.
4.2.2 Modelling
4.2.3 Result
The model is fitted to the abundance data of a squid species (Photololligo chinensis). Each panel in Fig. 3 represents the response of its abundance to the environment variable. The solid and two-dashed line respectively represents the predictor, \(\eta (\varvec{x}_s)\), of the model with or without the sub-sampling effect taken into account. The sub-sampling effect is now easily observed as a constant shift in the linear predictor for all covariates, except depth. This does not translate to a constant difference on the response scale though. The response of the model that ignores the sub-sampling effect underestimates the abundance. It also shows that the confidence interval of the model that accounts for the effect tends to be wider than the one ignoring the effect. These observations concur with the theoretical results in Sect. 5.
4.2.4 A technical matter
5 Implications for survey design
6 Summary
We have discussed how the imperfect observation effect due to sampling should be treated in ecological modelling, and presented how a general framework, the compound distribution, can accommodate individual detectability. The model is general and can handle many different types of sampling, including the two examples used in this paper: the commonly used simple random sampling and stratified random sampling. The method of implementing the sampling effect is straight-forward; the sampling effect enters the regression-type model as an offset term by using an appropriate link function. Other types of sampling mechanisms, such as cluster sampling, will require slightly more complex models that allow for the between individual correlation.
Our examples are typical of a sampling technique called ‘sub-sampling’, which is widely used in marine surveys. This is completely an anthropogenic effect induced during the survey process that should be taken into account for modelling. To the authors’ knowledge, anthropogenic sub-sampling is under studied; only two of articles can be found in the literature (Heales et al. 2000, 2003). Another sampling effect in marine surveys is the issue of catchability (also called detectability), which describes how likely the individuals will get caught given the sampling gear employed. We have not discussed this as our data consists of a single observation at each site, so the probability of catching an individual, given presence, is completely confounded with the probability of presence. If a site was visited multiple times then this could be incorporated into the compound distribution framework, and the catchability could be estimated. Commonly, this has been done using species occupancy models (MacKenzie et al. 2002). These types of models are an approximation to our framework, see Eq. (9).
Fisher (1934) clearly emphasises the importance of understanding the data collection procedure employed as a statistical commonplace. Accordingly, Rao (1965) generalises Fisher’s idea and proposes a general modelling framework that is able to accommodate a wide class of sampling mechanisms, such as non-observability of events by dealing with individual detectability. The compound model presented (Sect. 2 and Eq. 3) exhibits strong similarities with one of the models described in Rao (1965) and also in Patil and Rao (1978). Ecological studies will always have a limited number of observations from the population of interest, and so the statistical challenge has historically been centred around how to make effective inferences dealing properly with the sampling effect. This challenge will remain into the future. The compound distribution seems to have received little attention in ecological modelling and we show that it naturally underpins an effective modelling framework to account for imperfect observations.
Notes
Acknowledgments
Many thanks go to Carolyn Huston and Scott Nichol for their constructive comments and suggestions on an early version of this manuscript. HS acknowledges the support by the European Research Council (Project BioTIME 250189). SDF was supported by the Marine Biodiversity Hub, a collaborative partnership supported through funding from the Australian Government’s National Environmental Research Program (NERP). NERP Marine Biodiversity Hub partners include the Institute for Marine and Antarctic Studies, University of Tasmania; CSIRO Wealth from Oceans National Flagship, Geoscience Australia, Australian Institute of Marine Science, Museum Victoria, Charles Darwin University and the University of Western Australia.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Supplementary material
References
- Bax NJ (ed) (2011) Marine Biodiversity Hub, Commonwealth Environment Research Facilities. Final report 2007–2010. Report to Department of Sustainability, Environment, Water, Population and Communities. Canberra, AustraliaGoogle Scholar
- Borchers DL, Buckland ST, Zucchini W (2002) Estimating animal abundance. Springer, LondonGoogle Scholar
- Brooke B, Nichol S, Hughes M, McArthur M, Anderson T, Przeslawski R, Siwabessy J, Heyward A, Battershill C, Colquhoun J, Doherty P (2009) Carnarvon Shelf survey pos-survey report. Record 2009/02, Geoscience AustraliaGoogle Scholar
- Buckland ST, Anderson DR, Burnham KP, Laake JL, Borchers DL, Thomas L (eds) (2004) Advanced distance sampling. Oxford University Press, OxfordGoogle Scholar
- Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, New YorkGoogle Scholar
- Dennis EB, Morgan BJ, Ridout MS (2015) Computational aspects of n-mixture models. Biometrics 71:237–246CrossRefPubMedGoogle Scholar
- Dunn PK, Smyth GK (2005) Series evaluation of Tweedie exponential dispersion model densities. Stat Comput 15(4):267–280CrossRefGoogle Scholar
- Elith J, Leathwick JR (2009) Species distribution models: ecological explanation and prediction across space and time. Annu Rev Ecol Evol Syst 40:677–697CrossRefGoogle Scholar
- Feller W (1968) An introduction to probability theory and its applications, 3rd edn. Wiley, New YorkGoogle Scholar
- Ferrier S, Guisan A (2006) Spatial modelling of biodiversity at the community level. J Appl Ecol 43:393–404CrossRefGoogle Scholar
- Fisher R (1922) On the mathematical foundations of theoretical statistics. Philos Trans R Soc Lond Ser A 222:309–368CrossRefGoogle Scholar
- Fisher RA (1934) The effect of methods of ascertainment upon the estimation of frequencies. Ann Eugen 6:13–25CrossRefGoogle Scholar
- Fisher RA, Corbet AS, Williams CB (1943) The relation between the number of species and the number of individuals in a random sample of animal population. J Anim Ecol 12(1):42–58CrossRefGoogle Scholar
- Foster SD, Bravington MV (2013) A poisson-gamma model for analysis of ecological non-negative continuous data. Environ Ecol Stat 20:533–552CrossRefGoogle Scholar
- Foster SD, Shimadzu H, Darnell R (2012) Uncertainty in spatially predicted covariates: Is it ignorable? J R Stat Soc Ser C 61(4):637–652CrossRefGoogle Scholar
- Franklin J (2010) Mapping species distributions. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Gattone SA, Battista TD (2009) A functional approach to diversity profiles. J R Stat Soc Ser C 58(2):267–284CrossRefGoogle Scholar
- Guisan A, Zimmermann NE (2000) Predictive habitat distribution models in ecology. Ecol Model 135(2–3):147–186CrossRefGoogle Scholar
- Gurland J (1957) Some interrelations among compound and generalized distributions. Biometrika 44:265–268CrossRefGoogle Scholar
- Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman & Hall, FloridaGoogle Scholar
- Heales DS, Brewer DT, Wang YG (2000) Subsampling multi-species trawl catches from tropical northern Australia: Does it matter which part of the catch is sampled? Fish Res 48:117–126CrossRefGoogle Scholar
- Heales DS, Brewer DT, Wang YG, Jones PN (2003) Does the size of subsamples taken from multispecies trawl catches affect estimates of catch composition and abundance? Fish Bull 101:790–799Google Scholar
- Heck KLJ, van Belle G, Simberloff D (1975) Explicit calculation of the rarefaction diversity measurement and the determination of sufficient sample size. Ecology 56(6):1459–1461CrossRefGoogle Scholar
- Howard C, Stephens PA, Pearce-Higgins JW, Gregory RD, Willis SG (2014) Improving species distribution models: the value of data on abundance. Methods Ecol Evol 5(6):506–513CrossRefGoogle Scholar
- Hurlbert SH (1971) The nonconcept of species diversity: a critique and alternative parameters. Ecology 52(4):577–586CrossRefGoogle Scholar
- Jørgenson B (1997) The theory of dispersion models. Chapman and Hall, LondonGoogle Scholar
- Johnson NL, Kotz S, Kemp AW (1992) Univariate discrete distributions, 2nd edn. Wiley, New JerseyGoogle Scholar
- Lozier JD, Aniello P, Hickerson MJ (2009) Predicting the distribution of Sasquatch in western North America: anything goes with ecological niche modelling. J Biogeogr 36:1623–1627CrossRefGoogle Scholar
- MacKenzie DI, Nichols JD, Lachman GB, Droege S, Royle JA, Langtimm CA (2002) Estimating site occupancy rates when detection probabilities are less than one. Ecology 83(8):2248–2255CrossRefGoogle Scholar
- MacKenzie DI, Nichols JD, Royle JA, Pollock KH, Bailey LL, Hines JE (2006) Occupancy estimation and modeling. Academic Press, CambridgeGoogle Scholar
- McCullagh P, Nelder J (1989) Generalized linear models, 2nd edn. Chapman and Hall, FloridaGoogle Scholar
- Patil G, Rao C (1978) Weighted distributions and size-biased sampling with applications to wildlife populations and human families. Biometrics 34:179–189CrossRefGoogle Scholar
- Pitcher C, Doherty P, Arnold P, Hooper J, Gribble N, Bartlett C, Browne M, Campbell N, Cannard T, Cappo M, Carini G, Chalmers S, Cheers S, Chetwynd D, Colefax A, Coles R, Cook S, Davie P, De’ath G, Devereux D, Done B, Donovan T, Ehrke B, Ellis N, Ericson G, Fellegara I, Forcey K, Furey M, Gledhill D, Good N, Gordon S, Haywood M, Jacobsen I, Johnson J, Jones M, Kinninmoth S, Kistle S, Last P, Leite A, Marks S, McLeod I, Oczkowicz S, Rose C, Seabright D, Sheils J, Sherlock M, Skelton P, Smith D, Smith G, Speare P, Stowar M, Strickland C, Sutcliffe P, Van der Geest C, Venables W, Walsh C, Wassenberg T, Welna A, Yearsley G (2007) Seabed biodiversity on the continental shelf of the Great Barrier Reef world heritage area. Technical report of CSIRO marine and atmospheric researchGoogle Scholar
- Rao C (1965) On discrete distributions arising out of methods of ascertainment. Sankhya 27:311–324Google Scholar
- Royle JA (2004) \(N\)-mixture models for estimating population size from spatially replicated counts. Biometrics 60:108–115CrossRefPubMedGoogle Scholar
- Royle JA, Nichols JD (2003) Estimating abundance from repeated presence-absence data or point counts. Ecology 84(3):777–790CrossRefGoogle Scholar
- Sanders HL (1968) Marine benthic diversity: a comparative study. Am Nat 102(925):243–282CrossRefGoogle Scholar
- Shimadzu H, Darnell R (2015) Attenuation of species abundance distributions by sampling. R Soc Open Sci 2(140):219. doi: 10.1098/rsos.140219 Google Scholar
- Simberloff D (1972) Properties of the rarefaction diversity measurement. Am Nat 106(949):414–418CrossRefGoogle Scholar
- Sprott D (1965) Some comments on the question of identifiability of parameters raised by Rao. Sankhya Indian J Stat Ser A 27(2/4):365–368Google Scholar
- Stoklosa J, Daly C, Foster SD, Ashcroft MB, Warton DI (2015) A climate of uncertainty: accounting for error in climate variables for species distribution models. Methods Ecol Evol 6(4):412–423CrossRefGoogle Scholar
- Wood S (2006) Generalized additive models: an introduction with R. Chapman and Hall, FloridaGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.