Abstract
The analysis of habitat selection in radio-tagged animals is approached by comparing the portions of use against the portions of availability observed for each habitat type. Since data are linearly dependent with singular variance-covariance matrices, standard multivariate statistical tests cannot be applied. To bypass the problem, compositional data analysis is customarily performed via log-ratio transform of sample observations. The procedure is criticized in this paper, emphasizing the several drawbacks which may arise from the use of compositional analysis. An alternative nonparametric solution is proposed in the framework of multiple testing. The habitat use is assessed separately for each habitat type by means of the sign test performed on the original observations. The resulting p values are combined in an overall test statistic whose significance is determined permuting sample observations. The theoretical findings of the paper are checked by simulation studies. Applications to case studies previously considered in literature are discussed.
This is a preview of subscription content, log in to check access.
Abbreviations
- RHU:
-
Proportional or random habitat use
- PAT:
-
Portion of animal trajectory
- PAHR:
-
Portion of animal home range
- CODA:
-
Compositional data analysis
References
Aebischer NJ, Robertson PA, Kenward RE (1993) Compositional analysis of habitat use from animal radio-tracking data. Ecology 74:1315–1325
Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, London
Aitchison J (1994) Principles of compositional data analysis. In: Anderson TW, Fang KT, Olkin J (eds) Multivariate analysis and its applications. Institute of Mathematical Statistics, Hayward, pp 73–81
Calenge C (2006) The package “adehabitat” for the R software: A tool for the analysis of space and habitat use by animal. Ecol Model 197:516–519
Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate distributions. Chapman and Hall, London
Johnson DH (1980) The comparison of usage and availability measurements for evaluating resource preference. Ecology 61:65–71
Johnson NL, Kotz S, Balakrishnan N (1995) Continuous univariate distributions, vol 2. Wiley, New York
Johnson DS, Thomas DL, Ver Hoef TJ, Christ A (2008) A general framework for the analysis of animal resource selection from telemetry data. Biometrics 64:968–976
Kneib T, Knauer F, Küchenhoff H (2011) A general approach to the analysis of habitat selection. Environ Ecol Stat 18:1–25
Kooper N, Manseau M (2009) Generalized estimating equations and generalized linear mixed-effects models for modelling resource selection. J Appl Ecol 46:590–599
Manly BFJ, McDonald LL, Thomas DL, McDonald TL, Erickson WP (2002) Resource selection by animals. Kluwer, Dordrecht
Pesarin F (1992) A resampling procedure for nonparametric combination of several dependent tests. J Italian Stat Soc 1:87–101
Pesarin F (2001) Multivariate permutation tests: with applications in biostatistics. Wiley, New York
Randles RH, Wolfe DA (1979) Introduction to the theory of nonparametric statistics. Wiley, New York
Strickland MD, McDonald LL (2006) Introduction to the special section on resource selection. J Wildl Manag 70:321–323
Westfall PH, Young SS (1993) Resampling-based multiple testing. Wiley, New York
Worton BJ (1989) Kernel methods for estimating the utilization distribution in home-range studies. Ecology 70:164–168
Acknowledgments
The authors thank Luca Pratelli for his helpful suggestions in the theoretical aspects of the work.
Author information
Affiliations
Corresponding author
Additional information
Handling Editor: Ashis SenGupta.
Appendices
Appendix 1: Different expressions for the hypothesis of random habitat use
The RHU hypothesis (2) actually constitutes a multivariate hypothesis which can be rewritten as
where \(\text{ E }(X_{Uj} -X_{Aj})=0\) is the univariate hypothesis that the expected use of habitat \(j\) coincides with its expected (or constant) availability. The obvious sense of (12) is that \(\text{ H }_{X0}\) is true if all the univariate hypotheses are true. In turn, chosen a reference habitat \(h\), (12) is equivalent to
Indeed, if (2) is true, than for any habitat \(j\) it follows from (12) that \(\text{ E }(X_{Uj})=\text{ E }(X_{Aj})\) from which \(\text{ E }(X_{Uj})/\text{ E }(X_{Aj})=1\). Accordingly, for the reference habitat \(h\) and for each \(j\ne h\), it follows that \(\text{ E }(X_{Uj})/\text{ E }(X_{Aj})=\text{ E }(X_{Uh})/\text{ E }(X_{Ah})\) from which \(\text{ E }(X_{Uj})/\text{ E }(X_{Uh})=\text{ E }(X_{Aj})/\text{ E }(X_{Ah})\). As to the reverse, if (13) is true, then for the reference habitat \(h\) and for each \(j\ne h\) it holds that \(\text{ E }(X_{Uj})/\text{ E }(X_{Uh})=\text{ E }(X_{Aj})/\text{ E }(X_{Ah})\) or equivalently \(\text{ E }(X_{Uj})/\text{ E }(X_{Aj})=\text{ E }(X_{Uh})/\text{ E }(X_{Ah})\), i.e. for each \(j=1, \ldots , K\) it holds that \(\text{ E }(X_{Uj})/\text{ E }(X_{Aj})=c\) or equivalently \(\text{ E }(X_{Uj})=c\text{ E }(X_{Aj})\).
But since \(\sum _{j=1}^K {\text{ E }(X_{Uj})} =\sum _{j=1}^K {\text{ E }(X_{Aj})} =1\), then \(c=1\), which obviously implies (2).
In a similar way, chosen a reference habitat \(h\), (3) constitutes a multivariate hypothesis which is equivalent to
or, more explicitly, to
From (13) and (14), it is at once apparent that (3) is equivalent to (2) if
for each \(j\ne h=1,\ldots , K\). Since \(\text{ E }\left\{ \ln (X)\right\} \) generally differs from \(\ln \text{ E }(X)\), relation (15) does not generally hold.
Appendix 2: Dirichlet distributions and log-ratio transforms
The Dirichlet distribution is probably the most familiar model adopted for positive random vectors \(\mathbf{X}=\left[ X_1, \ldots , X_K\right] ^{\mathrm{T}}\) subject to the constraint \(\mathbf{1}^{\mathrm{T}}\mathbf{X}=1\). A \(K\)-variate random vector X is said to have a Dirichlet distribution with parameters \(\delta >0\) and \({\varvec{\uptheta }}=\left[ \theta _1, \ldots , \theta _K\right] ^{\mathrm{T}}\) with \(\theta _{j}>0\) for each \(j=1,\ldots , K\) if the joint probability density function at \(\mathbf{x}=\left[ x_1, \ldots , x_K \right] ^{\mathrm{T}}\) with \(\mathbf{1}^{\mathrm{T}}\mathbf{x}=1\) is given by
where \(\theta =\mathbf{1}^{\mathrm{T}}{\varvec{\uptheta }}\). As is well known (e.g. Fang et al. 1990), each marginal variable \(X_j\) has a beta distribution on [0,1] with shape parameters \(\delta \theta _j\) and \(\delta (\theta -\theta _j)\) in such a way that
and
Accordingly, marginal expectations do not depend on \(\delta \) and marginal variances increase as \(\delta \) decreases. In the framework of habitat selection analysis, \(\delta \) obviously accounts for the variability of portions of animal trajectories or home ranges within habitat types. However, when these quantities are estimated on the field by means of animal’s radio locations, \(\delta \) also accounts for the number of radio locations adopted in the study, since marginal variances decrease as the \(r_i\text{ s }\) increase and estimates become close to the real values.
If X has a Dirichlet distribution with parameters \(\delta \) and \({\varvec{\uptheta }}\), the log-ratio transform \(\mathbf{Y}=lr_h (\mathbf{X})\) is a random vector on \(R^{K-1}\) whose \(j\)-th marginal random variable \(Y_j =\ln (X_j/X_h)\) has a generalized logistic distribution of type IV with expectation
where \(\varphi (x)=\partial \ln \Gamma (x)/\partial x\) denotes the digamma function (e.g. Johnson et al. 1995, p. 142, Fang et al. 1990, Problem 1.5).
In the case of Johnson’s second order selection, denote by a the vector of portions of habitat types in the study area and suppose that \(\mathbf{X}_U\) has a Dirichlet distribution with parameters \(\delta _U\) and a, in such a way that \(\text{ H }_{X0}\) is true. Thus, in accordance with (16), the squared value of the unreliability measure of CODA-based procedure turns out to be
In a similar way, in the case of Johnson’s third order selection, suppose that \(\mathbf{X}_U\) and \(\mathbf{X}_A\) have Dirichlet distributions with the same parameter a and variability parameters \(\delta _U\) and \(\delta _A\), respectively, in such a way that \(\text{ H }_{X0}\) is true. From (16), the squared value of unreliability measure is
Appendix 3: Generating dependent compositional data
It is worth noting that \(\mathbf{X}_U\) and \(\mathbf{X}_A\) arise from the choice of the same animal and as such they should be realistically presumed as dependent random vectors. However, the general problem of constructing dependent random vectors \(\mathbf{X}_1 =\left[ X_{11}, \ldots , X_{1K}\right] ^{\mathrm{T}}\) and \(\mathbf{X}_2 =\left[ X_{21}, \ldots , X_{2K}\right] ^{\mathrm{T}}\) subject to the constraint \(\mathbf{1}^{\mathrm{T}}\mathbf{X}_1 =\mathbf{1}^{\mathrm{T}}\mathbf{X}_2 =1\) is difficult to solve in the framework of Dirichlet model since any couple of subvectors \(\mathbf{X}_1,\,\mathbf{X}_2\) partitioning a vector X with a Dirichlet distribution turn out to be independent with marginal Dirichlet distributions (see Fang et al. 1990, Theorem 1.4).
For this purpose, it is convenient to consider one vector, say \(\mathbf{X}_1\), distributed as a Dirichlet random vector with parameters \(\delta >0\) and \({\varvec{\uptheta }}\) in such a way that \(\mathbf{1}^{\mathrm{T}}\mathbf{X}_1 =1\), and then obtaining \(\mathbf{X}_2\) by means of \(\mathbf{X}_1 +\mathbf{U}\), where U is a random vector in which \(K-1\) components, say \(U_1, \ldots , U_{K-1}\), are random variables in the range \((-W,W)\) with
and \(U_K =-(U_1 +\cdots +U_{K-1})\). Indeed, after a straightforward algebra it can be proven that \(0<X_{2j} <1\) for each \(j=1,\ldots , K\) while \(\mathbf{1}^{\mathrm{T}}\mathbf{X}_2 =1\) by construction. Obviously \(\text{ E }(X_{2j})=\text{ E }(X_{1j})+\text{ E }(U_j)\), while \(\text{ V }(X_{2j})=\text{ V }(X_{1j})+\text{ V }(U_j)\), providing that \(\mathbf{X}_1\) and U are independent. If \(\text{ E }(\mathbf{U})=\mathbf 0 \), then \(\mathbf{X}_1\) and \(\mathbf{X}_2\) are dependent with the same mean vector. Moreover, if the \(U_j\text{ s }\) are symmetrically distributed around 0, than \(\text{ Pr }(X_{2j}>X_{1j})=0.5\) for each \(j=1,\ldots , K\). These two last features can be readily achieved if the \(U_j\text{ s }\) are independent beta variables on \((-W,W)\) with shape parameters both equal to \(\beta >0\) in such a way that they turn out to be symmetric around 0, with variance
Accordingly the \(U_j\text{ s }\) inflate the variances of the \(X_{1j}\) by a term which increases as \(\beta \) approaches 0.
If \(\mathbf{X}_1\) coincides with the vector of constants a, then if \(\text{ E }(\mathbf{U})=\mathbf 0 \) and the \(U_j\text{ s }\) are symmetrically distributed around 0, \(\text{ E }(\mathbf{X}_2)=\mathbf{a},\,\text{ Pr }(X_{2j} >a_j)=0.5\) and \(\text{ V }(X_{2j})=\text{ V }(U_j)\) for each \(j=1,\ldots , K\). Obviously, in this case the \(U_j\text{ s }\) varies on \((-w,w)\) with \(w=\min (a_1, \ldots , a_{K-1}, \frac{a_K}{K-1})\).
Rights and permissions
About this article
Cite this article
Fattorini, L., Pisani, C., Riga, F. et al. A permutation-based combination of sign tests for assessing habitat selection. Environ Ecol Stat 21, 161–187 (2014). https://doi.org/10.1007/s10651-013-0250-7
Received:
Revised:
Published:
Issue Date:
Keywords
- Compositional data analysis
- Johnson’s second order selection
- Johnson’s third order selection
- Monte Carlo studies
- Multiple testing
- Random habitat use