Abstract
Joint species distribution modeling is attracting increasing attention these days, acknowledging the fact that individual level modeling fails to take into account expected dependence/interaction between species. These joint models capture species dependence through an associated correlation matrix arising from a set of latent multivariate normal variables. However, these associations offer limited insight into realized dependence behavior between species at sites. We focus on presence/absence data using joint species modeling, which, in addition, incorporates spatial dependence between sites. For pairs of species selected from a collection, we emphasize the induced odds ratios (along with the joint occurrence probabilities); they provide a better appreciation of the practical dependence between species that is implicit in these joint species distribution modeling specifications. For any pair of species, the spatial structure enables a spatial odds ratio surface to illuminate how dependence varies over the region of interest. We illustrate with a dataset from the Cape Floristic Region of South Africa consisting of more than 600 species at more than 600 sites. We present the spatial distribution of odds ratios for pairs of species that are positively correlated and pairs that are negatively correlated under the joint species distribution model.
Similar content being viewed by others
Notes
Under the dimension reduction, we can include at most \(r< < S\) decay parameters where r is say 3 to 5. The effect of adopting a common decay parameter for the latent GP’s is expected to be negligible.
References
Agresti A (2012) Categorical data analysis, 3rd edn. Wiley, New York
Banerjee S, Carlin BP, Gelfand AE (2014) Hierarchical modeling and analysis for spatial data, 2nd edn. Chapman and Hall, New York
Calabrese JM, Certain G, Kraan C, Dormann CF (2014) Stacking species distribution models and adjusting bias by linking them to macroecological models. Glob Ecol Biogeogr 23:99–112
Chib S (1998) Analysis of multivariate probit models. Biometrika 85:347–361
Clark JS, Nemergut D, Seyednasrollah B, Turner PJ, Zhang S (2017) Generalized joint attribute modeling for biodiversity analysis: median-zero, multivariate, multifarious data. Ecol Monogr 87:34–56
Cressie N, Wikle CK (2011) Statistics for spatio-temporal data. Wiley, New York
De Oliveira V (2000) Bayesian prediction of clipped Gaussian random fields. Comput Stat Data Anal 34:299–314
Doornik J (2007) Ox: object oriented matrix programming. Timberlake Consultants Press, New York
Gelfand AE, Schmidt AM, Wu S, Silander JA Jr, Latimer A, Rebelo AG (2005) Modelling species diversity through species level hierarchical modelling. J R Stat Soc Ser C 54:1–20
Gelfand AE, Shirota S (2019) Preferential sampling for presence/absence data and for fusion of presence/absence data with presence-only data. Ecol Monogr 89:e01372
Graham CH, Hijmans RJ (2006) A comparison of methods for mapping species ranges and species richness. Glob Ecol Biogeogr 15:578–587
Guisan A, Rahbek C (2011) SESAM - a new framework integrating macroecological and species distribution models for predicting spatio-temporal patterns of species assemblages. J Biogeogr 38:1433–1444
Gupta SS (1963) Probability integrals of multivariate normal and multivariate. Ann Math Stat 34:792–828
Lane PW, Lindenmayer DB, Barton PS, Blanchard W, Westgate MJ (2014) Visualization of species pairwise association: a case study of surrogacy in bird assemblages. Ecol Evol 4:3279–3289
Latimer A, Wu S, Gelfand AE, Silander Jr JA (2006) Building statistical models to analyze species distributions. Ecol Appl 16:33–50
Mueller-Dombois D, Ellenberg H (1974) Aims and methods of vegetation ecology. Wiley, New York
Ovaskainen O, Abrego N, Halme P, Dunson D (2016) Using latent variable models to identify large networks of species-to-species associations at different spatial scales. Methods Ecol Evo 7:549–555
Ovaskainen O, Hottola J, Siitonen J (2010) Modeling species co-occurrence by multivariate logistic regression generates new hypotheses on fungal interactions. Ecology 91:2514–2521
Pineda E, Lobo JM (2009) Assessing the accuracy of species distribution models to predict amphibian species richness patterns. J Anim Ecol 78:182–190
Pollock LJ, Tingley R, Morris WK, Golding N, O’Hara RB, Parris KM, Vesk PA, McCarthy MA (2014) Understanding co-occurrence by modelling species simultaneously with a Joint Species Distribution Model (JSDM). Methods Ecol Evol 5:397–406
Rebelo T (2001) SASOL proteas: a field guide to the proteas of South Africa, 2nd edn. Fernwood Press, Halifax
Rota CT, Ferreira MAR, Kays RW, Forrester TD, Kalies EL, McShea WJ, Parsons AW, Millspaugh JJ (2016) A multispecies occupancy model for two or more interacting species. Methods Ecol Evol 7:1164–1173
Shirota S, Gelfand AE, Banerjee S (2019) Spatial joint species distribution modeling using Dirichlet processes. Stat Sin 29:1127–1154
Slepian D (1962) The one-sided barrier problem for Gaussian noise. Bell Syst Techn J 41:463–501
Takhtajan A (1986) Floristic regions of the world. University of California Press, California
Taylor-Rodríguez D, Kaufeld K, Schliep EM, Clark JS, Gelfand AE (2017) Joint species distribution modeling: dimension reduction using Dirichlet processes. Bayesian Anal 12:939–967
Thorson JT, Scheuerell MD, Shelton AO, See KE, Skaug HJ, Kristensen K (2015) Spatial factor analysis: a new tool for estimating joint species distributions and correlations in species range. Methods Ecol Evol 6:627–637
Wilkinson DP, Golding N, Guillera-Arroita G, Tingley R, McCarthy MA (2018) A comparison of joint species distribution models for presence–absence data. Methods Ecol Evol 10:198–211
Zobel DB, Anton JA (1997) A decade of recovery of understory vegetation buried by volcanic tephra from Mount St. Helens. Ecol Monogr 67:317–344
Acknowledgements
The computational results were obtained using Ox version 6.21 (Doornik 2007).
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling Editor: Luiz Duczmal.
Appendix
Appendix
I: We consider, in detail, the connection between the correlation arising under the latent multivariate normal model for species pairs and the associated odds ratio. We draw on some older work relating bivariate normal probabilities to the associated bivariate correlation. There is a substantial literature, with multivariate extensions, and we only note two papers here: Gupta (1963) and Slepian (1962).
The basic result we need is the following:
Theorem: Suppose \(\left( \begin{array}{c} Z_{1} \\ Z_{2} \\ \end{array} \right) \)
\(\sim \) BivN\(\left( \left( \begin{array}{c} 0 \\ 0 \\ \end{array} \right) \right. \), \(\left. \left( \begin{array}{cc} 1 &{} \rho \\ \rho &{} 1\\ \end{array} \right) \right) \). Then, \(P(Z_{1} \le c_1, Z_{2} \le c_2)\) is non-decreasing in \(\rho \).
We apply this result to (1). For fixed \(\mu _{i}^{(j)}\) and \(\mu _{i}^{(j')}\), by simple probability calculations, we have \(p_{i,00}^{(j,j')}\) non-decreasing in \(\rho ^{(j,j')}\), we have \(p_{i,01}^{(j,j')}\) non-increasing in \(\rho ^{(j,j')}\), we have \(p_{i,10}^{(j,j')}\) non-increasing in \(\rho ^{(j,j')}\), and we have \(p_{i,11}^{(j,j')}\) non-decreasing in \(\rho ^{(j,j')}\). As a result, the numerator in (1) is non-decreasing in \(\rho ^{(j,j')}\) while the denominator in (1) is non-increasing in \(\rho ^{(j,j')}\). So, altogether, we have \(\theta _{i}^{(j,j')}\) non-decreasing in \(\rho ^{(j,j')}\) for all i and \((j,j')\) pairs.
As a corollary, since \(\theta _{i}^{(j,j')} =1\) when \(\rho ^{(j,j')} = 0\), we must have \(\theta _{i}^{(j,j')} \ge 1\) when \(\rho ^{(j,j')} > 0\) and \(\theta _{i}^{(j,j')} \le 1\) when \(\rho ^{(j,j')} < 0\).
II: We offer some brief words regarding how calculation of probabilities is implemented. Under Markov chain Monte Carlo model fitting, we obtain posterior samples, say \(\mu _{i,b}^{(j)}, \mu _{i,b}^{(j')}, \rho _{b}^{(j,j')}, b=1,2,\ldots ,B\). Each term in (1) is a double integral which is a function of \((\mu _{i}^{(j)}, \mu _{i}^{(j')}, \rho ^{(j,j')})\). So, each sample, \(\mu _{i,b}^{(j)}, \mu _{i,b}^{(j')}, \rho _{b}^{(j,j')}\) produces a posterior realization of each of the four terms on the right side of (1), e.g., \(p_{i,00,b}^{(j,j')}\), hence a posterior realization, \(\theta _{i,b}^{(j,j')}\). Across \(b=1,2,\ldots ,B\), we obtain a posterior sample of size B for each of the terms on the right side of (1) as well as the induced odds ratio. The double integrals that are needed can be computed using approximations or numerically. In fact, if we work with the functional JSDM and specification (ii) above, then \(Z_{ij}\) and \(Z_{ij'}\) are conditionally independent given \(L^{F}_{ij}\), \(L^{F}_{ij'}\), \(L^{R}_{ij}\) and \(L^{R}_{ij'}\). So, we only need to calculate univariate cumulative normal distribution functions.
A fully Monte Carlo alternative is to generate many \((Z_{ij}, Z_{ij'})\) pairs, hence many \((Y_{ij}, Y_{ij'})\) pairs for each \(\mu _{i,b}^{(j)}, \mu _{i,b}^{(j')}, \rho _{b}^{(j,j')}\). The collection can be placed into a \(2 \times 2\) table from which we can obtain a posterior realization of each of the probabilities on the right side of (1) as well the odds ratio.
Rights and permissions
About this article
Cite this article
Gelfand, A.E., Shirota, S. The role of odds ratios in joint species distribution modeling. Environ Ecol Stat 28, 287–302 (2021). https://doi.org/10.1007/s10651-021-00486-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-021-00486-4