Skip to main content
Log in

The role of odds ratios in joint species distribution modeling

  • Published:
Environmental and Ecological Statistics Aims and scope Submit manuscript

Abstract

Joint species distribution modeling is attracting increasing attention these days, acknowledging the fact that individual level modeling fails to take into account expected dependence/interaction between species. These joint models capture species dependence through an associated correlation matrix arising from a set of latent multivariate normal variables. However, these associations offer limited insight into realized dependence behavior between species at sites. We focus on presence/absence data using joint species modeling, which, in addition, incorporates spatial dependence between sites. For pairs of species selected from a collection, we emphasize the induced odds ratios (along with the joint occurrence probabilities); they provide a better appreciation of the practical dependence between species that is implicit in these joint species distribution modeling specifications. For any pair of species, the spatial structure enables a spatial odds ratio surface to illuminate how dependence varies over the region of interest. We illustrate with a dataset from the Cape Floristic Region of South Africa consisting of more than 600 species at more than 600 sites. We present the spatial distribution of odds ratios for pairs of species that are positively correlated and pairs that are negatively correlated under the joint species distribution model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Under the dimension reduction, we can include at most \(r< < S\) decay parameters where r is say 3 to 5. The effect of adopting a common decay parameter for the latent GP’s is expected to be negligible.

References

  • Agresti A (2012) Categorical data analysis, 3rd edn. Wiley, New York

    Google Scholar 

  • Banerjee S, Carlin BP, Gelfand AE (2014) Hierarchical modeling and analysis for spatial data, 2nd edn. Chapman and Hall, New York

    Book  Google Scholar 

  • Calabrese JM, Certain G, Kraan C, Dormann CF (2014) Stacking species distribution models and adjusting bias by linking them to macroecological models. Glob Ecol Biogeogr 23:99–112

    Article  Google Scholar 

  • Chib S (1998) Analysis of multivariate probit models. Biometrika 85:347–361

    Article  Google Scholar 

  • Clark JS, Nemergut D, Seyednasrollah B, Turner PJ, Zhang S (2017) Generalized joint attribute modeling for biodiversity analysis: median-zero, multivariate, multifarious data. Ecol Monogr 87:34–56

    Article  Google Scholar 

  • Cressie N, Wikle CK (2011) Statistics for spatio-temporal data. Wiley, New York

    Google Scholar 

  • De Oliveira V (2000) Bayesian prediction of clipped Gaussian random fields. Comput Stat Data Anal 34:299–314

    Article  Google Scholar 

  • Doornik J (2007) Ox: object oriented matrix programming. Timberlake Consultants Press, New York

    Google Scholar 

  • Gelfand AE, Schmidt AM, Wu S, Silander JA Jr, Latimer A, Rebelo AG (2005) Modelling species diversity through species level hierarchical modelling. J R Stat Soc Ser C 54:1–20

    Article  Google Scholar 

  • Gelfand AE, Shirota S (2019) Preferential sampling for presence/absence data and for fusion of presence/absence data with presence-only data. Ecol Monogr 89:e01372

    Article  Google Scholar 

  • Graham CH, Hijmans RJ (2006) A comparison of methods for mapping species ranges and species richness. Glob Ecol Biogeogr 15:578–587

    Article  Google Scholar 

  • Guisan A, Rahbek C (2011) SESAM - a new framework integrating macroecological and species distribution models for predicting spatio-temporal patterns of species assemblages. J Biogeogr 38:1433–1444

    Article  Google Scholar 

  • Gupta SS (1963) Probability integrals of multivariate normal and multivariate. Ann Math Stat 34:792–828

    Article  Google Scholar 

  • Lane PW, Lindenmayer DB, Barton PS, Blanchard W, Westgate MJ (2014) Visualization of species pairwise association: a case study of surrogacy in bird assemblages. Ecol Evol 4:3279–3289

    Article  Google Scholar 

  • Latimer A, Wu S, Gelfand AE, Silander Jr JA (2006) Building statistical models to analyze species distributions. Ecol Appl 16:33–50

    Article  Google Scholar 

  • Mueller-Dombois D, Ellenberg H (1974) Aims and methods of vegetation ecology. Wiley, New York

    Google Scholar 

  • Ovaskainen O, Abrego N, Halme P, Dunson D (2016) Using latent variable models to identify large networks of species-to-species associations at different spatial scales. Methods Ecol Evo 7:549–555

    Article  Google Scholar 

  • Ovaskainen O, Hottola J, Siitonen J (2010) Modeling species co-occurrence by multivariate logistic regression generates new hypotheses on fungal interactions. Ecology 91:2514–2521

    Article  Google Scholar 

  • Pineda E, Lobo JM (2009) Assessing the accuracy of species distribution models to predict amphibian species richness patterns. J Anim Ecol 78:182–190

    Article  Google Scholar 

  • Pollock LJ, Tingley R, Morris WK, Golding N, O’Hara RB, Parris KM, Vesk PA, McCarthy MA (2014) Understanding co-occurrence by modelling species simultaneously with a Joint Species Distribution Model (JSDM). Methods Ecol Evol 5:397–406

    Article  Google Scholar 

  • Rebelo T (2001) SASOL proteas: a field guide to the proteas of South Africa, 2nd edn. Fernwood Press, Halifax

    Google Scholar 

  • Rota CT, Ferreira MAR, Kays RW, Forrester TD, Kalies EL, McShea WJ, Parsons AW, Millspaugh JJ (2016) A multispecies occupancy model for two or more interacting species. Methods Ecol Evol 7:1164–1173

  • Shirota S, Gelfand AE, Banerjee S (2019) Spatial joint species distribution modeling using Dirichlet processes. Stat Sin 29:1127–1154

    PubMed  PubMed Central  Google Scholar 

  • Slepian D (1962) The one-sided barrier problem for Gaussian noise. Bell Syst Techn J 41:463–501

    Article  Google Scholar 

  • Takhtajan A (1986) Floristic regions of the world. University of California Press, California

    Google Scholar 

  • Taylor-Rodríguez D, Kaufeld K, Schliep EM, Clark JS, Gelfand AE (2017) Joint species distribution modeling: dimension reduction using Dirichlet processes. Bayesian Anal 12:939–967

    Google Scholar 

  • Thorson JT, Scheuerell MD, Shelton AO, See KE, Skaug HJ, Kristensen K (2015) Spatial factor analysis: a new tool for estimating joint species distributions and correlations in species range. Methods Ecol Evol 6:627–637

    Article  Google Scholar 

  • Wilkinson DP, Golding N, Guillera-Arroita G, Tingley R, McCarthy MA (2018) A comparison of joint species distribution models for presence–absence data. Methods Ecol Evol 10:198–211

    Article  Google Scholar 

  • Zobel DB, Anton JA (1997) A decade of recovery of understory vegetation buried by volcanic tephra from Mount St. Helens. Ecol Monogr 67:317–344

    Article  Google Scholar 

Download references

Acknowledgements

The computational results were obtained using Ox version 6.21 (Doornik 2007).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shinichiro Shirota.

Additional information

Handling Editor: Luiz Duczmal.

Appendix

Appendix

I: We consider, in detail, the connection between the correlation arising under the latent multivariate normal model for species pairs and the associated odds ratio. We draw on some older work relating bivariate normal probabilities to the associated bivariate correlation. There is a substantial literature, with multivariate extensions, and we only note two papers here: Gupta (1963) and Slepian (1962).

The basic result we need is the following:

Theorem: Suppose \(\left( \begin{array}{c} Z_{1} \\ Z_{2} \\ \end{array} \right) \)

\(\sim \) BivN\(\left( \left( \begin{array}{c} 0 \\ 0 \\ \end{array} \right) \right. \), \(\left. \left( \begin{array}{cc} 1 &{} \rho \\ \rho &{} 1\\ \end{array} \right) \right) \). Then, \(P(Z_{1} \le c_1, Z_{2} \le c_2)\) is non-decreasing in \(\rho \).

We apply this result to (1). For fixed \(\mu _{i}^{(j)}\) and \(\mu _{i}^{(j')}\), by simple probability calculations, we have \(p_{i,00}^{(j,j')}\) non-decreasing in \(\rho ^{(j,j')}\), we have \(p_{i,01}^{(j,j')}\) non-increasing in \(\rho ^{(j,j')}\), we have \(p_{i,10}^{(j,j')}\) non-increasing in \(\rho ^{(j,j')}\), and we have \(p_{i,11}^{(j,j')}\) non-decreasing in \(\rho ^{(j,j')}\). As a result, the numerator in (1) is non-decreasing in \(\rho ^{(j,j')}\) while the denominator in (1) is non-increasing in \(\rho ^{(j,j')}\). So, altogether, we have \(\theta _{i}^{(j,j')}\) non-decreasing in \(\rho ^{(j,j')}\) for all i and \((j,j')\) pairs.

As a corollary, since \(\theta _{i}^{(j,j')} =1\) when \(\rho ^{(j,j')} = 0\), we must have \(\theta _{i}^{(j,j')} \ge 1\) when \(\rho ^{(j,j')} > 0\) and \(\theta _{i}^{(j,j')} \le 1\) when \(\rho ^{(j,j')} < 0\).

II: We offer some brief words regarding how calculation of probabilities is implemented. Under Markov chain Monte Carlo model fitting, we obtain posterior samples, say \(\mu _{i,b}^{(j)}, \mu _{i,b}^{(j')}, \rho _{b}^{(j,j')}, b=1,2,\ldots ,B\). Each term in (1) is a double integral which is a function of \((\mu _{i}^{(j)}, \mu _{i}^{(j')}, \rho ^{(j,j')})\). So, each sample, \(\mu _{i,b}^{(j)}, \mu _{i,b}^{(j')}, \rho _{b}^{(j,j')}\) produces a posterior realization of each of the four terms on the right side of (1), e.g., \(p_{i,00,b}^{(j,j')}\), hence a posterior realization, \(\theta _{i,b}^{(j,j')}\). Across \(b=1,2,\ldots ,B\), we obtain a posterior sample of size B for each of the terms on the right side of (1) as well as the induced odds ratio. The double integrals that are needed can be computed using approximations or numerically. In fact, if we work with the functional JSDM and specification (ii) above, then \(Z_{ij}\) and \(Z_{ij'}\) are conditionally independent given \(L^{F}_{ij}\), \(L^{F}_{ij'}\), \(L^{R}_{ij}\) and \(L^{R}_{ij'}\). So, we only need to calculate univariate cumulative normal distribution functions.

A fully Monte Carlo alternative is to generate many \((Z_{ij}, Z_{ij'})\) pairs, hence many \((Y_{ij}, Y_{ij'})\) pairs for each \(\mu _{i,b}^{(j)}, \mu _{i,b}^{(j')}, \rho _{b}^{(j,j')}\). The collection can be placed into a \(2 \times 2\) table from which we can obtain a posterior realization of each of the probabilities on the right side of (1) as well the odds ratio.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gelfand, A.E., Shirota, S. The role of odds ratios in joint species distribution modeling. Environ Ecol Stat 28, 287–302 (2021). https://doi.org/10.1007/s10651-021-00486-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10651-021-00486-4

Keywords

Navigation