Skip to main content

Model-based thinking for community ecology

Abstract

In this paper, a case is made for the use of model-based approaches for the analysis of community data. This involves the direct specification of a statistical model for the observed multivariate data. Recent advances in statistical modelling mean that it is now possible to build models that are appropriate for the data which address key ecological questions in a statistically coherent manner. Key advantages of this approach include interpretability, flexibility, and efficiency, which we explain in detail and illustrate by example. The steps in a model-based approach to analysis are outlined, with an emphasis on key features arising in a multivariate context. A key distinction in the model-based approach is the emphasis on diagnostic checking to ensure that the model provides reasonable agreement with the observed data. Two examples are presented that illustrate how the model-based approach can provide insights into ecological problems not previously available. In the first example, we test for a treatment effect in a study where different sites had different sampling intensities, which was handled by adding an offset term to the model. In the second example, we incorporate trait information into a model for ordinal response in order to identify the main reasons why species differ in their environmental response.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  • Anderson MJ (2001) A new method for non-parametric multivariate analysis of variance. Austral Ecol 26:32–46

    Google Scholar 

  • Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, New York

    Google Scholar 

  • Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen JR, Stevens MHH, White J-SS (2009) Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol Evol 24:127–135

    Article  PubMed  Google Scholar 

  • Brown AM, Warton DI, Andrew NR, Binns M, Cassis G, Gibb H (2014) The fourth-corner solution using predictive models to understand how species traits interact with the environment. Methods Ecol Evol 5(4):344–352

    Article  Google Scholar 

  • Burnham KP, Anderson DR (1998) Model selection and inference: a practical information-theoretic approach. Springer, New York

    Book  Google Scholar 

  • Christensen RHB (2013) Ordinal–regression models for ordinal data. R package version 2013.9-30. http://www.cran.r-project.org/package=ordinal/

  • Clark J (2007) Models for ecological data. Princeton University Press, Princeton

    Google Scholar 

  • Cressie N, Calder CA, Clark JS, Hoef JMV, Wikle CK (2009) Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling. Ecol Appl 19:553–570

    Article  PubMed  Google Scholar 

  • Draper NR, Smith H (1998) Applied regression analysis, 3rd edn. Wiley, New York

    Book  Google Scholar 

  • Dunn P, Smyth G (1996) Randomized quantile residuals. J Comput Graph Stat 5:236–244

    Google Scholar 

  • Dunstan PK, Foster SD, Darnell R (2011) Model based grouping of species across environmental gradients. Ecol Model 222:955–963

    Article  Google Scholar 

  • Dunstan PK, Foster SD, Hui FK, Warton DI (2013) Finite mixture of regression modelling for high-dimensional count and biomass data in ecology. J Agric Biol Environ Stat 18:357–375

    Article  Google Scholar 

  • Elith J, Leathwick J (2007) Predicting species distributions from museum and herbarium records using multiresponse models fitted with multivariate adaptive regression splines. Divers Distrib 13:265–275

    Article  Google Scholar 

  • Elith J, Leathwick J (2009) Species distribution models: ecological explanation and prediction across space and time. Ann Rev Ecol Evol Syst 40:677–697

    Article  Google Scholar 

  • Etienne RS (2007) A neutral sampling formula for multiple samples and an ‘exact’ test of neutrality. Ecol Lett 10:608–618

    Article  PubMed  Google Scholar 

  • Ferrier S, Guisan A (2006) Spatial modelling of biodiversity at the community level. J Appl Ecol 43:393–404

    Article  Google Scholar 

  • Foster S, Bravington M (2011) Graphical diagnostics for markov models for categorical data. J Comput Graph Stat 20:355–374

    Article  Google Scholar 

  • Foster S, Givens G, Dornan G, Dunstan P, Darnell R (2013) Modelling biological regions from multi-species and environmental data. Environmetrics 24:489–499

    Article  Google Scholar 

  • Gauch H, Chase GB, Whittaker RH (1974) Ordination of vegetation samples by Gaussian species distributions. Ecology 55:1382–1390

    Article  Google Scholar 

  • Gelfand AE, Schmidt AM, Wu S, Latimer A (2005) Modelling species diversity through species level hierarchical modelling. J R Stat Soc 54:1–20

    Article  Google Scholar 

  • Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. CRC Press, Boca Raton

    Google Scholar 

  • Goodall D, Johnson R (1982) Non-linear ordination in several dimensions. Vegetatio 48:197–208

    Google Scholar 

  • Hui FK, Warton DI, Foster S, Dunstan P (2013) To mix or not to mix: comparing the predictive performance of mixture models versus separate species distribution models. Ecology 94:1913–1919

    Article  PubMed  Google Scholar 

  • Ives AR, Helmus MR (2011) Generalized linear mixed models for phylogenetic analyses of community structure. Ecol Monogr 81:511–525

    Article  Google Scholar 

  • Jamil T, Ozinga WA, Kleyer M, ter Braak CJF (2013) Selecting traits that explain species–environment relationships: a generalized linear mixed model approach. J Veg Sci 24:988–1000

    Article  Google Scholar 

  • Jongman RHG, ter Braak CJF, van Tongeren OFR (1987) Data analysis in community and landscape ecology. Pudoc, Wageningen

    Google Scholar 

  • Lahiri SN (2003) Resampling methods for dependent data. Springer, New York

    Book  Google Scholar 

  • Legendre P, Legendre L (2012) Numerical ecology. Elsevier, Amsterdam

    Google Scholar 

  • Manly BFJ (2007) Randomization, bootstrap and Monte Carlo methods in biology, 3rd edn. Chapman & Hall, London

    Google Scholar 

  • Neter J, Kutner M, Natchtsheim C, Wasserman W (1996) Applied linear statistical models, 4th edn. Irwin, Chicago

    Google Scholar 

  • O’Hara RB, Kotze DJ (2010) Do not log-transform count data. Methods Ecol Evol 1:118–122

    Article  Google Scholar 

  • Ovaskainen O, Soininen J (2011) Making more out of sparse data: hierarchical modeling of species communities. Ecology 92:289–295

    Article  PubMed  Google Scholar 

  • Pledger S, Arnold R (2014) Multivariate methods using mixtures: correspondence analysis, scaling and pattern-detection. Comput Stat Data Anal 71:241–261

    Article  Google Scholar 

  • Pollock LJ, Morris WK, Vesk PA (2012) The role of functional traits in species distributions revealed through a hierarchical model. Ecography 35:716–725

    Article  Google Scholar 

  • Shuster JJ (2005) Diagnostics for assumptions in moderate to large simple clinical trials: do they really help? Stat Med 24:2431–2438

    Article  PubMed  Google Scholar 

  • Sousa P, Azevedo M, Gomes M (2006) Species-richness patterns in space, depth, and time (1989–1999) of the Portuguese fauna sampled by bottom trawl. Aquat Living Resour 19:93–103

    Article  Google Scholar 

  • Steel E, Kennedy M, Cunningham P, Stanovick J (2013) Applied statistics in ecology: common pitfalls and simple solutions. Ecospere 4:115

    Google Scholar 

  • ter Braak CJ, Hoijtink H, Akkermans W, Verdonschot PF (2003) Bayesian model-based cluster analysis for predicting macrofaunal communities. Ecol Model 160:235–248

    Article  Google Scholar 

  • ter Braak CJF (1986) Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67:1167–1179

    Article  Google Scholar 

  • Tukey JW (1977) Exploratory data analysis. Addison-Wesley, Reading

    Google Scholar 

  • Wang Y, Naumann U, Wright ST, Warton DI (2012) mvabund: an R package for model-based analysis of multivariate abundance data. Methods Ecol Evol 3:471–474

    Article  Google Scholar 

  • Warton DI (2008) Raw data graphing: an informative but under-utilized tool for the analysis of multivariate abundances. Austral Ecol 33:290–300

    Article  Google Scholar 

  • Warton DI (2011) Regularized sandwich estimators for analysis of high dimensional data using generalized estimating equations. Biometrics 67:116–123

    Article  PubMed  Google Scholar 

  • Warton DI, Hudson HM (2004) A MANOVA statistic is just as powerful as distance-based statistics, for multivariate abundances. Ecology 85:858–874

    Article  Google Scholar 

  • Warton DI, Hui FKC (2011) The arcsine is asinine: the analysis of proportions in ecology. Ecology 92:3–10

    Article  PubMed  Google Scholar 

  • Warton DI, Wang YA (in review) The PIT-trap: a general bootstrap procedure for inference about regression models with non-normal response

  • Warton DI, Wright ST, Wang Y (2012) Distance-based multivariate analyses confound location and dispersion effects. Methods Ecol Evol 3:89–101

    Article  Google Scholar 

  • Wenger SJ, Olden JD (2012) Assessing transferability of ecological models: an underappreciated aspect of statistical validation. Methods Ecol Evol 3:260–267. doi:10.1111/j.2041-210X.2011.00170.x

    Article  Google Scholar 

  • Yee TW (2010) The VGAM package for categorical data analysis. J Stat Softw 32:1–34

    Google Scholar 

  • Zuur AF, Ieno EN, Elphick CS (2010) A protocol for data exploration to avoid common statistical problems. Methods Ecol Evol 1:3–14

    Article  Google Scholar 

Download references

Acknowledgments

DIW is supported by the Australian Research Council Future Fellow scheme (project number FT120100501). SDF, GD and PKD were supported by the Marine Biodiversity Hub, a collaborative partnership supported through funding from the Australian Government’s National Environmental Research Program (NERP). NERP Marine Biodiversity Hub partners include the Institute for Marine and Antarctic Studies, University of Tasmania; CSIRO Wealth from Oceans National Flagship, Geoscience Australia, Australian Institute of Marine Science, Museum Victoria, Charles Darwin University and the University of Western Australia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David I. Warton.

Additional information

Communicated by P. R. Minchin and J. Oksanen.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Warton, D.I., Foster, S.D., De’ath, G. et al. Model-based thinking for community ecology. Plant Ecol 216, 669–682 (2015). https://doi.org/10.1007/s11258-014-0366-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11258-014-0366-3

Keywords

  • Community-level modelling
  • Fourth-corner problem
  • Model checking
  • Multivariate analysis
  • Ordination
  • Species distribution models