Plant Ecology

, Volume 216, Issue 5, pp 669–682 | Cite as

Model-based thinking for community ecology

  • David I. WartonEmail author
  • Scott D. Foster
  • Glenn De’ath
  • Jakub Stoklosa
  • Piers K. Dunstan


In this paper, a case is made for the use of model-based approaches for the analysis of community data. This involves the direct specification of a statistical model for the observed multivariate data. Recent advances in statistical modelling mean that it is now possible to build models that are appropriate for the data which address key ecological questions in a statistically coherent manner. Key advantages of this approach include interpretability, flexibility, and efficiency, which we explain in detail and illustrate by example. The steps in a model-based approach to analysis are outlined, with an emphasis on key features arising in a multivariate context. A key distinction in the model-based approach is the emphasis on diagnostic checking to ensure that the model provides reasonable agreement with the observed data. Two examples are presented that illustrate how the model-based approach can provide insights into ecological problems not previously available. In the first example, we test for a treatment effect in a study where different sites had different sampling intensities, which was handled by adding an offset term to the model. In the second example, we incorporate trait information into a model for ordinal response in order to identify the main reasons why species differ in their environmental response.


Community-level modelling Fourth-corner problem Model checking Multivariate analysis Ordination Species distribution models 



DIW is supported by the Australian Research Council Future Fellow scheme (project number FT120100501). SDF, GD and PKD were supported by the Marine Biodiversity Hub, a collaborative partnership supported through funding from the Australian Government’s National Environmental Research Program (NERP). NERP Marine Biodiversity Hub partners include the Institute for Marine and Antarctic Studies, University of Tasmania; CSIRO Wealth from Oceans National Flagship, Geoscience Australia, Australian Institute of Marine Science, Museum Victoria, Charles Darwin University and the University of Western Australia.


  1. Anderson MJ (2001) A new method for non-parametric multivariate analysis of variance. Austral Ecol 26:32–46Google Scholar
  2. Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, New YorkGoogle Scholar
  3. Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen JR, Stevens MHH, White J-SS (2009) Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol Evol 24:127–135CrossRefPubMedGoogle Scholar
  4. Brown AM, Warton DI, Andrew NR, Binns M, Cassis G, Gibb H (2014) The fourth-corner solution using predictive models to understand how species traits interact with the environment. Methods Ecol Evol 5(4):344–352CrossRefGoogle Scholar
  5. Burnham KP, Anderson DR (1998) Model selection and inference: a practical information-theoretic approach. Springer, New YorkCrossRefGoogle Scholar
  6. Christensen RHB (2013) Ordinal–regression models for ordinal data. R package version 2013.9-30.
  7. Clark J (2007) Models for ecological data. Princeton University Press, PrincetonGoogle Scholar
  8. Cressie N, Calder CA, Clark JS, Hoef JMV, Wikle CK (2009) Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling. Ecol Appl 19:553–570CrossRefPubMedGoogle Scholar
  9. Draper NR, Smith H (1998) Applied regression analysis, 3rd edn. Wiley, New YorkCrossRefGoogle Scholar
  10. Dunn P, Smyth G (1996) Randomized quantile residuals. J Comput Graph Stat 5:236–244Google Scholar
  11. Dunstan PK, Foster SD, Darnell R (2011) Model based grouping of species across environmental gradients. Ecol Model 222:955–963CrossRefGoogle Scholar
  12. Dunstan PK, Foster SD, Hui FK, Warton DI (2013) Finite mixture of regression modelling for high-dimensional count and biomass data in ecology. J Agric Biol Environ Stat 18:357–375CrossRefGoogle Scholar
  13. Elith J, Leathwick J (2007) Predicting species distributions from museum and herbarium records using multiresponse models fitted with multivariate adaptive regression splines. Divers Distrib 13:265–275CrossRefGoogle Scholar
  14. Elith J, Leathwick J (2009) Species distribution models: ecological explanation and prediction across space and time. Ann Rev Ecol Evol Syst 40:677–697CrossRefGoogle Scholar
  15. Etienne RS (2007) A neutral sampling formula for multiple samples and an ‘exact’ test of neutrality. Ecol Lett 10:608–618CrossRefPubMedGoogle Scholar
  16. Ferrier S, Guisan A (2006) Spatial modelling of biodiversity at the community level. J Appl Ecol 43:393–404CrossRefGoogle Scholar
  17. Foster S, Bravington M (2011) Graphical diagnostics for markov models for categorical data. J Comput Graph Stat 20:355–374CrossRefGoogle Scholar
  18. Foster S, Givens G, Dornan G, Dunstan P, Darnell R (2013) Modelling biological regions from multi-species and environmental data. Environmetrics 24:489–499CrossRefGoogle Scholar
  19. Gauch H, Chase GB, Whittaker RH (1974) Ordination of vegetation samples by Gaussian species distributions. Ecology 55:1382–1390CrossRefGoogle Scholar
  20. Gelfand AE, Schmidt AM, Wu S, Latimer A (2005) Modelling species diversity through species level hierarchical modelling. J R Stat Soc 54:1–20CrossRefGoogle Scholar
  21. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. CRC Press, Boca RatonGoogle Scholar
  22. Goodall D, Johnson R (1982) Non-linear ordination in several dimensions. Vegetatio 48:197–208Google Scholar
  23. Hui FK, Warton DI, Foster S, Dunstan P (2013) To mix or not to mix: comparing the predictive performance of mixture models versus separate species distribution models. Ecology 94:1913–1919CrossRefPubMedGoogle Scholar
  24. Ives AR, Helmus MR (2011) Generalized linear mixed models for phylogenetic analyses of community structure. Ecol Monogr 81:511–525CrossRefGoogle Scholar
  25. Jamil T, Ozinga WA, Kleyer M, ter Braak CJF (2013) Selecting traits that explain species–environment relationships: a generalized linear mixed model approach. J Veg Sci 24:988–1000CrossRefGoogle Scholar
  26. Jongman RHG, ter Braak CJF, van Tongeren OFR (1987) Data analysis in community and landscape ecology. Pudoc, WageningenGoogle Scholar
  27. Lahiri SN (2003) Resampling methods for dependent data. Springer, New YorkCrossRefGoogle Scholar
  28. Legendre P, Legendre L (2012) Numerical ecology. Elsevier, AmsterdamGoogle Scholar
  29. Manly BFJ (2007) Randomization, bootstrap and Monte Carlo methods in biology, 3rd edn. Chapman & Hall, LondonGoogle Scholar
  30. Neter J, Kutner M, Natchtsheim C, Wasserman W (1996) Applied linear statistical models, 4th edn. Irwin, ChicagoGoogle Scholar
  31. O’Hara RB, Kotze DJ (2010) Do not log-transform count data. Methods Ecol Evol 1:118–122CrossRefGoogle Scholar
  32. Ovaskainen O, Soininen J (2011) Making more out of sparse data: hierarchical modeling of species communities. Ecology 92:289–295CrossRefPubMedGoogle Scholar
  33. Pledger S, Arnold R (2014) Multivariate methods using mixtures: correspondence analysis, scaling and pattern-detection. Comput Stat Data Anal 71:241–261CrossRefGoogle Scholar
  34. Pollock LJ, Morris WK, Vesk PA (2012) The role of functional traits in species distributions revealed through a hierarchical model. Ecography 35:716–725CrossRefGoogle Scholar
  35. Shuster JJ (2005) Diagnostics for assumptions in moderate to large simple clinical trials: do they really help? Stat Med 24:2431–2438CrossRefPubMedGoogle Scholar
  36. Sousa P, Azevedo M, Gomes M (2006) Species-richness patterns in space, depth, and time (1989–1999) of the Portuguese fauna sampled by bottom trawl. Aquat Living Resour 19:93–103CrossRefGoogle Scholar
  37. Steel E, Kennedy M, Cunningham P, Stanovick J (2013) Applied statistics in ecology: common pitfalls and simple solutions. Ecospere 4:115Google Scholar
  38. ter Braak CJ, Hoijtink H, Akkermans W, Verdonschot PF (2003) Bayesian model-based cluster analysis for predicting macrofaunal communities. Ecol Model 160:235–248CrossRefGoogle Scholar
  39. ter Braak CJF (1986) Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67:1167–1179CrossRefGoogle Scholar
  40. Tukey JW (1977) Exploratory data analysis. Addison-Wesley, ReadingGoogle Scholar
  41. Wang Y, Naumann U, Wright ST, Warton DI (2012) mvabund: an R package for model-based analysis of multivariate abundance data. Methods Ecol Evol 3:471–474CrossRefGoogle Scholar
  42. Warton DI (2008) Raw data graphing: an informative but under-utilized tool for the analysis of multivariate abundances. Austral Ecol 33:290–300CrossRefGoogle Scholar
  43. Warton DI (2011) Regularized sandwich estimators for analysis of high dimensional data using generalized estimating equations. Biometrics 67:116–123CrossRefPubMedGoogle Scholar
  44. Warton DI, Hudson HM (2004) A MANOVA statistic is just as powerful as distance-based statistics, for multivariate abundances. Ecology 85:858–874CrossRefGoogle Scholar
  45. Warton DI, Hui FKC (2011) The arcsine is asinine: the analysis of proportions in ecology. Ecology 92:3–10CrossRefPubMedGoogle Scholar
  46. Warton DI, Wang YA (in review) The PIT-trap: a general bootstrap procedure for inference about regression models with non-normal responseGoogle Scholar
  47. Warton DI, Wright ST, Wang Y (2012) Distance-based multivariate analyses confound location and dispersion effects. Methods Ecol Evol 3:89–101CrossRefGoogle Scholar
  48. Wenger SJ, Olden JD (2012) Assessing transferability of ecological models: an underappreciated aspect of statistical validation. Methods Ecol Evol 3:260–267. doi: 10.1111/j.2041-210X.2011.00170.x CrossRefGoogle Scholar
  49. Yee TW (2010) The VGAM package for categorical data analysis. J Stat Softw 32:1–34Google Scholar
  50. Zuur AF, Ieno EN, Elphick CS (2010) A protocol for data exploration to avoid common statistical problems. Methods Ecol Evol 1:3–14CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  • David I. Warton
    • 1
    Email author
  • Scott D. Foster
    • 2
    • 3
  • Glenn De’ath
    • 4
  • Jakub Stoklosa
    • 1
  • Piers K. Dunstan
    • 2
  1. 1.School of Mathematics and Statistics and Evolution & Ecology Research CentreThe University of New South WalesSydneyAustralia
  2. 2.CSIRO’s Wealth from Oceans FlagshipHobartAustralia
  3. 3.CSIRO’s Division of Computational InformaticsHobartAustralia
  4. 4.Australian Institute of Marine ScienceCape FergusonAustralia

Personalised recommendations