Finite Mixture of Regression Modeling for High-Dimensional Count and Biomass Data in Ecology
- 647 Downloads
Understanding how species distributions respond as a function of environmental gradients is a key question in ecology, and will benefit from a multi-species approach. Multi-species data are often high dimensional, in that the number of species sampled is often large relative to the number of sites, and are commonly quantified as either presence–absence, counts of individuals, or biomass of each species. In this paper, we propose a novel approach to the analysis of multi-species data when the goal is to understand how each species responds to their environment. We use a finite mixture of regression models, grouping species into “Archetypes” according to their environmental response, thereby significantly reducing the dimension of the regression model. Previous research introduced such Species Archetype Models (SAMs), but only for binary assemblage data. Here, we extend this basic framework with three key innovations: (1) the method is expanded to handle count and biomass data, (2) we propose grouping on the slope coefficients only, whilst the intercept terms and nuisance parameters remain species-specific, and (3) we develop model diagnostic tools for SAMs. By grouping on environmental responses only, the model allows for inter-species variation in terms of overall prevalence and abundance. The application of our expanded SAM framework data is illustrated on marine survey data and through simulation.
Supplementary materials accompanying this paper appear on-line.
Key WordsCommunity-level model Mixture model Multi-species Species archetype model Species distribution model Tweedie
Unable to display preview. Download preview PDF.
- Anderson, M. J., Crist, T. O., Chase, J. M., Vellend, M., Inouye, B. D., Freestone, A. L., Sanders, N. J., Cornell, H. V., Comita, L. S., Davies, K. F., Harrison, S. P., Kraft, N. J. B., Stegen, J. C., and Swenson, N. G. (2011), “Navigating the Multiple Meanings of β Diversity: A Roadmap for the Practicing Ecologist,” Ecology Letters, 14, 19–28. CrossRefGoogle Scholar
- Bax, N., and Williams, A. (2000), “Habitat and Fisheries Production in the South East Fishery Ecosystem,” Final Report to the Fisheries Research and Development Corporation, Project No. 94/040. Google Scholar
- Dunn, P. K., and Smyth, G. K. (1996), “Randomized Quantile Residuals,” Journal of Computational and Graphical Statistics, 5, 236–244. Google Scholar
- Foster, S., and Bravington, M. (2013), “A Poisson-Gamma Model for Analysis of Ecological Non-negative Continuous Data,” Journal of Environmental and Ecological Statistics, in press. Google Scholar
- Geoscience Australia (2009), “GA Australian Bathymetry and Topography Grid, ANZLIC Metadata ANZCW0703013116.” Tech. rep., Australian Government Geoscience Australia. Google Scholar
- Hui, F. K. C., Warton, D. J., Foster, S., and Dunstan, P. (2013), “To Mix or Not to Mix: Comparing the Predictive Performance of Mixture Models Versus Separate SDMs,” Ecology, in press. Google Scholar
- Jørgenson, B. (1997), The Theory of Dispersion Models, London: Chapman and Hall. Google Scholar
- Nash, S. G., and Sofer, A. (1996), Linear and Nonlinear Programming (1st ed.), McGraw-Hill Series in Industrial Engineering and Management Science, New York: McGraw-Hill Inc. Google Scholar
- Novotny, V., Miller, S., Hulcr, J., Drew, R., Basset, Y., Janda, M., Setliff, G., Darrow, K., Stewart, A., Auga, J., Isua, B., Molem, K., Manumbor, M., Tamtiai, E., Mogia, M., and Weiblen, G. (2007), “Low Beta Diversity of Herbivorous Insects in Tropical Forests,” Nature, 448, 692–695. CrossRefGoogle Scholar
- Ross, L., Woodin, S., Hester, A., Thompson, D., and Birks, H. (2012), “Biotic Homogenization of Upland Vegetation: Patterns and Drivers at Multiple Spatial Scales Over Five Decades,” Journal of Vegetation Science. Google Scholar