Generalized Linear Latent Variable Models for Multivariate Count and Biomass Data in Ecology

  • Jenni NikuEmail author
  • David I. Warton
  • Francis K. C. Hui
  • Sara Taskinen


In this paper we consider generalized linear latent variable models that can handle overdispersed counts and continuous but non-negative data. Such data are common in ecological studies when modelling multivariate abundances or biomass. By extending the standard generalized linear modelling framework to include latent variables, we can account for any covariation between species not accounted for by the predictors, notably species interactions and correlations driven by missing covariates. We show how estimation and inference for the considered models can be performed efficiently using the Laplace approximation method and use simulations to study the finite-sample properties of the resulting estimates. In the overdispersed count data case, the Laplace-approximated estimates perform similarly to the estimates based on variational approximation method, which is another method that provides a closed form approximation of the likelihood. In the biomass data case, we show that ignoring the correlation between taxa affects the regression estimates unfavourably. To illustrate how our methods can be used in unconstrained ordination and in making inference on environmental variables, we apply them to two ecological datasets: abundances of bacterial species in three arctic locations in Europe and abundances of coral reef species in Indonesia.

Supplementary materials accompanying this paper appear on-line.


Biomass Laplace approximation Ordination Overdispersed count Species interactions 



We thank the Associate Editor and the referees for their helpful comments. We also thank Dr Manoj Kumar and Dr Riitta Nissinen for providing us the plant-microbial diversity data. JN and ST were supported by the Academy of Finland grants 251965 and 283323.

Supplementary material


  1. Araújo, M. B. and Luoto, M. (2007). The importance of biotic interactions for modelling species distributions under climate change. Global Ecology and Biogeography, 16:743–753.CrossRefGoogle Scholar
  2. Bartholomew, D. J., Knott, M., and Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach. Wiley: New York.CrossRefzbMATHGoogle Scholar
  3. Bianconcini, S. and Cagnone, S. (2012). Estimation of generalized linear latent variable models via fully exponential Laplace approximation. Journal of Multivariate Analysis, 112:183–193.MathSciNetCrossRefzbMATHGoogle Scholar
  4. Blanchet, F. (2014). HMSC: Hierachical modelling of species community. R package version 0.6-2.Google Scholar
  5. Brown, A. M., Warton, D. I., Andrew, N. R., Binns, M., Cassis, G., and Gibb, H. (2014). The fourth-corner solution - using predictive models to understand how species traits interact with the environment. Methods in Ecology and Evolution, 5:344–352.CrossRefGoogle Scholar
  6. Burnham, K. and Anderson, D. (2002). Model selection and multimodel inference: Al practical information-theoretic approach. Springer.Google Scholar
  7. Chu, H., Fierer, N., Lauber, C. L., Caporaso, J. G., Knight, R., and Grogan, P. (2010). Soil bacterial diversity in the arctic is not fundamentally different from that found in other biomes. Environmental Microbiology, 12:2998–3006.CrossRefGoogle Scholar
  8. Cressie, N., Calder, C. A., Clark, J. S., Hoef, J. M. V., and Wikle, C. K. (2009). Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling. Ecological Applications, 19(3):553–570.CrossRefGoogle Scholar
  9. Dunn, P. K. and Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5:236–244.Google Scholar
  10. ——. (2005). Series evaluation of tweedie exponential dispersion model densities. Statistics and Computing, 15:267–280.MathSciNetCrossRefGoogle Scholar
  11. Dunstan, P. K., Foster, S. D., Hui, F., and Warton, D. I. (2013). Finite mixture of regression modeling for high-dimensional count and biomass data in ecology. Journal of Agricultural, Biological and Environmental Sciences, 18:357–375.MathSciNetCrossRefzbMATHGoogle Scholar
  12. Foster, S. D. and Bravington, M. V. (2013). A Poisson–Gamma model for analysis of ecological non-negative continuous data. Environmental and ecological statistics, 20:533–552.MathSciNetCrossRefGoogle Scholar
  13. Hall, P., Ormerod, J. T., and Wand, M. (2011a). Theory of gaussian variational approximation for a poisson mixed model. Statistica Sinica, 21:369–389.MathSciNetzbMATHGoogle Scholar
  14. Hall, P., Pham, T., Wand, M. P., Wang, S. S., et al. (2011b). Asymptotic normality and valid inference for Gaussian variational approximation. The Annals of Statistics, 39:2502–2532.MathSciNetCrossRefzbMATHGoogle Scholar
  15. Huber, P. and Ronchetti, E. (2009). Robust Statistics. Wiley: New York.CrossRefzbMATHGoogle Scholar
  16. Huber, P., Ronchetti, E., and Victoria-Feser, M. (2004). Estimation of generalized linear latent variable models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66:893–908.MathSciNetCrossRefzbMATHGoogle Scholar
  17. Hui, F. K. C. (2016). boral–Bayesian Ordination and Regression Analysis of Multivariate Abundance Data in R. Methods in Ecology and Evolution, 7:744–750.CrossRefGoogle Scholar
  18. Hui, F. K. C., Taskinen, S., Pledger, S., Foster, S. D., and Warton, D. I. (2015). Model-Based Approaches to Unconstrained Ordination. Methods in Ecology and Evolution, 6:399–411.CrossRefGoogle Scholar
  19. Hui, F. K. C., Warton, D., Ormerod, J., Haapaniemi, V., and Taskinen, S. (2016). Variational Approximations for Generalized Linear Latent Variable Models. Journal of Computational and Graphical Statistics. In press.Google Scholar
  20. Joe, H. (2008). Accuracy of laplace approximation for discrete response mixed models. Computational Statistics & Data Analysis, 5066–5074:52.MathSciNetzbMATHGoogle Scholar
  21. Jorgensen, B. (1997). The Theory of Dispersion Models. Chapman & Hall.Google Scholar
  22. Kendal, W. S. (2004). Taylor’s ecological power law as a consequence of scale invariant exponential dispersion models. Ecological Complexity, 1(3):193–209.CrossRefGoogle Scholar
  23. Kristensen, K., Nielsen, A., Berg, C., Skaug, H., and Bell, B. (2016). Tmb: Automatic differentiation and laplace approximation. Journal of Statistical Software, Articles, 70(5):1–21.Google Scholar
  24. Letten, A. D., Keith, D. A., Tozer, M. G., and Hui, F. K. (2015). Fine-scale hydrological niche differentiation through the lens of multi-species co-occurrence models. Journal of Ecology, 103:1264–1275.CrossRefGoogle Scholar
  25. Männistö, M. K., Tiirola, M., and Häggblom, M. M. (2007). Bacterial communities in arctic fjelds of finnish lapland are stable but highly ph-dependent. FEMS Microbiology Ecology, 59:452–465.CrossRefGoogle Scholar
  26. Martin, T. G., Wintle, B. A., Rhodes, J. R., Kuhnert, P. M., Field, S. A., Low-Choy, S. J., Tyre, A. J., and Possingham, H. P. (2005). Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecology letters, 8:1235–1246.CrossRefGoogle Scholar
  27. Morales-Castilla, I., Matias, M. G., Gravel, D., and Araújo, M. B. (2015). Inferring biotic interactions from proxies. Trends in ecology & evolution, 30(6):347–356.CrossRefGoogle Scholar
  28. Moustaki, I. (1996). A latent trait and a latent class model for mixed observed variables. British Journal of Mathematical and Statistical Psychology, 49:313–334.CrossRefzbMATHGoogle Scholar
  29. Moustaki, I. and Knott, M. (2000). Generalized latent trait models. Psychometrika, 65:391–411.MathSciNetCrossRefzbMATHGoogle Scholar
  30. Nakagawa, S. and Schielzeth, H. (2013). A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods In Ecology And Evolution, 4:133–142.CrossRefGoogle Scholar
  31. Nissinen, R., Männistö, M., and van Elsas, J. (2012). Endophytic bacterial communities in three arctic plants from low arctic fell tundra are cold-adapted and host-plant specific. FEMS Microbiology Ecology, 82:510–522.CrossRefGoogle Scholar
  32. Ovaskainen, O., Abrego, N., Halme, P., and Dunson, D. (2016a). Using latent variable models to identify large networks of species-to-species associations at different spatial scales. Methods in Ecology and Evolution, 7:549–555.CrossRefGoogle Scholar
  33. Ovaskainen, O., de Knegt, H. J., and Delgado Sanchez, M. d. M. (2016b). Quantitative Ecology and Evolutionary Biology: Integrating Models with Data. Oxford: Oxford University Press.CrossRefGoogle Scholar
  34. Rabe-Hesketh, S., Skrondal, A., and Pickles, A. (2002). Reliable estimation of generalized linear mixed models using adaptive quadrature. Stata Journal, 2:1–21.zbMATHGoogle Scholar
  35. Rodrigues-Motta, M., Pinheiro, H. P., Martins, E. G., Araujo, M. S., and dos Reis, S. F. (2013). Multivariate models for correlated count data. Journal of Applied Statistics, 40:1586–1596.MathSciNetCrossRefGoogle Scholar
  36. Sammel, M. D., Ryan, L. M., and Legler, J. M. (1997). Latent variable models for mixed discrete and continuous outcomes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59:667–678.CrossRefzbMATHGoogle Scholar
  37. Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Chapman & Hall, Boca Raton.CrossRefzbMATHGoogle Scholar
  38. Taylor, L. R. (1961). Aggregation, variance and the mean. Nature, 189:732 – 735.CrossRefGoogle Scholar
  39. Warton, D. I. (2005). Many zeros does not mean zero inflation: comparing the goodness-of-fit of parametric models to multivariate abundance data. Environmetrics, 16:275–289.MathSciNetCrossRefGoogle Scholar
  40. Warton, D. I., Blanchet, F. G., O’Hara, R., Ovaskainen, O., Taskinen, S., Walker, S. C., and Hui, F. K. (2016). Extending Joint Models in Community Ecology: A Response to Beissinger et al. Trends in Ecology & Evolution, 31:737–738.CrossRefGoogle Scholar
  41. Warton, D. I., Blanchet, F. G., O’Hara, R., Ovaskainen, O., Taskinen, S., Walker, S. C., and Hui, F. K. C. (2015). So many variables: Joint modeling in community ecology. Trends in Ecology and Evolution, 30:766–779.CrossRefGoogle Scholar
  42. Warwick, R., Clarke, K., and Suharsono (1990). A statistical analysis of coral community responses to the 1982–83 el niño in the thousand islands, indonesia. Coral Reefs, 8:171–179.CrossRefGoogle Scholar
  43. Welsh, A. H., Cunningham, R. B., Donnelly, C., and Lindenmayer, D. B. (1996). Modelling the abundance of rare species: statistical models for counts with extra zeros. Ecological Modelling, 88:297–308.CrossRefGoogle Scholar
  44. Yu, D. W., Ji, Y., Emerson, B. C., Wang, X., Ye, C., Yang, C., and Ding, Z. (2012). Biodiversity soup: metabarcoding of arthropods for rapid biodiversity assessment and biomonitoring. Methods in Ecology and Evolution, 3:613–623.CrossRefGoogle Scholar

Copyright information

© International Biometric Society 2017

Authors and Affiliations

  1. 1.Department of Mathematics and StatisticsUniversity of JyväskyläJyväskyläFinland
  2. 2.School of Mathematics and Statistics and Evolution and Ecology Research CentreThe University of New South WalesSydneyAustralia
  3. 3.School of Mathematics and StatisticsThe University of New South WalesSydneyAustralia
  4. 4.Mathematical Sciences InstituteThe Australian National UniversityCanberraAustralia

Personalised recommendations