Generalized Linear Latent Variable Models for Multivariate Count and Biomass Data in Ecology

Abstract

In this paper we consider generalized linear latent variable models that can handle overdispersed counts and continuous but non-negative data. Such data are common in ecological studies when modelling multivariate abundances or biomass. By extending the standard generalized linear modelling framework to include latent variables, we can account for any covariation between species not accounted for by the predictors, notably species interactions and correlations driven by missing covariates. We show how estimation and inference for the considered models can be performed efficiently using the Laplace approximation method and use simulations to study the finite-sample properties of the resulting estimates. In the overdispersed count data case, the Laplace-approximated estimates perform similarly to the estimates based on variational approximation method, which is another method that provides a closed form approximation of the likelihood. In the biomass data case, we show that ignoring the correlation between taxa affects the regression estimates unfavourably. To illustrate how our methods can be used in unconstrained ordination and in making inference on environmental variables, we apply them to two ecological datasets: abundances of bacterial species in three arctic locations in Europe and abundances of coral reef species in Indonesia.

Supplementary materials accompanying this paper appear on-line.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Araújo, M. B. and Luoto, M. (2007). The importance of biotic interactions for modelling species distributions under climate change. Global Ecology and Biogeography, 16:743–753.

    Article  Google Scholar 

  2. Bartholomew, D. J., Knott, M., and Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach. Wiley: New York.

    Google Scholar 

  3. Bianconcini, S. and Cagnone, S. (2012). Estimation of generalized linear latent variable models via fully exponential Laplace approximation. Journal of Multivariate Analysis, 112:183–193.

    MathSciNet  Article  MATH  Google Scholar 

  4. Blanchet, F. (2014). HMSC: Hierachical modelling of species community. R package version 0.6-2.

  5. Brown, A. M., Warton, D. I., Andrew, N. R., Binns, M., Cassis, G., and Gibb, H. (2014). The fourth-corner solution - using predictive models to understand how species traits interact with the environment. Methods in Ecology and Evolution, 5:344–352.

    Article  Google Scholar 

  6. Burnham, K. and Anderson, D. (2002). Model selection and multimodel inference: Al practical information-theoretic approach. Springer.

  7. Chu, H., Fierer, N., Lauber, C. L., Caporaso, J. G., Knight, R., and Grogan, P. (2010). Soil bacterial diversity in the arctic is not fundamentally different from that found in other biomes. Environmental Microbiology, 12:2998–3006.

    Article  Google Scholar 

  8. Cressie, N., Calder, C. A., Clark, J. S., Hoef, J. M. V., and Wikle, C. K. (2009). Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling. Ecological Applications, 19(3):553–570.

    Article  Google Scholar 

  9. Dunn, P. K. and Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5:236–244.

    Google Scholar 

  10. ——. (2005). Series evaluation of tweedie exponential dispersion model densities. Statistics and Computing, 15:267–280.

    MathSciNet  Article  Google Scholar 

  11. Dunstan, P. K., Foster, S. D., Hui, F., and Warton, D. I. (2013). Finite mixture of regression modeling for high-dimensional count and biomass data in ecology. Journal of Agricultural, Biological and Environmental Sciences, 18:357–375.

    MathSciNet  Article  MATH  Google Scholar 

  12. Foster, S. D. and Bravington, M. V. (2013). A Poisson–Gamma model for analysis of ecological non-negative continuous data. Environmental and ecological statistics, 20:533–552.

    MathSciNet  Article  Google Scholar 

  13. Hall, P., Ormerod, J. T., and Wand, M. (2011a). Theory of gaussian variational approximation for a poisson mixed model. Statistica Sinica, 21:369–389.

    MathSciNet  MATH  Google Scholar 

  14. Hall, P., Pham, T., Wand, M. P., Wang, S. S., et al. (2011b). Asymptotic normality and valid inference for Gaussian variational approximation. The Annals of Statistics, 39:2502–2532.

    MathSciNet  Article  MATH  Google Scholar 

  15. Huber, P. and Ronchetti, E. (2009). Robust Statistics. Wiley: New York.

    Google Scholar 

  16. Huber, P., Ronchetti, E., and Victoria-Feser, M. (2004). Estimation of generalized linear latent variable models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66:893–908.

    MathSciNet  Article  MATH  Google Scholar 

  17. Hui, F. K. C. (2016). boral–Bayesian Ordination and Regression Analysis of Multivariate Abundance Data in R. Methods in Ecology and Evolution, 7:744–750.

    Article  Google Scholar 

  18. Hui, F. K. C., Taskinen, S., Pledger, S., Foster, S. D., and Warton, D. I. (2015). Model-Based Approaches to Unconstrained Ordination. Methods in Ecology and Evolution, 6:399–411.

    Article  Google Scholar 

  19. Hui, F. K. C., Warton, D., Ormerod, J., Haapaniemi, V., and Taskinen, S. (2016). Variational Approximations for Generalized Linear Latent Variable Models. Journal of Computational and Graphical Statistics. In press.

  20. Joe, H. (2008). Accuracy of laplace approximation for discrete response mixed models. Computational Statistics & Data Analysis, 5066–5074:52.

    MathSciNet  MATH  Google Scholar 

  21. Jorgensen, B. (1997). The Theory of Dispersion Models. Chapman & Hall.

  22. Kendal, W. S. (2004). Taylor’s ecological power law as a consequence of scale invariant exponential dispersion models. Ecological Complexity, 1(3):193–209.

    Article  Google Scholar 

  23. Kristensen, K., Nielsen, A., Berg, C., Skaug, H., and Bell, B. (2016). Tmb: Automatic differentiation and laplace approximation. Journal of Statistical Software, Articles, 70(5):1–21.

    Google Scholar 

  24. Letten, A. D., Keith, D. A., Tozer, M. G., and Hui, F. K. (2015). Fine-scale hydrological niche differentiation through the lens of multi-species co-occurrence models. Journal of Ecology, 103:1264–1275.

    Article  Google Scholar 

  25. Männistö, M. K., Tiirola, M., and Häggblom, M. M. (2007). Bacterial communities in arctic fjelds of finnish lapland are stable but highly ph-dependent. FEMS Microbiology Ecology, 59:452–465.

    Article  Google Scholar 

  26. Martin, T. G., Wintle, B. A., Rhodes, J. R., Kuhnert, P. M., Field, S. A., Low-Choy, S. J., Tyre, A. J., and Possingham, H. P. (2005). Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecology letters, 8:1235–1246.

    Article  Google Scholar 

  27. Morales-Castilla, I., Matias, M. G., Gravel, D., and Araújo, M. B. (2015). Inferring biotic interactions from proxies. Trends in ecology & evolution, 30(6):347–356.

    Article  Google Scholar 

  28. Moustaki, I. (1996). A latent trait and a latent class model for mixed observed variables. British Journal of Mathematical and Statistical Psychology, 49:313–334.

    Article  MATH  Google Scholar 

  29. Moustaki, I. and Knott, M. (2000). Generalized latent trait models. Psychometrika, 65:391–411.

    MathSciNet  Article  MATH  Google Scholar 

  30. Nakagawa, S. and Schielzeth, H. (2013). A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods In Ecology And Evolution, 4:133–142.

    Article  Google Scholar 

  31. Nissinen, R., Männistö, M., and van Elsas, J. (2012). Endophytic bacterial communities in three arctic plants from low arctic fell tundra are cold-adapted and host-plant specific. FEMS Microbiology Ecology, 82:510–522.

    Article  Google Scholar 

  32. Ovaskainen, O., Abrego, N., Halme, P., and Dunson, D. (2016a). Using latent variable models to identify large networks of species-to-species associations at different spatial scales. Methods in Ecology and Evolution, 7:549–555.

    Article  Google Scholar 

  33. Ovaskainen, O., de Knegt, H. J., and Delgado Sanchez, M. d. M. (2016b). Quantitative Ecology and Evolutionary Biology: Integrating Models with Data. Oxford: Oxford University Press.

    Google Scholar 

  34. Rabe-Hesketh, S., Skrondal, A., and Pickles, A. (2002). Reliable estimation of generalized linear mixed models using adaptive quadrature. Stata Journal, 2:1–21.

    MATH  Google Scholar 

  35. Rodrigues-Motta, M., Pinheiro, H. P., Martins, E. G., Araujo, M. S., and dos Reis, S. F. (2013). Multivariate models for correlated count data. Journal of Applied Statistics, 40:1586–1596.

    MathSciNet  Article  Google Scholar 

  36. Sammel, M. D., Ryan, L. M., and Legler, J. M. (1997). Latent variable models for mixed discrete and continuous outcomes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59:667–678.

    Article  MATH  Google Scholar 

  37. Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Chapman & Hall, Boca Raton.

    Google Scholar 

  38. Taylor, L. R. (1961). Aggregation, variance and the mean. Nature, 189:732 – 735.

    Article  Google Scholar 

  39. Warton, D. I. (2005). Many zeros does not mean zero inflation: comparing the goodness-of-fit of parametric models to multivariate abundance data. Environmetrics, 16:275–289.

    MathSciNet  Article  Google Scholar 

  40. Warton, D. I., Blanchet, F. G., O’Hara, R., Ovaskainen, O., Taskinen, S., Walker, S. C., and Hui, F. K. (2016). Extending Joint Models in Community Ecology: A Response to Beissinger et al. Trends in Ecology & Evolution, 31:737–738.

    Article  Google Scholar 

  41. Warton, D. I., Blanchet, F. G., O’Hara, R., Ovaskainen, O., Taskinen, S., Walker, S. C., and Hui, F. K. C. (2015). So many variables: Joint modeling in community ecology. Trends in Ecology and Evolution, 30:766–779.

    Article  Google Scholar 

  42. Warwick, R., Clarke, K., and Suharsono (1990). A statistical analysis of coral community responses to the 1982–83 el niño in the thousand islands, indonesia. Coral Reefs, 8:171–179.

    Article  Google Scholar 

  43. Welsh, A. H., Cunningham, R. B., Donnelly, C., and Lindenmayer, D. B. (1996). Modelling the abundance of rare species: statistical models for counts with extra zeros. Ecological Modelling, 88:297–308.

    Article  Google Scholar 

  44. Yu, D. W., Ji, Y., Emerson, B. C., Wang, X., Ye, C., Yang, C., and Ding, Z. (2012). Biodiversity soup: metabarcoding of arthropods for rapid biodiversity assessment and biomonitoring. Methods in Ecology and Evolution, 3:613–623.

    Article  Google Scholar 

Download references

Acknowledgements

We thank the Associate Editor and the referees for their helpful comments. We also thank Dr Manoj Kumar and Dr Riitta Nissinen for providing us the plant-microbial diversity data. JN and ST were supported by the Academy of Finland grants 251965 and 283323.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jenni Niku.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 67 KB)

Appendices

A Proofs

A.1 Laplace Approximations for the General Exponential Family

Assume that the responses \(y_{ij}\) come from the exponential family of distributions with mean \(\mu _{ij}=E(y_{ij})\), and write \(f(y_{ij}|\varvec{u}_i,\varvec{\Psi }) = \exp \left\{ y_{ij}a_j(\mu _{ij})-b_j(\mu _{ij}) + c_j(y_{ij})\right\} \), where \(a_j(\cdot )\), \(b_j(\cdot )\) and \(c_j(\cdot )\) are known functions, and \(\varvec{\Psi }\) includes all model parameters. The log-likelihood function (5) for parameter vector \(\varvec{\Psi }\) now equals

$$\begin{aligned} l(\varvec{ \Psi }) =&\sum \limits _{i=1}^n\log \displaystyle \int \left[ \prod \limits _{j=1}^m \exp \Big \{y_{ij}\, a_j(\mu _{ij}) - b_j(\mu _{ij}) + c_j(y_{ij})\Big \} \right] \times (2\pi )^{-\frac{d}{2}}\exp \left( -\frac{1}{2}\varvec{u}_i'\varvec{u}_i\right) \,d\varvec{u}_i, \end{aligned}$$

and the Laplace approximation of the log-likelihood function is

$$\begin{aligned} \tilde{l}(\varvec{ \Psi },\varvec{\hat{u}}_{i})&= \sum \limits _{i=1}^n\Bigg (-\frac{1}{2}\log \det \left\{ \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i)\right\} + \sum \limits _{j=1}^m\left\{ y_{ij}\, a_j(\mu _{ij}) - b_j(\mu _{ij}) + c_j(y_{ij})\right\} - \frac{\varvec{\hat{u}}_i'\varvec{\hat{u}}_i}{2}\Bigg ), \end{aligned}$$

where

$$\begin{aligned} \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i) = \sum \limits _{j=1}^m\frac{\partial ^2 \left\{ -y_{ij}\, a_j(\mu _{ij}) + b_j(\mu _{ij})\right\} }{\partial \varvec{u}_i'\partial \varvec{u}_i}\Bigg |_{\varvec{u}_i=\varvec{\hat{u}}_i} + \varvec{I}_d, \end{aligned}$$

and \(\varvec{\hat{u}}_i\) is the maximum of \(Q(\varvec{\Psi },\varvec{u}_{i}) = (1/m)\left( \sum \limits _{j=1}^m\log f(y_{ij}|\varvec{u}_i;\varvec{\Psi }) - \varvec{u}_i'\varvec{u}_i/2\right) \) with respect to \(\varvec{u}_i\). The result has been proven in Huber et al. (2004).

A.2 Poisson Responses

Species counts can be modelled as Poisson distributed responses, \(y_{ij}\sim Poisson(\mu _{ij})\), and log link function. Then \(a_j(\mu _{ij}) = \log (\mu _{ij}), b_j(\mu _{ij})=\mu _{ij}\), and \(c_j(y_{ij})=-\log (y_{ij}!)\). Then the following Laplace approximation \(\tilde{l}\) for the log-likelihood function is obtained

$$\begin{aligned} \tilde{l} (\varvec{ \Psi },\varvec{\hat{u}}_{i})&= \sum \limits _{i=1}^n\Bigg (-\frac{1}{2}\log \,\det \left( \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i) \right) + \sum \limits _{j=1}^m\big [ y_{ij} \hat{\eta }_{ij} - \exp (\hat{\eta }_{ij}) - \log (y_{ij}!) \big ] - \frac{\varvec{\hat{u}}_i'\varvec{\hat{u}}_i}{2}\Bigg ), \end{aligned}$$

where \(\varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i)= \sum \nolimits _{j=1}^m \exp (\hat{\eta }_{ij})\varvec{\gamma }_j\varvec{\gamma }_j' + \varvec{I}_d\), with \(\hat{\eta }_{ij}=\alpha _i + \beta _{0j} + \varvec{x}'_i \varvec{\beta }_j + \hat{\varvec{u}_i}'\varvec{\gamma }_j\), and \(\varvec{\hat{u}}_i\) is the maximum of

$$\begin{aligned} Q(\varvec{\Psi },\varvec{u}_{i}) =&\frac{1}{m}\Bigg [\sum \limits _{j=1}^m\big [ y_{ij}\eta _{ij} - \exp (\eta _{ij}) - \log (y_{ij}!) \big ] - \frac{\varvec{u}_i'\varvec{u}_i}{2} - \frac{d}{2}\log (2\pi )\Bigg ]. \end{aligned}$$

A.3 Proof of Theorem 2

Assume that the responses \(y_{ij}\) come from the zero-inflated Poisson distribution with mean \(E(y_{ij})=(1-p_j)\mu _{ij}\) and density of the form (3). The log-likelihood function (5) then equals

$$\begin{aligned} l(\varvec{ \Psi })= & {} \sum \limits _{i=1}^n\log \bigg (\int \prod \limits _{j=1}^m \exp \left( \log \left[ p_j + (1-p_j)\exp \{-\exp (\eta _{ij})\}\right] I_{(y_{ij} = 0)} \right. \\&+ \left. \left\{ \log (1-p_j) - \exp (\eta _{ij})+y_{ij}\eta _{ij} -\log (y_{ij}!)\right\} I_{(y_{ij} > 0)} \right) \bigg )\\&\times (2\pi )^{-\frac{d}{2}}\exp \left( -\frac{1}{2}\varvec{u}_i'\varvec{u}_i\right) \,d\varvec{u}_i. \end{aligned}$$

Hence, the Laplace approximation of the log-likelihood function is

$$\begin{aligned} \tilde{l}(\varvec{ \Psi },\varvec{\hat{u}}_{i})= & {} \sum \limits _{i=1}^n\Bigg (-\frac{1}{2}\log \det \left\{ \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i)\right\} + \sum \limits _{j=1}^m \log f(y_{ij}|\varvec{\hat{u}}_i;\varvec{\Psi }) - \frac{\varvec{\hat{u}}_i'\varvec{\hat{u}}_i}{2}\Bigg ) \\= & {} \sum \limits _{i=1}^n\Bigg (-\frac{1}{2}\log \det \left\{ \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i)\right\} + \sum \limits _{j=1}^m\Big (\log \big (p_j + (1-p_j) \hat{A}_{ij}\big )I_{(y_{ij} = 0)} \\&+ \left\{ \log (1-p_j) - \exp (\hat{\eta }_{ij})+y_{ij}\hat{\eta }_{ij} -\log (y_{ij}!)\right\} I_{(y_{ij} > 0)}\Big ) - \frac{\varvec{\hat{u}}_i'\varvec{\hat{u}}_i}{2}\Bigg ), \end{aligned}$$

where

$$\begin{aligned} \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i)= & {} \frac{\partial ^2}{\partial \varvec{u}_i'\partial \varvec{u}_i} \Bigg [ - \sum \limits _{j=1}^m\log f(y_{ij}|\varvec{u}_i;\varvec{\Psi }) + \frac{\varvec{u}_i'\varvec{u}_i}{2}\Bigg ] \Bigg |_{\varvec{u}_i=\varvec{\hat{u}}_i}\\= & {} \sum \limits _{j=1}^m \frac{\partial ^2 \left\{ \exp (\eta _{ij})I_{(y_{ij}> 0)} - \log (p_j + (1-p_j)A_{ij})I_{(y_{ij} = 0)}\right\} }{\partial \varvec{u}_i'\partial \varvec{u}_i}\Bigg |_{\varvec{u}_i=\varvec{\hat{u}}_i}+ \varvec{I}_d \\= & {} \sum \limits _{j=1}^m\Bigg [\exp (\hat{\eta }_{ij})I_{(y_{ij} > 0)} - \Bigg (\frac{(1-p_j)\hat{A}_{ij}\exp (\hat{\eta }_{ij})(\exp (\hat{\eta }_{ij})-1)}{p_j + (1-p_j)\hat{A}_{ij}} \\&- \frac{(1-p_j)^2\hat{A}_{ij}^2\exp (2\hat{\eta }_{ij})}{(p_j + (1-p_j)\hat{A}_{ij})^2}\Bigg )I_{(y_{ij} = 0)}\Bigg ]\varvec{\gamma }_j\varvec{\gamma }_j' + \varvec{I}_d, \end{aligned}$$

with \(\hat{\eta }_{ij}=\alpha _i + \beta _{0j} + \varvec{x}'_i \varvec{\beta }_j + \hat{\varvec{u}_i}'\varvec{\gamma }_j\) and \(\hat{A}_{ij}=\exp \{-\exp (\hat{\eta }_{ij})\}\), and \(\varvec{\hat{u}}_i\) is the maximum of \(Q(\varvec{\Psi },\varvec{u}_{i}) = (1/m)\left( \sum \nolimits _{j=1}^m\log f(y_{ij}|\varvec{u}_i;\varvec{\Psi }) - \varvec{u}_i'\varvec{u}_i/2\right) \).

A.4 Proof of Theorem 3

Assume that the responses \(y_{ij}\) come from the Tweedie distribution with mean \(E(y_{ij})=\mu _{ij}\) and density of the form (4). The log-likelihood function (5) then equals

$$\begin{aligned} l(\varvec{ \Psi })= & {} \sum \limits _{i=1}^n\log \Bigg (\int \prod \limits _{j=1}^m \exp \left( - \frac{\mu _{ij}^{2-\nu }}{\phi _j(2-\nu )}\right) I_{(y_{ij} = 0)} \\&+\, \frac{1}{y_{ij}}\tilde{W}(y_{ij},\phi _j,\nu )\exp \left\{ \frac{1}{\phi _j}\left( \frac{y_{ij}\mu _{ij}^{1-\nu }}{1-\nu } - \frac{\mu _{ij}^{2-\nu }}{2-\nu }\right) \right\} I_{(y_{ij} > 0)} \Bigg )\\&\times (2\pi )^{-\frac{d}{2}}\exp \left( -\frac{1}{2}\varvec{u}_i'\varvec{u}_i\right) \,d\varvec{u}_i. \end{aligned}$$

Hence, the Laplace approximation of the log-likelihood function is

$$\begin{aligned} \tilde{l}(\varvec{ \Psi },\varvec{\hat{u}}_{i})= & {} \sum \limits _{i=1}^n\Bigg (-\frac{1}{2}\log \det \left\{ \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i)\right\} + \sum \limits _{j=1}^m \log f(y_{ij}|\varvec{\hat{u}}_i;\varvec{\Psi }) - \frac{\varvec{\hat{u}}_i'\varvec{\hat{u}}_i}{2}\Bigg ) \\= & {} \sum \limits _{i=1}^n\Bigg (-\frac{1}{2}\log \det \left\{ \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i) \right\} + \sum \limits _{j=1}^m\bigg [\left\{ \log \tilde{W}(y_{ij},\phi _j,\nu ) - \log (y_{ij})\right\} I_{(y_{ij}>0)} \\&+ \frac{1}{\phi _j}\left( \frac{y_{ij}\exp \{(1-\nu )\hat{\eta }_{ij}\}}{1-\nu } - \frac{\exp \{(2-\nu )\hat{\eta }_{ij}\}}{2-\nu }\right) \bigg ] - \frac{\varvec{\hat{u}}_i'\varvec{\hat{u}}_i}{2}\Bigg ), \end{aligned}$$

where

$$\begin{aligned} \varvec{\Gamma }(\varvec{\Psi },\varvec{\hat{u}}_i)= & {} \frac{\partial ^2}{\partial \varvec{u}_i'\partial \varvec{u}_i} \Bigg [ - \sum \limits _{j=1}^m\log f(y_{ij}|\varvec{u}_i;\varvec{\Psi }) + \frac{\varvec{u}_i'\varvec{u}_i}{2}\Bigg ] \Bigg |_{\varvec{u}_i=\varvec{\hat{u}}_i}\\= & {} \sum \limits _{j=1}^m \frac{\partial ^2}{\partial \varvec{u}_i'\partial \varvec{u}_i} \frac{1}{\phi _j}\left( - \frac{y_{ij}\exp \{(1-\nu )\eta _{ij}\}}{1-\nu } + \frac{\exp \{(2-\nu )\eta _{ij}\}}{2-\nu }\right) \Bigg |_{\varvec{u}_i=\varvec{\hat{u}}_i}+ \varvec{I}_d \\= & {} \sum \limits _{j=1}^m\frac{1}{\phi _j}\left[ (2-\nu )\exp \{(2-\nu )\hat{\eta }_{ij}\} - y_{ij}(1-\nu )\exp \{(1-\nu )\hat{\eta }_{ij}\} \right] \varvec{\gamma }_j\varvec{\gamma }_j' + \varvec{I}_d, \end{aligned}$$

with \(\hat{\eta }_{ij}=\alpha _i + \beta _{0j} + \varvec{x}'_i \varvec{\beta }_j + \hat{\varvec{u}_i}'\varvec{\gamma }_j\) and \(\hat{A}_{ij}=\exp \{-\exp (\hat{\eta }_{ij})\}\), and \(\varvec{\hat{u}}_i\) is the maximum of \(Q(\varvec{\Psi },\varvec{u}_{i}) = (1/m)\left( \sum \nolimits _{j=1}^m\log f(y_{ij}|\varvec{u}_i;\varvec{\Psi }) - \varvec{u}_i'\varvec{u}_i/2\right) \).

B Additional Application Results

See Figs. 5, 6, 7, 8 and 9.

Fig. 5
figure5

The ordination of \(n=56\) sites based on generalized linear latent variable model without any covariates assuming negative binomial distributed responses. The sites in ordination are coloured according to their a soil organic matter (SOM) values and b phosphorous (P) values, and labelled according to the sampling site.

Fig. 6
figure6

Ranked point estimates with \(95\%\) confidence intervals for the three environmental variables based on negative binomial GLLVM. Grey confidence intervals include the zero value.

Fig. 7
figure7

The ordination of \(n=56\) sites based on generalized linear latent variable model with pH, soil organic matter and phosphorous as covariates, and assuming negative binomial distributed responses. The sites in ordination are coloured according to their a pH values, b soil organic matter (SOM) values and c phosphorous (P) values, and labelled according to the sampling site. The effect of environmental variables vanishes, but the ordination is affected by the sampling location few Kilpisjärvi sites being different from the others what comes to species composition.

Fig. 8
figure8

Dunn–Smyth residuals against linear predictors for the a Poisson, b zero-inflated Poisson and c negative binomial GLLVM models with pH, soil organic matter, phosphorous and categorical site as covariates. Lowess curves are included in the plots.

Fig. 9
figure9

Dunn–Smyth residuals against linear predictors for the Tweedie models a without site effect and b with site effect.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Niku, J., Warton, D.I., Hui, F.K.C. et al. Generalized Linear Latent Variable Models for Multivariate Count and Biomass Data in Ecology. JABES 22, 498–522 (2017). https://doi.org/10.1007/s13253-017-0304-7

Download citation

Keywords

  • Biomass
  • Laplace approximation
  • Ordination
  • Overdispersed count
  • Species interactions