Fourthcorner correlation is a score test statistic in a loglinear trait–environment model that is useful in permutation testing
 1.6k Downloads
 4 Citations
Abstract
Ecologists wish to understand the role of traits of species in determining where each species occurs in the environment. For this, they wish to detect associations between species traits and environmental variables from three data tables, species count data from sites with associated environmental data and species trait data from data bases. These three tables leave a missing part, the fourthcorner. The fourthcorner correlations between quantitative traits and environmental variables, heuristically proposed 20 years ago, fill this corner. Generalized linear (mixed) models have been proposed more recently as a modelbased alternative. This paper shows that the squared fourthcorner correlation times the total count is precisely the score test statistic for testing the linearbylinear interaction in a Poisson loglinear model that also contains species and sites as main effects. For multiple traits and environmental variables, the score test statistic is proportional to the total inertia of a doubly constrained correspondence analysis. When the count data are overdispersed compared to the Poisson or when there are other deviations from the model such as unobserved traits or environmental variables that interact with the observed ones, the score test statistic does not have the usual chisquare distribution. For these types of deviations, row and columnbased permutation methods (and their sequential combination) are proposed to control the type I error without undue loss of power (unless no deviation is present), as illustrated in a small simulation study. The issues for valid statistical testing are illustrated using the wellknown Dutch Dune Meadow data set.
Keywords
Community ecology Correspondence analysis Fourthcorner Permutation test Score test statistic Trait–environment association1 Introduction
Ecological and evolutionary theory predicts that species have adapted to the environments they occupy (Southwood 1977; Townsend and Hildrew 1994). In recent years, understanding how the units of evolution (species) and their associated traits relate to the environment they inhabit has become a central focus in community ecology (McGill et al. 2006). A central question in this quest has been to establish the functionality of species traits, i.e. determine which traits allow species to survive and prosper where they do. Ultimately, by examining variation among attributes of species (traits) and among attributes of sites (environment), we can describe some of the important rules by which species assemblages emerge. As Legendre et al. (1997) stated: “Testing such hypotheses would require (1) a way to detecting associations between species and habitat characteristics, and (2) a way of testing the significance of these associations.” Traits and environmental (or habitat) variables cannot be correlated directly as they are measured on different units, namely species and sites, respectively, but can be connected via a nonnegative link table with rows for sites and columns for species (e.g. presence–absence matrices, abundance or biomass information on species).
Legendre et al. (1997) developed a heuristic method referred to as the fourthcorner approach to trait–environment association. In that approach, three matrices containing information on the distributions of multiple species, species traits and the environmental attributes of species assemblages are combined to estimate a fourth matrix (the fourthcorner) containing correlations between traits and environment. Using ideas from multivariate analysis, Dolédec et al. (1996) developed a threetable ordination method, called RLQ, to establish the links between species traits and environmental variables. RLQ can consider either principal component analysis (PCA) or correspondence analysis (CA) as the central ordination method. For a single trait and a single environment variable, the version based on CA reduces to the fourthcorner method. The univariate version is by far the most used approach to link environmental and trait variation.
The fourthcorner solution (Dray and Legendre 2008; Legendre et al. 1997) is to determine X in the simplest way, namely by the matrix product \(\mathbf{X}=\mathbf{E}^{T}{} \mathbf{YT}\). For a nominal trait, a nominal environmental variable (expanded to indicator matrices E and T) and a presence–absence data table Y, the fourth data table X is simply a contingency table containing frequencies. A natural test statistic for significance testing is thus to compute the usual chisquare statistic, which will not necessarily follow a chisquare distribution because of the obvious dependencies between the entries. Legendre et al. (1997) investigated this issue and proposed permutation testing strategies as potential solutions. However, none of their strategies worked satisfactorily under all models they considered (Dray and Legendre 2008). Eventually, ter Braak et al. (2012) derived a strategy based on the sequential rejection principle (Goeman and Solari 2010) that controlled the type I error in data generated from any of models considered by Dray et al. (2014). This sequential strategy involves both row and column permutation (see Sect. 3).
For quantitative E and T, the same equation \(\mathbf{X}=\mathbf{E}^{T}{} \mathbf{YT}\) can be used, except that the expansion to indicator matrices must now be replaced by normalization of each column of E and T to a weighted mean of zero and a weighed variance of 1, with site and species weights for E and T being the row and columns sums of Y, respectively. This then yields a matrix X consisting of fourthcorner correlations.
The motivation for the weighting came from considering an “inflated data table” (Legendre et al. 1997), in which Y is vectorized, the zeroes removed, and each nonzero speciessite combination is associated with the corresponding rows of E and T. The fourthcorner correlation between a trait and an environmental variable is then the Pearson correlation between the corresponding column of the trait in the inflated T and the corresponding column of the environmental variable in the inflated E. For more general nonnegative data tables, this generalizes to a weighted Pearson correlation with abundances as weights and absences carrying zero weight (Dray and Legendre 2008).
The inflation process has an intuitive rationale when the abundance data are counts of individuals. The inflated data table simply lists all individuals (rows) and has, for \(p =q = 1\), two variables (columns), namely the single trait and the single environmental variable. Each row has the trait value of the species it belongs to and the environmental value of the site which it inhabits. The fourthcorner correlation is then simply the unweighted Pearson correlation between the two variables of this table. This is a natural method to use when individuals are sampled with measurements of their traits and the environmental variables where they live. In such sampling, there will be intraspecific (and intrasite) variation, which is ignored in the original formulation of Legendre et al. (1997), but could certainly be accounted for. Significance testing of the correlation by permutation procedures proceeds similarly to the nominal variables case (Dray et al. 2014).The rationale followed by Dolédec et al. (1996) to arrive at an equivalent solution is completely different; it is based on ways to constrain row and column scores in statistical triplets (Cailliez and Pagès 1976; Tenenhaus and Young 1985) defining a correspondence analysis. The link with (doubly) constrained correspondence analysis (Lavorel et al. 1998) returns at several places in this paper.
More recently, Pollock et al. (2012), Jamil et al. (2013) and Brown et al. (2014) independently proposed modelbased approaches that generalize the fourthcorner problem to multiple traits and environmental variables using generalized linear (mixed) models (GL(M)M) for vectorized Y, as in the bilinear regression approach of Gabriel (1998). Modelbased approaches have great appeal (e.g. for nature conservation purposes) as they can improve the prediction of species abundances not only based on their environmental characteristics, but also on their traits and the interactions between traits and environment. These GL(M)M approaches allow for simultaneous modeling of the abundances of m species in terms of one or more traits and environmental variables. Mainstream methods of variable selection are used to build parsimonious models. In these approaches, X becomes a matrix of (partial) regression coefficients estimating the direction and strength of the interaction between standardized traits and standardized environmental variables as well as their main effects on species distributions (Brown et al. 2014).
This paper seeks to establish connections between the earlier heuristic fourthcorner correlation and the more recent modelbased approaches. One such, almost trivial, connection has been presented in the Appendix of Brown et al. (2014). In there, for a nominal trait and a nominal environmental variable, the fourthcorner X is a contingency table obtained by merging columns and rows that belong to the same category of the trait and of the environmental variable, respectively, and the likelihood ratio test on interaction in a contingency table using a Poisson loglinear model is asymptotically equivalent with the usual chisquare test. No such relationships have been established for quantitative variables. The importance of such links is that they allow the generalization and unification of a simple and widely used heuristic method based on correlations (fourthcorner) to the GLM (fixed or mixed) regression machinery to link trait and environmental variation. This paper establishes that the squared fourthcorner correlation times the sum of the elements of the link table Y (i.e. \(y_{++} )\) is precisely the score test statistic for testing the linearbylinear interaction in a Poisson loglinear model with row and column main effects. Moreover, for multiple traits and environmental variables, the score test statistic is precisely \(y_{++} \) times the total inertia of a doubly constrained correspondence analysis (Kleyer 2012; Lavorel et al. 1999, 1998), which is the natural generalization of a singly constrained correspondence analysis, known as canonical correspondence analysis (Takane 2013; ter Braak 1986, 2014). It is also the natural generalization of RLQ (Dolédec et al. 1996; Dray et al. 2014) for correlated traits and environmental variables.
In ecological applications, however, the assumptions of the Poisson loglinear model are unlikely to hold true for a number of reasons. First, counts are typically overdispersed compared to the Poisson and therefore modeled by, for example, a negative binomial distribution (Warton 2005). Second, observations from the same site are likely to be dependent and residual correlation among species is to be expected. This dependence has typically been addressed by resampling methods that resample entire sites instead of single individual observations (Oksanen et al. 2013; Wang et al. 2012). Third, observations on the same species are dependent when the observed environment interacts with unobserved (latent) traits, giving residual correlation among sites. This dependence is accounted for in generalized linear mixed models for trait–environment interaction (Jamil et al. 2013; Pollock et al. 2012) by using a random slopes model. Warton et al. (2015a) extended such a random slopes logistic mixed model to a model with factoranalytic terms so as to account for both dependencies and analyzed it in the Bayesian framework using Gibbs sampling.
As this brief literature review shows, there are resamplingbased (permutation and bootstrap) and modelbased approaches to apply when model assumptions are unlikely to hold true. In the former the attempt is to overcome the shortcomings of the too simple model by resampling; in the latter, the simple model is extended until a ‘correct’ model has been found, defined as passing a number of diagnostics, so that one can then likely trust parametric (asymptotic) statistical inference. As an example, with modelbased methods it is possible to build multitrait multienvironment models in which the assumption of (conditional) independence is perhaps defendable. It is outside the scope of this paper to discuss the pros and cons of modelbased versus resamplingbased strategies, and how they might be combined. This paper takes the resampling approach using the simple Poisson model with interaction and shows by simulation that different deviations from the assumptions require different resampling methods to rescue the validity of the statistic test on trait–environment interaction. The different deviations also serve to explain why communitybased and speciesbased inference (Shipley et al. 2007) may statistically yield different results (Ackerly et al. 2002; PeresNeto et al. 2016) and why statistical tests based on communitybased resampling as in Warton et al. (2015b) may have inflated type I error when the GLMmodel does not hold true.
The paper is structured as follows. In Sect. 2 the score test statistic on interaction is derived, extended to the multitrait multienvironmental variable case and also specialized to a number of common simple cases. In all cases, the score test statistic can be expressed in terms of the total inertia of a (doubly or singly) constrained correspondence analysis. In Sect. 3 the distribution of the test statistic is examined in the Poisson model from which it was derived and for five extended models and under four permutation schemes. Depending on the model, the permutation distribution obtained in a particular scheme does or does not correspond with the simulated distribution (i.e. the true distribution with sampling error) with only one scheme that controls the type I error in all models. This ‘max’ scheme, developed by ter Braak et al. (2012) from the sequential rejection principle, takes the maximum p value of the communitybased permutation test and the speciesbased permutation test. Section 4 gives a real data example where, as in the simulations, communitybased and speciesbased inference lead to different results, which can then be combined in the max scheme. Section 5 discusses the advantages, limitations and extensions of the approach taken in this paper and formulates the paradox that abundance is a weight in the fourthcorner correlation and a response in the loglinear model and that, nevertheless, these methods are closely related. The paradox is reconciled via a formula, well known in the literature on correspondence analysis, which expresses correspondence analysis as an approximation to a particular loglinear model and by noting that the fourthcorrelation is the squareroot of the only nontrivial eigenvalue of a doubly constrained correspondence analysis.
2 Theory
Unless otherwise noted, the response is assumed to be count data.
2.1 Likelihood and sufficient statistics
2.2 Score test statistic
2.3 Score test statistic for the interaction parameter
2.4 Score test for multiple traits and environmental variables
This score test statistic, when divided by \(y_{++} \), is equal to the total inertia of a doubly constrained correspondence analysis (Kleyer 2012; Lavorel et al. 1999, 1998), which is the natural generalization of a singly constrained correspondence analysis, known as canonical correspondence analysis (Takane 2013; ter Braak 1986, 2014). It is also the natural generalization of RLQ (Dolédec et al. 1996; Dray et al. 2014) for trait and environmental data that are not R and Corthogonal.
2.5 Special cases
There are important special cases of these results.
Trait and environment variables are factors or the identity matrix
If trait and environment variables are factors so that \(\mathbf{T}\) and \(\mathbf{E}\) are indicator matrices, the score test statistic is simply the usual chisquare statistic calculated from the contingency table table \(\mathbf{Y}^{te}\), say, containing the total abundance in each class of the crossclassification of the factor classes (see “Appendix”). This result was derived with saturated main effects (having free row and column parameters \(r_i \) and \(c_j\)). Brown et al. (2014) obtained a similar result from the loglinear model with T and E as main effects.
With \(\mathbf{T}\) and \(\mathbf{E}\) as diagonal matrices of size \(m \times m\) and \(n \times n\), respectively, the score test statistics becomes the usual chisquare statistic for contingency table \(\mathbf{Y}\). This particular case is an analysis of \(\mathbf{Y}\) without external constraining information and has no value for trait–environment analysis.
With \(\mathbf{T}\) a diagonal matrix of size \(m \times m\) and \(\mathbf{E}\) an \(n \times p\) matrix, the score test statistic becomes \(y_{++} \) times the total inertia of a canonical correspondence analysis (ter Braak 1986). The (communitybased) permutation test on the effects of environmental variables on species abundance using canonical correspondence analysis can thus be viewed as a test on the speciesbyenvironment interaction in a loglinear model using a score test statistic.
Single trait and multiple environment variables
Note that the coefficient of determination of the weighted regression of \(\mathbf{t}^{*}\) on the environmental variables divides the regression sum of squares by the sum of squares based on \(\mathbf{t}^{*}\) (instead of on \(\mathbf{t})\) and is thus a factor \(\textit{var}_R \left( \mathbf{t} \right) /\textit{var}_R \left( {\mathbf{t}^{*}} \right) \) higher, as \(\textit{var}_R \left( {\mathbf{t}^{*}} \right) \le \textit{var}_R \left( \mathbf{t} \right) \). This can give spuriously high coefficients of determination when there is in fact no relation at all. The reason is that \(\textit{var}_R \left( {\mathbf{t}^{*}} \right) \) is close to zero when there is no association between \(\mathbf{t}\) and \(\mathbf{Y}\). Similarly, the simple and multiple correlation coefficients between \(\mathbf{t}^{*}\) and the environmental variable(s) are bad test statistics (PeresNeto et al. 2016). The score test statistics derived in this paper do not have this shortcoming.
Multiple traits and single environment variable
The case of a single quantitative environment variable with multiple traits works analogously to the previous subsection with \(\mathbf{Y}\) transposed. In this case, weighted averages of the environmental variable are calculated for each species, resulting in an mvector containing, what are called, species niche centroids. The vector can then be regressed on traits (Kleyer 2012; Šmilauer and Lepš 2014), analogously to the approach based on CWM.
3 Distribution of the score test statistic in permutation tests and extended models
This section examines by simulation the distribution of the score test statistic developed in the previous section when the assumptions of the Poisson loglinear model hold true and shows that resampling methods that preserve the row and column totals of Y yield a distribution of the score test statistic that is within sampling variation of both the asymptotic chisquare distribution and the simulated distribution. This section also shows that particular deviations from the assumptions of the Poisson model require different resampling methods to rescue the validity of the statistic test on trait–environment interaction. The deviations lead to models that serve to explain why communitybased and speciesbased inference may statistically yield different results. In this paper the focus is on permutational methods of resampling.
The (asymptotic) distribution of the score test statistic is known to be chisquare with pq degrees of freedom (Cox and Hinkley 1974) when the statistical null model holds true, which is in our case the Poisson loglinear model (1) with \(b=0\). Appendix S1 provides code in the R language (R Core Team 2015) and results of simulations illustrating this. In these simulations, the analytical equations for the score test statistic using (7), (10), (15) and (43) give numerically the same value as the score test statistic calculated using the R package mdscore (da SilvaJunior et al. 2015); the difference between the likelihood ratio and the modified score statistic is small.
Figure 1 compares the true exceedance probability of the score test statistic as estimated on the basis of 10,000 simulated data sets (vertical axis) with the exceedance probability estimated by the chisquare distribution (parametric) and as obtained from four different permutation schemes using 999 permutations each (horizontal axis) across six different datagenerating models (the \(2\times 3\) panels). Note that only the sixth model contains a true nonzero interaction between the observed trait t and the observed environmental variable e. Appendix S2 provides Rcode for the simulations.
The sixth model is the only nonnull (alternative) model, namely the base model extended with the terms \(b_{te} t_j e_i +b_{ze} z_j e_i \) with \(b_{te} =0.2\).
 1.
(rc). Randomly permute both all rows and all columns of Y in respect to each other (Dolédec et al. 1996). This scheme was first proposed by Welch (1990) for permutation testing of interaction in balanced fixedeffects twoway analysis of variance. They destroy any relationship between Y and E, and Y and T, respectively.
 2.
(row). Randomly permute only the rows of Y. This is model 2 of Dray and Legendre (2008), destroying any relationship between Y and E only.
 3.
(col). Randomly permute only the columns of Y. This is model 4 of Dray and Legendre (2008), destroying any relationship between Y and T only.
 4.
(max). Perform a sequential test (Goeman and Solari 2010) with first the rowpermutation test using scheme 2, and, if this test is significant, then the columnpermutation test using scheme 3, or vice versa (ter Braak et al. 2012). In our case, both tests use the same score test statistic, Eq. (7), so that the sequential test (when both tests are carried out) is then equivalent with the test in which the final p value is the maximum of the two p values. Scheme 4 improves model 5 of Dray and Legendre (2008) and PeresNeto et al. (2012) in the way the final p value is calculated.
On the basis of a suggestion of a reviewer, two more permutation methods, which permute the trait values (or the values of the environmental variable) in inflated tables, have been evaluated in Appendix S3.
Except in the bottomright panel in Fig. 1 (i.e. nonnull (alternative) model where \(b_{te} \ne 0\)), the ideal test in terms of type I error rates follows the 1:1 line. Lines above this line indicate liberal tests that have elevated Type I error rate (too many rejections at a specified nominal level, e.g. the horizontal dashed line at 0.05) and lines below the 1:1 line indicate conservative tests that have too few rejections at a specified nominal level. A test is said to control the type I error, if its type I error rate is at most the nominal level (Goeman and Solari 2010), that is, if its lines in Fig. 1 are all at or below the 1:1 line.
The exceedance probability based on the chisquare distribution with 1 degree of freedom is at the 1:1 line only for the Poisson model and is far above this line for the other models. In the toprow panels of Fig. 1, the rc, row and col schemes closely follow the 1:1 line, but the max scheme is slightly below this line, and thus conservative with an observed rejection rate in the 10,000 simulations of about 3% at the nominal 5% level of the test. Note, however, that the test is still reliable in the sense that it does not reject the null hypothesis more often than the nominal level.
In the first two panels of the bottomrow in Fig. 1, the rc scheme nearly coincides with the row scheme and the column scheme, respectively. The schemes are above the 1:1 line and thus liberal with a rejection rate of about 17% at the nominal 5% level of the test. In these panels, the max scheme nearly coincides with the column scheme and the row scheme, respectively. These schemes are about at the 1:1 line and thus have a rejection rate of about 5% at the nominal 5% level of the test.
The data generating model in the bottomright panel is the only one containing a true nonzero interaction between the observed trait t and observed environmental variable e. In this case, the ideal line is \({\Gamma } \)shaped, indicating a high rejection rate at each nominal level of the test. At the nominal level of 5%, the rejection rate is \(\sim \)0.90 for the col and max schemes and \(\sim \)0.97 for the row and rc schemes. If \(b_{ze} \) is decreased from 0.2 to 0, these rejection rates are all >0.98 in this case, and the plot is \({\Gamma } \)shaped, also for the chisquare based probability.
The conclusion from these simulations is that the use of the chisquare based probability gives highly inflated type I errors if the Poisson model does not strictly hold true. From the investigated permutation schemes (including the two methods of Appendix S3), the max scheme is the only scheme that controls the type I error in the five investigated models with latent variables, while providing a strong statistical power when \(b_{te} \ne 0\).
4 Real data example
Different permutation schemes can also lead to different results in real data. This is illustrated here with the Dune Meadow data set (Jongman et al. 1995) consisting of abundances of 28 plants in 20 sites with five environmental variables and, from Jamil et al. (2013), five plant traits. The abundance is on a semiquantitative rank scale with integer numbers from 0 (absent) to 9 (present everywhere). For illustration purposes only, abundances are treated as counts in this example and, alternatively converted to presence/absence. Suppose for a moment that the only available environmental variable is moisture, which is the major axis of variation of this data (Jongman et al. 1995), and one wishes to know whether it interacts with the plant trait SLA (specific leaf area). Using the fourthcorner score test statistic, the p values for the abundance data (with the p values obtained for presence/absence in this section between parentheses) for the permutation schemes rc, row and col are 0.008 (0.006), 0.028 (0.024), 0.218 (0.185), respectively (using 999 permutations). The first two schemes thus provide evidence for an interaction, whereas the col scheme does not. The simulations in Fig. 1 indicate that one possible reason for such a difference between the row and col schemes is that the environmental variable (moisture) interacts with a latent trait, even if that variable is independent of the trait of interest (SLA). There is indeed another trait in the Dune Meadow data set, namely seed mass, that has almost zero correlation with SLA (r = −0.047) and that interacts with moisture [p values of 0.0001 (0.0001) and 0. 0.012 (0.0185) for the row and col schemes, respectively]. The p value of 0.028 (0.024) in the row scheme for the testing the interaction between SLA and moisture is thus likely caused by the interaction between seed mass and moisture. There is thus no evidence in these data that SLA and moisture have a real interaction. This example illustrates that the evidence for a trait–environment interaction is weak unless both the row and col schemes result in low p values. This line of reasoning leads naturally to the max scheme; the formal argument hinges on the theory of sequential testing (Goeman and Solari 2010) as given in ter Braak et al. (2012).
5 Discussion
This paper shows that the fourthcorner correlation, heuristically developed by Legendre et al. (1997) for examining trait–environment associations, has a close relationship with the Poisson loglinear model with interactions, which has recently been proposed as a model for trait–environment relationships (Brown et al. 2014; Warton et al. 2015b). The squared fourthcorner correlation is proportional to the score test statistic for testing the linearbylinear interaction in the Poisson loglinear model with row and column main effects. This result gives a mathematical underpinning of a conjecture that PeresNeto et al. (2016) examined by simulation, namely that the fourthcorner correlation focuses on the interaction of a Poisson loglinear model and is not sensitive to main effects. Moreover, a score test is asymptotically equivalent with the likelihood ratio test, but much quicker to compute as it does not require fitting of the alternative model. This applies particularly to the test based on the fourthcorner correlation in comparison with the test based on the Poisson deviance difference between the main effects only model and the main effects with interaction model. In our R implementation, the test using the fourthcorner correlation is 140 times quicker to compute than the GLMbased test. Note that computing time easily becomes an issue with resampling for statistical inference, particularly, in large data sets.
Ecological data are likely overdispersed. Then there are two popular models, the quasiPoisson model and the negative binomial model. The quasiPoisson model, with its variance proportional to the mean, allows a quasilikelihood approach that leads to the Poisson deviance to be minimized and thus to the same estimates as the Poisson model. In this case, the squared fourthcorner is safe to use in resamplingbased (permutation or bootstrap) significance tests. That is not the case for the negative binomial model (with variance function \(\mu _{ij} +\phi \mu _{ij}^2 \) and scale parameter \(\phi \)). Then, the minimal sufficient statistics are the full data, instead of the three statistics below Eq. (3), and the score test statistic differs from the one in the Poisson model. Resampling based on the squared fourthcorner or the Poisson likelihood ratio (LR) is therefore no longer optimal and power may be lost. In a small simulation study as in the sixth panel of Fig. 1 (100 data sets per scenario and 99 permutations per data set), the power of the row, col and max schemes based on the negative binomial LR was 0.96, 0.94 and 0.93, respectively. By comparison, the power of the fourthcorner test on the same data sets was estimated as 0.97, 0.88 and 0.88, respectively, confirming some loss of power compared to using the negative binomial LR. The negative binomial LR is costly computationally and potentially numerically unstable; for example, in our implementation using the R package mvabund (Wang et al. 2012), I tried to obtain results for 1000 simulations with 999 permutation, but failed due to crashes of R. Note that the negative binomial GLM requires resampling for statistical inference as the parametric version inference is not very trustworthy, even in simple balanced design experiments for small to moderate data set sizes (Szöcs and Schäfer 2015). It would be of interest to develop a score test in the context of the negative binomial distribution.
Statistical tests in this paper have used resampling, based on restricted permutation of the counts. The restrictions ensured that the row and column totals were preserved. Without restrictions, permuting residuals would have been required to preserve these totals. Moreover, unrestricted resampling would treat the data or residuals as if they were exchangeable, whereas this is unlikely due to unobserved variation between species and/or sites.
Brown et al. (2014) advocate communitybased resampling as being designbased. However, ecologists typically search for trait–environment association in observational studies. Therefore there exists no real designbased inference; the values at the sites or for species are in no way randomized by design. But it may still be hypothesized that values of traits, values of environmental variables or residuals from models are exchangeable. This viewpoint supports both communitybased and speciesbased resampling, although not necessaritly completely random resampling when there is spatial or temporal autocorrelation or phylogenetic correlation.

the rc scheme is not able to control the type I error rate when there is additional unobserved random variation among sites or among species that interacts with either the observed environment or the observed trait (as in the terms \(b_{ze} z_j e_i \) and \(b_{tx} t_j x_i \) in the simulation models, respectively).

the row scheme is not able to control the type I error rate when there is additional speciesbased random variation that interacts with the observed environment (as in the term \(b_{ze} z_j e_i\)). In this scenario, the species respond differentially to the environment, but the differential response cannot be explained by the measured trait [see Eq. (16)]. By contrast, with additional sitebased random variation, the row scheme controls the type I error rate, even if it interacts with the observed trait (as in the term \(b_{tx} t_j x_i\)).

vice versa, the col scheme is not able to control the type I error rate when there is additional sitebased random variation that interacts with the observed trait (as in the term \(b_{tx} t_j x_i )\). In this scenario, the species respond differentially to the trait, but the differential response cannot be explained by the measured environment. By contrast, with additional speciesbased random variation, the col scheme controls the type I error rate, even if it interacts with the observed trait (as in the term \(b_{ze} z_j e_i\)).

the max scheme, in which the row and columnbased tests are combined, controlled the type I error rate in scenarios with either type of random variation.
Our simulation confirmed the remark of Brown et al. (2014) that communitybased resampling “enables valid inferences that are robust to correlation between species, even when such correlation has not been incorporated into the fitted model”: the simulation in Fig. 1 (middle panel in second row) had correlations among species due a latent environmental variable x that was uncorrelated with the observed variable e. The row scheme gave a correct type I error rate, but the col scheme did not, as species were correlated. Reversely, when there are dependencies among sites due to a latent trait z, rowbased resampling gave an inflated type I error rate, but columnbased resampling did not (left panel in second row of Fig. 1). When either one or the other situation could be present, the max scheme is a solution to valid inference. When both situations are likely present, the max scheme also shows moderate type I error rate inflation and some form of p value adjustment estimated via simulation might be a way out (to undo possible type I inflation noted in the previous paragaph) or, the other elaborate option, explicit modeling of the correlations in a GLMM model. Both options are outside the scope of this paper. Of course, for observational data, any estimated correlation or association does not imply causation.
A reviewer raised serious objections against any permutation method that is based on permuting species by arguing that: “species (columns) are not the sampling units, they are out of the control of the experimenter and are generally assumed to be correlated due to species interactions and missing predictors”. I add phylogenetic relationships to this (see below). Therefore “Resampling species makes no sense from a design perspective, irrespective of the presence or absence of speciesbyenvironment interaction effects”. The danger of all of this is that a statistical test using speciesbased resampling may have inflated type I error rate (is too liberal). Let me put this into the context of the max test. If the speciesbased resampling test is not performed, the final p value is the one from sitebased resampling. The p value of the species part of the max test is then effectively nil (under the true null hypothesis, the null hypothesis is always rejected), which corresponds to the maximum type I error rate inflation possible. One is thus better off by applying the speciesbased test than by not applying it, even in the case that the above mentioned danger of some type I error rate inflation is real.
Note that, as yet, no valid GLMbased statistical test of speciesbyenvironment interaction has been proposed. For example, the sitebased residual bootstrapping approach of Warton et al. (2015b) suffers from the same type I error rate inflation as the simple sitebased permutation scheme in the scenario of Fig. 1 that includes the \(b_{ze} z_j e_i \) term (ter Braak et al. 2016). Also this inflation can be counteracted by adding speciesbased resampling as in the max approach (ter Braak et al. 2016). Note also that missing predictors (either as main effects or interactions) are no problem as long as they do not interact with the observed trait and the observed environment. An example hereof is the random interaction scenario in Fig. 1.
Completely random permutations of species and/or of sites were used in this paper. This needs further adaptation as sites may be structured in space (spatial autocorrelation) and time (temporal autocorrelation) and species form a phylogeny (phylogenetic autocorrelation) so that neither sites nor species are really completely independent or exchangeable units. The net effect will be that the effective number of units is actually smaller than the number observed in the data (i.e. loss of degrees of freedom through autocorrelation), likely generating a liberal test when random permutations are used. Possible alternatives for random permutations are restricted permutations (Lapointe and Garland 2014) or data simulation that keeps the original spatial or phylogenetic structure in data (Wagner and Dray 2015). In this kind of hypothesis testing, phylogeny is treated as a nuisance: a trait–environment association is only judged valid when the association contributes beyond contributions due to phylogenic relatedness. For prediction, such a strong requirement is not needed. Prediction of abundance of a new species is expected to be better (with and without taking its trait value into account) the closer it is in the phylogeny to the species present in the data set.
The score test statistic for the testing the slope parameter in a simple regression is the sample size multiplied by the squared Pearson correlation (Bera and Bilias 2001). This result aligns nicely with the fourthcorner correlation defined as the Pearson correlation on inflated trait and environment data, but does not help to understand the link with the Poisson loglinear model. For this, the link between the fourthcorner correlation and correspondence analysis is more helpful as indicated in Sect. 2.3 and in more detail in the next paragraph.
Notes
Acknowledgements
I would like to thank Pedro PeresNeto and Stéphane Dray for discussions and suggestions and two anonymous reviewers for comments that improved the text.
Supplementary material
References
 Ackerly D, Knight C, Weiss S, Barton K, Starmer K (2002) Leaf size, specific leaf area and microhabitat distribution of chaparral woody plants: contrasting patterns in species level and community level analyses. Oecologia 130:449–457. doi: 10.1007/s004420100805 CrossRefGoogle Scholar
 Bera AK, Bilias Y (2001) Rao’s score, Neyman’s C(\(\upalpha \)) and Silvey’s LM tests: an essay on historical developments and some new results. J Stat Plan Inference 97:9–44. doi: 10.1016/S03783758(00)003438 CrossRefGoogle Scholar
 Brookes M (2011) The matrix reference manual. http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/identity.html#InvLemma (online)
 Brown AM, Warton DI, Andrew NR, Binns M, Cassis G, Gibb H (2014) The fourthcorner solution–using predictive models to understand how species traits interact with the environment. Methods Ecol Evol 5:344–352. doi: 10.1111/2041210x.12163 CrossRefGoogle Scholar
 Cailliez F, Pagès JP (1976) Introduction á l’Analyse des Données. Societé de Mathématiques Appliquées et de Sciences Humaines, ParisGoogle Scholar
 Cox DR, Hinkley DV (1974) Theoretical statistics. Chapman and Hall, LondonCrossRefGoogle Scholar
 da SilvaJunior AHM, da Silva DN, Ferrari SLP (2015) mdscore: an R package to compute improved score tests in generalized linear models. J Stat Softw 61:1–16. doi: 10.18637/jss.v061.c02 Google Scholar
 Dolédec S, Chessel D, ter Braak CJF, Champely S (1996) Matching species traits to environmental variables: a new threetable ordination method. Environ Ecol Stat 3:143–166CrossRefGoogle Scholar
 Dray S, Dufour AB (2007) The ade4 package: implementing the duality diagram for ecologists. J Stat Softw 22:1–20. doi: 10.18637/jss.v022.i04 CrossRefGoogle Scholar
 Dray S, Legendre P (2008) Testing the species traitsenvironment relationships: the fourthcorner problem revisited. Ecology 89:3400–3412. doi: 10.1890/080349.1 CrossRefPubMedGoogle Scholar
 Dray S, Choler P, Dolédec S, PeresNeto PR, Thuiller W, Pavoine S, ter Braak CJF (2014) Combining the fourthcorner and the RLQ methods for assessing trait responses to environmental variation. Ecology 95:14–21. doi: 10.1890/130196.1 CrossRefPubMedGoogle Scholar
 Gabriel KR (1998) Generalised bilinear regression. Biometrika 85:689–700. doi: 10.1093/biomet/85.3.689 CrossRefGoogle Scholar
 Goeman JJ, Solari A (2010) The sequential rejection principle of familywise error control. Ann Stat 38:3782–3810. doi: 10.1214/10AOS829 CrossRefGoogle Scholar
 Goodman LA (1979) Simple models for the analysis of association in crossclassifications having ordered categories. J Am Stat Assoc 74:537–552. doi: 10.2307/2286971 CrossRefGoogle Scholar
 Goodman LA (1981) Association models and canonical correlation in the analysis of crossclassifications having ordered categories. J Am Stat Assoc 76:320–334. doi: 10.2307/2287833 Google Scholar
 Greenacre MJ (1984) Theory and applications of correspondence analysis. Academic Press, LondonGoogle Scholar
 Greenacre M (2007) Correspondence analysis in practice, 2nd edn. Chapman and Hall/CRC, LondonCrossRefGoogle Scholar
 Jamil T, Ozinga WA, Kleyer M, ter Braak CJF (2013) Selecting traits that explain speciesenvironment relationships: a generalized linear mixed model approach. J Veg Sci 24:988–1000. doi: 10.1111/j.16541103.2012.12036.x CrossRefGoogle Scholar
 Jongman RHG, ter Braak CJF, van Tongeren OFR (1995) Data analysis in community and landscape ecology. Cambridge University Press, Cambridge. doi: 10.2277/0521475740 Google Scholar
 Kleyer M et al (2012) Assessing species and community functional responses to environmental gradients: which multivariate methods? J Veg Sci 23:805–821. doi: 10.1111/j.16541103.2012.01402.x CrossRefGoogle Scholar
 Lapointe FJ, Garland T Jr (2014) A generalized permutation model for the analysis of crossspecies data. J Classif 18:109–127. doi: 10.1007/s0035700100070 CrossRefGoogle Scholar
 Lavorel S, Touzard B, Lebreton JD, Clément B (1998) Identifying functional groups for response to disturbance in an abandoned pasture. Acta Oecol 19:227–240. doi: 10.1016/S1146609X(98)800271 CrossRefGoogle Scholar
 Lavorel S, Rochette C, Lebreton JD (1999) Functional groups for response to disturbance in mediterranean old fields. Oikos 84:480–498. doi: 10.2307/3546427 CrossRefGoogle Scholar
 Lavorel S et al (2008) Assessing functional diversity in the field–methodology matters!. Funct Ecol 22:134–147. doi: 10.1111/j.13652435.2007.01339.x Google Scholar
 Legendre P, Galzin RG, HarmelinVivien ML (1997) Relating behavior to habitat: solutions to the fourthcorner problem. Ecology 78:547–562. doi: 10.2307/2266029
 Manly BFJ (2006) Randomization, bootstrap and Monte Carlo methods in biology, 3rd edn. Chapman and Hall, LondonGoogle Scholar
 McGill BJ, Enquist BJ, Weiher E, Westoby M (2006) Rebuilding community ecology from functional traits. Trends Ecol Evol 21:178–185. doi: 10.1016/j.tree.2006.02.002 CrossRefPubMedGoogle Scholar
 Oksanen J et al (2013) vegan: community ecology package R package version 209. http://www.cranrprojectorg/
 PeresNeto PR, Leibold MA, Dray S (2012) Assessing the effects of spatial contingency and environmental filtering on metacommunity phylogenetics. Ecology 93:S14–S30. doi: 10.1890/110494.1 CrossRefGoogle Scholar
 PeresNeto PR, Dray S, ter Braak CJF (2016) Linking trait variation to the environment: critical issues with communityweighted mean correlation resolved by the fourthcorner approach. Ecography. doi: 10.1111/ecog.02302
 Pollock LJ, Morris WK, Vesk PA (2012) The role of functional traits in species distributions revealed through a hierarchical model. Ecography 35:716–725. doi: 10.1111/j.16000587.2011.07085.x CrossRefGoogle Scholar
 R Core Team (2015) R: a language and environment for statistical computing, version 3.0. R Foundation for Statistical Computing. http://www.Rproject.org, Vienna, Austria
 Rao CR (1973) Linear statistical inference and its application, 2nd edn. Wiley, New YorkCrossRefGoogle Scholar
 Shipley B, Vile D, Garnier É (2007) Response to comments on from plant traits to plant communities: a statistical mechanistic approach to biodiversity. Science 316:1425–1425. doi: 10.1126/science.1140372 CrossRefGoogle Scholar
 Šmilauer P, Lepš J (2014) Multivariate analysis of ecological data using CANOCO 5, 2nd edn. Cambridge University Press, CambridgeCrossRefGoogle Scholar
 Southwood TRE (1977) Habitat, the templet for ecological strategies? J Anim Ecol 46:337–365. doi: 10.2307/3817 CrossRefGoogle Scholar
 Szöcs E, Schäfer R (2015) Ecotoxicology is not normal. Environ Sci Pollut Res. doi: 10.1007/s1135601545793 Google Scholar
 Takane Y (2013) Constrained principal component analysis and related techniques. Chapman and Hall/CRC, LondonGoogle Scholar
 Tenenhaus M, Young FW (1985) An analysis and synthesis of multiple correspondence analyis, optimal scaling, dual scaling, homogeneity analysis and other methods for quantifying categorical multivariate data. Psychometrika 50:91–119. doi: 10.1007/bf02294151 CrossRefGoogle Scholar
 ter Braak CJF (1985) Correspondence analysis of incidence and abundance data: properties in terms of a unimodal response model. Biometrics 41:859–873. doi: 10.2307/2530959 CrossRefGoogle Scholar
 ter Braak CJF (1986) Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67:1167–1179. doi: 10.2307/1938672 CrossRefGoogle Scholar
 ter Braak CJF (1988) Partial canonical correspondence analysis. In: Bock HH (ed) Classification and related methods of data analysis. Elsevier Science Publishers B.V. (NorthHolland), Amsterdam, pp 551–558. http://edepot.wur.nl/241165
 ter Braak CJF (2014) History of canonical correspondence analysis. In: Blasius J, Greenacre M (eds) Visualization and verbalization of Data. Chapman and Hall/CRC, London, pp 61–75Google Scholar
 ter Braak CJF, Šmilauer P (2012) Canoco reference manual and user’s guide: software for ordination, version 5.0. Microcomputer Power, IthacaGoogle Scholar
 ter Braak CJF, Cormont A, Dray S (2012) Improved testing of species traitsenvironment relationships in the fourthcorner problem. Ecology 93:1525–1526. doi: 10.1890/120126.1 CrossRefPubMedGoogle Scholar
 ter Braak CJF, PeresNeto PR, Dray S (2016) A critical issue in modelbased inference for studying traitbased community assembly and a solution. PeerJ 5:e2885. doi: 10.7717/peerj.2885 CrossRefGoogle Scholar
 Townsend CR, Hildrew AG (1994) Species traits in relation to a habitat templet for river systems. Freshw Biol 31:265–275. doi: 10.1111/j.13652427.1994.tb01740.x CrossRefGoogle Scholar
 Wagner HH, Dray S (2015) Generating spatially constrained null models for irregularly spaced data using Moran spectral randomization methods. Methods Ecol Evol 6:1169–1178. doi: 10.1111/2041210x.12407 CrossRefGoogle Scholar
 Wang Y, Naumann U, Wright ST, Warton DI (2012) mvabund–an R package for modelbased analysis of multivariate abundance data. Methods Ecol Evol 3:471–474. doi: 10.1111/j.2041210X.2012.00190.x CrossRefGoogle Scholar
 Warton DI (2005) Many zeros does not mean zero inflation: comparing the goodnessoffit of parametric models to multivariate abundance data. Environmetrics 16:275–289. doi: 10.1002/env.702 CrossRefGoogle Scholar
 Warton DI, Blanchet FG, O’Hara RB, Ovaskainen O, Taskinen S, Walker SC, Hui FKC (2015) So many variables: joint modeling in community ecology. Trends Ecol Evol 30:766–779. doi: 10.1016/j.tree.2015.09.007 CrossRefPubMedGoogle Scholar
 Warton DI, Shipley B, Hastie T (2015b) CATS regression–a modelbased approach to studying traitbased community assembly. Methods Ecol Evol 6:389–398. doi: 10.1111/2041210x.12280 CrossRefGoogle Scholar
 Welch WJ (1990) Construction of permutation tests. J Am Stat Assoc 85:693–698. doi: 10.2307/2290004 CrossRefGoogle Scholar
 Yee TW (2015) Vector generalized linear and additive models with an implementation in R. Appendix A, Springer, New YorkCrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.