Algorithms and biplots for double constrained correspondence analysis
 1.3k Downloads
 1 Citations
Abstract
Correspondence analysis with linear external constraints on both the rows and the columns has been mentioned in the ecological literature, but lacks full mathematical treatment and easily available algorithms and software. This paper fills this gap by defining the method as maximizing the fourthcorner correlation between linear combinations, by providing novel algorithms, which demonstrate relationships with related methods, and by making a detailed study of possible biplots and associated approximations. The method is illustrated using ecological data on the abundances of species in sites and where the species are characterized by traits and sites by environmental variables. The trait data and environment data form the external constraints and the question is which traits and environmental variables are associated, how these associations drive species abundances and how they can be displayed in biplots. With microbiome data becoming widely available, these and related multivariate methods deserve more study as they might be routinely used in the future.
Keywords
Biplot Canonical correlation analysis Canonical correspondence analysis Community ecology Fourthcorner correlation Multivariate analysis Traitenvironment relations1 Introduction
Double constrained correspondence analysis (dcCA) was developed by JeanDominique Lebreton, Robert Sabatier and coworkers as a natural extension of canonical correspondence analysis (Lebreton et al. 1988a, b; ter Braak 1986, 1987) and applied in studies relating species attributes (traits) and environmental variables via the central sitesbyspecies abundance table (Lavorel et al. 1999, 1998). Canonical correspondence analysis (CCA) constrains the row scores of a correspondence analysis (CA) of the central table by linear combinations of environmental variables. In addition to a single constraint on the rows, dcCA uses also a constraint on the columns. In applications, the column (species) scores are constrained by linear combinations of species traits. Lavorel et al. (1999) nicknamed the method therefore double CCA and Kleyer et al. (2012) also use this name. Here the method is abbreviated to dcCA. Its inputs are three data tables: the central sitebyspecies table Y containing the abundances (\(\ge \)0) of the species in the sites (for short the community table), the sitebyenvironment table E, and the speciesbytraits table T. To give another application domain, the central table may contain preference scores of persons on different products and the tables E and T person and product characteristics, respectively.
Curiously, dcCA has never been explicitly defined mathematically in a scientific paper, presumably because its mathematics was crystal clear to its inventors. An exception is perhaps Böckenholt and Böckenholt (1990) but they defined the constraints in a different way, namely by the null space method (Takane 2013). As some applications of this method did appear, its novelty was gone and no thorough statistical description of the method was published. For Takane (2013), dcCA is only a special case of constrained principal component analysis, but there is more to say than that, as Hill (1974) said for CA or Tenenhaus and Young (1985) for multiple CA. Meanwhile, Kleyer et al. (2012) evaluated several methods for analysing traitenvironment relationships, including dcCA, and did not detect any advantage of dcCA over rival methods such as RLQ analysis (Dolédec et al. 1996), another threetable ordination method, and redundancy analysis on community weighted means (CWMRDA) that consists in combining tables T and Y to build a new table of community weighted trait means (CWMs) that is related to E, in a second step, by a twotable method. All these methods are closely linked but differ notably in whether they take account of correlations among traits and/or among environmental variables by builtin multiple regressions. RLQ is based on PLSlike regressions and thus does not take account of any of these correlations, CWMRDA takes account of the correlations among environmental variables, and dcCA takes account of both. As a consequence, RLQ is more robust and can analyse any number of traits and environmental variables, whereas in CWMRDA the number of environmental variables must not be large compared to the number of sites, and in dcCA also the number of trait variables must not be large compared to the number of species. But provided care has been taken of these limitations, for example by prior dimension reduction (Lavorel et al. 1999) or variable selection (ter Braak and Verdonschot 1995), dcCA can reveal trait and environment associations that cannot be identified in RLQ as we demonstrate in Sect. 4.
Reasons for the renewed interest in dcCA are its relation with the fourthcorner approach (ter Braak 2017), the fact that its inertia is a Rao score test statistic for testing traitenvironment association in a loglinear model (ter Braak 2017), and also the desirability of regressionbased methods for traitenvironment relations (PeresNeto et al. 2017; Warton et al. 2015) where abundances or community weighted means are considered as the response variables. In this paper the focus is on algorithms for dcCA. The first one is based on a singular value decomposition (SVD), the second one is an iteration algorithm based on the transition formulae, and the third is based on a combination of CCA and redundancy analysis (RDA) which are widely available together with associated methods for statistical testing and selection of variables (Dray and Dufour 2007; Oksanen et al. 2013; ter Braak and Šmilauer 2012).
These novel algorithms come in addition to three existing algorithms. The first is based on the observation of Lavorel et al. (1999) that dcCA can be seen as a canonical correlation analysis of traits and environment weighted by the central table. The input data would be inflated trait and environment data tables as exemplified in Dray and Legendre (2008); see also ter Braak (2017). Although canonical correlation analysis is widely available in statistical packages, a weighted version does not exist to our knowledge; the unweighted version could be used only for integer abundance data after ‘super’ inflation of the data so that every individual has a separate row.
In practice, Lavorel et al. (1999) used another algorithm developed by Bacou et al. (1989). This second algorithm consisted of two CCA’s: a traditional CCA of the abundance table with respect to the environment data from which a table of fitted abundances is computed by the reconstitution formula known from CA, and then another CCA of the transposed fitted table with respect to the trait data. A problem with this approach is that the fitted table may contain negative values and most programs for CCA do not allow negative values in the abundance table. Lavorel et al. (1999) do not mention this problem or how they dealt with it. Presumably their software allowed negative abundance values in a CCA.
The third algorithm is in the Appendices S2 and S4 of Kleyer et al. (2012). It starts from the results of a correspondence analysis of the central table, in line with how a canonical correspondence analysis is computed in the R package ade4 (Dray and Dufour 2007; R Core Team 2015). In this algorithm, the CA is foremost used to create row and column weights and a transformed abundance table that is projected on the environmental variables. The result is projected onto the traits, the result of which is finally rotated to principal axes. The projections are obtained by weighted regression. This approach, implemented in R function doublerda in Kleyer et al. (2012), is generic in that it can also be used for a double constrained principal component analysis (dcPCA) when starting from a principal components analysis of the central table. See Takane (2013) for a related generic framework. The second and third algorithms share the feature that they work on fitted sitebyspecies tables.
Our new algorithms do not create such tables. The iteration algorithm works directly on the original data tables Y, E and T, the SVD on the matrix product of Y and orthogonalized E and T (say \(\mathbf{E}^{*}\) and T\(^{*})\) and the combination of CCA and RDA can be seen as a variant of CWMRDA in which the communityweighted trait means (CWMs), usually obtained by combining Y and T, are computed with respect to \(\mathbf{T}^{*}\) instead of to T. Reversely, we also described an alternative SNCRDA in which the speciesniche centroids (SNCs), usually computed by combining Y and E (see PeresNeto et al. 2017 for more details), are computed with respect to \(\mathbf{E}^{*}\) instead of to E. The CWMs and SNCs are of smaller dimensions than the original sitesbyspecies table Y, when the number of environmental variables is smaller than the number of sites and the number of traits is smaller than the number of species, respectively.
The paper is structured as follows. Section 2 defines dcCA, presents the new algorithms, and defines some additional quantities that are useful for summarizing and plotting the result in biplots. Section 3 discusses the available biplots in detail and Sect. 4 presents a data example and simulated example comparing dcCA with RLQ. Most derivations are given in the Appendices; all equations are illustrated on real data in an Rscript in Supplementary “Appendix S1”.
2 Theory and method
2.1 Notation
In the paper, bold lower case is used for column vectors, e.g. \(\mathbf{x}\) is a vector with elements \(\left\{ {x_{i} } \right\} \), i = \(1,\ldots , n\). Bold upper case is used for matrices. Elements of Y will be denoted by \(y_{ij} \); subscript i denotes a site (row of Y or E) and subscript j denotes a species (column of Y or row of T). A symbol “+” replacing an index means the sum over the index, e.g. \(y_{i+} =\mathop \sum \nolimits _j y_{ij} \). Further, \(\mathbf{R}\) and \(\mathbf{K}\) are diagonal matrices with the weights \(\left\{ {y_{i+} } \right\} \) and \(\{ {y_{+j} } \}\), respectively, on their diagonal.
2.2 Definition and corresponding equations
Definition
dcCA is a method that finds linear combinations of the traits and of the environmental variables such that the fourth corner correlation between these linear combinations is maximized
The definition leads to an eigenequation of which the first (nontrivial) eigenvector is the solution. Later axes are then subsequent eigenvectors which have the interpretation that they also maximize the fourth corner correlation but subject to the additional constraint of their orthogonality (in a particular metric) to the previous axes. The definition is in line with CA as a method of optimal scaling of the row and column categories of a contingency table (Gifi 1990; Hill 1974).

The WA species scores (\(u_k^{*} )\) are (proportional to) weighted averages of LC site scores (i.e. the constrained site scores, \(\mathbf{x})\).

Canonical weights for the traits (\(\mathbf{c})\) are coefficients of the regression of the WA species scores (\(u_k^{*} )\) on the traits (\(\mathbf{T})\) with the species total abundances (\(y_{+k} )\) as weights.

LC species scores (i.e. the constrained species scores, \(\mathbf{u})\) are a linear combination of the traits.

WA site scores (\(x_i^{*} )\) are (proportional to) weighted averages of LC species scores (\(\mathbf{u})\).

Canonical weights for the environmental variables (\(\mathbf{b})\) are coefficients of the regression of the WA site scores (\(x_i^{*} )\) on the environmental variables (\(\mathbf{E})\) with the site total abundances (\(y_{i+} )\) as weights.

LC site scores (constrained site scores, \(\mathbf{x})\) are a linear combination of the environmental variables (\(\mathbf{E})\).
Special cases of dcCA are of course CCA if \(\mathbf{T}=\mathbf{I}_m \) or \(q \ge m\) so that there are effectively no constraints and CA if also \(\mathbf{E}=\mathbf{I}_n \) or \(p \ge n\).
2.3 Algorithm based on a SVD
This approach immediately solves for all dcCA eigenvectors. The LC site scores and LC species scores of all dimensions, \(\mathbf{X}=\mathbf{EB}\) and \(\mathbf{U}=\mathbf{TC}\), are R and Korthogonal and the scaling factor \({\varvec{\Delta }}^{\alpha }\) in Eq. (17) ensures that \(\mathbf{X}^{T}{} \mathbf{R}\mathbf{X}={\varvec{\Lambda }}^{\alpha }\) and \(\mathbf{U}^{T}{} \mathbf{K}\mathbf{U}={\varvec{\Lambda }}^{1\alpha }\), where \({{\varvec{\Lambda }} }= {\varvec{\Delta }}^{2}\), respectively, as in the transition formulae.
Note that \(tr\left( {\mathbf{D}^{T}{} \mathbf{D}} \right) \) is equal to the sum of all eigenvalues satisfying Eq. (8), also known as the total inertia of the dcCA. ter Braak (2017) showed that \(y_{++} tr\left( {\mathbf{D}^{T}{} \mathbf{D}} \right) \) is the Rao score test statistic for testing the traitenvironment interaction in a Poisson loglinear model with saturated main effects. The first eigenvalue is the onedimensional replacement thereof that could be also useful as test statistic in permutation tests if the alternative hypothesis is likely to be onedimensional.
2.4 Algorithm based on the transition formulae
If only the first or only a few eigenvectors need to be calculated, iterative methods can be used that repeatedly cycle through the transition formulae starting from arbitrary, nonconstant starting values for either \(\mathbf{b}\) or \(\mathbf{c}\), when using Eqs. (6) and (7), and for either \(\mathbf{b}\), \(\mathbf{c}\), \(\mathbf{x}^{*}\) or \(\mathbf{u}^{*}\), when using Eqs. (9)–(14). The latter is a constrained reciprocal averaging algorithm (Hill 1973). Such an algorithm for CCA is described in the “Appendix” of ter Braak and Prentice (1988) and the extension to dc–CA is trivial. This iterative algorithm is related to the wellknown power algorithm for solving eigenproblems (Good 1969; Gourlay and Watson 1973). Power algorithms tend to be slow, but acceleration methods make them practical.
2.5 Algorithm based on combining CCA and a weighted RDA
This subsection presents an algorithm based on a combination of CCA and RDA. The algorithm also gives insight into the relation between dcCA and CWMRDA.
 1.
Transforming the traits to Korthonormal ones using species weights \(\mathbf{K}\),
 2.
Computing community weighted means of the orthonormal traits \(\mathbf{T}^{*}\) and then
 3.
Applying a weighted RDA of these community weighted means with respect to the environmental data \(\mathbf{E}\) using site weights \(\mathbf{R}\).
The above Steps 1 and 2 can be combined in a single CCA of the transposed abundance table with respect to the traits, i.e. CCA(\(\mathbf{Y}^{T}\sim \mathbf{T})\). The row scores of this CCA are linear combinations of the traits which are Korthogonal and, depending on the scaling of the axes, also Knormalized (the scaling is columnmetric preserving) (ter Braak 1986, 2014). The column scores of this analysis (response variable scores, in this case representing rows of \(\mathbf{Y})\) are weighted averages of the row scores and thus community weighted means of Korthonormal traits [cf. Eq. (9)]. This way of making the traits orthonormal has an advantage for trait data that are (near) singular: the CCA ranks the dimensions in order of their importance for \(\mathbf{Y}^{T}\) so that it is unlikely that an important dimension is dropped due to an unlucky cutoff in the decision for rankdeficiency.
 1.
Perform CCA(\(\mathbf{Y}^{T}\sim \mathbf{T})\): a CCA of the transposed community table onto the traits
 2.
Obtain the column scores (\(\mathbf{M}^{*}\), say) from this analysis in sitemetric preserving scaling, an \(n \times q^{*}\) table of scores with \(q^{*}\) the rank of the trait data, which are community weighted means of orthonormalized traits.
 3.
Perform a weighted \(\hbox {RDA}_{\mathbf{R}}(\mathbf{M}^{*}\sim \mathbf{E})\): an RDA of \(\mathbf{M}^{*}\) on the environmental variables using row weights \(\mathbf{R}\).
 1.
Perform CCA(\(\mathbf{Y}\sim \mathbf{E})\): a CCA of the community table on to the environmental variables
 2.
Obtain the column scores (\(\mathbf{S}^{*}\), say) from this analysis in speciesmetric preserving scaling, an \(m \times p^{*}\) table of scores with \(p^{*}\) the rank of the environmental data, which are species niche centroids (PeresNeto et al. 2017) of orthonormalized environmental variables.
 3.
Perform weighted \(\hbox {RDA}_{\mathbf{K}}(\mathbf{S}^{*}\sim \mathbf{T})\): an RDA of \(\mathbf{S}^{*}\) on the traits using row weights \(\mathbf{K}\).
The computer program Canoco 5.10 (ter Braak and Šmilauer 2012) implements dcCA via these combinations of CCA and weighted RDA where the first combination is used for selection and significance testing of environmental variables and the second for selection and significance testing of traits.
2.6 Derived scores

Intraset correlations of the traits with the constrained dcCA axes, \(cor_\mathbf{K} \left( {\mathbf{T},\mathbf{U}} \right) \), and similarly for the environmental variables, \(cor_\mathbf{K} \left( {\mathbf{E},\mathbf{X}} \right) \).

Interset correlations of the traits with the WA dcCA axes, \(cor_\mathbf{K} \left( {\mathbf{T},\mathbf{U}^{*}} \right) \), and similarly for the environmental variables, \(cor_\mathbf{K} \left( {\mathbf{E},\mathbf{X}^{*}} \right) \). These are interset correlations in the setting of CCA as, for example, \(\mathbf{X}^{*}\) is a linear combination of \(\mathbf{Y}\) and not of \(\mathbf{E}\).

Fourthcorner correlations of the traits with the constrained dcCA axes, \(cor_\mathbf{Y} \left( {\mathbf{T},\mathbf{X}} \right) \), and similarly for the environmental variables, \(cor_\mathbf{Y} \left( {\mathbf{E},\mathbf{U}} \right) \). When dcCA is interpreted as a canonical correlation analysis on super inflated data, \(\mathbf{Y}\) seemingly disappears and these fourthcorner correlations are then in fact the interset correlations of the canonical correlation analysis. Unless noted explicitly otherwise, interset correlations of dcCA refer to their definition in the setting of CCA and RDA, that is, to \(cor_\mathbf{K} \left( {\mathbf{E},\mathbf{X}^{*}} \right) \) and \(cor_\mathbf{K} \left( {\mathbf{T},\mathbf{U}^{*}} \right) \).

Centroids of scores for categories of nominal traits and environmental variables. Such centroids can be viewed as scores for ‘super species’ and ‘super sites’ as they average scores of species or of sites belonging to the same category. The centroids are all weighted averages using the weights \(\left\{ {y_{i+} } \right\} \) for sites and \(\{y_{+k} \}\) for species.
3 Biplots
Biplots serve to visualize the main pattern in the analyzed data, by plotting the scores on, typically, the first two axes of the analysis, so that their inner product approximates a matrix, typically a data table or table with statistics such as correlation or regression coefficients (Gabriel 1982).With three kinds of items in the plot (sites, species, environmental variables) such plots are often called triplots, in which pairs of items have an inner product (biplot) interpretation. In dcCA there is a fourth kind of items: traits. This section proposes quadriplots with all four kinds of items in which almost all pairs of items have a biplot interpretation.
As in CCA, RDA and canonical correlation analysis (ter Braak 1990) there is a choice to visualize in the biplots regression coefficients or correlations. Biplots visualizing regression coefficients are treated in “Appendix A5”. For example, a biplot of both sets of canonical weights (\(\mathbf{B}\) and \(\mathbf{C})\) approximates the regression coefficients associated with the bilinear interaction between (i.e. products of) environmental variables and traits. The other biplots in “Appendix A5” essentially follow from considering dcCA as a canonical correlation analysis on inflated trait and environment data. In this section, the focus is on biplots of fourthcorner correlations between traits and environmental variables.
“Appendix A4” shows that when plotting \(\mathbf{B}_f \), \(\mathbf{C}_f \), \(\mathbf{X}\) and \(\mathbf{U}\) together the pairs \(\mathbf{B}_f \)–\(\mathbf{U}\) and \(\mathbf{C}_f \)–\(\mathbf{X}\) form a weighted leastsquares biplot of the species niche centroids \(\left( {\mathbf{K}^{1}\mathbf{Y}^{T}{} \mathbf{E}} \right) \) and CWMs \(\left( {\mathbf{R}^{1}{} \mathbf{YT}} \right) \), respectively. Also, \(\mathbf{X}\) and \(\mathbf{U}\) form a weighted leastsquares biplot of the fitted contingency ratios, which are the contingency ratios projected on both \(\mathbf{E}\) and \(\mathbf{T}\), analogously to the situation in CCA (ter Braak 2014).
The above biplot options are leastsquares for all values of \(\alpha \). For \(\alpha =1\), the plot is rowmetric preserving, and thus approximates the chisquare distance between sites based on fitted values, and \(\mathbf{C}_f \) contains the intraset correlations for the traits and the biplot thus weakly approximates the correlations among the traits. For \(\alpha =0\), the plot is columnmetric preserving, and thus approximates the chisquare distance between species based on fitted values, and \(\mathbf{B}_f \) contains the intraset correlations for the environmental variables and the biplot thus weakly approximates the correlations among the environmental variables. In conclusion, when plotting \(\mathbf{B}_f \), \(\mathbf{C}_f \), \(\mathbf{X}\) and \(\mathbf{U}\) together with \(\alpha =0\) or 1, five of the six pairs of items have a biplot interpretation (the pairs \(\mathbf{B}_f \)–\(\mathbf{X}\) and \(\mathbf{C}_f \)–\(\mathbf{U}\) have no known useful biplot interpretation for \(\alpha =1\) and 0, respectively).
4 Real data and simulation example
Figure 1 shows the rowmetric preserving dcCA biplot \(\left( {\alpha =1} \right) \) with arrows for traits and environmental variables (\(\mathbf{C}_f \) and \(\mathbf{B}_f )\), which display by their inner product their fourthcorner correlation, together with points for species and sites. The strongest correlations are those between Moisture and Seed mass (\(\,0.32\)) and Manure and SLA (0.23) as indicated by projecting the arrows for Seed mass and SLA on the arrows for Moisture and Manure; alternatively consider their obtuse and sharp angles, respectively, and the lengths of the arrows. The maximized fourthcorner correlations along the first (horizontal) and second (vertical) axes are 0.43 and 0.15, respectively. Because \(\alpha =1\), the configuration of site points and environmental arrows shows the importance of the first axis compared to the second; the environmental arrows are fourthcorner correlations and the trait arrows intraset correlations.
Beyond the pair \(\mathbf{B}_f \)–\(\mathbf{C}_f \), four other pairs of items have a biplot interpretation. Example interpretations are as follows:
Biplot of \(\mathbf{B}_f \)–\(\mathbf{U}\): the species points on the left in Fig. 1 have low SNC of Moisture (i.e. occur more at drier conditions) and the species on the right have high SNC with respect to Moisture (i.e. occur more at wetter conditions).
Biplot of \(\mathbf{C}_f \)–\(\mathbf{X}\): the site points when projected onto the arrow for SLA represent the CWM of SLA (their mean SLA), so that, for example, site 1 is inferred to have the highest mean SLA and sites 14 and 15 the lowest mean SLA.
Biplot of \(\mathbf{U}\)–\(\mathbf{X}\): the species and site points form a biplot of the contingency ratios, for example, the share of species Alo gen is high in site 1 compared to sites 14 and 15.
Biplot of \(\mathbf{C}_f \)–\(\mathbf{U}\): the species points when projected on to Seed mass represent their Seed mass, so that the species Vic lat and Lol per are inferred to have high Seed mass and Jun art and Agr Sto low Seed mass.
In this simple example, there is little difference between dcCA and RLQ: although dcCA maximized the fourthcorner correlation and RLQ maximized the fourthcorner covariance (subject to standardized axes), the resulting fourthcorrelations are almost identical. That is not always the case as will be shown in a simulation example. The R code for the simulation is in Supplementary “Appendix S2”.
Percentiles of the fourthcorrelations of the first two axes and their squared ratio of dcCA and RLQ in 10,000 simulated data sets with \(n = m =\) 100
\(\rho _1 \)  \(\rho _2 \)  \(\lambda _1 /\lambda _2 \)  

dcCA  RLQ  dcCA  RLQ  dcCA  RLQ  
2.50%  0.15  0.03  0.04  0.02  8.0  0.2 
50%  0.21  0.08  0.05  0.03  18.2  7.4 
97.50%  0.28  0.15  0.06  0.08  39.2  45.6 
Table 1 shows results of the simulation. The maximized fourthcorner correlation of dcCA of the first axis was much higher than the RLQ fourthcorner correlation of the first axis, with the 2.5% percentile of dcCA being even bigger than the 97.5% percentile of RLQ. The fourthcorner correlations of the second axes (which were zero in the model) were comparable. The ratio of the first over the second eigenvalue (squared correlation), which is a measure of the dominance of the first axis over the second in each simulated data set, is on average much higher in dcCA than in RLQ (last two columns in Table 1). Compared to RLQ, dcCA thus much better indicates that only one dimension is important for describing the association between the observed trait and environmental variables.
5 Discussion
This paper fills a gap by giving a mathematical description of double constrained correspondence analysis (dcCA) starting from the idea that it maximizes a correlation, in particular the fourthcorner correlation between linear combinations of traits and of environmental variables. It was known from the start that dcCA is identical with canonical correlation analysis of superinflated trait and environment data. But dcCA deserves special treatment as the units of sampling are not the individuals that are counted but the sites with individuals belonging to different species. Our mathematical development shows the precise role of community (site) weighted trait means (CWM) and, its reverse, species niche centroids (speciesweighted mean environment, SNC) in dcCA. In “Appendix A6” it is reiterated why CWMs and SNCs are key statistics in traitenvironment studies and that the withinsite trait variance and the withinspecies environmental variance (niche breadth) may deserve separate study in relation to the environment and traits, respectively. The novel algorithm that combines a singly constrained correspondence (i.e. CCA) with a weighted singly constrained principal component analysis (i.e. redundancy analysis) shows the relation with CWMRDA, an adhoc method that is commonly used to relate traits to environment. CWMRDA uses regression onto the environmental variables, whereas dcCA also uses regression onto the traits. By contrast, RLQ, one of the oldest multivariate methods for traitenvironment analysis, is based on covariance without using regression at all. Our small simulation example demonstrated that, by combining correlated observed variables, dcCA can detect trait and environment relationships that remain hidden in RLQ.
RLQ is based on coinertia analysis (Dray et al. 2003) while dcCA is based on CCA. Therefore the comparison between coinertia analysis and CCA by Dray et al. (2003) is of interest. They showed that CCA deteriorates in detecting the hidden gradients when many highly correlated environmental variables that have no real effect are included in the analysis. In such an extreme situation coinertia performs better than CCA and the same is expected for RLQ, and dcCA. However, with moderate correlations or when multicollinearity problems are taken care of, for example by variable selection or by changing regression to ridge regression, we expect dcCA to outperform RLQ.
Aitchison’s logratio analysis is essentially the analysis of doublecentred logtransformed \(\mathbf{Y}\) (see discussion by Dawid of Aitchison (1982) which leads to the centred logratio transformation). Microbiome data are sometimes analyzed by Aitchison’s logratio PCA (Gloor et al. 2016), despite the fact that they contain many zeros. Using weighted (double) (constrained) logratio analysis with row and column sums of \(\mathbf{Y}\) as weights (Greenacre and Lewi 2009) will decrease the adverse effect of rows and columns with many zeroes, at least if the number of zeroes is reflected in the weights (alternatively the rowwise and columnwise numbers of nonzeroes could be used as weights). A natural alternative is in our view (double) (constrained) CA which does not need tricks for handling zeroes in the data.
The computer program Canoco 5.10 (ter Braak and Šmilauer 2012) implements dcCA as a combination of CCA and a weighted RDA. Weighted dcPCA is implemented by changing the initial CCA(\(\mathbf{Y}^{T}\sim \mathbf{T})\) by a weighted \(\hbox {RDA}_{\mathbf{K}}(\mathbf{Y}^{T}\sim \mathbf{T})\). For this purpose, both RDAs must be centred by rows and by columns in the metrics \(\mathbf{R}\) and \(\mathbf{K}\), respectively, which is in agreement with the idea that the traitenvironment association is an interaction and should not involve main effects. By prior logtransformation, a (weighted or unweighted) double constrained logratio analysis is obtained. For statistical inference about the traitenvironment relation in dcCA see ter Braak (2017) and ter Braak et al. (2017) and, in the logratio context, Cormont et al. (2011). These methods can quickly provide an overview of which variables appear important. We believe that they deserve more consideration, evaluation and use.
Notes
Acknowledgements
We thank John Birks and Richard Furnas for inspiration and comments.
Supplementary material
References
 Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc B 44:139–177. http://www.jstor.org/stable/2345821
 Bacou AM, Sabatier R, Lespinasse P (1989) Analyses des correspondence avec une ou deux contraintes avec le logiciel Biomeco, manuel de l’utilisateur. CEFE, CNRS, MontpellierGoogle Scholar
 Böckenholt U, Böckenholt I (1990) Canonical analysis of contingency tables with linear constraints. Psychometrika 55:633–639. https://doi.org/10.1007/bf02294612
 Brown AM, Warton DI, Andrew NR, Binns M, Cassis G, Gibb H (2014) The fourthcorner solution—using predictive models to understand how species traits interact with the environment. Methods Ecol Evol 5:344–352. https://doi.org/10.1111/2041210x.12163 CrossRefGoogle Scholar
 Cormont A, Vos CC, van Turnhout CAM, Foppen RPB, ter Braak CJF (2011) Using lifehistory traits to explain bird population responses to changing weather variability. Clim Res 49:59–71. https://doi.org/10.3354/cr01007 CrossRefGoogle Scholar
 Dolédec S, Chessel D, ter Braak CJF, Champely S (1996) Matching species traits to environmental variables: a new threetable ordination method. Environ Ecol Stat 3:143–166. https://doi.org/10.1007/BF02427859 CrossRefGoogle Scholar
 Douglas Carroll J, Pruzansky S, Kruskal JB (1980) Candelinc: a general approach to multidimensional analysis of manyway arrays with linear constraints on parameters. Psychometrika 45:3–24. https://doi.org/10.1007/bf02293596 CrossRefGoogle Scholar
 Dray S, Chessel D, Thioulouse J (2003) Coinertia analysis and the linking of ecological tables. Ecology 84:3078–3089. http://www.jstor.org/stable/3449976
 Dray S, Dufour AB (2007) The ade4 package: implementing the duality diagram for ecologists. J Stat Softw 22:1–20. https://doi.org/10.18637/jss.v022.i04 CrossRefGoogle Scholar
 Dray S, Legendre P (2008) Testing the species traits environment relationships: the fourthcorner problem revisited. Ecology 89:3400–3412. https://doi.org/10.1890/080349.1 CrossRefPubMedGoogle Scholar
 Gabriel KR (1982) Biplot. In: Kotz S, Johnson NL (eds) Encyclopedia of statistical sciences, vol 1. Wiley, New York, pp 263–271Google Scholar
 Gabriel KR (1998) Generalised bilinear regression. Biometrika 85:689–700. https://doi.org/10.1093/biomet/85.3.689 CrossRefGoogle Scholar
 Gifi A (1990) Nonlinear multivariate analysis. Wiley, New York. ISBN 9780471926207Google Scholar
 Gittins R (1985) Canonical analysis. A review with applications in ecology. Springer, Berlin. ISBN 9783642698781CrossRefGoogle Scholar
 Gloor GB, Wu JR, PawlowskyGlahn V, Egozcue JJ (2016) It’s all relative: analyzing microbiome data as compositions. Ann Epidemiol 26:322–329. https://doi.org/10.1016/j.annepidem.2016.03.003 CrossRefPubMedGoogle Scholar
 Golub GH, Reinch C (1970) Singular value decomposition and least squares solutions. Numerische Mathematik 14:403–420. https://doi.org/10.1007/BF02163027 CrossRefGoogle Scholar
 Golub GH, van Loan CF (1989) Matrix computations. The John Hopkins University Press, Baltimore. ISBN 9780801854149Google Scholar
 Good IJ (1969) Some applications of singular decomposition of a matrix. Technometrics 11:823–831. https://doi.org/10.2307/1266902 CrossRefGoogle Scholar
 Goodman LA (1981) Association models and canonical correlation in the analysis of crossclassifications having ordered categories. J Am Stat Assoc 76:320–334. https://doi.org/10.2307/2287833 CrossRefGoogle Scholar
 Gourlay AR, Watson GA (1973) Computational methods for matrix eigenproblems. Wiley, New York. ISBN 0471319155Google Scholar
 Gower JC, Hand DJ (1996) Biplots. Chapman, London. ISBN 9780412716300Google Scholar
 Greenacre M, Lewi P (2009) Distributional equivalence and subcompositional coherence in the analysis of compositional data, contingency tables and ratioscale measurements. J Classif 26:29–54. https://doi.org/10.1007/s003570099027y CrossRefGoogle Scholar
 Greenacre MJ (1984) Theory and applications of correspondence analysis. Academic Press, London. ISBN 9780122990502Google Scholar
 Hill MO (1973) Reciprocal averaging: an eigenvector method of ordination. J Ecol 61:237–249. https://doi.org/10.2307/2258931
 Hill MO (1974) Correspondence analysis: a neglected multivariate method. Appl Stat 23:340–354. https://doi.org/10.2307/2347127 CrossRefGoogle Scholar
 Ihm P, van Groenewoud H (1975) A multivariate ordering of vegetation data based on Gaussian type gradient response curves. J Ecol 63:767–777. https://doi.org/10.2307/2258600 CrossRefGoogle Scholar
 Ihm P, van Groenewoud H (1984) Correspondence analysis and Gaussian ordination. Compstat Lect 3:5–60Google Scholar
 Jamil T, Ozinga WA, Kleyer M, ter Braak CJF (2013) Selecting traits that explain speciesenvironment relationships: a generalized linear mixed model approach. J Veg Sci 24:988–1000. https://doi.org/10.1111/j.16541103.2012.12036.x CrossRefGoogle Scholar
 Kleyer M et al (2012) Assessing species and community functional responses to environmental gradients: which multivariate methods? J Veg Sci 23:805–821. https://doi.org/10.1111/j.16541103.2012.01402.x CrossRefGoogle Scholar
 Lavorel S, Rochette C, Lebreton JD (1999) Functional groups for response to disturbance in mediterranean old fields. Oikos 84:480–498. https://doi.org/10.2307/3546427 CrossRefGoogle Scholar
 Lavorel S, Touzard B, Lebreton JD, Clément B (1998) Identifying functional groups for response to disturbance in an abandoned pasture. Acta Oecologica 19:227–240. https://doi.org/10.1016/S1146609X(98)800271 CrossRefGoogle Scholar
 Lebreton JD, Chessel D, Prodon R, Yoccoz N (1988a) L’analyse des relations espècesmilieu par l’analyse canonique des correspondances. I. Variables de milieu quantitatives. Acta Oecologia Generalis 9:53–67Google Scholar
 Lebreton JD, Chessel D, RichardotCoulet M, Yoccoz N (1988b) L’analyse des relations espècesmilieu par l’analyse canonique des correspondances. II. Variables de milieu qualitatives. Acta Oecologia Generalis 9:137–151Google Scholar
 Legendre L, Legendre P (2012) Numerical ecology. Elsevier, Amsterdam. ISBN 9780444538680Google Scholar
 Legendre P, Galzin RG, HarmelinVivien ML (1997) Relating behavior to habitat: solutions to the fourthcorner problem. Ecology 78:547–562. https://doi.org/10.2307/2266029 CrossRefGoogle Scholar
 Magnus JR, Neudecker H (1988) Matrix differential calculus with applications in statistics and econometrics. Wiley, New York. ISBN 9780471986331Google Scholar
 Mardia KV, Kent JT, Bibby JM (1980) Multivariate analysis. Academic Press, London. ISBN 9780124712522Google Scholar
 McCune B (2015) The front door to the fourth corner: variations on the sample unit \(\times \) trait matrix in community ecology. Commun Ecol 16:267–271. https://doi.org/10.1556/168.2015.16.2.14 CrossRefGoogle Scholar
 Oksanen J et al. (2013) vegan: Community Ecology Package. R Package. version 209. https://CRAN.Rproject.org/package=vegan
 PeresNeto PR, Dray S, ter Braak CJF (2017) Linking trait variation to the environment: critical issues with communityweighted mean correlation resolved by the fourthcorner approach. Ecography 40:806–816. https://doi.org/10.1111/ecog.02302 CrossRefGoogle Scholar
 R Core Team (2015) R: a language and environment for statistical computing, version 3.0. R Foundation for Statistical Computing, Vienna, Austria. www.Rproject.org
 Rui Alves M, Beatriz Oliveira M (2004) Predictive and interpolative biplots applied to canonical variate analysis in the discrimination of vegetable oils by their fatty acid composition. J Chemom 18:393–401. https://doi.org/10.1002/cem.884 CrossRefGoogle Scholar
 Takane Y (2013) Constrained principal component analysis and related techniques. Chapman and Hall/CRC, Londen. ISBN 9781466556669Google Scholar
 Tenenhaus M, Young FW (1985) An analysis and synthesis of multiple correspondence analyis, optimal scaling, dual scaling, homogeneity analysis and other methods for quantifying categorical multivariate data. Psychometrika 50:91–119. https://doi.org/10.1007/bf02294151 CrossRefGoogle Scholar
 ter Braak CJF (1985) Correspondence analysis of incidence and abundance data: properties in terms of a unimodal response model. Biometrics 41:859–873. https://doi.org/10.2307/1938672 CrossRefGoogle Scholar
 ter Braak CJF (1986) Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67:1167–1179. https://doi.org/10.2307/1938672 CrossRefGoogle Scholar
 ter Braak CJF (1987) The analysis of vegetationenvironment relationships by canonical correspondence analysis. Vegetatio 69:69–77. https://doi.org/10.1007/BF00038688 CrossRefGoogle Scholar
 ter Braak CJF (1988) Partial canonical correspondence analysis. In: Bock HH (ed) Classification and related methods of data analysis. Elsevier Science Publishers B.V. (NorthHolland), Amsterdam, pp 551–558. http://edepot.wur.nl/241165
 ter Braak CJF (1990) Interpreting canonical correlation analysis through biplots of structural correlations and weights. Psychometrika 55:519–531. https://doi.org/10.1007/BF02294765 CrossRefGoogle Scholar
 ter Braak CJF (2014) History of canonical correspondence analysis. In: Blasius J, Greenacre M (eds) Visualization and verbalization of Data. Chapman and Hall, London, pp 61–75. http://edepot.wur.nl/302963
 ter Braak CJF (2017) Fourthcorner correlation is a score test statistic in a loglinear traitenvironment model that is useful in permutation testing. Environ Ecol Stat 24:219–242. https://doi.org/10.1007/s1065101703680 CrossRefGoogle Scholar
 ter Braak CJF, PeresNeto P, Dray S (2017) A critical issue in modelbased inference for studying traitbased community assembly and a solution. PeerJ 5:e2885. https://doi.org/10.7717/peerj.2885 CrossRefPubMedPubMedCentralGoogle Scholar
 ter Braak CJF, Prentice IC (1988) A theory of gradient analysis. Adv Ecol Res 18:271–317. https://doi.org/10.1016/S00652504(08)60183X CrossRefGoogle Scholar
 ter Braak CJF, Šmilauer P (2012) Canoco reference manual and user’s guide: software for ordination, version 5.0. Microcomputer Power, Ithaca, USAGoogle Scholar
 ter Braak CJF, Verdonschot PFM (1995) Canonical correspondence analysis and related multivariate methods in aquatic ecology. Aquat Sci 57:255–289. https://doi.org/10.1007/BF00877430 CrossRefGoogle Scholar
 Tso MKS (1981) Reducedrank regression and canonical analysis. J R Statist Soc B 43:183–189. http://www.jstor.org/stable/2984847
 Warton DI, Shipley B, Hastie T (2015) CATS regression: a modelbased approach to studying traitbased community assembly. Methods Ecol Evol 6:389–398. https://doi.org/10.1111/2041210x.12280 CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.