To Editor,

We read with great interest the paper entitled “Traits and stress: keys to identify community effects of low levels of toxicants in test systems” by Matthias Liess and Mikhail Beketov (Liess and Beketov 2011). The paper presents a new way to analyse data from microcosms and mesocosms, which the authors claim that to be more sensitive than the commonly-used principal response curves (PRC) method (Van den Brink and Ter Braak 1999). Since PRC was developed more than a decade ago, new developments in the field of analysing community responses to stress are very welcome. However, after reading the paper, we have concluded that their new method, i.e. SPEARmesocosm, may not offer the level of improvement suggested and in this letter to the editor we will briefly explain why.

Both PRC and SPEARmesocosm display the time-dependent treatment effects of a toxicant. The fundamental difference between the PRC and SPEARmesocosm methods is that PRC is a multivariate statistical method and SPEARmesocosm is a univariate method. For SPEARmesocosm to work, a single index must be constructed on the basis of predictions derived from a priori knowledge of intrinsic sensitivity and life-cycle characteristics (voltinism). By contrast, PRC does not need this a priori knowledge and can work with multiple indices. In microcosm and mesocosm studies for which it was developed it is usually applied on the original taxon composition data. PRC is thus a purely statistical method for analysing empirical data derived from mesocosm and other community-level experiments. PRC partitions the observed variance in the data in time, treatment (which includes interaction with time) and residual variance (which corresponds to the differences between replicates) and summarises the variance explained by treatment and time by showing the time-dependent treatment effects in sequential (first, second, etc.) PRC diagrams. These PRC diagrams show the contrasting responses of different (groups of) taxa, very much like the contrasting response of the sensitive univoltine species and the other taxon groups as displayed in Fig. 2 of Liess and Beketov (2011). The agreement of Fig. 2 with PRC would have been even greater if the percentage change would have been plotted on a logarithmic scale.

When comparing the statistical methods for analysing the data from such experiments, it is clearly critical that the same endpoints are being compared (using the same input data), otherwise the differences seen in outcomes cannot be reliably attributed to the statistical methods. We feel that the comparison made in Fig. 3 is inappropriate, the comparison is one of apples with oranges. While the PRC diagram shows the dominant response present in the whole invertebrate community, SPEARmesocosm only takes (presumed) sensitive species into account. For a proper comparison, Liess and Beketov could have performed a PRC analysis only using the sensitive univoltine taxa, which would almost certainly have yielded a diagram comparable to Fig. 3b. Probably the sub-dominant responses of the sensitive univoltine taxa would be presented by the second PRC of the original analysis. We are, unfortunately, not in the position to evaluate this since no access to the data was provided on this short time-frame, despite a request to the authors. How the second PRC is extracted and tested for significance is explained in the original PRC publication (Van den Brink and Ter Braak 1999), while Van den Brink and ter Braak (1998); Van den Brink et al. (2003) and Maccherini et al. (2007) present examples of the use of the second PRC. We acknowledge that testing the second PRC on its significance and presenting it when it is significant is not common practice. This example indicates that such an approach should be evaluated more often than it is at present. In passing we note that instead of applying PRC to the original taxon data, PRC could also have been applied to the data after aggregation to taxon clusters. This would have signalled out directly the different response of the sensitive univoltine taxa to the toxicant in comparison with the other groups. As another way of ensuring that all responses present in the data set are highlighted is to perform univariate tests at the taxon level and present the responses of all taxa for which consistent significant treatment effects are indicated. This approach is common practice and required in the evaluation of most microcosm and mesocosm studies performed for registration purposes of pesticides in Europe (SANCO 2002; De Jong et al. 2005). Such analyses would no doubt also highlight the sensitive responses at the population level as are presented by the SPEAR method.

Since the data of the mesocosm experiment are not presented by Liess and Beketov (2011) in a format which allows these queries to be addressed, it is difficult to gain understanding regarding the dominance or rarity of the different taxa. All abundances provided are expressed as relative to the control, so it is unclear what the actual recorded abundances of the taxa were. We would expect that may have been somewhat low for individual taxa in the samples (total abundance around 100 individuals/sample) since the overall abundance is approximately 1,000 individuals/m2 (Fig. 1) while 0.09 m2 (4 quadrants of 15 × 15 cm2) was sampled during each sampling time. Thus the reader cannot ascertain whether Fig. 2d is based on high or low abundance values (even single individuals), which is of crucial importance in any robust evaluation of the effects on sensitive univoltine species as compared to the whole community.

In order to use SPEARmesocosm for the described experiment, some species were ‘reclassified’ in terms of their sensitivity from the original SPEAR database values. Indeed, without this reclassification “differences between control and lowest concentration were […] not significant any more” (page 1,335). The authors state that reclassification was only done for two species, not explaining the non-sensitivity classification of Gammarus sp. (original Sorganic value of +0.04; Liess and Von der Ohe 2005) and the sensitive classification of Chironomidae (original Sorganic value of −0.39; Liess and Von der Ohe 2005), when a cut-off value of −0.36 is used. This suggests that one could require a different SPEARmesocosm for each new compound to be tested in future microcosm or mesocosm experiments, and thus the generality of the proposed method is at best rather questionable. The use of a single indicator of sensitivity neglects the fact that pesticides with different mode of actions can have very different toxicity profiles which is, for instance shown by Vaal et al. (2000); Escher and Hermens (2002) and Rubach et al. (2010). Since the original SPEAR sensitivity ranking is based on the AQUIRE data base (Von der Ohe and Liess 2004), this ranking is probably dominated by organophosphate compounds, as they normally dominate EC50 data sets (Rubach et al. 2010). Consequently, it can be expected that the sensitivity ranking in SPEAR will not perform as expected for compounds which are selective for different taxonomic groups than organophosphates (see Rubach et al. 2010 for a ranking). For instance, none out of the 10 invertebrate taxa that showed the largest response to a carbendazim treatment in microcosms (Cuppen et al. 2000) would be qualified as sensitive by SPEARmesocosm, while laboratory toxicity tests performed with the same or closely related species, explain the response of seven of them in at least the highest concentration tested (Van Wijngaarden et al. 1998). Since species are a priori classified as sensitive or insensitive and univoltine or multivoltine, the SPEARmesocosm indicator does not allow for unforeseen sensitivities or life cycle characteristics of taxa—a significant short-coming for micro- and mesocosm experiments, where the majority of taxa present will normally have been untested. This is, for instance, shown by Figs. 2a, c of Liess and Beketov (2011) which indicate direct effects on taxa that are classified as insensitive. Moreover, since the index focuses on sensitivity and voltinism, it also ignores indirect effects, which are a key consideration for performing microcosm and mesocosm tests (Giddings et al. 2002).

Thus while we fully support the use of traits in ecotoxicology and chemical stress ecology (Van den Brink et al. 2011), we do not agree that the approach used in Liess and Beketov (2011) is an entirely appropriate approach for the evaluation mesocosm studies. However, we sincerely look forward working with Matthias Liess, Mikhail Beketov and others to improve the ecological foundation of our science through the implementation of traits-based approaches in future research.