Introduction

Ecologists investigate biodiversity’s characteristics, causes, and consequences. Despite longstanding efforts, vast gaps remain in our understanding that can be aggregated into larger classes of issues. We have not yet identified most taxa (Linnean shortfall, [13]), their geographic ranges (Wallacean shortfall, [65]), their phylogenetic relationships (Darwinian shortfall, [25]), or their functional traits (Raunkiæran shortfall, [42]). Given the speed of anthropogenic changes to the environment, the Hutchinsonian shortfall [42], i.e., the lack of knowledge about the tolerance of species to abiotic conditions, might be most relevant. These issues converge in freshwaters: many shortfalls are most pronounced there [31] and they are arguably the most threatened by human actions [2, 27, 95]. Agricultural land use contributes to many physical and chemical stressors that negatively affect freshwater ecosystems [104], such as nutrient enrichment [49, 113], increased sediment load [49, 125], and exposure to pesticides [115]. While the role of pesticides in broad-scale biodiversity trends remains poorly studied [114], studies indicate that they impact ecological communities at environmentally relevant concentrations [21, 64, 104, 110].

In Europe, active substances must pass a prospective environmental risk assessment (ERA) before being released into the market to prevent unacceptable effects on the environment. For each active substance, this assessment establishes a presumably safe concentration (Predicted no Effect Concentration, PNEC) and a concentration predicted to occur in the environment given the suggested application procedure (Predicted Exposure Concentration, PEC). If a compound’s PEC is lower than its PNEC, it is considered safe [11]. PNECs are derived in a tiered approach, starting with a mandatory first tier, which involves standard toxicity tests under laboratory conditions using single species. The determined effect concentrations are divided by an assessment factor to obtain the PNEC. The assessment factor is meant to account for the uncertainty in extrapolating from laboratory conditions to the field. Higher tier tests may be conducted if the PEC exceeds the tier one PNEC. They involve increasingly complex scenarios, such as multi-species and semi-field test systems, and lower assessment factors [28].

Many water quality regulations employ type- or site-specific thresholds for nutrients and other physicochemical parameters (e.g., [92]). In contrast, ERA assumes that a single concentration threshold (the PNEC) can obtain a similar protection level across different ecosystems. The assessment factor could account for potential sensitivity variation between ecosystem types, but this kind of variation was not considered in the derivation of assessment factors [72]. It remains an open question whether systematic taxonomic changes in assemblage composition among different types of ecosystems result in systematic differences in assemblage sensitivity to chemicals. This question is relevant in an applied context. If assemblage sensitivity varies systematically among river types, using a single PNEC would likely be inefficient and potentially ineffective. To date, few studies have investigated broad-scale spatial patterns in sensitivity, partly because the sensitivities of most species to most pesticides remain unknown (the Hutchinsonian Shortfall).

Van den Berg et al. [119] predicted the relative sensitivity of macroinvertebrates toward pesticides with models using information on functional traits and taxonomic relationships. They found considerable differences in the percentage of sensitive macroinvertebrate taxa between European ecoregions and UK river types. However, the magnitude of differences depended on the pesticide’s mode of action. The data for Europe consisted only of species lists for ecoregions [44] and not of observed assemblages. Further, the study relied on a dichotomization of a relative sensitivity metric (mode-specific sensitivity, [97]), where all taxa that had a higher-than-average sensitivity were classified as sensitive. This metric is impacted by the included taxa and their taxonomic resolution. Liang et al. [63] used a hierarchical Species Sensitivity Distribution model (hSSD, [20, 55]) to predict the sensitivity of untested macroinvertebrate taxa toward 18 chemicals. They found spatial patterns in predicted macroinvertebrate assemblage sensitivity across England, which were more pronounced in specifically than non-specifically acting chemicals, and which differed between chemicals. They also found statistically significant differences in the distribution of the least and most sensitive assemblages across river types. Similarly, field studies found significant variability between water body types within regions [7] but negligible variation in assemblage sensitivity between central and northern European streams [98, 102]. Together, these studies point toward broad-scale spatial patterns in sensitivity, but patterns in field-sampled assemblage at the European scale have not been evaluated.

In this paper, we investigated whether macroinvertebrate assemblage sensitivity toward copper and imidacloprid differs systematically among broad river types across Europe. The chemicals we evaluate represent heavy metals and insecticides and are relatively well-tested. We focused on the sensitivity of macroinvertebrate assemblages as they are among the groups facing the highest risk from exposure [71, 130]. To predict the sensitivities of untested taxa, we used an hSSD model that integrates chemical properties and taxonomic relatedness.

We chose to analyze patterns in aquatic invertebrates as they are among the organism groups facing the highest risk from chemical exposure [80, 107],Wolfram et al., 2023). Simultaneously, large-scale data on the occurrences of aquatic invertebrates, a prerequisite for our study, are available through national monitoring programs. Similarly, the selection of chemicals was motivated by relevance and data availability.

Copper and imidacloprid can impact aquatic invertebrate communities at environmentally relevant concentrations. Copper naturally occurs in freshwater ecosystems due to weathering, erosion, atmospheric deposition, and groundwater influx [19]. Though natural variation in concentrations is large [132], harmful levels are typically reached through human interventions, particularly in areas with intensive agricultural activities. Agricultural copper inputs to freshwaters account for close to half the total load and are largely due to erosion of soil and surface water runoff after the application of copper-based pesticides [19]. For aquatic invertebrates, high levels of copper can affect osmoregulation [12] and have been shown to change the community structure significantly [48]. Imidacloprid is a widely used neonicotinoid insecticide commonly found in freshwater ecosystems [12, 29]. It is highly toxic to many aquatic invertebrates, particularly insects, and can cause mortality, and behavioral changes [81], potentially altering the composition and functioning of these communities [121].

Methods

Data collection and harmonization

We compiled a database of river macroinvertebrate assemblages throughout Europe from openly available and unpublished national monitoring datasets (see supplementary materials). Each assemblage corresponds to an actual field sample. All samples were collected in or after 2005 and with proportional multi-habitat sampling equal to or similar to the AQEM-STAR method [4]. To ensure comparability, we harmonized taxonomy across datasets with the taxonomic backbone of the Global Biodiversity Information Facility (www.gbif.org), only used samples collected between May and September, and restricted the data to phyla that occurred in all datasets (Annelida, Mollusca, and Arthropoda). If multiple samples were taken from a site, we used only the most recent one.

We classified all catchments as disturbed or least disturbed (sensu [112]) based on a European stressor database [62]. This database includes catchment-level data on seven indicators of anthropogenic stress: mixture toxic pressure, extent of urban and agricultural land use in the riparian zone, alteration of mean annual flow and base flow, and total phosphorus and nitrogen load. We classified catchments as disturbed if the value for at least one of the seven stressors exceeded its 24th percentile. In an earlier study, the 24% threshold was found to maximize the ratio of least disturbed catchments with high or good ecological quality to least disturbed catchments with moderate, poor, or bad ecological quality [50]. Subsequently, all assemblages from sampling sites within a catchment were assigned the same disturbance state as their catchment.

We conducted all analyses twice, once with only assemblages from the least disturbed catchments and once with the complete dataset. When we include assemblages from disturbed catchments in the analysis, stressor-induced taxonomic homogenization [74, 89] can reduce taxonomic turnover among river types. Notwithstanding, most catchments in our database were categorized as disturbed, and removing such catchments reduced our statistical power and spatial coverage. The dataset comprised 13713 assemblages from distinct sampling sites, 3703 (27%) were least disturbed (Fig. 1). As results differed little between the two datasets, we present the results for assemblages from least disturbed catchments. The results for all assemblages are provided in the supplementary materials.

Fig. 1
figure 1

Spatial distribution of 10010 disturbed and 3703 least disturbed macroinvertebrate sampling sites across Europe

National river typology systems are available in all European states but differ strongly between countries. Therefore, we assigned each assemblage to one of twelve broad river types (Table 1), which are an aggregation of national Water Framework Directive typology types and currently the only pan-European river typology system that classifies river segments rather than regions [69]. The taxonomic composition of biotic assemblages in least disturbed catchments varies more strongly among types than within them, which is a crucial assumption for any typology system [50, 51]. While these differences in community compositions are only marginal, superior alternatives are currently lacking.

Table 1 IDs and names of the 12 broad river types developed by Lyche Solheim et al. [69]

To each assemblage, we assigned the broad river types of the spatially closest river segment in the digital river network provided by Globevnik [33], which includes the segments’ broad river types. Assigning sites to river segments is error-prone. The sampled segments might be missing from the digital river network, or the sites might be closer to other segments due to potential inaccuracies in the site coordinates or the spatial position of segments. To reduce the likelihood of such errors, we removed assemblages located > 300 m from the closest river segment. Further, we validated our assignment of assemblages to the river segments by visually comparing the sampling site and segment location against the CaroDB.Positron base map with the mapview R package [3].

Predicting assemblage sensitivity with hSSDs

We derived the sensitivity of the 13713 assemblages (least disturbed and disturbed) toward a heavy metal (copper) and an insecticide (imidacloprid). If we lacked sensitivity data for a taxon, we predicted its sensitivity with an hSSD model. The hSSD model expands upon Species Sensitivity Distributions (SSD), which estimate the probability distribution of sensitivities (usually log (\(E{C}_{50}\))) which different taxa have toward one chemical [58, 93]. Since sensitivities are partly phylogenetically conserved [34, 35, 70], hSSD models use the relatedness among taxa to predict sensitivities [118]. We employed the hSSD model (version 122b) proposed by Craig [20] and described in Sinclair et al. [107]. This is a Bayesian model that uses a Markov chain Monte Carlo (MCMC) method to sample from a distribution representing uncertainty about the sensitivity of taxa in the total species pool, taking into account the available toxicity data and the taxonomic relatedness of species tested and to be predicted.

We trained the hSSD model on acute toxicity data from the US EPA ECOTOXicology Knowledgebase [83], available at http:/www.epa.gov/ecotox/). The toxicity data consisted of \(E{C}_{50}\) (immobility) or \(L{C}_{50}\) values for aqueous exposure with durations of 1–7 days. A total of 2197 unique taxa were included in the dataset, where sensitivity data were available for 59 and 33 taxa for copper and imidacloprid, respectively. These training taxa included insect, annelid, and mollusk species (see supplementary materials for a list of trained species). Using the parameter values estimated in the model training, we predicted the log \(E{C}_{50}\) for all untested taxa in our assemblages (Fig. 2). We estimated model parameters with a Metropolis within blocks Gibbs approach, an MCMC algorithm, and used the taxonomic levels genus, family, order, class, and phylum. The MCMC had a burn-in of 10000 (copper) or 20000 (imidacloprid) runs and the predicted log (\(E{C}_{50}\)) values were calculated from 30000 (copper) or 50000 (imidacloprid) samples drawn with a thinning of 15. We used more samples for imidacloprid as this increased the number of species with a stationary posterior distribution.

Fig. 2
figure 2

Workflow of the analysis. We used hierarchical species sensitivity distribution models (hSSD) to predict the sensitivity of 2197 taxa toward copper and imidacloprid. After removing taxa for which no reliable prediction could be made because the posterior distribution of the log (EC50) did not reach a stationary state, we fit log-normal distributions to the predicted log (EC50) values of each observed macroinvertebrate assemblage. Given that the log-normal was a reasonable approximation of the empirical distribution of log (EC50) values, we determined the assemblage HC5 as the fifth percentile of the fitted distribution. We evaluated the accuracy of predictions using leave-one-out cross-validation. Iteratively, each training taxon was removed from the training data and a hSSD model trained on the remaining taxa was used to predict the omitted taxon’s LC50

We removed all taxa for which a Heidelberger–Welch test [39] indicated that the posterior was non-stationary (\(\alpha =0.05\)), which indicated that those estimates were unreliable. Removing those taxa reduced the total number of taxa from 2197 to 2192 and 1361 for copper and imidacloprid, respectively. There was no systematic difference between the \(log\left(E{C}_{50}\right)\) values of taxa with stationary and non-stationary posteriors for copper and a noticeable but ultimately inconsequential difference for imidacloprid (supplementary materials). The omitted taxa were unequally distributed across orders (supplementary materials). For each assemblage, we calculated the fraction of the remaining taxa. Assemblages where this fraction was lower than 50% were omitted from further analyses. This did not affect the number of assemblages for copper. For imidacloprid, this reduced the total number of assemblages from 13713 to 11590 and the number of least disturbed assemblages from 3707 to 3107.

We built assemblage SSDs by fitting log-normal distributions to the predicted log(\(E{C}_{50}\)) values of the taxa within observed assemblages. We followed the guidance in EFSA [28] and only fit SSDs to assemblage with at least eight taxa. Further, we checked the fit of the log-normal distribution with a Kolmogorov–Smirnov test [57, 108]. Assemblages with a statistically significant test (\(\alpha =0.05\)) are not well approximated by the log-normal distribution and were omitted from further analyses. This reduced the number of assemblages to 13127 (3606 least disturbed) for copper and 10815 (2935 least disturbed) for imidacloprid. The omitted sites were distributed relatively equally among river types (supplementary materials). Tables with the predicted \(E{C}_{50}\) values are available in the supplementary materials. Lastly, we predicted the concentration that would affect 5% of taxa from the assemblage (Hazard Concentration 5, \(H{C}_{5}\)) as the fifth percentile of the distribution fitted to its log(\(E{C}_{50}\)) values. The \(H{C}_{5}\) is a suitable summary statistic to express the potential effects of chemical exposure on assemblages [99]. To facilitate comparisons among chemicals, we scaled the \(H{C}_{5}\) values by dividing the \(H{C}_{5}\) values by the median \(H{C}_{5}\) of the chemical and then taking the decadal logarithm of the quotient.

Detecting patterns in sensitivities

We used Cliff’s d to estimate whether sensitivities differed between river types. As the distributions of predicted \(H{C}_{5}\) values were strongly skewed and non-normal (Fig. 3), we used this non-parametric effect size estimate, which is robust toward non-normality and outliers [17] as it does not compare indicators of distribution location. Cliff’s \(d\) is the sample approximation of \(\delta\), which is the probability that a value (here \(H{C}_{5}\)) from one group (here each broad river type) is higher than those from another group (Eq. (1)).

Fig. 3
figure 3

Density of assemblage hazard concentration 5 (\(H{C}_{5}\)) values for copper and imidacloprid. \(H{C}_{5}\) values outside the 95% highest density interval for the respective chemical are shaded black. Only the least-disturbed sites are included. The X-axis is log10-scaled, and the X-axis ranges vary across chemicals

$$\delta =Pr\left({x}_{i}>{x}_{j}\right)-Pr\left({x}_{i}<{x}_{j}\right)$$
(1)

This probability is approximated by computing the proportion of values in one group that exceed those in the other (Eq. (2)).

$$d=\frac{{\Sigma }_{i=1}^{m}{\Sigma }_{j=1}^{n}\left[{x}_{i}>{x}_{j}\right]-\left[{x}_{i}<{x}_{j}\right]}{mn}$$
(2)

The [\(\cdot\)] are Iverson brackets, defined to take the value one if the contained statement is true and 0 otherwise. \(m\) and \(n\) are the respective group sizes. According to Romano et al. [96], |\(d\)| values above 0.47 strongly support group differences.

The Cliff’s \(d\) provided us with an estimate of whether \(H{C}_{5}\) values differ between groups but not with an estimate of the magnitude of these differences. To this end, we divided the median \(H{C}_{5}\) value for all combinations of river types to obtain an estimate of the factor of variation between types. To reduce the impact of the skewed distributions, we only used \(H{C}_{5}\) values within the 95% highest density interval (HDI), i.e., the smallest interval that contains 95% of the observations. ERA accommodates uncertainty in estimates, including the possibility for systematic differences between recipient ecosystems, through assessment factors. When determining regulatory acceptable concentrations with SSDs, the European Food Safety Authority (EFSA) recommends assessment factors of three to six for invertebrates [28]. Among the suggestions to choose a value within that range is to consider the quality of the toxicity data used to construct the SSD. As most of our toxicity data are predictions from the hSSD, we prefer to err on the side of caution and consider the higher assessment factor of six. Thus, differences between river types that exceed a factor of six would surpass the variation accounted for by current practices without considering other sources of variation, such as biotic interactions or the extrapolation from laboratory to field conditions.

Software

We conducted all analyses in R 4.3.0 [94]. For data wrangling, we used the packages tidyverse 2.0.0 [127], data.Table 1.14.8 [26], and sf 1.0–12 [88]. For analyses, we used the packages vegan 2.6–4 [82], MASS 7.3–58.3 [123], effsize 0.8.1 [117] and HDInterval 0.2.4 [75]. We created visualizations with ggplot2 3.4.2 [126], tmap 3.3–3 [116], and cowplot 1.1.1 [128].

Results

Assemblage \(H{C}_{5}\) values varied by up to two and three orders of magnitude for copper and imidacloprid, respectively (Fig. 3). Within the 95% HDI, the predicted \(H{C}_{5}\) values only varied by one order of magnitude. Our cross-validation indicated a median relative error of 0.87 and 0.99, for copper and imidacloprid, respectively (supplementary materials).

The predicted assemblage \(H{C}_{5}\) values varied more strongly within than among broad river types (Fig. 4). The largest among-type differences are apparent for copper, where the median scaled \(H{C}_{5}\) of very large rivers (RT1) is -0.35, i.e., at approximately 45% of the overall median \(H{C}_{5}\) for copper and highland rivers (RT10) is at 0.13, i.e., approximately 1.3 times the overall median for copper. An alternative version of Fig. 4 with log (\(H{C}_{5}\)) on the y-axis is available in the supplementary materials.

Fig. 4
figure 4

Density distribution of scaled hazard concentration 5 (\(H{C}_{5}\)) values for both chemicals and all broad river types. Scaling was achieved by dividing \(H{C}_{5}\) values by the median \(H{C}_{5}\) for the chemical across broad river types and taking the decadal logarithm of this quotient. Values of zero thus imply that the value is equal to the chemical’s overall median, and values of 1 indicate that the value is one order of magnitude greater than the overall median. Horizontal lines within the density curves are medians. This plot shows the least-disturbed sites and values within the 95% highest density interval

The analysis of Cliff’s \(d\) confirmed this impression (Fig. 5). Differences between broad river types exceeded the heuristic threshold of 0.47 for copper and imidacloprid. For copper, assemblages in lowland rivers (RT1-5), especially very large rivers (RT1), were more sensitive to copper than those from mid-altitude (RT6-9), highland (RT10), and Mediterranean rivers (RT11,12). Across altitude levels, assemblages from calcareous rivers were more sensitive toward copper than those from siliceous rivers. For imidacloprid, all threshold exceedances included very large rivers. Their assemblages were notably less sensitive than those from mid-altitude (RT6 and RT8) and perennial Mediterranean (RT11) rivers.

Fig. 5
figure 5

Differences between the assemblage hazard concentration 5 (\(H{C}_{5}\)) values of different broad river types expressed as the absolute value of Cliff’s d. X- and Y-Axis give the broad river type ID (Table 1). Dark blue cells indicate the smallest differences and dark red cells mark the largest observed differences. An asterisk marks Cliff’s d values that exceed the threshold of 0.47. Values are based on the least disturbed sites only

We quantified the differences between river types by computing the quotients of river type-specific median \(H{C}_{5}\) values. All quotients were below six for both chemicals, i.e., median river type \(H{C}_{5}\) values differed by less than a factor of six. The highest quotient between median \(H{C}_{5}\) values was 3.1 (Fig. 6), which we observed for copper between very large rivers (RT1) and highland rivers (RT10).

Fig. 6
figure 6

The factor of variation between median \(H{C}_{5}\) s of broad river types. The black dashed vertical lines mark factors of one and six, i.e., the lowest possible value and the upper limit for assessment factors suggested by EFSA for deriving regulatory thresholds with macroinvertebrate Species Sensitivity Distributions. Small vertical lines show individual quotients

Propagating the uncertainty we quantified in the cross-validation, slightly increased the variation between river types (supplementary materials).

Discussion

We predicted the sensitivity of macroinvertebrate assemblages toward copper and imidacloprid at a European scale and compared these assemblage sensitivities among European broad river types. We found clear sensitivity differences among river types and observed the largest between-type difference in the median \(H{C}_{5}\), a factor of 3.1, for copper between very large rivers (highest sensitivity) and highland rivers (lowest sensitivity). This variation is lower than the assessment factors recommended by EFSA [28] and is thus implicitly accounted for in current practices. The assessment factors were derived by comparing \(H{C}_{5}\) values from SSDs to no or low observed effect concentrations from mesocosm studies [72]. They account for biotic interactions and the extrapolation from laboratory to field conditions, not for variation in assemblage composition. While the variation between river types did not by itself exceed the assessment factor it adds to the already considered variation and the total variation might surpass the assessment factor. The variation between river types we found could thus justify additional, albeit small, assessment factors. Further studies are needed to assess the need for such factors for other chemicals, primarily specifically acting ones [63], at the European scale.

Overall, sensitivity differed among broad river types but only weakly and in a chemical-dependent manner. Our results suggest that variation in macroinvertebrate assemblage sensitivity, solely due to taxonomic composition, exists at the European scale but is neither pronounced nor well captured by existing freshwater typology systems. Recently, Liang et al. [63] found pronounced spatial patterns in the sensitivity of macroinvertebrate assemblages toward different chemicals across England. While these results seem contradictory to ours, the apparent difference can be traced back to four distinctions between the studies. First, they analyzed different chemicals. Liang et al. [63] evaluated 18 compounds of which only copper matched between their and our data set. In addition, neonicotinoids were absent from their analysis. As spatial patterns differ between chemicals in both studies, we should be careful when extrapolating to untested chemicals. Second, they focused on the least and most sensitive assemblages instead of the median sensitivity. Thus, they aimed to answer a different question. Third, our study considers larger spatial scales. Scale dependence has been recorded for various ecological phenomena [16], e.g., [38], and larger differences between broad river types may exist within regions of Europe. However, the low overall variation between observed \(H{C}_{5}\) values and the results of previous studies [98, 102] render this unlikely.

We limited our analysis to the least disturbed sites to focus the analyses on relatively unaltered biotic communities. However, we cannot exclude or gage the potential impact of unmeasured or omnipresent stressors on these communities. The qualitative agreement between the results obtained for all sites (supplementary materials) and those obtained for the least disturbed sites could indicate a considerable discrepancy between least and minimally disturbed conditions [112] in our samples. Stressor-induced taxonomic harmonization, as has been reported for the omnipresent stressor of increasing temperature [32, 79], could have contributed to reduced differences in sensitivity. However, Liang et al. [63] analyzed sites of mixed and high water quality separately and found communities from high water quality samples to be more similar in taxonomic composition and less variable in sensitivity.

Sources of uncertainty and limitations

As is common with large-scale ecological studies, our results contain uncertainty [46, 106]. Here, it mainly stems from the limited availability of toxicity data. Our hSSD models were trained on 59 taxa for copper and 33 for imidacloprid. While internal model parameters benefit from more extensive training [20], our predicted LC50 values would have been more precise if more training data were available. We have quantified this uncertainty in cross-validation and found that it likely is of little consequence to our conclusions. Still, more toxicity data would have improved model fits and reduced the number of removed taxa.

Another source of uncertainty is our biological data. Although our dataset is one of the most comprehensive collections of European macroinvertebrate occurrences, the samples are unevenly distributed. As is common with macroinvertebrate data, the taxonomic resolution can be low, e.g., mostly at the family level for Chironomidae, potentially obscuring differences. Lastly, we combined datasets which introduces biases if datasets differ systematically. However, all included datasets followed the same sampling protocol (AQEM-STAR; [4]), except for one that employed a highly similar approach [56, 66]. Additionally, we considered occurrence data, which is less sensitive to variations in sampling methods compared to abundance data [14, 29, 45].

Comparing HC5 values among broad river types assumes that a discrete representation of space is suitable, and specifically, that the broad river types are a good representation of environmental gradients. We used the broad river types because they are the only pan-European river typology system. Alternative systems either classify regions instead of segments (e.g., [44, 76]) or extend beyond Europe [85] and thus have a lower resolution. Jupke et al. [50] and Jupke et al. [51] showed that the community composition differs nearly as much within the broad river types as among them. Larger sensitivity differences between river types are more likely if community composition differs strongly between river types [63]. While other typology systems could elicit stronger differences, additional analyses (not reported) do not support this. Overall, we have no reason to believe that any of the discussed factors introduced a systematic bias, impacting river-type comparison.

Further prospects of type-specific risk assessment

Our results lend limited support for the use of a type-specific ERA. Considering ecosystem types in ERA may still deliver more precise thresholds because bioavailability and stressor context can vary systematically among river types. The effects of a chemical on biota are determined by its bioavailable fraction, which can be considerably lower than the total load [67]. Bioavailability, i.e., the extent to which a contaminant is available for uptake by organisms, is determined by how strongly the chemical adsorbs to available surfaces, its speciation, and its degradation rate. All three factors are governed by water pH (e.g., [24, 53, 131]), temperature [54, 87], as well as size and organic carbon content of suspended solids [24, 36, 40]. Water hardness reduces the uptake of metals because the calcium cations compete for the same membrane transport proteins as the metals [43, 73, 109]. Temperature, pH, organic carbon content, and water hardness are affected by factors that are, or could easily be, implemented in river typology systems, such as altitude, bedrock geology, or dominant catchment soil type. The bioavailable fraction, and therefore the effect of a chemical, might differ between river types, even when the inherent sensitivity of the assemblages is similar. Most aquatic ecosystems face exposure to multiple stressors at or above ecologically relevant thresholds [101, 124]. Hence, organisms are likely already in a stressed state before the exposure to the chemical(s). The simultaneous or antecedent occurrence of other, chemical or non-chemical, stressors can strongly impact a chemical’s physiological and ecological effects (e.g., [9]). The toxicity of pyrethroids increases with decreasing temperature [18, 37] and with increasing salinity [37]. Under hypoxic conditions, some metal cations occur in lower valence states (e.g., Cu+), which differ in toxicity from higher valence forms (e.g., [105]). For example, the same levels of oxygen reduction and copper that were individually non-lethal led to a 50% mortality in the mayfly Ephoron virgo when combined [120]. These examples are by no means exhaustive (see Holmstrup et al. [41] and Steinberg [111] for reviews on these topics) but demonstrate the potential for stressor interactions. A meta-analysis of such interactions found synergistic interactions (i.e., the combined effect exceeds the sum of independent effects) in 62% of cases [59]. Conversely, models using only the dominant stressors best explained the observed effects on organisms in a study investigating the combined effects of climate change and additional stressors [78]. The prevalence and magnitude of many stressors differ between river types [8, 61, 101], and the same is true for responses of a taxon to the same stressor [1, 15, 23]. In a spatially explicit risk assessment, we may delineate likely river-type-specific combinations of stressors or chemicals. A key challenge for including stressor interactions in prospective risk assessment is the large number of possible combinations [68]. Both multiple stressor and mixture toxicity research are currently active, though poorly integrated, fields of science [84, 100]. One potential integration pathway could be identifying the most common type-specific stressor combinations. We might use available field data (e.g., [64, 103]) or a combination of high-resolution crop classification at the national [5, 10] or continental level [22, 90] and inventories of crop-specific active ingredients [47] to predict common mixtures of pesticides. Pistocchi et al. [91] took steps in this direction by predicting the concentrations and cumulative toxicities of 148 active substances throughout Europe. Field data or predictions on other stressors, such as nutrients [62], flow regime shifts [62], temperature [52], and salinity [60], are also available on broad spatial scales and could be used to identify common and type-specific combinations of non-chemical and chemical stressors. This approach cannot address second-order effects following the primary changes to the species composition or food web structure [30, 86, 129]. Such net effects of biotic interactions are context-dependent and currently defy accurate determination [6].

Conclusions

Current ERA practices fail to fully protect non-target organisms. One way to improve ERA might be to account for differences between recipient ecosystems in biotic and abiotic conditions. We found the differences in macroinvertebrate assemblage sensitivities to copper and imidacloprid among broad river types at a European scale to be within the uncertainty accounted for in ERA via assessment factors. Notably, spatial variation in assemblage composition was not considered in the derivation of assessment factors. Between-type variation might thus contribute to other sources of variation, which, in total, exceeds assessment factors. Therefore, our study provides some support for a river-type-specific risk assessment for the two chemicals studied. Additionally, our predictions build on the taxonomic composition of assemblages and do not consider potential differences in the bioavailability of toxic substances and multiple stressor contexts. Both might contribute to a higher variation in the ecological effects of chemicals between river types. Lastly, the finding of considerable differences in sensitivity rank order and magnitude of variation between chemicals indicate that the results should only be extrapolated to other chemicals after careful consideration.