Background

The European Water Framework Directive (WFD) (2000/60/EC) aims to restore all surface waters to ‘good’ status, a condition that is assessed using a range of biological, hydrological, and chemical metrics. One of the chemical metrics used to assess status is an Environmental Quality Standard (EQS). Chemicals are selected for EQS derivation that are deemed to present a Europe-wide risk, and these EQS are then applied across all European Union countries. EQS represent legally binding maximum surface water concentrations and therefore drive regulatory action to reduce the concentrations of chemicals in surface waters.

Diclofenac is a human and veterinary medicine that has been widely detected in European surface waters, and which primarily enters these waters in discharges from Wastewater Treatment Plants (WWTPs) [3]. Diclofenac behaves conservatively in conventional wastewater treatment processes, with relatively low levels of removal being achieved [26, 34]. This means that long-term exposure of aquatic organisms to diclofenac in European waters certainly occurs downstream of some WWTPs. However, the quantification of actual risks to aquatic receptor organisms requires that the hazards for such receptors are appropriately assessed. An Annual Average EQS should therefore be derived using chronic ecotoxicity data on aquatic organisms exposed to diclofenac. European Technical Guidance is available that sets out the approach to follow to deliver an EQS that is scientifically robust and consistent in terms of protection goals and outcomes of compliance with existing EQS for other substances [9]. Importantly, this guidance is intended to provide clarity for both the regulator and regulated communities with respect to the processes and approaches to be applied in the derivation of an EQS, and to ensure that the potential for inconsistency in implementation and assessment across Europe is minimised.

There have been several attempts to derive an EQS for diclofenac at a national level, most notably by the German Federal Environment Agency (UBA 2011; 2014; 2017; 2018). Some individual countries have also adopted these same EQS (e.g. [33]). Unfortunately, these EQS were primarily derived using data that are considered to be unreliable, or not relevant for use in EQS setting [24, 46]. These data focussed on possible histopathological responses to diclofenac in fish, particularly in kidneys, liver, and gills, rather than on accepted population-relevant ecotoxicity ‘apical’ endpoints such as mortality, growth, and reproduction [9]. Most of these studies were not undertaken or reported in a sufficiently reliable manner for EQS derivation, while the small number of studies that were considered to be reliable failed to link histopathological responses with adverse population-relevant effects [46].

In 2019, the European Commission tasked an expert group to derive a new EQS for diclofenac, this time for application and use across the whole of Europe. This EQS has now been drafted [10] and, in contrast to previous draft dossiers, histopathological responses in fish have now been discounted. Surprisingly however, so has the extensive reliable and relevant database of long-term ecotoxicity data for diclofenac, covering a wide range of different freshwater and marine aquatic species, in favour of an EQS based on the outcomes of a single freshwater mesocosm study [15].

In Europe, and in common with other regulatory regimes used to derive guidelines or water quality criteria (e.g. CCME [2, 44], the process of EQS setting starts with the identification and compilation of chronic ecotoxicity data for a substance, which are then assessed against established criteria for reliability (e.g. [23]), and for their relevance to the protection goals of an EQS. If there are sufficient laboratory data, a probabilistic species sensitivity distribution approach is used to derive a hazard concentration. An assessment factor is then applied to account for uncertainties, for example in extrapolating from the laboratory to the field. The European Guidance states that the choice of assessment factor for the probabilistic approach is determined by factors such as the quality of the chronic ecotoxicity database—the endpoints covered, inclusion of sensitive life stages, and taxonomic and ecological diversity and representativity, knowledge of mode of action in aquatic organisms subject to long-term exposure, the statistical uncertainty of the assessment, and comparisons with field and mesocosm studies, if available.

In this paper, we describe the available long-term aquatic toxicity dataset and discuss the EC [10] derivation of an Annual Average EQS to protect aquatic organisms. We then propose an alternative EQS which complies with EC [9] guidance and uses all relevant and reliable aquatic toxicity data.

Long-term ecotoxicity data for diclofenac

The reliable and relevant chronic laboratory dataset for diclofenac covers 20 species and eight higher taxonomic groups (Additional file 1).

Data are available for effects on 72-h population growth in both freshwater and marine unicellular algae (Desmodesmus subspicatus and Dunaliella tertiolecta, respectively), although this taxonomic group appears to be relatively insensitive to diclofenac (NOEC/EC10 values of approximately 15,000 to 50,000 µg L−1).

Data on the growth of aquatic higher plants exposed to diclofenac are more variable. A single study on Azolla filiculoides [37] demonstrated minimal effects on plant growth after a 10-day exposure to concentrations of diclofenac below 10,000 µg L−1. However, Kummerova et al. [16] and Markovic et al. [20] highlight considerable differences in the inhibition of growth in Lemna minor exposed to diclofenac. Kummerova et al. [16] derived a NOEC of 10 µg L−1 after a 10-day exposure, while Markovic et al. [20] generated an EC20 value of 6280 µg L−1 after a shorter exposure duration (7 days). It is unclear why such a difference in growth inhibition caused by diclofenac exposure is reported, although the longer exposure time in the Kummerova et al. [16] may provide at least a partial explanation. However, large differences for the same species, endpoints, and similar exposure durations also occur elsewhere in the dataset (e.g. for Daphnia magna 21-day reproduction and Oryzias latipes 28–30-day growth).

A wide array of different invertebrate groups are included in the diclofenac dataset, covering rotifers (two species), crustaceans (five species comprising freshwater filter feeders, and both freshwater and marine benthic feeders), both bivalve and gastropod molluscs (a single species each), and echinoderms (a single species). As would be expected for such a range of different taxonomic groups, with endpoints comprising inhibition of reproduction, development and growth, EC10/NOEC values cover a substantial range (5 to > 72,000 µg L−1). The most sensitive EC10/NOEC value in the invertebrate dataset is 5 µg L−1 diclofenac for 48-h inhibition of larval growth in the sea urchin Paracentrotus lividus [29].

The chronic dataset for diclofenac for fish comprises six species, covering both salmonids (Salmo trutta and Oncorhynchus mykiss) and non-salmonids (Oryzias latipes, Gasterosteus aculeatus, Danio rerio, and Cyprinus carpio). Studied endpoints include mortality, reproduction, development, and growth. Again, a relatively wide range of NOEC/EC10 values is reported for the effects of diclofenac (4.6–5000 µg L−1, even within the same species and endpoint (i.e. NOEC/EC10 for growth in Danio rerio of 11.1–5000 µg L−1). NOEC/EC10 values of below 10 µg L−1 have been derived for Oryzias latipes (14-day inhibition of reproduction NOEC of 7.1 µg L−1 [39], 90-day jaw malformation NOEC of 7.3 µg L−1 [40], Gasterosteus aculeatus (21- to 28-day jaw malformation NOEC of 4.6 µg L−1 [25], and Oncorhynchus mykiss (28-day eye malformation NOEC of 5 µg L−1 [1], with the Naslund et al. [25] G. aculeatus study producing the most sensitive reported EC10/NOEC for the full reliable and relevant chronic diclofenac dataset.

European Commission EQS for aquatic organisms based on a stream mesocosm study

The EC [10] EQS derivation for diclofenac examined the data discussed above (Additional file 1) and proposed an EQS of 0.04 µg L−1, based on an estimated 10% effect level of 0.22 µg L−1 for stickleback (Gasterosteus aculeatus) from a mesocosm study [15], divided by an Assessment Factor of 5.

This approach appears to diverge from EC [9] guidance in two main areas:

  1. 1.

    Despite the availability of an extensive and comprehensive dataset of reliable and relevant long-term ecotoxicity studies for diclofenac, a deterministic assessment has been applied.

    There are more than sufficient reliable and relevant chronic aquatic toxicity data to allow a probabilistic (SSD) approach to EQS derivation for diclofenac. The application of a deterministic approach (especially one based on data derived from a mesocosm study) when such an extensive chronic toxicity dataset exists for diclofenac should only be attempted when a more statistically robust approach cannot produce a reliable result. This requires that all the available options for an SSD assessment have been thoroughly explored—including the evaluation of all potentially viable statistical models for the SSD curve, and the assessment of the effect of removing insensitive data from the upper portion of the distribution if bimodality is suspected.

  2. 2.

    The single mesocosm study using diclofenac [15] has been used to directly derive the EQS, despite apparently not meeting all the reliability criteria required by the guidance for employing such an approach.

Joachim et al. [15] report on a 5-month mesocosm study with diclofenac that included exposure of caged freshwater mussels (Dreissena polymorpha) and free-living stickleback (Gasterosteus aculeatus) to nominal concentrations of 0.1, 1 and 10 μg L−1 in triplicate (with calculated ‘average effective concentrations’, based on measured values, of 0.041, 0.44 and 3.82 μg L−1). The mortality of female fish and mussels after 5 months of exposure to diclofenac is stated by EC [10] to show a concentration-related response. However, there was both very high mortality in control replicates (up to 60% mortality for fish, and 41% for mussels) and significant variability between replicate mortalities across different mesocosms (both for controls and treatments) in both species. Except at the highest exposure concentration, the degree of variability in controls and treatments overlapped significantly, rendering reported differences between responses observed controls and the two lowest exposure groups to be highly questionable.

The EQS guidance [9] makes several references to the use of mesocosm data, including specific reliability criteria that must be satisfied if such data are to be used to directly derive an EQS (Additional file 1). A number of these criteria do not appear to be sufficiently met to allow the Joachim et al. [15] mesocosm study to be considered reliable for the direct derivation of an EQS.

Measured exposure concentrations; the full analytical results were not included in the main paper or Additional file 1, and only time-weighted ‘Average Exposure Concentrations (AEC)’ were reported. An approach for calculating these time-weighted average concentrations (the van Wijngaarden et al. (1996) AEC approach) was used by Joachim et al. [15]. This approach is specifically designed for use in a pond mesocosm that had been over-sprayed with a chemical on a single occasion at the start of a study, mirroring the field use of plant protection products. It was not designed to be applied in a continuously dosed stream mesocosm. Nevertheless, Joachim et al. [15] report AECs of 0.041 ± 0.016, 0.44 ± 0.05, and 3.82 ± 0.47 µg L−1 in the mesocosms treated with three different diclofenac concentrations. These can be compared with simple mean concentrations, which are more appropriate for a continuously dosed stream system, of 0.05–0.06, 0.43–0.49, and 3.86–4.17 µg L−1 diclofenac, respectively. However, both the AEC and simple mean concentrations mask such extremely large variations in exposure concentrations that they are both probably meaningless. For example, the measured exposure concentrations of diclofenac to which mobile organisms such as stickleback were exposed at the highest nominal concentration of 10 µg/L ranged from 0.14 to 7.235 µg L−1 diclofenac—a factor of over 50—over the course of the study.

In addition, the analytical results highlight that measured concentrations were < 50% of nominal in all treatments at all measurement times, even at the inlet to the mesocosms. The analytical data also include a significant number of censored results (less than the limit of quantification (LoQ)) in the lowest two exposure concentrations at both 5 and 19 m along the mesocosms. In addition, there are several sampling occasions across treatments when the diclofenac concentration apparently increased along the length of the mesocosm, which is a very unusual finding, especially given the > 50% loss compared to nominal concentrations in solutions entering the mesocosms. This was particularly pronounced on the last sampling date when reported concentrations were higher in the lower reaches of all the mesocosms. Overall, this combination of issues with the analytical measurement of exposure concentrations would usually invalidate a study for use as key data in EQS derivation.

Secondly, the degree of control mortality for freshwater mussels and fish was high in the control mesocosms, with up to 41% mortality for (caged) mussels and up to 60% mortality for stickleback across control treatments by the end of the 5-month study. This level of control response would invalidate these data if they were obtained from a laboratory study, yet they are used by EC [10] for EQS derivation in the same way as laboratory data.

Finally, regarding the statistical analyses, Joachim et al. [15] did not report an EC10 for any of the measured endpoints from the mesocosm in their paper, but the EC [10] assessment estimates an EC10 for female stickleback mortality of 0.22 µg L−1 diclofenac with a 95% confidence interval ranging over two orders of magnitude (0.0385–1.30 µg L−1). This is then used in a deterministic approach to deriving an EQS. The reasons for the high uncertainty in the female stickleback EC10 value are clear: the data are highly variable across a relatively small number of treatments (one control and three diclofenac concentrations). However, there is no evidence of a statistically significant effect on female stickleback mortality below the highest test concentration.

The mesocosm for diclofenac seems to be sufficiently reliable to be used as ‘supporting’ data, i.e. to qualitatively assess the uncertainties of the EQS derivation, and select an assessment factor, as detailed in the guidance [9]. However, issues with data variability, high control effects, and uncertain exposure metrics mean that the outcomes of this study appear to be insufficiently reliable to be used as ‘critical’ or key data for direct derivation of an EQS.

In addition, the EC [10] also derive further deterministic EQS for ‘laboratory’ data using the freshwater mussel mortality data from the mesocosm and use this as a ‘weight-of-evidence’ to support the mesocosm-based deterministic EQS. There is no precedent (nor mention in the guidance [9]) for including the results from a mesocosm in the derivation of a laboratory data-based EQS, using either deterministic or probabilistic derivation approaches. Indeed, a mortality endpoint with the degree of control response and within-treatment variability observed in the mussel data for diclofenac would be considered unreliable for use in EQS derivation if it came from a laboratory study.

Evidence-driven EQS for aquatic organisms based on long-term toxicity data

Using the reliable and relevant chronic laboratory ecotoxicity dataset for diclofenac described earlier (Additional file 1), we derived an Annual Average EQS for diclofenac according to the approaches and guidance provided [9]. For the application of a probabilistic SSD approach, EU guidance [9] stipulates that the available reliable and relevant ecotoxicity dataset comprises at least eight higher taxonomic groups and 10 different species. The chronic dataset for diclofenac meets both these criteria, with eight higher taxonomic groups and 20 species, and is therefore sufficiently extensive to allow a probabilistic approach to EQS derivation.

EU guidance [9] highlights that an SSD based on a log-normal distribution of the data should be constructed, at least initially, and used to derive the concentration predicted to affect 5% of exposed species (the 5% hazardous concentration or HC5). Figure 1 shows the log-normal SSD curve (n = 20) for the chronic ecotoxicity dataset produced using ETX software [43]. Here we have used the most sensitive EC10 or estimated 10% effect level for each species in the dataset (Additional file 1). The HC5 for this SSD is 0.63 µg L−1, with 95% confidence limits of 0.06–3.14 µg L−1.

Fig. 1
figure 1

Log-normal SSD distribution for the full reliable and relevant chronic diclofenac data (most sensitive EC10 or estimated 10% effect value for each species) (n = 20)

EU guidance [9] requires assessment of the SSD curve for its statistical fit to a log-normal distribution model, using a series of goodness-of-fit tests. The SSD shown in Fig. 1 fails the goodness-of-fit tests implemented in ETX (Kolmogorov–Smirnov, Anderson–Darling, and Cramér–von Mises) and there are relatively wide 95% confidence limits around the HC5 value. In addition, there appears to be a degree of bimodality in the distribution, with a group of eight values comprising the lower (most sensitive) portion of the curve (EC10/estimated 10% effect threshold = 1.7 to 8.6 µg L−1), and a group of 10 values comprising the upper (least sensitive) portion of the curve (EC10/estimated 10% effect threshold = 590 to 25,000 µg L−1). However, it is debatable whether this SSD is truly bimodal since the 40 and 120 µg L−1 data points bridge the gap between these lower (sensitive) and upper (insensitive) portions of the SSD curve.

Given these potential issues with application of a log-normal distribution to the data, EU guidance [9] suggests that there then should be an attempt to fit the data to other distribution models to see if an improved fit can be achieved. The Burlioz SSD software developed by the Commonwealth Scientific and Industrial Research Organisation (CSIRO) in Australia allows an ecotoxicity dataset to be analysed for the best fitting SSD distribution. Application of the Burlioz methodology to this dataset shows that an inverse Weibull model provides a better fit (Fig. 2).Footnote 1

Fig. 2
figure 2

Inverse Weibull SSD distribution for the full reliable and relevant chronic diclofenac data (most sensitive EC10 or estimated 10% effect value for each species) (n = 20)

The HC5 for this SSD is 1.6 µg L−1 diclofenac with 95% confidence limits around the HC5 of 0.88–5.6 µg L−1.

The reliable and relevant chronic ecotoxicity dataset clearly demonstrates considerable variation in sensitivity to diclofenac among the different species that have been tested. However, the split between the sensitive (lower) and insensitive (upper) portions of the SSD curves do not relate to clear differences in sensitivity to diclofenac of different taxonomic groups, with the same taxonomic groups represented at both ends of the distribution. Indeed, when the entire dataset is considered, it is clear the range of 10% NOEC/EC10 values for some of the same individual species also span the distribution (e.g. L. minor and D. magna) (Additional file 1).

The potential for bimodality in an SSD distribution is discussed in EU guidance, which suggests that if there is clear evidence of a ‘break’ in the distribution between sensitive species and other species, or there is poor model fit, then the left tail of the distribution should be analysed in more detail. Such analysis may include the derivation of an HC5 using the data only from the most sensitive group and comparison of this to the entire dataset. Depending on the outcomes of such a comparison, it may be better from a statistical and ecological perspective to derive the EQS using only the most sensitive data, providing that the SSD retains a minimum of 10 datapoints. This approach is underpinned by two fundamental principles of EQS derivation (also stated in EU guidance) which emphasise that: (i) not all data have equal influence on the derivation, with so-called ‘critical’ data strongly influencing the resultant EQS and (ii) the SSD EQS derivation method should always be applied when the conditions for its use are met. Figure 3 shows an inverse Weibull SSD distribution comprising the ‘sensitive’ and ‘intermediate’ portions of the dataset (10% effect values of 1.7–120 µg L−1; n = 10) (Additional file 1). The HC5 is 1.9 µg L−1 (95% confidence limits 1.3–3.8 µg L−1), which differs little from the full dataset (1.6 µg L−1 with 95% confidence limits of 0.88–5.6 µg L−1), although the confidence limits are tighter.

Fig. 3
figure 3

Inverse Weibull SSD distribution for the most sensitive reliable and relevant chronic diclofenac data (most sensitive EC10 or estimated 10% effect value for each species) (n = 10)

The sensitive portion of the distribution therefore appears to drive the probabilistic outcomes for the overall dataset, and the insensitive portion of the overall distribution clearly has a relatively minor influence on the HC5 generated.

There is a factor of 3 difference in the HC5 values estimated from the distributions shown in Figs. 1, 2 and 3, with values ranging from 0.63 to 1.9 µg L−1.

EC [9] requires application of an assessment factor of between 1 to 5 to an HC5 to account for uncertainty in the estimated threshold and to ensure that the EQS is protective of all environmental receptors likely to be exposed to diclofenac in the environment. Application of the maximum AF of 5 to the lowest calculated HC5 of 0.63 µg L−1 for the full dataset produces an EQS value of 0.126 µg L−1. This threshold would protect all species in the extensive chronic dataset (n = 20), based on their reported or estimated 10% effect values, with a ‘margin of safety’ of > 10 between the lowest 10% effect value in the dataset (1.7 µg L−1) and the EQS. An EQS of 0.126 µg L−1 would also be lower than the EC10 for female stickleback estimated by EC [10].

Indicative compliance assessment for diclofenac in European surface waters

While acknowledging that, at the scientific level, the environmental concentrations of a substance should not influence the derivation and acceptance of an EQS, the indicative compliance assessment presented here provides a degree of context for the two EQS values. The European monitoring dataset contains 26,737 individual measurements, although 80% of the data points are from France [3], Loos et al. [19], EEA [11]. Data less than the limit of quantification were set as half the limit of quantification in accordance with the European Commission Direction 2009/90/EC [8]. The appropriate monitoring metric to compare with an AA EQS is, unsurprisingly, an annual average for the site under consideration. However, here we have taken a worst-case approach and calculated the mean of the 90th percentile values from each individual country included in the dataset is 0.090 µg L−1, and the mean of the 95th percentile values from each country is 0.157 µg L−1. The 90th and 95th percentile values for the French data set are 0.060 and 0.110 µg L−1, respectively. Weighted averages were also calculated to take account of the relative sizes of the different country-specific datasets when combined to produce the overall dataset. The weighted and unweighted means of 90th percentile values are 0.090 and 0.141 µg L−1, respectively. There are four countries with noticeably higher 90th and 95th percentile values (Hungary, Belgium (Flanders), Germany, and Austria), and if these countries are omitted from the calculations the unweighted mean of 90th percentile values is 0.090 µg L−1. The mean of the unweighted 90th percentile concentrations is therefore used to define the concentration of diclofenac in water for this exposure assessment.

Sufficient data are available for an indicative compliance assessment to be undertaken for diclofenac exposure to European aquatic organisms. The overall risk characterisation ratio for Europe, based on an exposure concentration of 0.090 µg L−1, is 0.71 if the EQS is set at 0.126 µg L−1 and 2.25 if it is set at 0.04 µg L−1 (the EC assessment). However, this overall estimate of the risk does not consider the differences in exposure levels between different regions. The proportion of samples available from each different regulatory organisation that has reported concentrations of 0.13 µg L−1 or greater is summarised in Table 1 to provide an indication of the potential levels of compliance with an EQS for diclofenac across different regions.

Table 1 Summary of the percentage of freshwater monitoring samples exceeding the EQS for diclofenac in freshwater derived in this study

The extent of monitoring is highly variable between different countries, with France reporting a dataset of 21,472 samples, while Croatia, Cyprus, Denmark, Estonia, Iceland, Ireland, Luxembourg, and Wales each report results for fewer than 10 samples. Germany, Luxembourg, Flanders, Hungary, and Iceland all report more than 30% of samples with diclofenac concentrations > 0.13 µg L−1, although Luxembourg and Iceland both report very small datasets, whereas Flanders has a relatively extensive monitoring dataset including over 1000 samples.

The generic monitoring concentration used in this paper is a reasonable worst-case regional concentration for Europe because it has been calculated as the mean of the 90th percentile concentrations from several different European countries. However, higher concentrations could be encountered locally where there are specific emission sources, such as major hospitals. Country-specific assessments of potential compliance with the EQS derived in the present study suggest that levels of non-compliance could be relatively high in some regions, such as Germany and Flanders, whereas potential non-compliance in France, the country with the most extensive monitoring dataset for diclofenac, is < 5%.

The region-specific indicative compliance is based on a face value comparison of the concentrations of diclofenac reported in individual spot samples against the proposed EQS for diclofenac in the water column expressed as an annual average. Furthermore, the extent to which region-specific monitoring has been targeted at those sites most likely to be receiving diclofenac exposures or has been aimed at providing an overall indication of country-wide exposures, is unknown and likely to vary between different regions. This means that making robust comparisons of the potential compliance situation between different regions is challenging.

The cumulative frequency distributions of the reported monitoring data, based on individual sample results, are shown in Fig. 4 for Flanders, France, and Germany. Our proposed EQS for diclofenac and that of the EC [10] of, respectively, 0.126 and 0.04 µg L−1 are also indicated for reference. This indicative compliance assessment is based on a face value assessment against individual samples rather than annual average concentrations calculated from regular samples collected over the course of a year or more (as the WFD stipulates). The dataset for Flanders includes 1025 samples covering 84 sites and collected over a 5-year period. The dataset for France includes 21,472 samples covering 1827 sites and collected over a 3-year period. The dataset for Germany includes 233 samples covering 24 sites and collected over a 2-year period. All samples from all three of these countries were reported as being from routine monitoring and were collected from receiving freshwaters. Regulatory monitoring programmes are routinely targeted towards the most potentially problematic sites, and this may be the situation with the datasets for Flanders and Germany, both of which have a much lower number of sampling sites with higher diclofenac concentrations than in the dataset from France.

Fig. 4
figure 4

Cumulative frequency distribution of the diclofenac monitoring data from Flanders, France and Germany compared to the potential EQSs

Data on concentrations of diclofenac in European surface waters suggest that there are potential risks to aquatic receptors. It would therefore be prudent to monitor diclofenac concentrations in water in those surface waters known to receive high concentrations of diclofenac from WWTPs, as well as at appropriate reference sites which are not directly impacted by major local wastewater discharges. Such data, combined with appropriate ecological monitoring can provide field validation or benchmarking of the derived EQS (e.g. [27]).

Conclusions

The accepted guidance for European EQS setting [9] states that, provided there are sufficient reliable and relevance laboratory data, a probabilistic species sensitivity distribution approach should be used to derive the hazard concentration underpinning the EQS derivation. Reversion to a deterministic approach should only be undertaken when the SSD approach cannot produce a reliable outcome. The assessment of such reliability should include the evaluation of all potentially viable statistical models for the SSD curve and, if bimodality is suspected, of the effect of removing insensitive data on the HC5 value produced.

An extensive hazard dataset of reliable and relevant long-term ecotoxicity data for diclofenac is available covering 20 species and 8 higher taxonomic groups. This dataset is more than sufficient for the adoption of a probabilistic approach to the derivation of an EQS for diclofenac. We have reviewed this dataset for diclofenac and developed updated SSDs. A log-normal distribution for the full dataset (n = 20) (Fig. 1) does not meet all of the goodness-of-fit tests specified in the accepted guidance [9], indicating that the dataset may be a relatively poor fit to the log-normal model. Further assessment indicates that an inverse Weibull model fits the distribution of data relatively well (Fig. 2). Both SSDs also highlight marginal bimodality in the distribution, although a small number of intermediate datapoints apparently bridge the groups of sensitive and insensitive data. However, removal of the insensitive data from the SSD (Fig. 3) shows that the HC5 for the entire dataset is primarily driven by the most sensitive data (n = 10).

The HC5 values from the three SSD curves (Figs. 1, 2 and 3) range from 0.63 to 1.9 µg L−1. Application of the maximum AF of 5 to the lowest calculated HC5 of 0.63 µg L−1 for the full dataset produces an EQS value of 0.126 µg L−1. This threshold would protect all species in the extensive chronic dataset (n = 20), based on their reported or estimated 10% effect values, with a ‘margin of safety’ of >10 between the lowest 10% effect value in the dataset (1.7 µg L−1) and the EQS.

The European Commission’s EQS for diclofenac [10] rejected the SSD approach on the basis that the SSD for the chronic diclofenac dataset displayed a clear bimodal distribution. However, based on our assessment, this bimodality appears questionable. It is also clear that the insensitive chronic ecotoxicity data for diclofenac do not exert a significant influence on the HC5 value generated by the SSD approach.

The EC [10] assessment favours a deterministic EQS (0.04 µg L−1) based on a single mesocosm study using diclofenac [15]. Unfortunately, this mesocosm study does not appear to meet all of the reliability criteria required by the accepted guidance [9] for employing a mesocosm study in the direct derivation of an EQS.

Since all substance-specific EQS are legally binding and applied EU-wide it is critically important that they are both scientifically robust and consistent in terms of protection goals. The European Technical Guidance [9] sets out the approach to follow in deriving EQS and is specifically intended to ensure delivery on these objectives. Use of this Technical Guidance [9] is binding on the European Commission and Member States, and the introduction explicitly states that “[t]he Commission intends also to use this Technical Guidance to derive the EQSs for newly identified priority substances and to review the EQSs for existing substances”. While it is clearly good practice to periodically revisit this guidance to ensure it reflects the latest science with respect to water quality standard derivation, and to take account of potentially useful alternative approaches, it is just as clearly inappropriate to disregard the current guidance in favour of a bespoke approach within specific substance assessments. Of course, there may be circumstances in which the current guidance (or specific parts of it) cannot be applied for the derivation of a specific substance EQS because the available hazard data are insufficient, or do not allow the accepted approaches to be applied. In such cases, a bespoke method of derivation may be acceptable, provided that it can be clearly demonstrated that the accepted approaches are not applicable, and that the alternative approach is as scientifically robust as those provided by the guidance. However, in the case for diclofenac, sufficient data are available to allow a probabilistic derivation of an EQS, and the guidance is clear that this approach should take precedence when this is the case. While the guidance allows for the use of mesocosm studies to deterministically derive EQS in specific circumstances, it also requires that such studies adhere to strict reliability criteria if they are to be used in this way. Failure to adhere to the accepted guidance (either wholly or in part) for EQS derivation is likely to lead to inconsistency between different substance-specific EQS with respect to the protection goals of the WFD, undermining trust in the EQS derivation process, and ultimately risk shifting what should be strictly technical considerations into the political arena.