Leverett et al. [1] commented on the generation of the Environmental Quality Standard (EQS) dossier for diclofenac for the European Water Framework Directive (WFD) and criticized the derived EQS value in the draft dossier. This value was derived by a sub-group of experts chaired by the Joint Research Centre (JRC) and the German Environment Agency (UBA).

Leverett et al. brought up valuable points concerning the problematics related to deriving an EQS value for diclofenac. However, in our view, the derivation of the alternative EQS, as proposed by Leverett et al., does not solve these problems in accordance with the Technical guidance document on EQS [2].

As stated in the conflict of interest one author (Jim Ryan) works for a company Glaxo Smith & Kline plc (GSK) that produces diclofenac, sells products containing diclofenac, and submits environmental risk assessments for pharmaceuticals to regulatory authorities. In fact, GSK a leading company in the diclofenac market.

Furthermore, we would like to note that, although the authors are heavily criticizing the diclofenac draft dossier, four of the five authors (D. Leverett, G. Merrington, M. Crane, J. Ryan) were actually participants of this expert group and, therefore, actively involved in generating this same draft dossier. The preparatory phase and the drafting phase of the same dossier, included lengthy discussions with the mentioned authors and with other experts on all details, in numerous meetings, for several months. The criticised information in the final dossier in our view merely includes the ‘diverging’ views of other experts (non-GSK associated experts), while this has regrettably not been clearly indicated in the Leverett et al.’s paper.

As participants of that expert group and with long experience in Environmental Risk Assessments including generating EQS dossiers within the context of the WFD, the authors should know the process and the individual steps of generating such a dossier. The status of the draft diclofenac dossier is still “work in progress”, being today (December 2021) in the EU internal review process. In line with the EU-Commission strategy of full transparency, this draft version before the internal review is available on the CIRCABC website [2]. Therefore, some of the details Leverett et al. are criticizing in their comments might still be modified during this internal review process. In addition, publicly available data that are still “work in progress” can differ from the ‘final’ dossier which is not available yet, especially because the dossier is currently still being ‘peer’ reviewed by a panel of independent scientists via the EU’s Scientific Committee on Health and Emerging Environmental Risks (SCHEER).

Leverett et al. [5] are commenting on the use of the mesocosm data of Joachim et al. [4].

They question the validity of the mesocosm study, claiming that the reliability criteria were not fulfilled, and also claiming that statistically significant effects were only seen at the highest concentration (stickleback data). They suggested to use their Species Sensitivity Distribution (SSD) approach instead. However, this topic was discussed in all details during the sub group meetings. The majority of the experts regarded the mesocosm study as reliable and useful, and concerning the lack of statistical significance of effects at lower concentrations, it was pointed out that there was a statistically significant correlation between concentration and effect. In studies with such a high variability (which is normal for mesocosm studies) the traditional “hypothesis testing”, which Leverett et al. employed, has a low degree of statistical power, and is actually quite meaningless.

As laid out in detail in Chapter 6.3.1.2 and Annex 2 of the draft dossier, the SSD displays a significant bimodal distribution [2]. The technical guidance document for deriving environmental quality standards (TGD-EQS) [1] requires the data to follow a distinct distribution, usually a normal distribution, if the SSD is used to derive the EQS. In case such a distribution is not given for the whole data set, it is recommended to do an SSD for the more sensitive taxonomic groups. If the data from this second SSD are normally distributed, the resulting HC5 can be used for EQS derivation (TGD, chapter 3.3.1.2, page 44):

“If the data do not fit any distribution, the left tail of the distribution (the lowest effect concentrations) should be analysed more carefully. If a subgroup of species is particularly sensitive and, if there are sufficient data, an SSD may be constructed using only this subgroup. However, this should be underpinned if possible by some mechanistic explanation e.g. high sensitivity of certain species to this particular chemical. The SSD method should not be used in cases where there is a poor data fit to all available distributions.”.

In contrast to, e.g., substances with an estrogenic mode of action like estradiol and ethinylestradiol, for diclofenac, there are no clear taxonomic-related differences found in the distribution of the SSD. This is highly visible through the data collated and analysed by the expert group and analysed in the draft EQS dossier. For example, two autotrophic species (Dunaliella tertiolecta and Desmodesmus subspicatus) are on the higher end of the distribution while duckweed (Lemna minor) is shown to be the second most sensitive species. Moreover, fish toxicity data ranged from 3.5 µg/L for Salmo trutta up to 674 µg/L for Cyprinus carpio. Consequently, it was considered that there were no ecological or taxonomic reason to use one part of the SSD only and exclude other studies, i.e., no specific sensitive species group could be established.

These results suggest that the SSD approach may not be applicable in the case of diclofenac. No mechanistic explanation for a sensitive subgroup could be identified. This is the main and scientifically sound reason why, in line with the TGD [1], the expert group suggested not to use the SSD at all for setting the EQS.

In contrast, Leverett et al. [5] wrote: “However, it is debatable whether this SSD is truly bimodal since the 40 and 120 μg L−1 data points bridge the gap between these lower (sensitive) and upper (insensitive) portions of the SSD curve.”

Later in the text, the authors are suggesting to just use the sensitive part of the SSD, without any biological explanation, but citing the TGD “not all data have equal influence on the derivation, with so-called ‘critical’ data strongly influencing the resultant EQS as stated in EU guidance document” [1]. Here, the authors are obviously omitting parts of the citation. In the same paragraph (chapter 2.6.3, p. 27–28 the TGD states: “If a species sensitivity modelling approach is adopted, a distinction between critical and supporting data does not apply. This is because all the data are used in the model extrapolation and so, all the data can be regarded as critical (as long as they are reliable and relevant).”

In the second part of their commentary, Leverett et al. [5] are commenting on the use and the interpretation of the monitoring data, generated and provided by the individual member states. Here, the authors are, however, making some crucial but scientifically incorrect simplifications:

  • They developed the indicative compliance assessment on a basis of the country level (mean of 90th percentiles of individual countries) instead of at the level of monitoring sites as stipulated in the Directive 2008/105/EC.

  • In this paper, the 90th percentiles (as well as the other statistical parameters) are estimated only by a substitution approach, just setting the data, which are less than the limit of quantification as half of the limit of quantification value. This is in contrast to Merrington et al. [6], where the same authors postulate the substitution as a bias-prone method that should not be used in risk assessment. In addition, this paper is lacking information about confidence intervals of the derived statistics, thus the possible range of statistical parameters is unknown which reduces the robustness of their results.

  • In the collected dataset for European surface waters, one of the participating countries is overrepresented since it holds about 80% of all reported samples. Although this is mentioned in the text, this paper does not consider a data scenario “evaluation without the most data-rich country” to assess what impact on the final results this country would have. Instead, four countries which have shown many exceedances comparing to the EQS were eliminated, and in a second step specifically analysed.

  • This paper evaluates the risk mainly by considering weighted and unweighted means of 90th percentiles of measured concentrations in participating countries. The rationale of this choice is not explained or commented upon. The authors do not explain either, why higher percentiles, for instance 95th, are not taken into consideration. Indeed, the mean of the 95th percentiles of reporting countries is 0.157 μg/L which exceeds both tentative Annual Average (AA)-EQS (0.126 μg/L as derived in the paper as well as the provisional one of 0.04 μg/L according to the EC draft dossier). Actually, this result shows and confirms a risk of diclofenac in EU watersheds.

Conclusion

We agree that regulatory decisions and processes should be challenged in scientific articles, but we completely disagree with using a scientific journal to claim such a disagreement during the review process of a dossier for an EQS derivation.