Introduction

Microplastic particles (MPs; < 5000 μm) are pervasive in the aquatic environment, including ocean surface waters [54], deep ocean trenches [1], wetlands [36], lakes [13, 14] and the Arctic [20]. Exposure to MPs has been associated with various types of biological responses, including disruption of feeding [61], decreases in growth [48], tissue inflammation [43], changes in gene expression [42, 66], and decreases in reproductive success [55]. The observed adverse effects have resulted in increasing concern regarding the environmental risks of MPs, thus requiring the development and application of appropriate risk assessment and management tools [21].

In order to appropriately inform MPs management decisions, there is a need to establish critical threshold values at which adverse effects are likely to occur (Coffin and Weisberg, [11]). However, the development and application of such risk management framework is currently limited because MPs are a diverse class of pollutants with varying properties (i.e., different sizes, shapes, polymer types, and chemical additives) that can influence the toxicity outcome in aquatic organisms [37, 49]. For instance, several laboratory-based studies have documented chemical effects of MPs from additives or other sorbed contaminants [17, 31, 38, 70]. Others have reported physical effects of MPs as a result of food dilution where the volume of ingested particles creates a false sense of satiation and reduces nutrient intake [12, 47, 53], or as a result of translocation of MPs across epithelial barriers, leading to tissue inflammation and oxidative stress [39, 64].

Further complicating the development of thresholds are the discrepancies between laboratory-based exposures and environmentally relevant exposure scenarios. Most toxicity studies are using monodispersed MPs (e.g., single-sized virgin beads or fragments of one polymer type) [21, 32], while ambient exposures are to a mixture of particle sizes, shapes and chemistries [7], including a high proportion of microfibers [4], and other chemical contaminants from different parallel exposure pathways. Moreover, accurate estimates of environmental concentrations for smaller MPs (< 300 μm) remain sparse [9], and particles < 1 μm can be difficult to detect in environmental samples [50]. However, recent studies have proposed an alignment strategy to extrapolate laboratory results of varying levels of polydispersity and facilitate comparisons with the polydispersity of MPs in the environment [33, 34]. This represents a promising advancement to improve environmental risk assessments for MPs.

Here, we report the outcome of an expert workshop that aimed to develop and apply a risk management framework for MPs in aquatic ecosystems. The framework was populated using a meta-analysis of the peer-reviewed literature to estimate threshold values for two different effect mechanisms, food dilution and translocation. The workshop addresses the State of California legislative mandates for enhanced MPs management [63] and advances risk assessment analysis by introducing a tiered management framework that defines four thresholds based on varying levels of MPs in surface waters and corresponding to different levels of concern and management responses. Finally, the level of confidence in the framework and each of the threshold values was evaluated along with research needs to improve confidence in these thresholds in future iterations.

Proposed microplastics management framework

The proposed framework developed for water discharge management consists of five levels of risk management concern, with four hazard thresholds based on different levels of confidence that MPs can cause an adverse effect to aquatic life (Fig. 1). Threshold 1 defines the level below which managers should be confident in the absence of biological effects and is limited to decisions related to monitoring needs. In contrast, Threshold 4 corresponds to a level of concern where managers can be more confident that ecosystem level effects manifest. Thus, observations of MPs in aquatic systems in exceedance of Threshold 4 would require the implementation of stringent and potentially expensive management actions, such as limiting uses of a waterbody (e.g., fishing restrictions) or source controls (e.g., regulation of local point sources). Workshop participants recommended that Threshold 1 be implemented to establish a “protective” level below which environmental risks are low, minimizing Type I errors. The more “predictive” Threshold 4 is based on high quality and reliable toxicity data reporting significant effects of MPs exposures. This minimizes Type II error to ensure that the associated management actions are appropriately justified. Thresholds 2 and 3 represent a gradation in management decisions and associated thresholds between Thresholds 1 and 4. Although the development of a risk management framework fills a critical gap to address MP pollution in aquatic environments, other precautionary measures to reduce sources of MPs (e.g., product bans) should be conducted in parallel for effective MPs management.

Fig. 1
figure 1

Proposed tiered management framework to implement health-based thresholds for microplastics

Methods

Approach to derive health-based thresholds

The approach developed and applied here to calculate the four thresholds is based on the use of a species sensitivity distribution (SSD) [3, 51, 59]. This probabilistic approach, often used to develop water quality criteria, summarizes ecotoxicological data for various species and taxa in order to compare interspecies sensitivity to a specific contaminant [16, 44]. Because they integrate a large set of toxicity data for all species exposed to a contaminant of interest, SSDs and the derivation of hazard concentration (HC) levels protective of aquatic communities are considered a pragmatic approach that is not without limitations but the best available tool so far [16].

To establish threshold values reflective of the proposed “protective-to-predictive” tiered-management framework, workshop participants agreed on a suite of SSD parameters representing an increased level of confidence in ecologically relevant effects (Table 1). Parameters include the use of an appropriate data collapsing method, a percentage of species affected (HC), model point estimate, and the level of biological organization of the endpoints considered. To estimate the lower thresholds values (#1 and #2), experts recommended the inclusion of endpoints at all levels of biological organization and use of the 1st quartile to summarize the data for each species. Threshold 1 (Investigative monitoring) is based on the lower 95% confidence level of the HC5, while Threshold 2 (Discharge monitoring) is based on the median HC5. For the higher “predictive” thresholds, (i.e., Threshold 3 and 4), the recommended SSD parameters include the use of organismal and population level endpoints only, summarized using the median data collapsing method. Threshold 3 (Management planning) is set at the median HC5 and Threshold 4 (Source control measures) is set at the median HC10.

Table 1 Species sensitivity distribution (SSD) parameters and data filters used to derive multiple thresholds as part of the tiered framework

Analyses to populate the thresholds

Effect-concentration data were collected using the Toxicity of Microplastics Explorer (ToMEx) database, built to gather peer-reviewed literature on the biological effects of different shapes and polymer types of MPs ranging between 0.001–5000 μm (Hampton et al. A, [23]). Due to uncertainties in environmental distributions for MPs smaller than 1 μm, only toxicity data between 1 and 5000 μm were included in our analysis. Each study was evaluated against a pre-defined set of 14 quality criteria for experimental set up, particle characterization and dose-response. Our criteria were adapted from those described in de Ruijter et al. [12], with some modifications (see Table S1A for more details). Out of the 162 peer-reviewed toxicity studies representing 5871 datapoints and 109 species included in the database, only 21 studies met our pre-defined set of quality criteria. A total of 290 datapoints was extracted using the following dose descriptors: no observed effect concentration (NOEC), lowest observed effect concentration (LOEC), lethal effect concentration (LCx), and percent effect concentration (ECx). Highest observed no effect concentrations (HONEC) were excluded from our analyses due to their limited reliability [2].

SSDs were constructed using the SSDTools package in R as described by Thorley and Schwarz [56]. The analyses were performed using NOECs for chronic exposures. To do so, effect metrics (i.e., LOECs and EC50/LC50) values were converted using assessment factors of 2 and 10, respectively. For the conversion of acute to chronic data, an assessment factor of 10 was applied [62]. The basis for the assessment factors applied is provided in Tables S1B and S1C. Raw data used in the SSDs are provided in Table S1D. Separate SSDs were developed for two different hypothesized mechanisms of toxicity. To identify the dataset relevant to food dilution, species-dependent ingestible size ranges based on mouth opening were used as the upper limits. Studies using algal species were excluded as ingestion is not a plausible mechanism. For tissue translocation, preliminary results of a binomial logistic regression model (using 27 studies for 19 species) suggested that particles shorter than 83 μm were most relevant to trigger this effect mechanism (supplemental information S2).

Prior to incorporation of data into the SSD, data were aligned and rescaled to 1–5000 μm for two size-related dose metrics, volume for food dilution and surface area for tissue translocation, following the methods by Koelmans et al. [33] and Kooi et al. [35]. This data alignment resolved an inconsistency in available toxicity data of different levels of polydispersity, including monodisperse (i.e., same size and/or shape versus polydisperse MPs). This approach also allowed correction for bioavailability of particles for food dilution and tissue translocation. A detailed description of the data alignment approach used can be found in the supplemental information S3.

Sensitivity analyses

Three types of sensitivity analyses were conducted, focused on the alignment procedure, endpoints selection, and studies selection. The first one examined the variability associated with various assumptions used in the alignment method proposed by Koelmans et al. [33]. Specifically, we assessed the influence of alpha values proposed in Kooi et al. [35] and used these to convert effects data from particles used in the laboratory to effects data for polydisperse particles as seen in the environment. We also examined the influence of the assumed limitation in bioavailability of particles (i.e., estimated range for width and length for each particle type) for food dilution and tissue translocation. The second type of sensitivity analysis examined the importance of endpoints selection (all endpoints vs. fitness vs. mortality) in the SSDs to calculate threshold values. The third type of analysis used the leave-one-out method to remove individual studies and assess the variability in the underlying data [57]. For all three types of sensitivity analyses, the specific parameter examined was modified one-at-a-time and their impact on the resulting SSDs was evaluated. Additional details on the different sensitivity analyses are provided in Supplemental Information S3.

Expert’s confidence in the outcomes

Experts were asked to describe their relative confidence level in the decision framework and analytical process adopted. Workshop participants engaged in critical discussions on the suitability of a tiered-management construct with four thresholds. The relative level of confidence in the calculated threshold values was also evaluated based on the amount, quality, and consistency of data. As part of this process, experts conducted a detailed review of five studies that were driving the outcome of the SSDs to ensure that data were correctly entered in the database and aligned. The confidence vote was achieved using a semi-quantitative approach similar to that used by the Intergovernmental Panel on Climate Change (IPCC [41]). Each expert anonymously rated their confidence level on a scale from 1 to 5 (very low, low, medium, high, very high). Once the votes were tallied, the experts discussed the outcome, and shared their perspectives on the appropriateness and quality of the data.

Results

Thresholds

SSD-derived thresholds were calculated using toxicity data for 14–16 species from 6 or 7 taxonomic groups, depending on the threshold level and effect mechanism (Figure S1). The estimated threshold values for food dilution ranged between 0.3 and 34 particles/L (mass equivalent of 0.05 to 6 mg/L), with 95% confidence interval (CI) spanning over two orders of magnitude (Table 2). Results of these SSDs identified the marine bivalve Pinctada margaritifera (black lip pearl oyster) and estuarine fish Oryzias melastigma (marine medaka) as the two most sensitive species (Figure S1). For tissue translocation, estimated threshold values ranged between 60 and 4110 particles/L (mass equivalent between 10 and 676 mg/L), with wide 95% CI (Table 3). P. margaritifera deemed to be the most sensitive, was reported to have reduced assimilation efficiency and altered energy balance upon exposure to polystyrene microbeads (6–10 μm) for 2 months [18]. In Wang et al., [60], O. melastigma exposed to 10 μm polystyrene spheres for 2 months had altered antioxidant enzyme expression and lowered circulating concentrations of sex steroids. O. melastigma, was also identified as the most sensitive to potential tissue translocation-related effects, together with the freshwater crustacean Ceriodaphnia dubia. Studies included in the SSDs for C. dubia showed that acute exposure to MP polyethylene particles (2–5 μm) had a significant impact on reproduction and survival [27, 28, 69].

Table 2 Proposed microplastics toxicity thresholds for food dilution, relevant for particle sizes between 1 and 5000 μm
Table 3 Proposed microplastics toxicity thresholds for tissue translocation, relevant for particle sizes between 1 and 83 μm

While the studies used for thresholds derivatization met all our pre-defined quality standards, it should be noted that none of them met all 20 quality criteria considered necessary for risk assessment by de Ruijter et al. [12]. For example, most studies used virgin MPs, did not verify background contamination, and did not confirm actual exposure concentrations (i.e., nominal concentrations reported). Moreover, the in-depth review of the studies driving the lower portion of the SSD curves also indicated that some of these studies poorly described MP sample preparation and may have used unreliable quantification methods.

Sensitivity analysis

The results of the sensitivity analyses revealed that assumed distribution values for polydisperse laboratory experiments were of negligible impact to the resulting thresholds (− 15 to + 5%; Table 4 and supplemental information S4). The assumed values used for bioaccessibility based on shape and for estimating environmental MP polydispersity distribution had moderate impacts on the thresholds (− 89 to + 32% and − 87 to 289%, respectively). The results of the leave-one-out analysis implied the most sensitive study used to derive the SSD also had a moderate influence on the thresholds calculated (− 47 to + 300% difference) (Figure S4). For both tissue translocation and food dilution-based thresholds, endpoint selection had the largest impact on the resulting thresholds (Table 4). When comparing SSD-derived thresholds based on all endpoints to those calculated using fitness or mortality, thresholds values varied between − 38 and 628,000%, with the influence largely due to using only mortality as an endpoint (supplemental information S4).

Table 4 Sensitivity analyses to assess the impact of alignment method, endpoint selection and individual studies, on the threshold values calculated

Expert’s confidence in the outcomes

Overall, workshop participants expressed high confidence in the proposed multi-tiered management framework and the use of SSDs and data alignment calculations to derive hazard threshold values. The experts’ scores ranged between 3.0 and 5.0 for both with mean score of 4.2 (high) for the framework and 3.9 (high) for analytical approach (Fig. 2A). The confidence level in the threshold values for food dilution and tissue translocation were highly variable among experts, with individual scores ranging from 1.0 (very low) to 4.0 (high). Thus, mean scores for individual thresholds ranged between 2.4 and 3.0, with slightly higher scores for food dilution-related thresholds (Fig. 2B).

Fig. 2
figure 2

Mean confidence scores for (A) the management framework and (B) threshold values. Whiskers represent the range of experts’ votes. Scoring scale: 1- very low, 2- low, 3- medium, 4- high and 5- very high

Discussion

The tiered-management framework presented here is an enhancement of an approach used by the California State Water Resources Control Board to monitor other emerging contaminants [40]. The key to the framework is a recognition that there is not a single threshold that can be adopted to manage the potential environmental risks of MPs, but rather a need for multiple tiers representing varying levels of management decisions. Tiers 1–3 promote incremental information gathering needed to better understand the issue prior to implementing costly regulations in Tiers 4 and 5. The proposed strategy is focused on decisions regarding water quality management in aquatic habitats and is not intended to discourage upstream pollution control measures such as reduction. The same approach could also be applied to other types of MP management concerns, such as the presence of MPs in tissues and food.

The SSD approach used here has considerable precedent for toxicity threshold development and risk assessment for chemical contaminants [6, 16, 26, 45, 58]. This approach has also been applied to derive thresholds for non-chemical stressors including nanomaterials [10, 19], nanoplastics [65], and MPs (e.g., [2, 8, 15]). Coupled with Koelmans et al.’s [33] alignment approach, this probabilistic data integration approach proved particularly useful for MPs due to the large differences in particle size and shape used in toxicity studies [49]. A key decision in employing an SSD approach is the parameterization to produce four different thresholds that cover the gradation between Type I and Type II error regarding toxicity. The experts quickly reached consensus on the selection of HC values, point estimates and data collapsing methods, but discussed at length the selection of endpoints. The consensus was that Thresholds 1 and 2 should include endpoints from all levels of biological organization, including effects at the molecular and cellular level. This is consistent with the evolution of toxicity testing and the recognition that sub-organismal biomarkers can serve as early indicators to prevent adverse toxicity effects [30]. For Thresholds 3 and 4, the consensus was to limit to organismal-level effects for greater confidence in ecologically relevant effects. However, the experts did not restrict the categories of endpoints to consider, thus including data for fitness, metabolism, and behavioral endpoints. Adverse outcome pathways associated with microplastics exposure are still in development [29] and currently available data are not sufficient to determine low or medium priority endpoints. While regulatory frameworks have favored the use of fitness endpoints (e.g., growth, development, mortality) for threshold development, other categories of endpoints such as immune or behavioral changes may also lead to impaired fitness. Moreover, the use of severe effects (e.g., mortality data only) would not be appropriate for a precautionary management approach aiming to prevent impacts to aquatic life. As more information is garnered on the toxicity pathways for MPs, data selection for SSD implementation should be reconsidered.

Application of the SSD approach yielded thresholds lower for food dilution than tissue translocation. Microplastics toxicity is known to be size-related, which may be predicted by volume and surface area of the exposed MPs ([5, 24]). Thus, it is possible that a small number of large particles will have a large volume in the gut leading to reduced food intake. It should be noted that our 8 threshold values generally fell within the range of SSD-derived thresholds previously reported by Everaert et al. [15] for marine species (HC5 of 33.3 particles/L, with a 95% confidence interval of 0.36–13,943 particles/L) using unaligned data, and by Koelmans et al. [33] for freshwater ecosystems (HC5 of 75.6 particles/L, with a 95% confidence interval of 11–521 particles) using an alignment-based method. Other studies that calculated predicted no-effect concentrations (PNECs) using unaligned data also reported values within the range of thresholds calculated in this study [2, 67]. However, all these studies proposed a single value, often applicable to a specific habitat (i.e., freshwater or marine). Here, we chose to merge all aquatic species (freshwater and marine) in the meta-analysis based on the assumption that effect mechanisms have a stronger influence on toxicity outcome than habitat, and the recognition that some species live across salinity gradients. This decision allowed the use of a larger dataset, although managers may choose to repopulate this framework with taxa specific to their geography and habitat should sufficient data be available. Overall, food dilution thresholds were within the lower range of concentrations reported in aquatic habitats, while those developed for tissue translocation were much higher than reported in the environment [22, 68]. Our findings suggest food dilution will drive management responses and future studies should better define the dose-response relationship for this effect mechanism.

Evaluation of experts’ uncertainties revealed a high level of confidence in the management framework and analytical approach for populating the thresholds, but relatively low confidence in the thresholds themselves. One area of concern was the lack of separation observed between Thresholds 2 and 3, which failed to reflect the increase in likelihood for impact and did not support the need for more stringent management decisions. This suggests that additional parameterization of the SSDs may be needed. The main area of uncertainties, however, stemmed from the limited data of sufficient quality and environmental relevance. Most studies included in the SSDs used spheres or fragments, despite evidence that fibers are one of the shapes most frequently detected in the environment [4]. Fibers are also believed to exert higher toxicity in comparison to other MP shapes [46, 52]. The alignment method partially addresses this concern, but a more representative underlying dataset would reduce some these uncertainties. In addition, over 90% of studies in the database did not meet all quality criteria described by de Ruijter et al. [12]. Instead, a reduced set of quality criteria was identified, focusing on select details of exposure conditions (e.g., polymer type, shape, size, nominal concentration) and a minimum of 3 MP concentrations (excluding controls). To ensure that sufficient data was included in the SSDs, other key quality criteria critical to the assessment of dose-response relationships such as verification of MP exposure concentrations or chemical composition of tested MPs were not considered. These shortcomings had an influence on the relative confidence in threshold values derived. Recommendations to improve data quality in future studies are expanded in Hampton et al. C [25]. Another concern was the absence of established toxicity pathways for MPs. While several studies reported gene expression changes, altered metabolism, oxidative stress or tissue inflammation following microplastic exposure, few provided clear relationships between these endpoints and more apical effects at the organism or population level. Studies on adverse outcome pathways that integrate responses across multiple biological scales would increase the weight of evidence and overall confidence in health-based thresholds for MPs.

Conclusions

We introduce a MPs risk management framework that identifies multiple levels of potential management action depending on MPs concentrations and associated biological effect thresholds. Included in that framework are four health-based thresholds that distinguish those management levels and a process for calculating thresholds. While this work was done to address a California legislative mandate, it has relevance to other jurisdictions globally. More importantly, the multi-tiered framework is adaptive to new management decisions and to incorporation of additional data as it becomes available. The most important data needed to improve confidence in thresholds produced by the framework are ones that enhance the knowledge on dose-response relationships and the effects of environmentally realistic (polydisperse) mixtures of diverse particles, and studies that better establish adverse outcome pathways.