A data-driven approach to identifying PFAS water sampling priorities in Colorado, United States

Barton, Kelsey E.; Anthamatten, Peter J.; Adgate, John L.; McKenzie, Lisa M.; Starling, Anne P.; Berg, Kevin; Murphy, Robert C.; Richardson, Kristy

doi:10.1038/s41370-024-00705-7

A data-driven approach to identifying PFAS water sampling priorities in Colorado, United States

Article
Open access
Published: 01 August 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Exposure Science & Environmental Epidemiology Submit manuscript

A data-driven approach to identifying PFAS water sampling priorities in Colorado, United States

Download PDF

Kelsey E. Barton ORCID: orcid.org/0000-0001-6103-8975^1,2,
Peter J. Anthamatten³,
John L. Adgate²,
Lisa M. McKenzie²,
Anne P. Starling⁴,
Kevin Berg¹,
Robert C. Murphy⁵ &
…
Kristy Richardson¹

691 Accesses
1 Altmetric
Explore all metrics

Abstract

Background

Per and polyfluoroalkyl substances (PFAS), a class of environmentally and biologically persistent chemicals, have been used across many industries since the middle of the 20^th century. Some PFAS have been linked to adverse health effects.

Objective

Our objective was to incorporate known and potential PFAS sources, physical characteristics of the environment, and existing PFAS water sampling results into a PFAS risk prediction map that may be used to develop a PFAS water sampling prioritization plan for the Colorado Department of Public Health and Environment (CDPHE).

Methods

We used random forest classification to develop a predictive surface of potential groundwater contamination from two PFAS, perfluorooctane sulfonate (PFOS) and perfluorooctanoate (PFOA). The model predicted PFAS risk at locations without sampling data into one of three risk categories after being “trained” with existing PFAS water sampling data. We used prediction results, variable importance ranking, and population characteristics to develop recommendations for sampling prioritization.

Results

Sensitivity and precision ranged from 58% to 90% in the final models, depending on the risk category. The model and prioritization approach identified private wells in specific census blocks, as well as schools, mobile home parks, and public water systems that rely on groundwater as priority sampling locations. We also identified data gaps including areas of the state with limited sampling and potential source types that need further investigation.

Impact statement

This work uses random forest classification to predict the risk of groundwater contamination from two per- and polyfluoroalkyl substances (PFAS) across the state of Colorado, United States. We developed the prediction model using data on known and potential PFAS sources and physical characteristics of the environment, and “trained” the model using existing PFAS water sampling results. This data-driven approach identifies opportunities for PFAS water sampling prioritization as well as information gaps that, if filled, could improve model predictions. This work provides decision-makers information to effectively use limited resources towards protection of populations most susceptible to the impacts of PFAS exposure.

The Utility of Machine Learning Models for Predicting Chemical Contaminants in Drinking Water: Promise, Challenges, and Opportunities

Article Open access 17 December 2022

Guideline levels for PFOA and PFOS in drinking water: the role of scientific uncertainty, risk assessment decisions, and social factors

Article Open access 08 January 2019

Health-Risk Assessment of Arsenic and Groundwater Quality Classification Using Random Forest in the Yanchi Region of Northwest China

Article 25 November 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Introduction

Per- and polyfluoroalkyl substances (PFAS) comprise several thousand chemicals that are ubiquitous in both the environment and human serum due to their wide use, environmental persistence, and potential for bioaccumulation [1,2,3,4]. PFAS have been manufactured since the 1940s for a variety of applications ranging from stain- and water-resistant consumer products to aqueous film forming foams (AFFF) used to fight fuel fires [1, 3,4,5]. PFAS are estimated to persist untransformed in water and sediment for hundreds of thousands of years [1, 3, 4]. Some longer-chain PFAS (typically defined as PFAS containing ≥6 carbons), such as perfluorooctane sulfonate (PFOS), and perfluorooctanoate (PFOA), have estimated elimination half-lives as long as 2–8 years in humans [6,7,8,9]. Exposure to certain PFAS associated with health effects including increased cholesterol, decreased infant birth weight, and increased risk of certain types of cancer [10]. Certain populations, including infants, young children, and populations disproportionately burdened by environmental pollution, may face greater risk of health impacts from PFAS exposure [11,12,13,14].

Between 2013 and 2015, the United States Environmental Protection Agency (USEPA) tested approximately 8% of U.S. public water systems for a panel of six PFAS. This effort sampled water provisions for a total of 240 million people as part of the Third Unregulated Contaminant Monitoring Rule (UCMR3) [15]. UCMR3 detected PFAS in the water of 16 million people across 33 states, including multiple water systems serving a combined 65,000 people in Fountain Valley, Colorado (El Paso County) where PFAS were detected over the 2016 USEPA health advisory (70 ng/L) [16, 17].

After the discovery of PFAS contamination in El Paso County, the Colorado Department of Public Health and Environment (CDPHE) conducted additional water testing and identified other contaminated water systems and surface water sites across the state [18,19,20]. In the CDPHE’s 2020 PFAS Sampling Project, all 71 surface water sites had detectable concentrations of PFAS and approximately 25% of the 397 treated drinking water systems had detectable concentrations of PFOS or PFOA [19]. While the CDPHE was able to collect data on the drinking water exposure of 75% of the population, many smaller and rural public water systems, and nearly all private wells, remain untested [19, 21].

Due to the ubiquitous nature of PFAS and the multitude of potential sources, prioritizing limited resources for water sampling efforts is challenging. Building on studies that aimed to develop similar prediction models and improved understanding of the dynamics driving PFAS in the environment [22,23,24,25], the goal of this project was to incorporate data on all known and potential PFAS sources, including factors that may impact groundwater vulnerability and PFAS transport in the environment, and existing PFAS water sampling data in Colorado into a PFAS groundwater contamination risk prediction map. Additional goals included evaluating the potential for exposures in disproportionately impacted communities and developing recommendations for a prioritization plan to inform future sampling efforts and resource allocation.

Methods

Data sources

This project used a variety of data sources, including PFAS water sampling results, information on known and potential PFAS sources, physical environmental characteristics, population density, and population vulnerability. Basic information on each data source can be found in Tables 1–3, with additional information included in Supplementary Tables 1, 2, and 4.

Table 1 PFAS groundwater sampling results compiled from 10 different sampling efforts used as the training data^a.

Full size table

Table 2 Potential PFAS sources, categorization approach, and sub-categories.

Full size table

Table 3 Additional predictors of contamination (non-sources).

Full size table

PFAS sampling results (training data)

Results from groundwater sampling from ten different efforts conducted between January, 2016 and April, 2022 were used for the training data [26]. These data included 1232 data points from public water systems, private wells, and monitoring wells. Data collected at sources, such as points from within military sites or corrective action sites, were excluded from the training data set in order to avoid introducing bias into the training data. Further, for wells sampled more than once, we used the highest value reported during the study period to generate conservative estimates with the mission of protecting public health. Maximum values represented 36 (2.9%) of the training data points. To investigate the potential impact of using maximum values rather than average values, we evaluated the classification status using average values and determined that only 8 of those 36 samples (0.06% of training data points) would be classified differently (Supplementary Table 5). More information on the training dataset can be found in Table 1 and Supplementary Table 4.

Samples that were below the limit of detection (LoD) were assigned the value of ½ the LoD. Ten datasets (Table 1 and Supplementary Table 4) were included in the training data, including data collected over multiple years for different purposes. Consequently, detection limits and the number of analyzed PFAS varied greatly throughout the training data. All datasets measured PFOS and PFOA, although some reported the summation data only.

Known and potential PFAS sources

To distinguish known PFAS sources from unknown PFAS sources we divided each type of point source into sub-categories where possible. For example, fire stations were divided into four sub-categories based on information collected by the CDPHE in a state-wide survey of fire departments, which asked about their use of PFAS-containing firefighting foams and their participation in the fire-fighting foam registration program [27]. Not only can detailed classifications improve model performance, but this information may also be useful for identifying important point sources for predicting PFAS contamination and determining point sources that warrant further investigation. Point sources were treated as a “distance” feature in the model; specifically, the Euclidean distance from each PFAS sampling point to its nearest neighboring point source, within each specific category, was recorded. We report the distribution of known and potential PFAS point sources in Fig. 1.

**Fig. 1: Distribution of potential and known PFAS sources in Colorado used in development of the prediction model.**

We employed calculated density rasters that show where point or line features are more or less concentrated to explore the impact of varied spatial relationships for some potential source types [28]. A density raster enables analytical evaluation of potential PFAS sources that have greater uncertainty regarding the likelihood of PFAS contamination of groundwater, such as the oil and gas industry [29,30,31]. We calculated density rasters using the kernel density tool in ArcPro with default input parameters (see Table 2 and Supplementary Table 1) [32].

Physical environmental characteristics

In addition to these point sources, we considered other explanatory variables that may predict groundwater vulnerability or shed insight into how PFAS travel through the environment. These data include elevation, soil properties, geologic features, annual average precipitation, directionality of groundwater flow, and land use (e.g., urban, agricultural, forested). Environmental characteristics were either represented as vector polygons (e.g., alluvial aquifers) or rasters (e.g., soil permeability). We present complete information on these in Table 3 and Supplementary Table 2.

Population vulnerability

We considered population vulnerability, using Colorado’s definition of a disproportionately impacted (DI) community, as a factor in development of the sampling prioritization plan. DI communities are less likely to have equitable access to healthcare and more likely to experience a higher cumulative burden of environmental and psychosocial stressors, both of which may contribute to greater susceptibility to adverse health impacts from environmental exposures [13, 33,34,35,36]. To incorporate vulnerability into the assessment, we downloaded data from the CDPHE’s EnviroScreen (2022) and used the binary variable, “DI community” at the census block group level [37].

Colorado’s Environmental Justice Act defines a disproportionately impacted (DI) community as a census block group with at more than 40% low-income households, more than 40% people of color households, or more than 40% housing cost-burdened households [38]^. These demographic variables may impact community vulnerability to environmental exposure and disease in the following ways: low-income households face disproportionate burdens of unhealthy environmental conditions including higher exposures to air and noise pollution, water contamination, toxic waste sites and unsafe conditions of the built environment (e.g., schools and homes) [13]; due to past and present policies that perpetuate structural racism (e.g., redlining) people of color are disproportionately exposed to air pollution and heavy metals, have less access to green space, and are more likely to live near hazardous waste sites [33,34,35, 39, 40]; housing cost-burdened households are more likely to face material hardships including food insecurity and medical-care hardship [36].

Statistical analysis

We employed a supervised machine learning spatial analysis technique using forest-based classification in ArcGIS Pro 2.6.3 [41,42,43]. This technique creates hundreds of decision trees which are developed using a random selection of the original data (i.e., the PFAS water sampling results), referred to as the “training data” [41,42,43]. The model was cross-validated (in iterations of 20) with the remaining original data not used in the initial model creation. Forest-based classification uses results from each decision tree in combination to predict the outcome of an unknown sample (i.e., locations where PFAS have not yet been measured in water) [43]. We used a random sample of 75% of the data to train the model and the remaining 25% for validation [43]. The validation dataset includes the same category proportions (low/moderate/high) as the complete training dataset. The following metrics were used to determine the most appropriate set of predictors and input parameters: false positive rate, true positive rate (sensitivity), true negative rate (specificity), positive predictive value (precision), and accuracy (Supplementary Table 6) [42, 44].

To generate the most stable model possible using criteria delineated above, the model was “tuned” by adjusting tree depth and number of trees as well as the predictive variable set. The number of randomly sampled variables was set to the default which is the square root of the total number of variables. The model was run in iterations of 20 to evaluate the distribution of the variable importance box plots, which show the similarity of each variable’s importance between runs; narrower boxplots indicate greater model stability [42, 43]. Because nearly 60% of the training dataset is comprised of the “low risk” category, the “compensation for sparse categories” parameter was applied to ensure that all training data categories were equally represented in the decision tree analysis. Specifically, selecting this parameter guarantees that each category is included in each tree to create balanced models [42, 43].

Once we selected the final explanatory variable set, we ran the model again using the “Train and Predict” option to generate predictions for untested locations. No data were excluded for validation in this final model run [42]. Based on the distribution of data and Colorado policies, developed from the 2016 USEPA health advisories, categories were defined as follows: 1) summation of PFOS and PFOA below 5 ng/L (“low risk”; 2) summation of PFOS and PFOA between 5 ng/L and 35 ng/L (“moderate risk”); 3) summation of PFOS and PFOA above 35 ng/L (“high risk”). The training dataset was categorized prior to training the model (Fig. 2); the model subsequently predicts a category, rather than a concentration, at unsampled locations. The number of points in each category in the training dataset are shown in Table 1. We created a one-mile point grid with ArcPro’s fish-net tool for the state of Colorado, and we ran the model to predict to each point in the grid across the state. Once predictions were assigned to the point grid, we transformed the grid to a surface (with continuous values ranging from 0 to 2 representing predicted risk of PFAS contamination) using an inverse distance weighting (IDW) model, a common and effective general interpolation technique [45]. To determine which variables had the most important influence on model predictions and accuracy, we assessed each variable’s Gini coefficient within the variable importance table [42].

**Fig. 2: The distribution and categorization of 1232 data points with actual PFOS and PFOA used to train the model.**

The explanatory variables included all known and potential PFAS sources, as well as geographic features that could impact groundwater vulnerability or PFAS transport in the environment (Tables 2 and 3). Euclidean distances from each PFAS sampling location to each PFAS point source were defined by the model and assigned the term “explanatory distance features”.

Development of a prioritization plan

Once we created the predictive map, we identified unsampled public water systems (including transient non-community systems [TNCs], non-transient non-community systems [NTNCs]), and census blocks where a high proportion of residents are private well-users. TNCs are public water systems that provide water to at least 25 people over a short period of time (e.g., gas stations and campgrounds). NTNCs are public water systems that provide water to at least 25 people for at least 6 months per year (e.g., schools and hospitals) [46]. We assigned all unsampled water systems that rely on groundwater, and private well locations determined from the Colorado Division of Water Resources dataset, a PFAS groundwater contamination risk value based on the risk surface described above. Due to the interpolation procedure, each assigned risk value was a continuous number between 0 (lowest risk) and 2 (highest risk). We developed the priority sampling list based on the potential exposure of vulnerable populations and the predicted groundwater contamination risk. Specifically, we evaluated schools with independent water systems, mobile home parks, and census blocks considered to be disproportionately impacted that were at elevated groundwater contamination risk and had higher private well density. We also considered other unsampled systems at elevated risk not covered under the Fifth Unregulated Contaminant Monitoring Rule (UCMR5) sampling [47]. Finally, we assessed data gaps and evaluated the variable importance ranking and distance to point sources to explore data and mapping needs to improve the predictive power of the model.

Results

Map diagnostics

Table 4 presents diagnostics for the selected model. This model yielded the best fit after adjusting input parameters and exploratory variables. Several non-source predictors were excluded from the final model, based on low Gini coefficients and lack of influence in the model’s prediction power; the excluded predictors were depth to water table, land use, the complete alluvial aquifer, and irrigated lands (see Table 3). Following validation analysis, the model performed best for the “low” and “high” risk categories with 85% sensitivity and 90% precision for “low” risk and 80% sensitivity and 71% precision for “high” risk. Sensitivity and precision for the “moderate” risk category were lower at 58% and 55%, respectively. The final model, run using the full training dataset (n = 1232), correctly classified 96.5% of points (Supplementary Fig. 1).

Table 4 Diagnostics developed from the validation data (i.e., 25% training data held out for validation) for the selected model.

Full size table

Variable importance

The model calculates variable importance using Gini coefficients, which rank variables based on how well each variable separates samples into classes as a means of assessing its discriminatory power [48]. Population density consistently remained the most important variable throughout model runs. Other variables that ranked in the top ten for discriminatory power included ski resorts, soil permeability class, elevation, airports (without part 139 certification for AFFF testing), airports (with part 139 certification for AFFF testing), annual average precipitation, AFFF spills, fire stations (reported possession and use) and water flow direction. While the variables listed above offered marginally more predictive power than the others (shown in Tables 2 and 3), there was not a clear pattern of potential PFAS source importance with the exception of population density, which suggests a need for additional investigation and understanding of most potential source types (Supplementary Table 3).

Sampling prioritization: drinking water

We derived sampling prioritization from the predictive map (Fig. 3) alongside information on DI communities, unsampled public water systems (including very small community water systems, TNCs and NTNCs) and private wells. With this approach, we developed recommendations to target specific schools, mobile home parks, census blocks with high proportions of private-wells, and other potentially at-risk public water systems for the CDPHE’s PFAS sampling programs [19].

We identified fifteen schools and nineteen mobile home parks with unsampled drinking water systems as having potentially elevated risk of PFOS and PFOA groundwater contamination. Three of the schools and twelve of the mobile home parks are located in DI communities. We also identified over 300 public water systems at potentially elevated risk for PFOS and PFOA groundwater contamination, most of which are very small (n = 70) or considered TNCs (n = 152) and NTNCs (n = 32). Lastly, we identified 20 priority census blocks in DI communities each containing at least 5,000 household use or domestic use private wells according to available data.

Sampling prioritization: source investigation

As noted above, the variable importance ranking indicated a need for additional investigation of many source types. In Colorado, the primary source types that have been investigated with targeted sampling include Department of Defense sites, and a limited number of fire stations, airports, corrective action sites, and wastewater treatment plants. Many source types included in this model have not undergone targeted sampling to evaluate occurrence of PFAS releases or contaminant plume delineations. We evaluated the distance metrics (i.e., the distance from each sampling point in the training dataset to the nearest point of the source type) for each source type to determine which source types had the least amount of PFAS sampling near them.

The distance variable (Supplementary Table 3) suggests that improving our understanding of historic and ongoing PFAS-containing AFFF use would aid in PFAS investigation and enable better mapping and characterization of PFAS contamination in Colorado. While the CDPHE has conducted surveys of fire stations on PFAS-containing AFFF use and possession, historic use of PFAS-containing AFFF is largely not recorded. Identifying past use cases and conducting representative sampling could help to further identify and reduce exposure.

Discussion and conclusion

The purpose of this prediction map was to inform a first-round prioritization approach that targets areas of Colorado with higher risk of PFOS and PFOA contamination and determine information that could be collected to improve the predictive power of future mapping work. From 2023 to 2025, small (between 3000 and 10,000 people) and large (>10,000 people) public water systems are required to test for PFAS during implementation of UCMR 5 [47]. This leaves over 1800 public water systems (including TNCs and NTNCs) without a testing requirement in Colorado and does not include testing of private wells [49]. Further, in 2024 the USEPA released finalized maximum contaminant levels for six PFAS, including PFOS and PFOA [50, 51]. This rule requires sampling for PFAS by public water systems but will not impact private wells or TNCs. The lack of information for private well users means that PFAS risk may continue unchecked.

Recommendations

To address these data gaps, we recommend a particular focus on smaller systems serving rural areas, mobile home parks, and schools that provide water to vulnerable or disproportionately impacted populations. We also recommend a focus on census blocks that have a high proportion of private well users and are considered DI communities.

To maintain focus on vulnerable populations, we have prioritized schools and mobile home parks for sampling and impact assessment efforts. While water systems that serve schools are considered to be NTNC systems, students and employees may spend a majority of their waking time at school, where they may consume significant amounts of drinking water. The lack of information on the temporal relationship between PFAS exposure and the onset of observable symptoms obscures the causal links. However, children are more likely to experience higher exposure per body weight than adults, and studies have found they often carry a higher PFAS body burden than adults [11, 14]. Further, many health effects associated with PFAS exposure have been shown to manifest during adolescence [11, 14, 52].

Mobile home parks often are commonly located in DI communities, providing housing for lower-income and socially vulnerable individuals. Mobile home parks often do not have the resources to provide adequate environmental services, including drinking water, storm water and wastewater drainage [53]. A nationwide study found that living in a mobile home park was negatively associated with water service reliability [54]. Further studies in California observed that mobile home parks were more likely to incur violations for health-based standards than their non-mobile home counterparts [55]. To address some of these concerns, Colorado passed a Mobile Home Park Water Quality bill in 2023 which created a water testing program, which may include testing for PFAS, for mobile home parks [56].

This work suggests that TNCs and NTNCs should be prioritized as sampling sites. While TNCs and NTNCs are often overlooked in sampling programs due to the limited exposure duration for many who utilize them, people may use these systems for their primary or secondary water source throughout the year. Key examples include employees and students who drink from water systems at churches and schools. We therefore include these system types, alongside very small public water systems (<3000) in our sampling list if they are in an area predicted to have elevated contamination risk. Care should be taken to prioritize sampling of these systems not only by predicted risk levels and DI community status, but also by assessing impacts to potentially vulnerable populations including infants and young children, and people who are pregnant or planning to become pregnant, or currently breastfeeding (e.g., evaluate age structure of community served or metrics such as proportion of people participating in the Special Supplemental Nutrition Program for Women, Infants, and Children [i.e., WIC]).

Because there are limited resources to sample private wells, resources should be focused on DI communities with a high proportion of private well users whose groundwater may be at risk of PFAS contamination. Private well owners in DI communities may not have the resources to regularly test and treat their water. CDPHE may also have resources to connect well-owners to free or reduced-cost filtration options through its emergency assistance program [19]. Therefore, a one goal of this effort was to highlight areas with high densities of private wells for targeted outreach to increase enrollment into the CDPHE’s PFAS sampling programs. Because private wells are spatially dispersed throughout Colorado [57]; increased well testing will also help to address important data gaps.

Finally, this report suggests attention be given to specific source types where limited information is available on PFAS releases. Records of PFAS use or release are absent because over decades of use these substances were not subject to regulation under various environmental regulations that monitor and regulate releases. The lack of clear patterns in variable importance for most potential PFAS source types points to a need for additional source investigation. CDPHE may explore its authority to investigate various PFAS source types. In 2018, CDPHE added PFOS and PFOA to its state hazardous constituent list. This gives CDPHE the authority to monitor for and address PFOS and PFOA at facilities subject to corrective action under the Resource Conservation and Recovery Act (RCRA). Other sources could be investigated indirectly by sampling groundwater from private wells and/or surface water likely influenced by those sources [19, 58]. At a national level, changes to Toxic Release Inventory reporting to remove the de minimis exemption, as well as designation of PFOS and PFOA as hazardous substances under Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA), will improve accuracy of information available for future releases of PFAS; however, gaps will persist without investigation of historic or ongoing releases at potential sources [50, 59]. Improved understanding of release volumes of PFAS containing substances at each source location may also be used in the future as a way to rank and prioritize potential sources.

In future model iterations where the variable importance ranking provides better understanding of the predictive power of various PFAS sources, it will be important to evaluate the spatial structure of the data to consider the effects of spatial autocorrelation. Additionally, as we move towards more consistency in analytical detection capabilities and the number of PFAS analyzed, future work could incorporate time-trend analysis to assess changes in PFAS contamination patterns over time.

Model strengths

Some strengths of this model include the ability to incorporate and auto-calculate spatial factors (e.g., distance to source) into the prediction model, employ numerous and varied explanatory variables, assess which factors are most predictive of PFAS contamination through the model’s variable importance function, validate many different model iterations with varied input specifications before running the final prediction model, and visually present results in a manner that facilitates understanding and decision-making. This analytical approach has been demonstrated to be effective in projects with similar objectives [22, 24, 60,61,62], including a recent study that evaluated the effectiveness of random forest classification in comparison to logistic regression for predicting PFAS contamination in private wells in New Hampshire [23]. The authors also found that random forest classification (performed in R rather than ArcGIS Pro), performed better than logistic regression across all five PFAS evaluated [23].

Random forest classification is a particularly effective analytical tool for this project due to its ability to develop predictive maps using many different covariates without an assumed linear relationship with the outcome variable, as is the case with the data used here. Further, random forest classification has no assumptions about normalcy, yielding a model that can effectively handle non-parametric, ordinal, and categorical data [41].

Model limitations

Limitations largely stem from data sources themselves, including high LoDs in some portion of the samples, preferential sampling of PFAS at contaminated sites and across Colorado’s Front Range, inconsistency in the number of PFAS analyzed per site, and limited knowledge of occurrence, magnitude, and duration of PFAS releases at most point sources.

With respect to the range of LoDs, there are 51 samples (4% of training data) included in the training dataset which are non-detect but classified in the moderate category rather than the low category due to high LoDs and the substitution method we employed in this work. It is possible that this biases the model predictions high near some sources. For example, one dataset collected in Frisco, Colorado has LoDs of 20 ng/L and 10 ng/L for PFOA and PFOS, respectively. Eleven samples were collected in this area because of nearby sampling results indicating PFAS contamination. All eleven samples in this area came back below detection, but in the training dataset are classified as moderate risk. We do not have extensive data in this area, which is near a ski resort. This could bias the model towards the ski resort source. More information needs to be collected in areas like this to verify results for subsequent model iterations.

On the other hand, 27 of the 51 high LoD samples were collected in El Paso County in relation to investigations of AFFF release from a military base. We have hundreds of samples from this area indicating widespread contamination. The LoDs for these samples are higher compared to other datasets because they were collected in the earlier years of PFAS investigation (2016 and 2018). Given known contamination in this area, it is not as likely that the classification for these 27 samples biases model predictions. Finally, changing LoDs and differences between regulatory quantitation limits and laboratory methods have complicated our ability to assess risk in the past. As we continue to collect data via standardized methods for PFAS with increasingly lower LoDs, we will improve risk classification models and decision-making.

Random forest classification is more effective at handling preferential sampling than alternative methods [61]. It is important to ensure the training dataset (75% of the PFAS sampling results) represents adequate geographic spread and has similar proportions in each category as the full dataset. To account for differences in PFAS results data (number of PFAS analyzed and differences in detection limits) we decided to move forward with a simple aggregation of PFOS and PFOA. While this limits our ability to predict the full spectrum of PFAS contributing to contamination, it enables us to tease out point sources or geographic features that contribute to commonly detected PFAS. While there are some differences in the sources of PFOS and PFOA, there are also many similarities and these two PFAS are often found together. In 92% of the training data sampling points the detection status of PFOS and PFOA are the same. In other words, only in 8% of the samples was PFOA detected and PFOS not detected or vice versa. As Colorado continues to collect data with lower LoDs and an expanded suite of PFAS we would work to rerun this model for separate PFAS types to better understand source signatures, fate, and transport in Colorado. Finally, while a continuous-scale analysis may be more informative for identifying water systems at highest relative risk, the current data do not suggest benefits of performing regression analysis with continuous values over the type of classification we employed. The merits of classification are further supported in a review of machine learning models to predict potential groundwater contamination, which determined that models generally performed better with classification than regression [63]. While the data used in this work was better suited for classification analysis, whether to run classification or regression is dependent on the type and extent of available data.

Additional information on source types could be useful for better refining the model in the future. For example, landfills likely have a different risk of PFAS contamination based on factors such as age of waste and how the landfill is constructed [64,65,66]. Further, the depth to water table data set could be improved to better approximate well depth across Colorado and potentially improve its predictive capability. We do not have or know of a reliable method for evaluating well depths at the statewide scale. The high variability of aquifer types and characteristics across Colorado (fracture flow, alluvial, confined, unconfined, etc.), fluctuating and depleted aquifers without water-level monitoring data, and significant variability in transmissivity, additionally introduces complexity that becomes a challenge at the statewide scale. Moving forward, we have interest in undertaking a similar spatial/modeling exercise at more localized or regional scales that incorporate well depth or “uppermost groundwater aquifer” depth in heavily studied and understood aquifer systems with time-series monitoring data of water table depth. Finally, information on some PFAS source types and some potentially useful environmental predictors (e.g., groundwater age) [67], was not available for inclusion in the model.

Conclusions and future work

This map represents the first iteration of this work, which we will develop further as new data become available. The primary goal of this modeling effort was to identify data gaps and drive prioritization for subsequent rounds of PFAS sampling. With its framework in place, the CDPHE will re-run this model with larger and more comprehensive training data on an annual basis. Additional data with more consistent and lower LoDs will allow for an assessment that is more refined and improve model predictions. Further, a better understanding of releases from potential PFAS sources, along with targeted sampling in higher risk areas, will assist resource allocation efforts and improve our big-picture understanding of who is at greatest risk for PFAS exposure and health effects in Colorado.

Data availability

Data sources used in this work are described in detail in the supplemental material and on the Colorado Department of Public Health and Environment’s PFAS mapping webpage: https://cdphe.colorado.gov/pfas-mapping.

References

(ITRC) IT& RC. Per- and Polyfluoroalkyl Substances Technical and Regulatory Guidance. 2022. Available from: https://pfas-1.itrcweb.org/.
Lewis AJ, Yun X, Spooner DE, Kurz MJ, McKenzie ER, Sales CM. Exposure pathways and bioaccumulation of per- and polyfluoroalkyl substances in freshwater aquatic ecosystems: Key considerations. Sci Total Environ. 2022;822:153561.
Article CAS PubMed Google Scholar
Prevedouros K, Cousins IT, Buck RC, Korzeniowski SH. Sources, Fate and Transport of Perfluorocarboxylates. Environ Sci Technol. 2006;40:32–44.
Article CAS PubMed Google Scholar
The Stockholm Convention. The new POPs under the Stockholm Convention. 2017. Available from: http://chm.pops.int/TheConvention/ThePOPs/TheNewPOPs/tabid/2511/Default.aspx.
Glüge J, Scheringer M, Cousins IT, DeWitt JC, Goldenman G, Herzke D, et al. An overview of the uses of per- and polyfluoroalkyl substances (PFAS). Environ Sci Process Impacts. 2020;22:2345–73.
Article PubMed PubMed Central Google Scholar
Bartell SM, Calafat AM, Lyu C, Kato K, Ryan PB, Steenland K. Rate of Decline in Serum PFOA Concentrations after Granular Activated Carbon Filtration at Two Public Water Systems in Ohio and West Virginia. Environ Health Perspect. 2010;118:222–8.
Article CAS PubMed Google Scholar
Li Y, Fletcher T, Mucs D, Scott K, Lindh CH, Tallving P, et al. Half-lives of PFOS, PFHxS and PFOA after end of exposure to contaminated drinking water. Occup Environ Med. 2018;75:46–51.
Article CAS PubMed Google Scholar
Olsen GW, Burris JM, Ehresman DJ, Froehlich JW, Seacat AM, Butenhoff JL, et al. Half-Life of Serum Elimination of Perfluorooctanesulfonate,Perfluorohexanesulfonate, and Perfluorooctanoate in Retired Fluorochemical Production Workers. Environ Health Perspect. 2007;115:1298–305.
Article CAS PubMed PubMed Central Google Scholar
Xu Y, Fletcher T, Pineda D, Lindh CH, Nilsson C, Glynn A, et al. Serum Half-Lives for Short- and Long-Chain Perfluoroalkyl Acids after Ceasing Exposure from Drinking Water Contaminated by Firefighting Foam. Environ Health Perspect. 2020;128:077004.
Article CAS PubMed PubMed Central Google Scholar
Fenton SE, Ducatman A, Boobis A, DeWitt JC, Lau C, Ng C, et al. Per‐ and Polyfluoroalkyl Substance Toxicity and Human Health Review: Current State of Knowledge and Strategies for Informing Future Research. Environ Toxicol Chem. 2021;40:606–30.
Article CAS PubMed Google Scholar
Anderko L, Pennea E. Exposures to per-and polyfluoroalkyl substances (PFAS): Potential risks to reproductive and children’s health. Curr Probl Pediatr Adolesc Health Care. 2020;50:100760.
PubMed Google Scholar
Brown P. Race, Class, and Environmental Health: A Review and Systematization of the Literature. Environ Res. 1995;69:15–30.
Article CAS PubMed Google Scholar
Evans GW, Kantrowitz E. Socioeconomic Status and Health: The Potential Role of Environmental Risk Exposure. Annu Rev Public Health. 2002;23:303–31.
Article PubMed Google Scholar
Rappazzo K, Coffman E, Hines E. Exposure to Perfluorinated Alkyl Substances and Health Outcomes in Children: A Systematic Review of the Epidemiologic Literature. Int J Environ Res Public Health. 2017;14:691.
Article PubMed PubMed Central Google Scholar
United States Environmental Protection Agency (USEPA). Fact Sheets about the Third Unregulated Contaminant Monitoring Rule (UCMR 3). 2023. Available from: https://www.epa.gov/dwucmr/fact-sheets-about-third-unregulated-contaminant-monitoring-rule-ucmr-3.
Barton KE, Starling AP, Higgins CP, McDonough CA, Calafat AM, Adgate JL. Sociodemographic and behavioral determinants of serum concentrations of per- and polyfluoroalkyl substances in a community highly exposed to aqueous film-forming foam contaminants in drinking water. Int J Hyg Environ Health. 2020;223:256–66.
Article CAS PubMed Google Scholar
Bruce Finley. Drinking water in three Colorado cities contaminated with toxic chemicals above EPA limits. The Denver Post; 2016. Available from: https://www.denverpost.com/2016/06/15/colorado-widefield-fountain-security-water-chemicals-toxic-epa/.
Bruce Finley. North metro Denver groundwater contaminated with PFCs is flowing into a drinking-water system that supplies 50,000 residents. The Denver Post; 2018. Available from: https://www.denverpost.com/2018/07/12/north-metro-denver-contaminated-groundwater/.
Colorado Department of Public Health and Environment (CDPHE). Projects and programs addressing chemicals from firefighting foam and other sources. 2023. Available from: https://cdphe.colorado.gov/pfas-projects.
Grace Hood. Boulder Well Marks 1st PFC-Contaminated Water Case Outside of El Paso County [Internet]. Colorado Public Radio; 2018. Available from: https://www.cpr.org/2018/05/31/boulder-well-marks-1st-pfc-contaminated-water-case-outside-of-el-paso-county/.
Johnson TD, Belitz K, Lombard MA. Domestic Wells in the United States. United States Geological Survey (USGS); 2020. Available from: https://ca.water.usgs.gov/projects/USGS-US-domestic-wells.html.
George S, Dixit A. A machine learning approach for prioritizing groundwater testing for per-and polyfluoroalkyl substances (PFAS). J Environ Manag. 2021;295:113359.
Article CAS Google Scholar
Hu XC, Ge B, Ruyle BJ, Sun J, Sunderland EM. A Statistical Approach for Identifying Private Wells Susceptible to Perfluoroalkyl Substances (PFAS) Contamination. Environ Sci Technol Lett. 2021;8:596–602.
Article CAS PubMed PubMed Central Google Scholar
Roostaei J, Colley S, Mulhern R, May AA, Gibson JM. Predicting the risk of GenX contamination in private well water using a machine-learned Bayesian network model. J Hazard Mater. 2021;411:125075.
Article CAS PubMed Google Scholar
Salvatore D, Mok K, Garrett KK, Poudrier G, Brown P, Birnbaum LS, et al. Presumptive Contamination: A New Approach to PFAS Contamination Based on Likely Sources. Environ Sci Technol Lett. 2022;9:983–90.
Article CAS PubMed PubMed Central Google Scholar
Colorado Department of Public Health and Environment (CDPHE). PFAS Mapping. 2023. Available from: https://cdphe.colorado.gov/pfas-mapping.
Colorado Department of Public Health and Environment (CDPHE). Colorado laws and policies related to chemicals from firefighting foam and other sources. 2023. Available from: https://cdphe.colorado.gov/pfas-laws.
Esri. Understanding density analysis. 2023. Available from: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-analyst/understanding-density-analysis.htm.
Blair BD, McKenzie LM, Allshouse WB, Adgate JL. Is reporting “significant damage” transparent? Assessing fire and explosion risk at oil and gas operations in the United States. Energy Res Soc Sci. 2017;29:36–43.
Article Google Scholar
Horwitt D, Gottlieb B, Allison G. Fracking with “Forever Chemicals” in Colorado. Physicians for Social Responsibility; 2022. Available from: https://psr.org/wp-content/uploads/2022/01/fracking-with-forever-chemicals-in-colorado.pdf.
United States Environmental Protection Agency (USEPA). Hydraulic Fracturing for Oil and Gas: Impacts from the Hydraulic Fracturing Water Cycle on Drinking Water Resources in the United States (Final Report). Washington D.C.; 2016. Report No.: EPA/600/R-16/236F. Available from: https://cfpub.epa.gov/ncea/hfstudy/recordisplay.cfm?deid=332990.
Esri. Kernel Density (Spatial Analyst). 2023. Available from: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-analyst/kernel-density.htm.
Tessum CW, Paolella DA, Chambliss SE, Apte JS, Hill JD, Marshall JD. PM _2.5 polluters disproportionately and systemically affect people of color in the United States. Sci Adv. 2021;7:eabf4491.
Article PubMed Google Scholar
Alvarez CH. Structural Racism as an Environmental Justice Issue: A Multilevel Analysis of the State Racism Index and Environmental Health Risk from Air Toxics. J Racial Ethn Health Disparities. 2023;10:244–58.
Article PubMed Google Scholar
Shkembi A, Smith LM, Neitzel RL. Linking environmental injustices in Detroit, MI to institutional racial segregation through historical federal redlining. J Expo Sci Environ Epidemiol. 2022. Available from: https://www.nature.com/articles/s41370-022-00512-y.
Shamsuddin S, Campbell C. Housing Cost Burden, Material Hardship, and Well-Being. Hous Policy Debate. 2022;32:413–32.
Article Google Scholar
Colorado Department of Public Health and Environment (CDPHE). Colorado EnviroScreen. 2023. Available from: https://cdphe.colorado.gov/enviroscreen.
Colorado Department of Public Health and Environment (CDPHE). Environmental Justice. 2023. Available from: https://cdphe.colorado.gov/environmental-justice.
Jones DH, Yu X, Guo Q, Duan X, Jia C. Racial Disparities in the Heavy Metal Contamination of Urban Soil in the Southeastern United States. Int J Environ Res Public Health. 2022;19:1105.
Article CAS PubMed PubMed Central Google Scholar
Okorie CN, Thomas MD, Méndez RM, Di Giuseppe EC, Roberts NS, Márquez-Magaña L. Geospatial Distributions of Lead Levels Found in Human Hair and Preterm Birth in San Francisco Neighborhoods. Int J Environ Res Public Health. 2021;19:86.
Article PubMed PubMed Central Google Scholar
Breiman L. Random Forests. Mach Learn. 2001;45:5–32.
Article Google Scholar
ESRI. How Forest-based Classification and Regression works. 2023. Available from: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/how-forest-works.htm.
ESRI. Forest-based Classification and Regression (Spatial Statistics). 2023. Available from: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/forestbasedclassificationregression.htm.
Novakovic JDJ, Veljovic A, Ilic SS, Papic Z, Tomovic M. Evaluation of Classification Models in Machine Learning. Theory Appl Math Comput Sci. 2017;7:39–46.
Google Scholar
ESRI. How IDW works. 2023. Available from: https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/how-idw-works.htm.
United States Environmental Protection Agency (USEPA). Information about Public Water Systems. 2022. Available from: https://www.epa.gov/dwreginfo/information-about-public-water-systems.
United States Environmental Protection Agency (USEPA). Fifth Unregulated Contaminant Monitoring Rule. 2023. Available from: https://www.epa.gov/dwucmr/fifth-unregulated-contaminant-monitoring-rule.
Menze BH, Kelm BM, Masuch R, Himmelreich U, Bachert P, Petrich W, et al. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinforma. 2009;10:213.
Article Google Scholar
Colorado Department of Public Health and Environment (CDPHE). Web Drinking Water Info. 2023. Available from: https://lookerstudio.google.com/u/0/reporting/18DpQAMm-riBo5DfqEUCgDqMspPPhu-Ul/page/q5Fz?params=%7B%22df12%22:%22include%25EE%2580%25800%25EE%2580%2580IN%25EE%2580%2580A%22,%22df5%22:%22exclude%25EE%2580%25800%25EE%2580%2580IN%25EE%2580%2580Non-Public%22,%22df24%22:%22include%25EE%2580%25800%25EE%2580%2580IN%25EE%2580%2580NITRATE%22,%22df27%22:%22include%25EE%2580%25800%25EE%2580%2580IN%25EE%2580%2580No%22%7D.
United States Environmental Protection Agency (USEPA). PFAS Strategic Roadmap: EPA’s Commitments to Action 2021-2024. 2021. Available from: https://www.epa.gov/system/files/documents/2021-10/pfas-roadmap_final-508.pdf.
United States Environmental Protection Agency (USEPA). Per- and Polyfluoroalkyl Substances (PFAS): Proposed PFAS National Primary Drinking Water Regulation. 2023. Available from: https://www.epa.gov/sdwa/and-polyfluoroalkyl-substances-pfas.
Pitter G, Zare Jeddi M, Barbieri G, Gion M, Fabricio ASC, Daprà F, et al. Perfluoroalkyl substances are associated with elevated blood pressure and hypertension in highly exposed young adults. Environ Health. 2020;19:102.
Article CAS PubMed PubMed Central Google Scholar
State of Vermont Agency of Natural Resources. Manufactured Housing Community Solutions. 2023. Available from: https://anr.vermont.gov/special-topics/arpa-vermont/manufactured-housing-community-solutions.
Pierce G, Jimenez S. Unreliable Water Access in U.S. Mobile Homes: Evidence From the American Housing Survey. Hous Policy Debate. 2015;25:739–53.
Article Google Scholar
Pierce G, Gonzalez SR. Public Drinking Water System Coverage and Its Discontents: The Prevalence and Severity of Water Access Problems in California’s Mobile Home Parks. Environ Justice. 2017;10:168–73.
Article Google Scholar
Colorado General Assembly. Mobile Home Park Water Quality. 2023. Available from: https://leg.colorado.gov/bills/hb23-1257.
Colorado Department of Natural Resources. DWR Well Permit Research. 2023. Available from: https://maps.dnrgis.state.co.us/dwr/Index.html?viewer=dwrwellpermit.
Colorado Department of Public Health and Environment (CDPHE). Addition of Perfluorooctanoic acid (PFOA) and its anion perfluorooctanoate, and Perfluorooctane sulfonic acid (PFOS) and its anion, perfluorooctane sulfonate, to the Part 261, Appendix VIII List of Hazardous Constituents. Solid and Hazardous Waste Commission/Hazardous Materials and Waste Management Division; 2018. Report No.: 6 CCR 1007-3. Available from: https://www.sos.state.co.us/CCR/Upload/AGORequest/AdoptedRules02018-00017.rtf.
United States Environmental Protection Agency (USEPA). EPA Proposes Rule to Enhance Reporting of PFAS Data to the Toxics Release Inventory. 2022. Available from: https://www.epa.gov/newsreleases/epa-proposes-rule-enhance-reporting-pfas-data-toxics-release-inventory.
Chen W, Xie X, Wang J, Pradhan B, Hong H, Bui DT, et al. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA. 2017;151:147–60.
Article Google Scholar
Mi C, Huettmann F, Guo Y, Han X, Wen L. Why choose Random Forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence. PeerJ. 2017;5:e2849.
Article PubMed PubMed Central Google Scholar
Zabihi M, Pourghasemi HR, Pourtaghi ZS, Behzadfar M. GIS-based multivariate adaptive regression spline and random forest models for groundwater potential mapping in Iran. Environ Earth Sci. 2016;75:665.
Article Google Scholar
Hu XC, Dai M, Sun JM, Sunderland EM. The Utility of Machine Learning Models for Predicting Chemical Contaminants in Drinking Water: Promise, Challenges, and Opportunities. Curr Environ Health Rep. 2022;10:45–60.
Article PubMed PubMed Central Google Scholar
Lang JR, Allred BM, Field JA, Levis JW, Barlaz MA. National Estimate of Per- and Polyfluoroalkyl Substance (PFAS) Release to U.S. Municipal Landfill Leachate. Environ Sci Technol. 2017;51:2197–205.
Article CAS PubMed Google Scholar
Benskin JP, Li B, Ikonomou MG, Grace JR, Li LY. Per- and Polyfluoroalkyl Substances in Landfill Leachate: Patterns, Time Trends, and Sources. Environ Sci Technol. 2012;46:11532–40.
Article CAS PubMed Google Scholar
Hamid H, Li LY, Grace JR. Review of the fate and transformation of per- and polyfluoroalkyl substances (PFASs) in landfills. Environ Pollut. 2018;235:74–84.
Article CAS PubMed Google Scholar
McMahon PB, Tokranov AK, Bexfield LM, Lindsey BD, Johnson TD, Lombard MA, et al. Perfluoroalkyl and Polyfluoroalkyl Substances in Groundwater Used as a Source of Drinking Water in the Eastern United States. Environ Sci Technol. 2022;56:2279–88.
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We acknowledge the support and contributions of colleagues at the Colorado Department of Public Health & Environment, including David Dani, John Duggan, Eric Brown, Shannon Barbare, Meghan Williams, and Margaret Horton.

Funding

This work was supported by the Grant or Cooperative Agreement Number, 1 NUE1EH001463-01-00, funded by the Centers for Disease Control and Prevention. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Centers for Disease Control and Prevention or the Department of Health and Human Services.

Author information

Authors and Affiliations

Toxicology and Environmental Epidemiology Office, Colorado Department of Public Health & Environment, Denver, CO, USA
Kelsey E. Barton, Kevin Berg & Kristy Richardson
Department of Environmental & Occupational Health, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Kelsey E. Barton, John L. Adgate & Lisa M. McKenzie
Department of Geography and Environmental Science, University of Colorado Denver, Denver, CO, USA
Peter J. Anthamatten
Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
Anne P. Starling
Source Water Assessment & Protection Program, Colorado Department of Public Health & Environment, Denver, CO, USA
Robert C. Murphy

Authors

Kelsey E. Barton
View author publications
You can also search for this author in PubMed Google Scholar
Peter J. Anthamatten
View author publications
You can also search for this author in PubMed Google Scholar
John L. Adgate
View author publications
You can also search for this author in PubMed Google Scholar
Lisa M. McKenzie
View author publications
You can also search for this author in PubMed Google Scholar
Anne P. Starling
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Berg
View author publications
You can also search for this author in PubMed Google Scholar
Robert C. Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Kristy Richardson
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

KEB contributed to conception and design of the project, conceptualization of the paper and review and synthesis of the literature, acquisition of data, analysis and interpretation of data, writing of the initial draft of the manuscript or a section of the manuscript, critical review and revision of the manuscript for important intellectual content, and statistical analysis; PJA contributed to supervision, conceptualization of the paper and review and synthesis of the literature, and critical review and revision of the manuscript for important intellectual content; JLA contributed to supervision, conception and design of the project, and critical review and revision of the manuscript for important intellectual content; LMM contributed to supervision, critical review and revision of the manuscript for important intellectual content; APS contributed to supervision, and critical review and revision of the manuscript for important intellectual content; KB contributed to obtaining funding, analysis and interpretation of data, critical review and revision of the manuscript for important intellectual content. RCM contributed to acquisition of data, and critical review and revision of the manuscript for important intellectual content; KLR contributed to supervision, conception and design of the project, obtaining funding, and critical review and revision of the manuscript for important intellectual content.

Corresponding author

Correspondence to Kelsey E. Barton.

Ethics declarations

Competing interests

The authors have no financial disclosures to declare. JLA is engaged in various aspects of PFAS litigation.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

aj-checklist

Supplemental Figure 1 Caption

Supplemental Figure 1

Supplemental Tables

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Barton, K.E., Anthamatten, P.J., Adgate, J.L. et al. A data-driven approach to identifying PFAS water sampling priorities in Colorado, United States. J Expo Sci Environ Epidemiol (2024). https://doi.org/10.1038/s41370-024-00705-7

Download citation

Received: 08 November 2023
Revised: 08 July 2024
Accepted: 08 July 2024
Published: 01 August 2024
DOI: https://doi.org/10.1038/s41370-024-00705-7
Springer Nature America, Inc.

A data-driven approach to identifying PFAS water sampling priorities in Colorado, United States

Abstract

Background

Objective

Methods

Results

Impact statement

Similar content being viewed by others

The Utility of Machine Learning Models for Predicting Chemical Contaminants in Drinking Water: Promise, Challenges, and Opportunities

Guideline levels for PFOA and PFOS in drinking water: the role of scientific uncertainty, risk assessment decisions, and social factors

Health-Risk Assessment of Arsenic and Groundwater Quality Classification Using Random Forest in the Yanchi Region of Northwest China

Explore related subjects

Introduction

Methods

Data sources

PFAS sampling results (training data)

Known and potential PFAS sources

Physical environmental characteristics

Population vulnerability

Statistical analysis

Development of a prioritization plan

Results

Map diagnostics

Variable importance

Sampling prioritization: drinking water

Sampling prioritization: source investigation

Discussion and conclusion

Recommendations

Model strengths

Model limitations

Conclusions and future work

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

aj-checklist

Supplemental Figure 1 Caption

Supplemental Figure 1

Supplemental Tables

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation