Introduction

Drinking water represents an important source of exposure to inorganics (e.g., arsenic and nitrate), radionuclides (uranium, alpha particles), disinfection byproducts (DBPs), and volatile organic compounds (VOCs) for populations in the United States (U.S.) and worldwide [1]. Approximately 90% of the U.S. population is served by public water systems, and 10% by private wells [2]. In California, approximately 95% of the population is served by public water systems [3]. Public water systems include at least 15 service connections or serve at least 25 people; community water systems (CWS) are a type of public water system that serve the same population year-round [2]. Public water systems are regulated by the U.S. Environmental Protection Agency (EPA) under the Safe Drinking Water Act [4]. The contaminants we evaluated are regulated through federally enforceable maximum contaminant levels (MCLs), which were determined based on economic and technical feasibility, treatment technologies, cost-benefit analysis, and public health benefit for specific health endpoints [4]. States generally have primacy over enforcement of federal drinking water regulations. Notably, the MCL goal, a non-enforceable standard based solely on risk to health, is 0 µg/l for arsenic, uranium, alpha particles, trichloroethylene (TCE), tetrachloroethylene (PCE), bromodichloromethane, bromoform, and dichloroacetic acid, as there is no known safe level of exposure to these contaminants [4]. Private wells are not federally regulated or monitored.

Nitrate is a common contaminant of drinking water supplies in agricultural areas, due to use of nitrogen fertilizers and concentrated animal feeding operation waste [5, 6]. Atmospheric deposition, erosion of natural deposits, and septic tank or sewage leakage contribute to nitrate contamination in rural and urban areas [4]. Geogenic arsenic occurs in groundwater across the U.S., with regional differences due to climatic and geological factors; arid climates can cause evaporative concentration of arsenic in shallow groundwater supplies and lead to high levels, such as in the San Joaquin Valley of California [7,8,9,10]. Mining and historical arsenical pesticide use are anthropogenic sources of arsenic contamination in water supplies [8]. Uranium is present in different rock types and is leached from host mineral phases to surface and ground water supplies; uranium mining/milling and mobilization of uranium via nitrate fertilizer use are anthropogenic sources of uranium contamination [11,12,13,14,15]. Uranium and other radionuclides can decay and release alpha radiation, often quantified as total gross alpha for monitoring compliance purposes. DBPs are formed by the reaction of chlorine and bromine with natural organic compounds during the disinfection of water supplies to treat pathogens [16]. DBPs are commonly found in public water supplies across the U.S., with the highest concentrations observed in those reliant on surface water or shallow groundwater [16]. While over 700 DBPs have been identified, the most abundant classes are trihalomethanes (THMs, which include the chemicals chloroform, dibromochloromethane, bromodichloromethane, and bromoform, and regulated as the sum total, TTHM), and haloacetic acids (HAA5, regulated as the sum of dichloroacetic acid, trichloroacetic acid, monochloroacetic acid, bromoacetic acid, and dibromoacetic acid) [4, 17, 18]. The VOCs TCE and PCE are solvents used in dry cleaning, metal degreasing, textile, art, and industrial processes, and may be found in some consumer products [19]. Toxic waste disposal sites, sometimes recognized as Superfund sites under the Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA), are anthropogenic sources of inorganic arsenic, uranium, TCE, and PCE in groundwater [12, 19,20,21,22,23,24].

Numerous studies implicate one or more of these drinking water contaminants in adverse health effects, including cancer, cardiovascular disease, reproductive and developmental toxicity, nephrotoxicity, and other adverse health conditions [1, 16, 20, 25,26,27,28,29,30,31,32,33,34,35,36,37,38,39]. Inorganic arsenic is classified by the International Agency for Research on Cancer (IARC) as a cause of cancers of the bladder, lung, and skin, and is associated with increased risk of cancers of the kidney, liver, and prostate [26]. Inorganic arsenic is also a potent toxicant associated with numerous adverse health outcomes, including cardiovascular disease, hypertension, and reproductive disorders [26, 31, 32]. Uranium exposure through drinking water is associated with renal damage and nephrotoxicity, and an increased risk of colorectal, breast, kidney, prostate, and total cancer [20, 27, 33]. Nitrate is classified by IARC as a probable human carcinogen when ingested under conditions that result in the endogenous formation of N-nitroso-compounds, most of which are animal carcinogens [28, 34]. Cancers of the stomach, colon, bladder, kidney, ovary, and thyroid, and thyroid disease are associated with elevated nitrate ingested from drinking water; however, the number of studies of most cancer sites is limited [29, 35]. Higher intake of DBPs through drinking water is associated with increased bladder cancer risk, and a limited number of studies suggest DBP exposures are potential risk factors for colon, rectum, and endometrial cancer [16, 36]. TCE is classified as carcinogenic to humans based on kidney cancer, and PCE (Group 2A) as probably carcinogenic to humans based on bladder cancer evidence [30]. Occupational studies also support adverse developmental, neurological, and hepatotoxic effects of TCE and PCE exposures [19]. Assessment of long-term drinking water contaminant exposures and associated health risks have traditionally been limited by the lack of water quality data that could be assigned to individuals in epidemiologic cohorts; understanding large-scale water quality data at the level of consumer intake is a critical research gap [40]. Additionally, there are relatively few cohort studies evaluating drinking water exposures at levels below the MCLs and World Health Organization guidelines that are commonly experienced by the general U.S. population [1]. Inequalities in CWS arsenic, uranium, and nitrate exposures by sociodemographic characteristics such as, race and ethnicity, income, education, region, and rurality/urbanicity have been documented [41, 42]. Few studies have evaluated sociodemographic inequalities in DBP and TCE/PCE exposures in the United States.

Our primary objective for this study was to describe exposure to regulated, frequently detected and measured contaminants in drinking water in the California Teachers Study (CTS), a large prospective cohort of women. We described the spatial linkage of participants’ residences to their drinking water source and corresponding estimates of contaminant concentrations. For a subset, we evaluated the agreement between address-assigned and self-reported drinking water source and described the daily intake of tap water and CWS contaminants. Additionally, we examined inequalities in CWS exposures across sociodemographic groups.

Materials and methods

Study population

We included participants from the CTS, a prospective cohort of women designed to investigate the etiology of breast and other cancers [43]. The CTS was originally designed to study higher rates of breast and other cancers observed in female teachers. Female public school teachers and administrators enrolled in the California State Teachers Retirement System were mailed a self-administered questionnaire, and 133,477 completed the questionnaire and joined the cohort in 1995–1996 (Supplementary Fig. S1) [43]. Of those, 124,685 had valid California addresses at the time of enrollment. Participants provided information on sociodemographic characteristics (race and ethnicity, age, education), anthropometrics (height and weight), smoking and alcohol consumption, and personal and family medical history. Participant race and ethnicity were categorized at enrollment as follows: non-Hispanic white (only white reported), Black (Black only or white and Black reported), Hispanic (Hispanic only or white and Hispanic reported), Native American (Native American only or white and Native American reported), Asian/Pacific Islander (Asian/Pacific Islander only [Chinese, Filipino, Hawaiian, Japanese, Vietnamese, or Korean] or white and Asian/Pacific Islander), or Other/multi-racial (Other reported or more than one of the aforementioned groups reported).

Drinking water data and linkages

Our study used a geospatial dataset of statewide drinking water boundaries from the Water Boundary Tool (version updated in 2019), created by the Public Health Institute’s California Environmental Heath Tracking Program (CEHTP) [44, 45]. The California Office of Environmental Health Hazard Assessment (OEHHA) then cleaned and processed the geospatial layer by repairing geometry and reconciling overlaying boundaries. The OEHHA dataset includes CWS service areas collected previously by water system operators and local regulatory agencies; for areas outside of CWS boundaries, regions were partitioned into Public Land Survey Section (PLSS) sections, approximately 1 × 1 miles, and ambient groundwater contaminant concentrations were estimated (described below) [46]. For 124,665 eligible and consenting participants with a geocoded address at enrollment in California (Texas A&M, USC Geocoding Platform) [47], we linked their addresses to drinking water boundaries using QGIS [44]. Participants whose geocodes did not intersect a water boundary (N = 36) were manually assigned to the nearest water boundary. Most participants lived within a CWS service area (N = 115,206 (92%); CWS N = 1249) and were assumed to be using the corresponding CWS as their residential water source; CWS serving CTS participants were located across California (Fig. S2). The CWS for Los Angeles City was subdivided into five subsections based on their source of water and treatment plants. The remaining participants with addresses located outside of a CWS service area (N = 9459; 8%) were assumed to be served by domestic wells (systems that serve less than 5 service connections) or state small water systems (5–14 connections), which are not regulated or monitored by California.

CWS-level monitoring data (1990–2020) were obtained from and processed by OEHHA [44, 47]. Average annual concentrations of each contaminant were computed for each CWS as follows. We prioritized contaminants measured in samples of treated/delivered drinking water. When treated samples were not available for a contaminant, we averaged results from raw/untreated samples [44]. Only samples collected from CWS during active periods of use were included. Uranium concentrations were converted from pCi/l to µg/l using 1.49 as the conversion factor (pCi/l*1.49 = µg/l) [48]. CWS are required to report non-detections and concentrations based on the detection limits for the purposes of reporting (DLR) (Table S1), which is often higher than the laboratory limit of quantification. When samples had concentrations of zero, were reported as equal to, half of, or below the DLR, concentrations were imputed using Tobit regression based on the existing measurement data and assuming a log-normal distribution [49]. The upper bound for imputation was contaminant specific and was derived from the median of reported concentrations below the DLR (Table S1); the lower bound was zero. Among CWS matched to participants’ enrollment addresses, the population-weighted average percent of years (1990–2020) that CWS reported detectable levels of contaminants was 86% for gross alpha, 80% for uranium, 83% for nitrate-nitrogen (N) (hereafter referred to as nitrate), 88% for HAA5, 70% for TTHM, 64% for arsenic, 28% for TCE, and 35% for PCE (Table S2). Fewer CWS reported measurement data for uranium and HAA5 compared to the number of CWS that reported data for the other contaminants; years with missing measurement data were not included in the computation of average concentrations.

Domestic well estimates were computed by OEHHA using groundwater measurement data collected from 2011–2019 in the Groundwater Ambient Monitoring and Assessment Program, and were available at the Public Land Survey System level [44, 50]. Briefly, each section was assigned the average groundwater quality data from all wells within the section. Non-domestic wells were limited to those with well depths similar to domestic wells. Sections with no ambient groundwater quality data were assigned the average of all neighboring sections’ data. In cases where all neighboring sections had no data, the section was assigned the average ambient groundwater quality data from the township (6 × 6 miles) [51]. Domestic well data below the limits of detection were not imputed.

Self-reported drinking water source and consumption at questionnaire 6

In the CTS questionnaire 6 (2017–2019, N = 39,031, 34% response rate), participants were asked their drinking water source at the current home (municipal water, private well, bottled water, other, don’t know), and whether the current home tap water was filtered/treated (which could include filtering pitchers such as Brita®, but did not include water softeners) [52]. Participants were also asked the number of glasses of water, cups of coffee, and cups of tea, usually consumed per day (made with household tap water) with responses of “Never”, “1 cup [or glass] a day”, “2 cups a day” up to “6+ cups a day”. The latter was coded as 6; responses of “Occasionally, i.e., not every day,” were coded as 0.5; responses of “Never” and “Skipped” were coded as zero. We treated responses of “Drank but don’t know how much” as missing (N = 491). We converted the number of cups per day to liters (l) per day assuming that a glass contained 12 ounces [53]. We linked the addresses of participants who completed questionnaire 6 to a CWS service boundary or PLSS section. Among participants that were linked to a CWS and self-reported using municipal water, we calculated the daily contaminant intake from tap water using the following equation: (Contaminant concentration*Intake Rate [l/day]). We calculated the intake rate for tap water only (N = 22,067 with self-reported tap water consumption) and total water including coffee and tea (N = 20,657 with self-reported tap water consumption of water, coffee, and tea). We calculated the contaminant concentration using monitoring data from 2016–2020, to align with the time frame of questionnaire 6.

A subset of participants with a self-reported municipal water source were able to provide the name of the water company at their current home (N = 9834) [54]. We compared the agreement of the self-reported and the assigned CWS name, to evaluate the accuracy of our linkage-based exposure assessment. First, we converted the water system names to lowercase and removed blank spaces. We then calculated the Levenshtein distance between the CWS names to assess similarity between the strings (we defined a match as strings with Levenshtein distance ≤5 based on visual assessment of different distances), matched by the first letter and partial common components of the strings, and performed a manual check to correct matches and non-matches that were inaccurately classified using the automated processes.

Census tract covariates

A neighborhood socioeconomic status (SES) metric for the enrollment address was previously created for CTS participants based on the enrollment address and incorporating three 1990 census block group variables [55]. Briefly, quartile values were computed based on the statewide estimates of educational attainment (percentage of adults over age 25 completing a college degree or higher), income (median family income), and occupation (half of adults employed in managerial/professional occupations). Census block groups were assigned a score of 1 (low) through 4 (high) for each of the SES attributes. Scores were summed across the attributes and census block groups were categorized into four groups based on quartiles of this overall score [55].

Urbanicity of the enrollment address was also previously characterized based on information from the 1990 U.S. Census [55]. Metropolitan urban area was defined as the highest quartile of population density within an Urbanized Area (population ≥ 1,000,000) and Metropolitan suburban area was defined as the rest of the population (lower three quartiles) within an Urbanized Area (population ≥ 1,000,000). City was defined as Census Places outside of an Urbanized Area with a population ≥50,000 people, and Town was defined as Census Places outside of an Urbanized Area with a population <50,000 people and in the upper three quartiles of population density. Rural was defined as Census Places outside of an Urbanized Area with <50,000 people and in the lowest quartile of population density, and unpopulated areas (Table S3) [55]. We dichotomized urbanicity into metropolitan areas (metropolitan urban and metropolitan suburban) and non-metropolitan areas (city, town, rural).

Statistical analyses

All statistical analyses were conducted in R version 4.0.2 within the CTS Researcher Platform [56]. We described participant characteristics by water source at enrollment. Because most participants were served by CWS, we did not conduct further analysis of the domestic well exposures. We estimated long-term averages of CWS contaminants at the enrollment address (1990–2015), and the percent of years that annual average concentrations were ≥half the MCL and ≥MCL, out of the total number of years of measurement data reported per CWS in the 1990–2015 period. To align with the timing of self-reported information on residential drinking water source at questionnaire 6 (2017–2019), we described 2016–2020 average CWS concentrations at the questionnaire 6 address, among participants who self-reported drinking municipal water.

Residential duration at the enrollment address was previously estimated for CTS participants [57]. Briefly, addresses from enrollment (1995–1996) through 2019 were obtained from participants who completed follow-up questionnaires, as well as from the U.S. Postal Service, LexisNexis, Experian, and the California Cancer Registry databases [55]. The mean (median) total residential duration at the enrollment address was 22.2 (22.7) years. As a sensitivity analysis, we computed long-term average exposures restricted to participants residing at the enrollment address for at least 20 years before and/or after enrollment (N = 60,972).

We performed Spearman correlation analyses to describe correlations between contaminants. We compared CWS exposures by sociodemographic subgroups: participant race and ethnicity, and census block group-level SES quartile and urbanicity. We used generalized linear regression to estimate unadjusted and adjusted geometric mean ratios (GMRs) of CWS exposures by race and ethnicity (non-Hispanic white as reference), census block group SES quartile (quartile 1 as reference), and urbanicity (non-metropolitan areas as reference); adjusted analyses were co-adjusted for the other two sociodemographic variables. Finally, we compared self-reported drinking water source to the spatially linked drinking water source (CWS or domestic well) at questionnaire 6. We evaluated the percent agreement between participants’ self-reported water company name and the spatially linked, standardized CWS name, as described above. Among participants self-reporting municipal tap water as their drinking water source, we computed daily tap water and CWS contaminant intake.

In supplemental analyses, we described temporal trends in CWS-level contaminant concentrations, among the 1249 CWS linked to CTS participants at enrollment. We used generalized linear regression models to estimate GMRs of 10-year average CWS concentrations (1990–1999 average concentration as reference), adjusting for water source type (groundwater, groundwater under the influence of surface water, surface water) and population served (very small (≤500 people), small (>500–3300 people), medium (>3300–10,000 people), large (>10,000–<1,000,000 people), and very large (≥1,000,000 people)). Due to limited HAA5 data prior to 2000, we compared average HAA5 concentrations in 2000–2009 (reference) to the average concentration in 2010–2020.

Results

Descriptive characteristics by enrollment drinking water source

We included 125,665 CTS participants with a valid California residential address at the time of enrollment and who had consented to non-breast cancer research. Median age at enrollment (1995–1996) was 52 years old with almost all (99%) participants having attained a bachelor’s degree or higher degree (Table 1). A majority of all participants were non-Hispanic white (87%), followed by Hispanic (4%), Asian (4%), Black (3%), other/multi-racial (1%), and Native American (1%). By water source, 80% of CWS users and 50% of domestic well users lived in census block groups in the upper two quartiles of SES. While 86% of domestic well users lived in non-metropolitan areas, 32% of CWS users lived in non-metropolitan areas, and 68% lived in metropolitan areas (Table 1). Comparing CWS and domestic well exposures, average nitrate concentrations were more than twice as high among domestic well users (median = 1.25 mg/l) compared to CWS users (median = 0.54 mg/l); median arsenic and gross alpha concentrations were higher among CWS users (1.03 µg/l and 2.21 pCi/l, respectively) compared to domestic well users (0.62 µg/l and 0.38 pCi/l, respectively), though the 75th percentile of arsenic concentrations were higher among domestic well users. Uranium concentration estimates were not available for domestic well users. TTHM, HAA5, TCE, and PCE exposures occurred mostly among those using CWS (median (µg/l) of HAA5 = 8.67, TTHM = 12.86, TCE = 0.01, PCE = 0.02), compared to domestic well users (median = 0 µg/l).

Table 1 Characteristics of California Teachers Study (CTS) participants, by drinking water source (community water system, CWS, and domestic well) assigned by enrollment address.

In Spearman correlation analyses of CWS exposures, the participants’ long-term (1990–2015) average concentrations of uranium and gross alpha were highly correlated (rho = 0.85) (Fig. S3). More moderate (0.08–0.67) positive pairwise correlations were observed between nitrate, arsenic, uranium, gross alpha, TCE, and PCE. The summed classes of disinfection byproducts HAA5 and TTHM were positively correlated with each other (rho = 0.52), and negatively or negligibly correlated with all other contaminants.

CWS contaminant exposures are summarized as average concentrations from 1990–2015, and as the percent of years that annual average concentrations were ≥1/2 MCL and ≥MCL in that period (Table 2). Overall, the means and medians of the average concentrations were below regulatory limits for all contaminants, though there was considerable variability in ranges. Of note, fewer CWS reported measurement data in this period for uranium (CWS N = 860) and HAA5 (CWS N = 882) than the other contaminants [44]. Almost double the number of CWS reported HAA5 data in 2010–2020 (N = 957) compared to before 2010 (N = 480) (Fig. S4). Median and mean average concentrations and percent of years of data ≥1/2 MCL and ≥MCL were similar when we restricted analyses to participants that resided at their enrollment address for at least 20 years (pre and/or post enrollment) (Table S4).

Table 2 Distributions of long-term (1990–2015) average concentrations of community water system (CWS) exposuresa, and percent of years the annual average concentration was at or above the maximum contaminant level (MCL) and half of the MCLb.

CWS exposures by sociodemographic factors

We described average CWS exposures stratified by census block-group level SES quartile and urbanicity, and participant race and ethnicity (Table 3 and Figs. S5 and S6). In adjusted analyses (model 2), relative to non-Hispanic white participants, arsenic concentrations were 14% higher (95% CI 11, 17%), 12% higher (95% CI 10, 15%), and 8% higher (95% CI 4, 13%) for Black, Hispanic, and other/multi-racial participants, respectively (Fig. 1). Relative to non-Hispanic white participants, uranium was 15% higher (95% CI 11, 19%), 24% higher (95% CI 20, 28), and 8% higher (95% CI 1, 16%) for Black, Hispanic, and Native American participants, respectively, with similar patterns observed for gross alpha. Nitrate levels were 32% (95% CI 28, 37%) higher for Hispanic participants, 6% (95% CI 2, 10%) higher for Asian participants, and 11% (95% CI 4, 19%) higher for other/multi-racial participants. Relative to non-Hispanic white participants, TTHM levels were 4% (95% CI 0, 9%) higher for Black participants, but lower for all other racial/ethnic groups. Similarly, HAA5 levels were lower for all other racial/ethnic groups compared to non-Hispanic white participants. Conversely, TCE and PCE levels were elevated for all racial/ethnic groups compared to non-Hispanic white participants. Results were similar in univariate analyses (model 1, Fig. S7).

Table 3 Distributionsa of long-term (1990–2015) average concentrations of community water system (CWS) exposures by census block group-level socioeconomic status quartile (SES)b and urbanicityc, and participant race and ethnicity.
Fig. 1: Adjusted percent (%) change (95% CI)1 of long-term (1990–2015) average concentrations of community water system (CWS) exposures2 by census block group-level socioeconomic status quartile (SES)3 and urbanicity4, and participant race and ethnicity (N = 114,183 participants with race/ethnicity, SES, and urbanicity information).
figure 1

Reference groups are in black, non-reference groups are in blue. Shapes represent each sociodemographic group category (square = urbanicity, triangle = SES, circle = race/ethnicity). 1Generalized linear regression was used to compute geometric mean ratios (GMRs) (95% confidence intervals, CI) of natural-log transformed exposure estimates by SES quartile (reference = SES quartile 1), urbanicity (reference = non-metropolitan areas), and participant race/ethnicity (reference = non-Hispanic white participants). Model was co-adjusted for SES, urbanicity, participant race/ethnicity. We calculated percent changes as: (GMR − 1) × 100%. 2Contaminants are as follows: arsenic (µg/l), uranium (μg/l), gross alpha (pCi/l), nitrate-nitrogen (Nitrate-N, mg/l), total trihalomethanes (TTHM, μg/l), five haloacetic acids (HAA5, μg/l), trichloroethylene (TCE, μg/l), and tetrachloroethylene (PCE, μg/l). Measurement data below the detection limit for the purposes of reporting (DLR) were imputed using Tobit regression. 3SES metric was created incorporating three 1990 census block group variables (occupation, education, and income) (Hurley et al. [55]). 4Urbanization categories were created using 1990 census block groups (Hurley et al. [55]) and dichotomized as non-metropolitan (rural, town, city), and metropolitan (metropolitan suburban and metropolitan urban).

Self-reported drinking water source and characteristics

Among 38,865 participants who participated in questionnaire 6 (2017–2019) and completed the drinking water section, 33,276 participants (86%) resided in California and could be assigned to a water source (CWS or PLSS domestic well) based on their Q6 address. Seventy percent responded that their tap water source was municipal water, 6% reported private well water, 15% reported bottled water, 4% reported other, and 5% reported don’t know or had a missing response (Table 4). Based on the Q6 address, 30,737 participants (92%) were assigned as CWS users, while 2539 (8%) were assigned as domestic well users. Of participants spatially linked as CWS users, 74% responded that their tap water source was municipal water, 2% reported private well water, 15% reported bottled water, 4% reported other, and 5% reported don’t know or had a missing response. Among those assigned as domestic well users, 53% said they used a private well, 26% reported municipal water, 11% reported bottled water, 4% reported other, and 5% don’t know/missing.

Table 4 Comparisons of drinking water sources at the 6th survey (2017–2019): self-reported drinking water source vs. assigned water sourcea (N = 33,276).

The daily intakes of contaminants in tap water, based on the CWS 2016–2020 average concentrations, are described in Table 5. Daily intakes of each contaminant were calculated for participants who were assigned as CWS users and self-reported municipal tap water and intake of water, coffee, and tea. Median tap water intake including water, coffee, and tea was 2.01 l/day compared to 1.42 l/day for tap water alone. Mean daily intake from water, coffee, and tea was as follows: 1.97 µg arsenic/day, 6.85 µg uranium/day, 5.03 pCi/l gross alpha/day, 2.27 mg/l nitrate/day, 46.83 µg TTHM/day, 23.32 µg HAA5/day, 0.19 µg TCE/day, and 0.18 µg PCE/day (Table 5).

Table 5 Daily intake of community water system (CWS, 2016–2020) arsenic, uranium, gross alpha, nitrate-nitrogen (N), five haloacetic acids (HAA5), total trihalomethanes (TTHM), trichloroethylene (TCE) and tetrachloroethylene (PCE) per L of home tap water consumed per day.

Among 27,594 participants with self-reported information about home tap water filtration/treatment, 52% used filtered/treated tap water, 45% did not filter or treat their tap water, and 3% did not know (Table S5). Fifty-four percent of self-reported municipal water users and 47% of domestic well users reported using some type of filtration/treatment. The most common types of treatment specified were refrigerator filters (39%) followed by pitcher filters such as Brita/PUR® (30%); only 15% used reverse osmosis treatment. The use of reverse osmosis was higher among self-reported private well users (27%) compared to municipal water users (15%). Participants that used “other” filtration/treatment methods (16%), reported using distillation, zero water® filters, and boiling water, to name a few examples.

We assessed the agreement between the self-reported and assigned CWS name. Among 23,356 participants with self-reported municipal water, 9834 (40%) provided a water company name. Of those, 9567 participants could be assigned to a CWS based on their address (the other 267 were assigned as domestic well users). The self-reported and assigned CWS name agreed for 8591 participants (90%).

In analyses evaluating temporal changes in 10-year average CWS concentrations for 2000–2009 and 2010–2020 compared to the 1990–1999 average, we observed statistically significant declines in CWS arsenic, uranium, gross alpha, and nitrate, while TTHM concentrations were statistically significantly higher (Table S6). Compared to the 2000–2009 average concentrations, HAA5 concentrations were significantly higher in 2010–2020. No clear temporal trend was observed for TCE and PCE.

Discussion

We successfully linked CTS participants residing in California to their drinking water source at enrollment through the 6th survey (2017–2019), and characterized long-term exposures to eight regulated water contaminants, leveraging water quality monitoring data for CWS and domestic wells. We observed high consistency between self-reported and assigned water source, and high agreement between self-reported water system name and assigned CWS name, among a subset of participants who completed the 2017–2019 survey. Participant-reported information about drinking water source and CWS name are useful to validate exposure assessments that rely on spatial intersections of geocoded addresses with CWS service area boundaries.

We calculated contaminant intake via tap water based on self-reported information on drinking water source and consumption. Additionally, by assessing the proportion of participants that use tap water filtration or treatment, we can better understand how CWS exposures may differ within a service area, based on point-of-use filtration or treatment. While over half of participants said they drank filtered tap water, only a small percent reported using treatment techniques (e.g., reverse osmosis) that would be likely to remove arsenic, nitrate, and uranium. Additional treatment techniques may have been used that were not captured by the CTS questionnaire. Even among participants who used drinking water treatments or who self-reported ingesting bottled water in 2017–2019 (15%), dermal and inhalation exposure remain potential routes of exposure for some water contaminants like DBPs and VOCs.

Average CWS nitrate exposures were similar comparing CTS participants (median = 0.54 mg/l) to the median nitrate concentration estimated for the population using CWS in the state of California from 2011–2019 (~0.6 mg/l), while median arsenic concentrations were slightly higher in the CTS (1.03 µg/l) compared to statewide estimates (~0.6 µg/l) [58]. CWS nitrate levels in the CTS were slightly lower than previous estimates for the Iowa Women’s Health Study (1.07 mg/l), while TTHM concentrations were higher in our cohort (median = 12.86 µg/l compared to 4.77 µg/l) [36, 59]. CWS arsenic exposures were similar to estimates in the National Health and Nutrition Examination Survey (NHANES, CWS exposures assigned by residential county, median arsenic = 1.35 µg/l), a series of cross-sectional surveys representative of the non-institutionalized general U.S. population [60]. CWS arsenic and uranium (median = 3.48 µg/l) exposures in the CTS were also similar to those in the Multi-Ethnic Study of Atherosclerosis (MESA, exposures assigned by zip code; median arsenic and uranium = 0.35 and 1.14 µg/l, respectively), a prospective cohort of urban, racially and ethnically diverse U.S. adults across six urban centers including Los Angeles [61]. California participants in MESA had CWS arsenic concentrations ranging from 0.35–5.86 µg/l [62]. We are not aware of other U.S. epidemiologic cohorts that have assessed exposure to gross alpha. Differences between studies in the time periods assessed, regions, and amount of monitoring data available, may contribute to the differences observed between our and other studies. Still, similar patterns and magnitudes compared to other cohorts are notable given the geographic and demographic differences in study populations.

We observed differences in drinking water exposures across participant race and ethnicity group and census block group-level SES and urbanicity. Black and Hispanic participants had elevated arsenic, uranium, gross alpha, TCE, and PCE exposures compared to non-Hispanic white participants (Fig. 1). Other race and ethnic groups and multi-racial participants had elevated arsenic, nitrate, TCE, and PCE exposures. This study is consistent with prior findings that CWS arsenic and nitrate concentrations increased per 10% increase in the Latinx population served by CWS in California (2011–2019) [58]. CWS arsenic concentrations (2006–2011) were also higher among Hispanic, Black, and Chinese-American participants compared to Non-Hispanic white participants in California in MESA [61]. Native American participants had elevated uranium and gross alpha exposures, which is consistent with the legacy of uranium mining and milling on or near tribal areas in the American west [63, 64]. Arsenic, uranium, gross alpha, and nitrate exposures were lower in upper SES and metropolitan areas, while DBP exposures were higher. Previous studies have documented disparities in drinking water metal/metalloid, nitrate, and other regulated contaminants in the U.S. by region, SES, race and ethnicity, and rurality/urbanicity [10, 41, 42, 65].

Prior evidence on differences in HAA5, TTHM, TCE, and PCE exposures across subpopulations is limited; our study underscores the need for future evaluations of DBP and VOC exposures in drinking water. As expected, TTHM and HAA5 concentrations were higher among CWS users compared to domestic well users; CWS users would be more likely to have DBP exposures than domestic well users due to the regular use of chlorination and other disinfection processes for CWS, compared to less frequent disinfection of domestic wells. The groundwater quality data used to estimate exposures to areas served by domestic wells included some data from CWS wells, which may overestimate TTHM and HAA5 concentrations for domestic well users. Though TCE and PCE concentrations were higher among CWS users compared to domestic well users, detections of TCE and PCE in CWS were reported for only about 35% of years (population-weighted average) from 1990–2020 (Table S2).

Strengths and limitations

Strengths of this study include our use of long-term water quality monitoring data for multiple regulated contaminants with extensive geographic coverage across California. Epidemiologic analyses of drinking water exposures have previously been limited by the inability to link water quality data to epidemiologic cohorts. Based on enrollment addresses, we were able to match all CTS participants living in California to a CWS or domestic well exposure estimate. We assessed assignment accuracy and potential exposure misclassification at follow-up (Q6, 2017–2019). Bottled water use was the primary reason for potential exposure misclassification of the drinking water source; however, additional routes of exposure to water contaminants, such as dermal exposure to DBPs, remain relevant even to participants who did not report drinking their home tap water [66]. Almost all of the self-reported municipal water users were correctly found to live within CWS boundaries (22,697 of 23,356; 97%), and most self-reported private well users were correctly found to live outside of CWS boundaries (1345 of 1927; 70%). However, some participants who were spatially assigned as domestic well users self-reported drinking municipal water (26%); this is likely explained by CWS service boundaries lacking sufficient granularity to pinpoint households within the service area that relied on a private well. This information may be used to inform and improve future updates to the shapefiles of CWS service boundaries. With high agreement between assigned vs. self-reported water system name (90%), our findings suggest that exposure misclassification of CWS source is not likely to be large. We did not have information at enrollment on home drinking water consumption; however, per capita consumption of bottled water has generally increased in the U.S. from 1999–2022 [67]. Data from NHANES suggests that prevalence of home tap water consumption (vs. bottled water), was consistent for U.S. adults between 2007–2016 [68]. Our findings of median tap water intake (1.42 l of water, 2.01 l of water, tea, and coffee) were comparable to previous estimates of tap water intake for women in NHANES [69, 70]. Few U.S. cohort studies have quantified tap water contaminant exposures using self-reported information on home tap water consumption, though this represents an important component to estimating drinking water dose and can enable more precise evaluation of toxicant dose-response relationships [71].

Limited CWS data were available for uranium and HAA5. For uranium, CWS were permitted by Section 64442(f) of Title 22 of the California Code of Regulations to substitute gross alpha activity measurements for uranium measurements, if the gross alpha concentration did not exceed 5 pCi/l [44]. As such, CWS (particularly small (>500–3300 people served) and very small (≤500 people) systems with gross alpha <5 pCi/l had more missing uranium measurements. Future studies may consider the use of gross alpha and other co-contaminants, and hydrological and geochemical characteristics of the water source, to estimate uranium when measurement data are sparse. For HAA5, the data were available for a limited number of CWS prior to 2003 (CWS N = 14). In the absence of historical monitoring data, it may be reasonable to use recent HAA5 data to estimate past exposures if water sources and treatment methods did not change. When earlier CWS data are available, future studies may also evaluate whether HAA5 levels were stable over time when treatment methods and water sources did not change. Additional limitations include our inability to evaluate drinking water sources and exposures outside of the home, such as places of employment and recreation. We did not consider the residential history in this analysis, and linked participants to their corresponding water source based on their address at enrollment and at the 6th survey. Future analyses that use the complete residential history should provide improved exposure estimates that allow for lagging of exposures as appropriate for the health outcome being studied.

Conclusions

In this assessment of drinking water exposures for the California Teachers Study, we generated estimates of long-term exposure for study participants that will be useful for epidemiologic studies of chronic disease outcomes. This study demonstrated that linkage of epidemiologic study populations’ address data to CWS boundaries represents a reasonable approach that is not likely to introduce substantial exposure misclassification. We observed heterogeneity in CWS exposures by participant race and ethnicity, and neighborhood SES and urbanicity. Identification of differential exposure to water contaminants is critical to the development of effective public health interventions that reduce drinking water exposures, exposure disparities, and associated health risks [72].