Introduction

Fecal contamination of water continues to be a major public health concern, with new challenges necessitating a renewed urgency in developing rapid and reliable methods to detect contamination and prevent human exposures. Aging sewer infrastructure in the USA and elsewhere will require rapid methods to assess fecal contamination of water [1, 2•, 3]. The number of extreme weather events including flooding events is forecast to increase with climate change and has been associated with contamination of water resources [4,5,6]. Also, the increasing threat of antimicrobial resistance is making it all the more important to lower the rates of infections across the globe, especially infections that require antibiotic treatment, and to identify environments contaminated with antibiotic-resistant pathogens [7,8,9].

Fecal indicator bacteria have been used for over 150 years to indicate fecal contamination of water and associated health risks (Table 1). The latter half of the nineteenth century saw the discovery of waterborne disease transmission, perhaps most famously in the analysis of drinking water systems by John Snow during the 1854 London cholera outbreak and the isolation of Vibrio cholerae by Robert Koch in 1884 (though first identified in 1854 by Filippo Pacini) [10,11,12]. Recognition that sewage contamination of water sources spreads diseases such as cholera and typhoid necessitated a means by which to ascertain the presence of sewage in drinking water. Coliform bacteria, a group of typically harmless Gram-negative bacteria that constitute part of the natural gut microbiota in humans and other warm-blooded animals, provided a simple and reasonably reliable tool for diagnosing sewage pollution in drinking water samples owing to their high concentrations in sewage and ease of culture [13, 14]. A growing concern about fecal pollution in the wider environment and the potential for human exposure to enteric pathogens through additional environmental pathways, especially recreational and foodborne exposure routes, encouraged subsequent efforts to assess fecal contamination in an increasing variety of environmental matrices [15, 16]. Feces can contain a wide range of pathogens, which when introduced to the environment may persist for varying amounts of time, often at concentrations too low for reliable detection but still hazardous to human health [17,18,19]. Furthermore, conventional methods of enteric pathogen detection are generally time-consuming, expensive, and often insensitive even in fresh feces [20]. The use of fecal indicator bacteria (FIB) like the fecal coliform Escherichia coli to suggest the presence of hazardous fecal pollution therefore continues to be a valuable tool to assess water quality.

Table 1 Indicators of fecal contamination for water quality assessments

Limitations of the fecal indicator paradigm have long been acknowledged [21,22,23]. Researchers have identified many challenges and limitations to the effective use of both traditional and alternative fecal indicators to characterize risk, identify sources, and evaluate interventions [24,25,26]. Arguably, one of the most significant limitations is the inconsistent relationships between FIB occurrence, enteric pathogens, and health risks [25, 27]. In settings with high rates of enteric infection and inadequate fecal waste management, E. coli in drinking water (but not other coliforms) has often been associated with increased risk of illness [28,29,30]. Similarly, the health risks of recreational uses of surface waters have been found to increase with FIB density, but generally only at locations with known human fecal inputs or under high-risk conditions, such as following precipitation or the removal of physical barriers [25, 31,32,33,34,35]. The FIB found to correlate with health risks also vary widely by site [32]. The co-occurrence of enteric pathogens and FIB in ambient waters is inconsistent at best [27, 36], and commonly used FIB are known to persist and grow in the environment [37,38,39,40,41,42].

In this paper, we review recent progress in the quest for improved indicators of fecal contamination in water. We summarize recent advances in alternative indicators with a focus on microbial source tracking markers. We also recognize the advances in molecular methods that are increasingly being used to detect fecal contamination of water and to identify sources of contamination. Improvements in detection capabilities, analytical sensitivity, and data quality are discussed along with barriers that must be overcome for wider adoption. We review efforts to characterize health hazards associated with fecal contamination, and we distinguish the timing and conditions when indicators appear best suited to identify risks to human health. Finally, we identify opportunities for continued improvements in the use of indicator organisms to assess environmental fecal pollution and to safeguard human health.

Alternative Indicators

Many alternative fecal microbes have been proposed to address the limitations of traditional FIB as indicators of fecal pollution [21, 22, 43]. Better surrogates that share environmental fate and transport mechanisms with pathogens of concern—particularly coliphages, viruses that infect E. coli bacteria, and obligate anaerobes, thought to have host specificity and to derive exclusively from recent fecal contamination—have frequently been identified as potential alternative indicators expected to better represent risks to human health [44,45,46]. Health risks from exposure to ambient fecal contamination are largely a function of the specific pathogens present and their concentrations in the exposure matrix, which is strongly influenced by the source of the fecal pollution [47,48,49,50,51]. Enteric viruses that primarily derive from human sources, notably human noroviruses, drive the global burden of gastrointestinal illness, including from recreational water exposures [2•, 52,53,54,55,56]. Highly persistent and infectious protozoan parasites are shed at high rates by livestock [2•, 57,58,59], and avian sources introduce bacterial pathogens like Campylobacter spp. and non-typhoidal Salmonella spp., particularly the poultry common in domestic environments in low- and middle-income countries (LMIC) where frequent exposure is likely [57, 60,61,62,63,64]. Given the differences in human health risks and appropriate mitigation strategies for fecal pollution from different sources, there is a need to identify not only the presence of fecal contamination but also its origins. Because traditional FIB cannot discriminate between fecal sources, host-associated fecal microbes have been the subject of extensive research in recent years for use as indicators of source-attributable fecal pollution, an approach known as microbial source tracking (MST) [23, 43].

Microbial Source Tracking

Numerous host-associated organisms and gene markers have been identified for identifying sources of fecal pollution in water, none of which has demonstrated perfect source sensitivity and specificity [46]. Numerous MST markers target members of the order Bacteroidales, many of which are obligate anaerobes and abundant constituents of the gut microbiota in warm-blooded animals, including one of the most common and earliest-proposed human-associated molecular markers, HF183 [43, 46, 65, 66]. Because MST targets specific constituents of the gut microbiota, the diagnostic performance of each MST marker can vary substantially between populations. Validation of existing MST markers for use in new geographic locations is increasingly standard practice [24], with recently published MST validation studies conducted in Australia [67, 68], Bangladesh [69, 70], Costa Rica [71], India [72], Japan [73, 74], Mozambique [75], New Zealand [76], Nepal [77, 78], Singapore [79], Thailand [80], and the USA [81, 82], and a global evaluation of markers in sewage from 13 countries on 6 continents [83•]. Potential MST markers continue to be identified, most notably human-associated crAssphage, a bacteriophage infecting Bacteroides intestinalis recently discovered to be an abundant, globally distributed constituent of the human gut virome [84,85,86,87,88,89]. Human-associated E. coli markers, long-desired for their direct correspondence to a common FIB used for regulatory purposes, have also been developed [90,91,92], though they may lack the analytical sensitivity for effective use in ambient waters [93]. The identification of new markers is increasingly supported by advances in sequencing technology and bioinformatics [84, 94,95,96], and next-generation sequencing (NGS)–based MST approaches continue to be refined [97]. Although highly dependent on fecal library composition (the collection of metagenomic sequences from known fecal sources that informs source identification algorithms) [97,98,99,100,101], NGS-MST has the potential to identify finer distinctions between sources, as demonstrated by a study in Kenya that distinguished between fecal contamination from young children and adults [102•]. The recent introduction of more affordable and portable long-read sequencing platforms, while currently error prone, promises to accelerate the use of sequencing to characterize fecal contamination [103, 104].

MST proponents typically advocate a “toolbox approach” to fecal source attribution that combines multiple MST markers, detection methods, and sampling strategies in recognition of the limitations of any single MST marker to reliably and conclusively characterize fecal pollution [105,106,107]. Two toolbox constituents recently receiving much attention in the literature are pepper mild mottle virus (PMMoV), a plant virus infecting Capsicum species acquired by humans from dietary sources, and crAssphage, both viruses that hold promise as human-associated viral surrogates owing to their global distribution in sewage at densities typically much higher than other viruses [2•, 71, 74, 78, 108,109,110]. Nonetheless, Bacteroides HF183 and its variants have arguably consolidated their role as the default tool for human source tracking [43], featuring consistently high concentrations in sewage globally [83•], frequent detection in surface waters [61, 93, 110,111,112], standardized protocols [81, 113], and validated multiplex assays [89, 114]. However, the diagnostic performance of HF183 and most other human-associated markers has typically been poor in highly contaminated settings in many low- and middle-income countries (LMIC) [58, 69, 70, 72, 75, 115], with the exception of high sensitivity to child feces in urban Kenya [102•].

Successful identification of non-human fecal sources may best demonstrate the value of MST for informing management and research priorities. Unlike human-associated markers, animal fecal markers (e.g., livestock-associated BacCow and canine-associated BacCan, both with Bacteroidales targets, and avian-associated GFD, which targets the genus Helicobacter) have performed well in LMIC settings and have repeatedly identified livestock as major sources of fecal contamination and pathogens in the domestic environment, supporting recent calls for renewed emphasis on animal waste management [58, 116]. Likewise, MST investigations can impact management and mitigation programs by determining that wildlife, livestock, or pets contributed substantially to fecal pollution in certain watersheds and beaches [81, 117•, 118•, 119]. The forensic potential of MST was demonstrated during a 2019 Campylobacter outbreak in Norway, which was attributed to non-human sources, most likely horses, using a combination of FIB, MST, and direct pathogen detection [120]. MST has also been used to identify sources of antimicrobial resistance, the environmental dimensions of which remain poorly understood [121, 122]. Similar investigations are likely to increasingly carry legal implications, for instance, by implicating animal agriculture industries in unpermitted surface water pollution [13, 123, 124].

Molecular Methods: Challenges and Advances

The historic infeasibility of comprehensive direct pathogen detection in environmental waters continues to motivate the use of fecal indicators, which have traditionally been detected using culture-based methods. Growing FIB from water samples on selective media is routine and relatively inexpensive but generally requires a minimum of 18 hours to obtain results, by which time conditions at the sampling location may have dramatically changed [125,126,127]. Furthermore, the obligate anaerobes and host-specific viruses proposed as alternative indicators for MST are often not amenable to laboratory culture [43]. A range of alternative detection methods continue to be developed and are the subject of several recent comprehensive reviews [97, 128,129,130,131], with real-time polymerase chain reaction (qPCR) and related molecular methods that infer the presence of fecal microbes from their genetic material experiencing particularly widespread adoption [43].

Detection

By bypassing culture, samples can be analyzed by qPCR in as few as three hours or stabilized for transport and extended storage prior to analysis [126], However, this analysis may detect residual signals from organisms that are not viable or infectious at the time of collection [132]. Although gene markers must be pre-specified, qPCR (alongside reverse-transcription PCR (RT-PCR) for RNA markers) provides a consistent approach for detecting targets ranging from FIB to viruses, human mitochondrial DNA, and genes conferring pathogenicity or antimicrobial resistance [27, 43, 133]. qPCR assays can also be multiplexed to detect a limited number of targets in a single reaction. Furthermore, the recent development of qPCR array cards that enable simultaneous detection of dozens of gene targets in a single sample demonstrates the growing feasibility of direct detection of a comprehensive set of enteric pathogens alongside functional genes and fecal indicators [134,135,136,137,138,139,140,141,142].

Analytical Sensitivity

Although qPCR is a sensitive method relative to culture and conventional PCR [20, 143], it is vulnerable to interference from other substances common in environmental waters that can reduce the availability of target DNA or inhibit polymerase function, limiting assay sensitivity [144]. Strategies to mitigate matrix interference include sample dilution or chemical treatment, nucleic acid purification, inhibition-resistant reagents, and the use of multiple processing and internal controls to both identify inhibited samples and competitively bind interfering substances [144,145,146]. Such approaches increase the complexity, expense, and time requirements for analysis, and physical removal of inhibitors through dilution or purification also reduces target DNA, providing diminishing returns to analytical sensitivity [147, 148]. Complete abatement of qPCR inhibition is likely unrealistic; nevertheless, recent efforts to standardize qPCR procedures for water quality assessment suggest that a set of existing mitigation practices is sufficient to render matrix interference a manageable nuisance in most applications [81, 144, 149•]. Increasing adoption of digital PCR (dPCR), a quantitative PCR approach robust to inhibition, will likely further alleviate the challenge of inhibition for routine molecular detection of fecal microbes [61, 114, 128, 150,151,152].

Improved assay sensitivity offers little benefit if the target is unlikely to be present in the test sample due to low ambient concentrations. While simulation studies indicate that the concentrations at which bacterial indicators represent elevated risk of illness are well above the limits of detection [153], enteric viruses, protozoan parasites, and some alternative indicators (e.g., coliphages) commonly require larger sample volumes for reliable capture, necessitating concentration methods to obtain test sample volumes that can be accommodated by the chosen detection method [44, 128]. Filtration approaches that allow simultaneous concentration of a wide range of organisms are increasingly used to process samples, including as part of automated large-volume samplers, prior to culture or molecular detection [128, 154,155,156]. Ultrafiltration techniques in particular have demonstrated reasonably efficient and consistent recovery for a variety of organisms, water types, and sample volumes, providing a natural complement to multitarget arrays, and increasingly appear to be the default concentration approach for many applications [154, 157,158,159,160,161]. Co-concentration of qPCR inhibitors during ultrafiltration is a concern, but effective inhibition mitigation has been demonstrated by further processing of the concentrate prior to analysis [160, 162,163,164].

Data Quality

Generalized data reporting guidelines notwithstanding [165], differences in analytical procedures and data handling practices were identified as major sources of variability in a multilaboratory comparison study of primarily qPCR-based MST approaches [24, 166]. Substantial effort has been devoted in recent years to the development and implementation of standardized protocols and quality control metrics for fecal indicator assessment by qPCR [113, 167, 168]. A notable feature shared by these protocols and other recent recommendations for improved reliability of molecular detection methods is a reliance on numerous controls throughout the procedure [128, 137]. While the use of positive and negative controls is standard for most analytical techniques, requirements outlined in standardized qPCR protocols generally include multiple serial dilutions of standard reference material to construct calibration curves, sample processing controls (SPC), method extraction blanks (MEB), internal amplification controls (IAC), and no template controls (NTC), with two or three replicates of each sample, control, and standard dilution series concentration analyzed on each instrument run [113]. Each additional reference or control material must be obtained, prepared, stored, and used in the appropriate manner, increasing per-sample costs and introducing considerable complexity and opportunity for user error. In a recent large demonstration project, some laboratories without extensive previous qPCR experience struggled to achieve adequate quality control despite receiving method-specific instrumentation, materials, and training. Even laboratories with substantial qPCR experience regularly failed to meet data quality criteria in this study [149•]. The compounding complexity required for reliable results suggests that qPCR in its current form may be unsuitable for routine monitoring purposes except in particularly well-resourced laboratories that regularly process sufficient sample numbers to warrant the equipment, properly maintain assay materials, and ensure sustained institutional experience. By contrast, culture-based FIB detection has grown increasingly accessible following recent efforts to develop low-cost field tests that can be performed with minimal equipment at ambient temperatures [125, 169, 170]. While still requiring substantial resources and expertise, dPCR requires fewer controls and precise reference materials than qPCR because it is robust to matrix interference and offers absolute quantification, features which may position dPCR to be increasingly adopted for general use [171]. Both up-front and per-reaction costs are considerably higher for dPCR compared to qPCR, but the improved multiplexing performance, fewer required control reactions, and greater precision of dPCR present opportunities to mitigate differences in per-sample costs [114, 143, 172, 173].

Health Relevance and Protection

In addition to revealing fecal pollution and elucidating its sources, fecal indicators are widely used to characterize health hazards in waters potentially impacted by fecal contamination. This approach has proven somewhat effective in drinking water and for recreational exposures during wet weather or near point sources of fecal pollution [25, 28, 35]. A recent review found increased likelihood of co-detection of fecal indicators and enteric pathogens in recreational waters under similar conditions [27]. However, relationships between fecal indicators and gastrointestinal illness have mostly not been observed in waters impacted by non-point source pollution [25, 33, 174], despite well-documented risks to swimmers [31].

In the absence of consistent empirical relationships, quantitative microbial risk assessment (QMRA) has been used to estimate the health implications of various indicators introduced by different fecal sources [153, 175]. Notably, threshold concentrations at which MST markers correspond to increased risk of illness have been estimated in several QMRA studies; these thresholds are comfortably above the typical limits of detection, suggesting that the markers are highly likely to be detected should their associated pathogens be present at hazardous levels [2•, 3, 49, 56, 62, 176].

Indicator-based risk assessment requires defining the relationships between the indicator concentration and the index pathogens selected for consideration, typically a subset of pathogens expected to account for the majority of the risk and for which dose-response relationships have been characterized [55, 175]. A substantial body of research characterizing processes affecting indicator-pathogen relationships has culminated in the recent publication of several comprehensive reviews and meta-analyses of the occurrence, transport, and persistence of indicators and common index pathogens in fecal waste streams and surface water [17, 27, 177,178,179]. Associations between indicators and pathogens in surface water have been largely inconsistent, although empirical determination of these relationships is challenged by the limitations of direct pathogen detection; associations are more commonly observed among more frequently detected organisms [27, 36]. Microbial occurrence is more consistent in feces and particularly in sewage, which smooths the high individual variability in fecal microbe shedding by representing the combined fecal inputs of populations [177, 178, 180,181,182]. Despite less frequent detection of alternative indicators in recreational waters [27], high concentrations of multiple human-associated markers have been reported worldwide in both raw and biologically treated wastewater [83•]. Meta-analyses have also found high coliphage and norovirus densities in raw sewage around the world [178, 180]. A wide range of pathogens have frequently been detected in stormwater, though with greater variability and typically at lower concentrations than in sewage [179].

Upon introduction to the environment, microbial contaminants are subject to highly variable dispersal and decay processes [17]. Differential transport and decay of indicators and pathogens reduce associations between them that may have been present at the source, but the numerous factors affecting environmental fate and transport were previously poorly understood [24]. Many studies have since investigated the persistence of different organisms under various conditions, often using seeded mesocosms [17]. A recent QMRA study incorporated a meta-analysis of decay rates and found that the risk represented by a particular concentration of sewage-derived HF183 increased with time because it decayed faster than norovirus, the principal driver of risk [2•]. Conversely, another QMRA found that failing to account for differential decay overestimated the risk posed by animal fecal sources but did not meaningfully affect the risk from human sources, which in this study was dominated by viruses with similar decay characteristics as human-associated markers [49]. However, the multitude of factors that affect microbial fate and transport calls into question the generalizability of such assessments given the wide spatial and temporal variation in natural conditions, particularly across organisms and fecal sources [17, 183].

Applications and Recommendations

The use of FIB for fecal contamination assessment continues to have many applications and has expanded with the broad adoption of MST approaches. Routine monitoring of surface waters is widely conducted in order to assess regulatory compliance, characterize water quality trends, and provide timely warnings to protect public health [184,185,186,187,188,189]. Specific investigations, often supplemented with historical monitoring data, may be conducted to inform management strategies and remediation efforts and to evaluate the impacts of infrastructure, policy, and practices [59, 118•, 179, 190,191,192,193]. Forensic applications are increasingly pursued to assign responsibility for fecal pollution, largely enabled by wider adoption of MST approaches and molecular detection methods [13, 69, 70, 106, 117•, 120, 123].

Despite their widespread use, evidence for the suitability of indicators in evaluative applications remains mixed and appears to vary depending on the timing and conditions under which they are applied. Under favorable conditions that provide more proximate connections between indicators and their sources (e.g., near wastewater outfalls or in household drinking water), indicator abundance may be associated with increased risk of illness that one would expect with elevated fecal loads [25, 29, 194]. Interventions that directly impact sources, such as gull deterrence at beaches, may also be reflected in indicator concentrations [118•]. Contamination through less direct processes, such as non-point source pollution, is subject to the numerous factors affecting microbial fate and transport, which may account for the large temporal variability often observed in FIB concentrations and the lack of association with illness [25, 127, 195]. Such variability limits the amount of information conveyed by individual observations, requiring much larger datasets to disentangle trends in indicator occurrence from the inherent variance in indicator measurements [188, 189]. These limitations are especially pronounced when anticipated effects are indirect and small relative to typical indicator concentrations, which may be maintained in part by other sources and pathways of contamination [138, 192, 193, 196]. The outsized influence of precipitation on microbial concentrations may obscure less dramatic dynamics in many systems [195]. Furthermore, clear long-term indicator trends do not necessarily represent concomitant changes in pathogen hazards [157].

The increasing feasibility of comprehensive direct pathogen detection suggests that situations demanding a high degree of confidence about the presence of hazardous fecal contamination may be best served by assaying pathogens directly, utilizing concentration methods to improve sensitivity as appropriate [128]. The possibility of false negatives due to temporal and spatial variability, while partially addressable through strategies such as composite sampling, nevertheless suggests that general fecal indicators should continue to be assessed to complement direct pathogen detection efforts. Despite the recent introduction of procedures to simultaneously quantify multiple FIB, MST, and pathogen genes in under 4 hours [139], the expense, necessary expertise, and rapid pace of change likely preclude the routine application of direct pathogen detection for some time to come. Meanwhile, protecting public health in recreational waters remains an important (and legally mandated) goal. High-traffic beaches with established daily microbial water quality testing programs and dedicated laboratory facilities are likely to benefit from implementing rapid FIB qPCR monitoring with same-day notification [126, 197]. As such beaches are often located near large urban areas and impacted by human sources, they may further benefit from instead implementing simultaneous monitoring of FIB and human-associated markers by duplex dPCR to establish time trends in human-source contamination at little additional cost [112, 114]. Although associations between human-associated markers and gastrointestinal illness are generally lacking [32, 174], their regular application across multiple human-impacted locations may provide useful information to prioritize remediation efforts [111, 112].

Locations that host fewer recreators, have limited monitoring resources and sampling frequency, or are impacted by non-point sources, for which generalizable relationships between indicators and risk are lacking, are unlikely to realize similar benefits from adopting rapid molecular monitoring while incurring substantial additional expense, complexity, and opportunity for error [149•, 198, 199]. Rather, supplementing existing FIB monitoring programs with predictive modeling may present a more feasible approach for expanding the scope of microbial water quality assessment in the numerous surface waters for which monitoring resources are limited [200, 201]. Precipitation—perhaps the most consistent factor in recreational water quality, reliably increasing ambient fecal microbe concentrations and the risk of illness—likewise tends to drive predictive FIB model outcomes [119, 179, 195, 202,203,204]; for many applications, providing recreational guidance on the basis of recent precipitation may well present the most reliable method for protecting public health [110].

Conclusions

The value of fecal indicators as investigative tools to identify fecal pollution has been reaffirmed and expanded with the broad adoption of MST approaches, despite imperfect sensitivity and specificity [58, 81, 110, 118•, 119]. Also, the literature on many technical aspects of fecal indicators and their applications has notably matured, as demonstrated by the recent publication of several comprehensive reviews [2•, 17, 25, 27, 44, 144, 153]. The improved understanding of microbial dynamics and detection approaches has supported the development of more nuanced and robust procedures for characterizing fecal pollution. Nevertheless, this body of work also serves to emphasize the incredible complexity and variability of fecal microbes in the environment and reinforces the challenges to their effective use.

Major challenges remain in source apportionment, risk characterization, and impact evaluation. Additional research is needed to further refine indicators of fecal contamination and to add tools to the toolbox appropriate for emerging challenges. New indicators are needed to detect antimicrobial-resistant bacteria and resistance genes in water samples and to link environmental antimicrobial resistance to health risks. Better risk characterizations are needed to improve risk modeling and to expand the timing and conditions under which these models can reliably predict threats to human health. Also, empirical models that identify associations between indicators and co-measured predictors, particularly rainfall, can likely alleviate some of the sample burden associated with water quality assessments. Direct pathogen detection is becoming more feasible than in previous years and is likely to be more of a focus for water quality tests in the future. Together, these advances are improving water quality assessments and identifying appropriate actions to safeguard public health across the globe.