Quantifying the contribution of microbial immigration in engineered water systems
Immigration is a process that can influence the assembly of microbial communities in natural and engineered environments. However, it remains challenging to quantitatively evaluate the contribution of this process to the microbial diversity and function in the receiving ecosystems. Currently used methods, i.e., counting shared microbial species, microbial source tracking, and neutral community model, rely on abundance profile to reveal the extent of overlapping between the upstream and downstream communities. Thus, they cannot suggest the quantitative contribution of immigrants to the downstream community function because activities of individual immigrants are not considered after entering the receiving environment. This limitation can be overcome by using an approach that couples a mass balance model with high-throughput DNA sequencing, i.e., ecogenomics-based mass balance. It calculates the net growth rate of individual microbial immigrants and partitions the entire community into active populations that contribute to the community function and inactive ones that carry minimal function. Linking activities of immigrants to their abundance further provides quantification of the contribution from an upstream environment to the downstream community. Considering only active populations can improve the accuracy of identifying key environmental parameters dictating process performance using methods such as machine learning.
KeywordsMicrobial immigration Mass balance Engineered water systems Microbiome
Wastewater treatment plant
Microbial communities play essential roles in biogeochemical cycles in natural and engineered ecosystems . To study how different microorganisms assemble into a community and contribute to the function of an ecosystem, various mechanisms including the niche and neutral theories have been developed . In the neutral theory of biodiversity and biogeography, immigration is one of the key stochastic processes that change the community assemblage together with death and birth . This process, sometimes referred to as migration , is originally used in macroecology to estimate the rate of new bird species entering a remote island from the nearest land mass, i.e., the chance of immigration, which plays a pivotal role in the equilibrium of island fauna’s diversity . As the definition of immigration can vary considerably , this review adopts the one stated by Bell , and defines immigration as the process of a microbial individual being added to a local community from the species pool of the metacommunity, which consists of a set of local communities that are physically linked by immigration and can exchange colonists of multiple species . A similar term often used is dispersal [8, 9, 10]. While dispersal and immigration may slightly differ in specific context and one may even include the other , there is still no consensus on the difference [10, 11, 12, 13]. Therefore, we do not attempt to discuss the differences between immigration and dispersal here. We further consider microorganisms that arrive at the local communities all as immigrants, regardless of how they arrive (e.g., facilitated by cell motility or flow of water or air) and how they contribute to the local community after arrival.
While microbial immigration is frequently reported in engineered water systems, it remains challenging to quantitatively address “to what extent immigration contributes to the assembly and function of the downstream community?” In this article, we focus on methodology quantifying microbial immigration by first reviewing methods that are currently used and identifying their limitations. Then, an approach that calculates the net growth rates of individual microbial immigrants in the downstream community is reviewed. It couples a mass balance model with high-throughput DNA sequencing to partition microbial assembly into active populations that contribute to community function and inactive ones that carry minimal microbial function. Its potential use together with machine learning to identify key environmental parameters affecting the microbial ecosystem’s function is discussed.
Methods commonly used to evaluate immigration impact
The second approach is microbial source tracking that estimates the proportion of taxa in the downstream or sink community coming from multiple upstream or source environments  (Fig. 2b). The basic rationale is that more abundant taxa in the source have higher probabilities to be observed in the sink, which represents the contribution of each source. This method has been applied to study a sink environment that receives immigration from multiple sources, such as residential kitchen microbiome subject to source microbiota of the human palm skin, produces, and faucet water  and public restroom microbiome subject to source microbiota of the soil, water, human urine, gut, mouth, and skin . However, the method assumes that all the observed microbial populations in the sink community come from source environments and ignores the fact that some active microorganisms can undergo rapid reproduction after entering the sink. When the abundance of immigrants increases, the source tracking method cannot fully explain this fraction of community composition from known sources. This limitation leads to the observation that sometimes the majority of the sink community is labeled as unknown, whereas only a small proportion can be explained by known sources. The observation of unknown sources is especially common in systems with high microbial activities, such as wastewater treatment processes [32, 33, 34].
The third approach uses the neutral model (Fig. 2c) developed by Sloan and coworkers . By determining the abundance-frequency distribution of individual microbial species, a species-independent immigration probability m is calculated, which is uniform for every community member. A small m value suggests that the community as a whole is comprised of a low proportion of immigrants from the source. This model has been frequently used to evaluate the relative importance of neutral mechanism and directly assess the immigration rate at community level by calculating the m value. Using this model, Ayarza and Erijman revealed that neutral process was important in the assembly of activated sludge community . In a drinking water distribution system, the model was used to demonstrate that the role of immigration from city water supply to tap water was higher at the proximal end than at the distal end of indoor plumbing . Likewise, studies have compared the m values to reveal higher immigration impact in planktonic communities than in sedimentary communities in Yangtze River , and in deep-water communities than in surface water communities . Despite the success of the model in explaining the general trend of abundance-frequency distribution, there are always species significantly deviating from the S-shape fitting curve. This is likely caused by assuming a constant immigration rate for all community members. It ignores the fact that immigrants with active microbial growth in the new environments can become more abundant than the prediction.
All the three methods described above merely enumerate the number of immigrants and cannot fully reveal the impact of microbial immigration on community functions. As microorganisms can carry considerably diverse activities after entering a new environment, in contrast to inert and homogeneous particles, it is important to address “how many microbial immigrants are able to actively contribute to ecological functions.”
Quantifying immigration impact with the consideration of microbial activity
To quantitatively evaluate the immigration impact, both the abundance and activity of individual immigrants should be considered. Compared to determining abundance, assessing the activities or growth of individual immigrants in a given microbial ecosystem is challenging. In pure cultures, microbial activity can be assessed by measuring substrate consumption, metabolite production, or cell density change during a period of incubation. However, only a small fraction of microbes in nature can be cultivated and microbial activities determined in pure culture can differ drastically in a complex community under environmental conditions . Sequencing 16S ribosomal RNA (rRNA) genes and other biomarkers is often used to identify microorganisms present in the environment but cannot effectively distinguish active populations from inactive or dormant species. Likewise, metagenomics reveals functional potentials of community members but cannot discern expressed and non-expressed pathways . Directly sequencing rRNA can identify active microbes, but the consistency of this approach can be affected by the differences between rRNA content and microbial activity . Sequencing messenger RNA (mRNA), i.e., metatranscriptomics, provides accurate identification of highly expressed genes and active populations from environmental samples , but this approach is challenged by the scarcity of well-annotated high-quality reference genomes . Nucleotide sequencing can be coupled with methods such as microautoradiography , stable isotope probing , or nano secondary ion mass spectrometry  that label specific substrates to link substrates uptake activity with microbial identity. It is also possible to label and visualize specific active populations using fluorescence in situ hybridization designed to target rRNA . These methods however cannot target all community members due to the cost and time associated with labeling individual substrates or organisms. In addition, metaproteomics and metabolomics can characterize the entire collection of proteins or metabolites of a given sample, and provide direct measurement of microbial activity, but also face challenges on preparing high-quality samples from complex environments and on linking proteins/metabolites with microbial identity . Overall, it is still expensive and time-consuming using these ecological tools to quantify the in situ activities of most microbial populations in a complex ecosystem.
Quantifying immigration impact at individual population level using ecogenomics-based mass balance approach
Saunders et al. first developed this method and studied three WWTPs . Thirty-five percent of the observed species in activated sludge reactors were also detected in the influent wastewater, suggesting a strong immigration impact. However, the mass balance revealed that majority of the shared species had negative net growth rate, suggesting they did not actively contribute to the metabolisms in activated sludge. There were a few immigrants with positive growth rate in activated sludge, indicating that they were indeed active in situ. Overall, the authors concluded a modest impact of immigration on the activated sludge community, considering there were both inactive and active immigrants with moderate abundance.
Mei et al. applied the ecogenomics-based mass balance approach to anaerobic digesters that receive massive biomass from upstream activated sludge . Based on the result that populations with negative net growth rate accounted for 25% of total sequences in digesters, a strong immigration impact with the feed aerobic wasted sludge was concluded. Phylogenetic analysis confirmed that inactive populations were associated with aerobes or facultative anaerobes, whereas active populations were associated with obligate anaerobes. This study also reported the bias associated with the use of 16S rRNA-based relative abundance as an activity indicator in environments under high immigration impact. It is possible that some immigrants, which are the major populations in the previous environment, can still contain high copy number of rRNA after moving to a downstream environment. Thus, rRNA-based calculation can overestimate the relative abundance and contribution of these populations in the downstream community . Such rationale was further used to multiple full-scale digesters around the world [52, 53], and the findings revealed that immigration from feed sludge to the digester communities was a ubiquitous phenomenon and the extent of contribution was also influenced by the operation conditions and pretreatments related to the anaerobic digestion.
The ecogenomics-based mass balance approach was also applied to an industrial WWTP to demonstrate its effectiveness in teasing apart the interaction between neutral and niche-based mechanisms . In the studied WWTP, immigrants from an upstream anaerobic reactor were inactive (net growth rate ≤ 0) and represented a negligible fraction (1% of the total sequences) in the downstream activated sludge community, implying a weak immigration impact. But these immigrants were found to affect the prediction of key environmental parameters from community composition using a machine-learning tool . To do so, a supervised learning regressor was first trained on a set of samples with known physiochemical parameters such as temperature, pH, and nutrient concentrations and then used to predict the target values of the remaining samples. Parameters with higher prediction accuracy played more important roles in shaping the microbial community. After removing inactive immigrants from the downstream community, the prediction accuracy greatly improved. This result suggests that more cautious interpretation should be made to identify the key environmental parameters based on community composition. Commonly used methods, including k-means clustering , principal components analysis , principal coordinate analysis , non-metric multidimensional scaling , and redundancy analysis , solely rely on DNA-based microbial abundance, but pay little attention to the existence of inactive immigrants. The efforts of correlating observed species abundance with environmental conditions would be notably biased in an open ecosystem where inactive populations are introduced by immigration.
Wastewater systems are often designed with complex process configuration where high biomass flux from one process to the next can take place. Sampling and controlling in these systems at different temporal and spatial scales are easier than natural environments. Therefore, these environments present an excellent opportunity to apply the ecogenomics-based mass balance approach to quantify the contribution of microbial immigration to community composition and function. Furthermore, this method can be applied to non-wastewater systems where microbial immigration is commonly present but the contribution is rarely quantified. The differentiation of inactive populations is specifically valuable in those environments where the growth rates of microorganisms can be more heterogeneous than in highly selective wastewater systems. For example, biofilm growth in the drinking water distribution pipe has been recognized as an important process that affects drinking water quality [21, 22]. The mass balance model can be used to characterize the growth and immigration of different organisms in the biofilm, especially those posing risks to human health. To do so, a section of pipe can be considered as the control volume, with fresh city water and tap water as the influent and effluent, respectively. Organisms that can scavenge substrates in the pipe will exhibit higher growth rates and can be released into the tap water. Different conditions and parameters related to the mass balance can be tested to assess their impacts on immigration. Some of them include the disinfection methods of the city water supply, the sizes and materials of the pipe, the temperature of the environment, and the period of the water stagnation. These results can provide guidance to improve drinking water quality and prevent waterborne disease outbreak from the aspect of microbial ecology. While the measurements of mass flux and cell count related to biofilm can be challenging, they can be solved for example by harvesting the biofilm after a period of development and by enumerating cell number with flow cytometry as demonstrated in a recent study . Optical coherence tomography is another effective and non-destructive way to determine the biofilm mass and possibly mass change (e.g., in term of volume) on the pipe inner surface . In other systems where diverse ecological functions are carried out, functional genes, such as mcrA for methanogenesis and amoA for nitrification, can be used to monitor a subset of populations with specific function(s) instead of 16S rRNA gene. Besides marker genes, metagenomics and metatranscriptomics can be used to estimate the abundance and activity, respectively, of individual immigrants with higher resolution. In addition, the immigration of viruses  and eukaryotes  from an upstream process to a downstream process can be monitored and quantified, in addition to prokaryotic populations.
Comparison of the commonly used methods that quantify immigration impact and the ecogenomics-based mass balance
Ecogenomics-based mass balance
Abundance of total immigrants
Abundance of individual immigrants
Activity of individual immigrants
Multiple upstream environments
Multiple downstream environments
Cell number estimation
Microbial immigration is a ubiquitous and important process occurring in engineered water systems, and it allows microbes present in an upstream system to influence the microbial assembly and function in a downstream receiving system after entering. To understand the impact of microbial immigration, qualitative and quantitative methods are necessary and have been developed. Commonly used methods are recognized to have limitations in quantifying the immigration impacts. The ecogenomics-based mass balance approach provides a solution by quantitatively determining the activity profile of all microbial populations in a community. This approach can effectively identify inactive populations, especially those resulted from immigration, and pinpoint microorganisms that are actually carrying out the process of interest. Furthermore, when coupled with methods such as machine learning, it can better identify key environmental parameters affecting system performance, which can guide the monitoring and designing of biological processes. It is foreseen that such an approach can be widely applied to various engineered and possibly natural environments, where the contribution of microbial immigration remains to be further characterized.
RM conceived the idea and wrote the manuscript. WTL conceived the idea and reviewed the manuscript. Both authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
- 3.Hubbell SP. The unified neutral theory of biodiversity and biogeography (MPB-32). Princeton: Princeton University Press; 2001.Google Scholar
- 24.Wells GF, Wu CH, Piceno YM, Eggleston B, Brodie EL, DeSantis TZ, Andersen GL, Hazen TC, Francis CA, Criddle CS. Microbial biogeography across a full-scale wastewater treatment plant transect: evidence for immigration between coupled processes. Appl Microbiol Biotechnol. 2014;98(10):4723–36.PubMedCrossRefGoogle Scholar
- 34.Ahmed W, Staley C, Sadowsky MJ, Gyawali P, Sidhu JPS, Palmer A, Beale DJ, Toze S. Toolbox approaches using molecular markers and 16S rRNA gene amplicon data sets for identification of fecal pollution in surface water. Appl Environ Microbiol. 2015;81(20):7067.PubMedPubMedCentralCrossRefGoogle Scholar
- 42.Poretsky RS, Gifford S, Rinta-Kanto J, Vila-Costa M, Moran MA. Analyzing gene expression from marine microbial communities using environmental transcriptomics. J Vis Exp. 2009;24:e1086.Google Scholar
- 47.Sekiguchi Y, Kamagata Y, Nakamura K, Ohashi A, Harada H. Fluorescence in situ hybridization using 16S rRNA-targeted oligonucleotides reveals localization of methanogens and selected uncultured bacteria in mesophilic and thermophilic sludge granules. Appl Environ Microbiol. 1999;65(3):1280–8.PubMedPubMedCentralGoogle Scholar
- 55.McHardy IH, Goudarzi M, Tong M, Ruegger PM, Schwager E, Weger JR, Graeber TG, Sonnenburg JL, Horvath S, Huttenhower C. Integrative analysis of the microbiome and metabolome of the human intestinal mucosal surface reveals exquisite inter-relationships. Microbiome. 2013;1(1):17.PubMedPubMedCentralCrossRefGoogle Scholar
- 60.Shen Y, Huang PC, Huang C, Sun P, Monroy GL, Wu W, Lin J, Espinosa-Marzal RM, Boppart SA, Liu W-T, et al. Effect of divalent ions and a polyphosphate on composition, structure, and stiffness of simulated drinking water biofilms. NPJ Biofilms Microbiomes. 2018;4(1):15.PubMedPubMedCentralCrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.