Introduction

Groundwater ecosystems harbour the largest terrestrial biome, accounting for up to 40% of the earth’s freshwater prokaryotic biomass (Griebler and Lueders 2009; Griebler et al. 2014). This rich biodiversity is threatened on a global scale because of groundwater contamination. There are thousands of cases of groundwater contamination globally due to landfill operations. A good amount of research has focused on characterising landfill leachate chemistry (Kjeldsen et al. 2002; Masoner et al. 2020; Zhao et al. 2018) and groundwater leachate plumes (Abiriga et al. 2020, 2021d; Bjerg et al. 2011; Christensen et al. 2001). While research on leachate microbiology is now gaining momentum (Rajasekar et al. 2018; Song et al. 2015b; Staley et al. 2018; Zainun and Simarani 2018), our understanding of the microbial ecology of landfill-leachate-impacted aquifers remains scant. The chemical composition of landfill leachate is complex (Christensen et al. 2000; Eggen et al. 2010; Moody and Townsend 2017; Mouser et al. 2005) and may contain chemicals that are toxic to microorganisms in the leachate-receiving groundwater. As landfill leachate production may last for decades to centuries (Bjerg et al. 2011), the long-term release of toxic compounds can even result in permanently eliminating the native species due to chronic disturbances (Herzyk et al. 2017; Song et al. 2015a). Thus, landfill contaminations are press perturbations. Press perturbations are disturbances due to persistent discharge of contaminants into an environmental medium such as groundwater, soil, lake, and river (Zhou et al. 2014).

The complex mix of contaminants in landfill leachate limits the applicability of conventional treatments for landfill leachates and necessitates more robust approaches. Natural attenuation is considered superior in this respect (Mouser et al. 2005). The better treatment outcome from natural attenuation signifies the roles of the intrinsic microorganisms, because biodegradation, which is a core process in natural attenuation, is mediated by microbes. Studying the microbiology of landfill leachate plumes not only informs on the effect of the leachate on the microbial communities, but also informs on the potential of the resident microbial communities to degrade contaminants in the plumes.

Previous microbiological studies from landfill leachate plumes (Abiriga et al. 2021a; Lu et al. 2012; Mouser et al. 2005; Taş et al. 2018) have almost exclusively focused on one aspect of microbial ecology such as alpha diversity, beta diversity, or microbial functions. We previously showed how multiple factors can affect microbial community composition in an aquifer (Abiriga et al. 2021b). While these studies have given significant insights into the microbiology of landfill leachate plumes, the aspect of microbial co-occurrence and the relative importance of deterministic versus stochastic microbial community assembly remains unexplored. Knowing whether microbial communities assemble deterministically or stochastically is very crucial in understanding how the communities evolve and sustain. This presents an important knowledge gap in our understanding of the microbial ecology of landfill-perturbed aquifers.

Network analysis has been successfully applied to study microbial co-occurrence across multitudes of habitats (Barberán et al. 2012; de Vries et al. 2018; Horner-Devine et al. 2007; Ju et al. 2014; Lupatini et al. 2014; Williams et al. 2014) and has helped resolved aspects of microbial ecology that cannot be addressed by community metrics such as alpha and beta diversities (Lupatini et al. 2014). Analysis of co-occurrence patterns can decipher otherwise inaccessible aspects of complex microbial systems such as providing information on the ecological traits of uncharacterised microbes that co-occur with well characterised microbes (Barberán et al. 2012; Fuhrman 2009; Williams et al. 2014). This may allow such taxa, which are very difficult to cultivate in the laboratory, to be grown in a co-culture with the well characterised species (Lupatini et al. 2014). The contribution of deterministic processes in shaping the aquifer microbiology was quantified by applying a multivariate statistic by leveraging on the environmental data. Coupling network analysis to multivariate statistics offers a better interpretation of microbial community data (Williams et al. 2014).

The main objectives of the study were to answer the following questions: (i) which taxa show strong and significant interactions? (ii) Which are the keystone taxa and how do they compare among sampling wells? (iii) Do the microbial taxa in the aquifer assemble deterministically or stochastically? Answering these fundamental ecological questions should give insight into the microbial ecology of understudied landfill-perturbed environments.

Materials and methods

The study aquifer and field procedures

The study aquifer is a confined aquifer of Quaternary glaciofluvial deposit located in southeast Norway (Fig. 1). The aquifer matrix is characterised by medium to high permeability sand and gravel (Abiriga et al. 2020; Klempe 2004, 2015). It is a small aquifer fed by a small watershed (Klempe 2015). In the period 1974–1996, a municipal landfill was operated in the area and because no leachate containment system was in place, the leachate from the landfill contaminated the aquifer. Additional information on the study site is accessible elsewhere (Abiriga et al. 2020, 2021c, d; Klempe 2004, 2015).

Fig. 1
figure 1

Sample site showing the landfill, sampling wells and the site hydrogeology

Groundwater samples were collected twice a year, in spring and autumn, in 2018 and 2019 from four monitoring wells: R1 (the proximal well), R2 (the intermediate well) and R4 (the distal well) located in the contaminated aquifer, and R0 (the background well) located in a nearby uncontaminated aquifer (Fig. 1). The proximal, intermediate, and distal wells are located downstream of the landfill at 26 m, 88 m, and 324 m, respectively. The proximal and intermediate wells are multilevel sampling wells constructed using the Waterloo Groundwater Monitoring System, equipped with five and four levels, respectively. The distal well consist of a cluster of three 25 mm diameter PVC pipes installed at different depths. In this study, however, only one level was considered in each of the wells: R104 (proximal), R203 (intermediate), and R402 (distal); all from the middle level of the aquifer.

In total, 48 groundwater samples were analysed, with three samples taken in spring and another three in autumn which makes 12 from each of the four wells over the 2-year period. Groundwater samples from the proximal and intermediate wells were obtained by a repeated cycle of applying nitrogen pressure through drive valves and venting, until groundwater samples emerge through the Teflon tubes with a gentle pulsating flow. Samples from the distal well were taken using a hand pump, while those from the background well were taken using a submersible pump. In all the cases, samples were collected after purging the well volume (distal and background wells) and micro-purging (proximal and intermediate wells) in accordance with ISO 5667-11 (2009). Samples for microbiology were collected in sterile 350 ml PETE bottles (VWR, UK) without headspace, while those for groundwater geochemistry were collected in 500 ml PETE bottles. pH and electrical conductivity were determined onsite, while dissolved oxygen was fixed onsite and later determined in the laboratory using the Winkler method (Winkler 1888). The samples were maintained at ≤ 4 °C in cooler boxes and transported to the laboratory at University of South-Eastern Norway.

Laboratory procedures

Groundwater chemical analyses have been described previously (Abiriga et al. 2021a). The samples were analysed for 15 physicochemical parameters: pH, dissolved oxygen, electrical conductivity, sodium, potassium, ammonium, calcium, magnesium, iron, manganese, chloride, nitrate, alkalinity, sulphate, and total nitrogen using standard analytical methods (Abiriga et al. 2021a).

A total of 300 ml of each of the samples for microbiology in sterile PETE bottles was filtered through 0.2 μm polycarbonate membrane filter upon arrival at the laboratory. The filters were stored at − 70 °C prior to DNA extraction. DNA was extracted from one half filters using DNeasy PowerSoil Kit (Qiagen, Germany), following the manufacturer’s instructions. DNA quantity was measured using Qubit Fluorometer 3.0 (Life Technologies, Malaysia), while the quality was assessed using Nanodrop spectrometer (Thermo Scientific, China) and 2% agarose gel electrophoresis. The DNA samples were sent to Norwegian Sequencing Centre (https://www.sequencing.uio.no), where PCR amplification, library preparation and sequencing were conducted. The V3-V4 hypervariable region of the 16S rRNA gene was amplified using the primer set 319F (5′-ACTCCTACGGGAGGCAGCAG-3′) (Lane 1991) and the modified 805R (5′-GGACTACNVGGGTWTCTAAT-3′) (Apprill et al. 2015). Library preparation was conducted following the Fadrosh et al. protocol (Fadrosh et al. 2014), with the forward and reverse oligos consisting of an Illumina-specific adaptor sequence, a 12-nucleotide barcode sequence, a heterogeneity spacer, and the primer set. The 16S rRNA gene fragment library was sequenced using Illumina MiSeq, by applying the 300 bp paired-end protocol v3 (600-cycle kit) with 10% PhiX as the control library.

Sequence analysis

The DNA sequences were demultiplexed using a demultiplexer accessible at https://github.com/nsc-norway/triple_index-demultiplexing/tree/master/src. During this step, barcode sequences and the heterogeneity spacers were removed. The DNA sequences were quality-filtered (primer trimming, and removal of short sequences and chimeras), dereplicated, merged, and assigned to amplicon sequencing variants using DADA2 (Callahan et al. 2016) plug-in for QIIME2 v.2019.1.0 (Bolyen et al. 2019). All the steps were run using the default parameters except the primer length (set to 20 bp) and minimum length of reads (set to 280 bp). The amplicon sequencing variants (ASVs) were subjected to taxonomic assignment using Naïve Bayes classifier algorithm trained on data from SILVA v.138 conducted in QIIME2 v.2020.2.0 (Bolyen et al. 2019). The library statistics are provided in the supplementary information (Table S3).

Statistical data analysis

Statistical analyses were performed using R v.4.0.2 (R Core Team 2020). The microbial community dataset used in all the analyses was classified at genus level of taxonomy. The alpha diversity (Shannon index) was calculated using package phyloseq v.1.38.0 (McMurdie and Holmes 2013). Difference in Shannon diversity index across the sampling wells was tested for significance using one-way ANOVA with a post hoc Tukey’s HSD for pairwise comparisons. Differences in Shannon diversity index between 2018 and 2019 and between autumn and spring were tested for significance using Student’s t test. Multivariate analyses: nonmetric multidimensional scaling (NMDS), permutational analysis of variance (PERMANOVA) (Anderson 2001), variation partitioning (Borcard et al. 1992), and rarefaction (Fig. S1) were performed using package vegan v.2.5.6 (Oksanen et al. 2019). As water chemistry datasets are dimensionally heterogeneous (measured in different units), the data was standardised prior to variation partitioning, as was the microbial community dataset, which was square-root transformed and Hellinger standardised (Legendre and Gallagher 2001) prior to multivariate analysis. NMDS was used to visualise beta diversity based on Bray-Curtis dissimilarity measure. The sample clusters in the NMDS were tested for significant difference using PERMANOVA on 9999 permutations. Group homogeneity was assessed using function ‘betadisper’ (Anderson 2006). Likewise, the change in the beta diversity between the 2 years (2018 and 2019) and seasons (spring and autumn) were analysed for significance using PERMANOVA on 9999 permutations. The contribution of the measured variables in explaining the variation in the microbial community composition was analysed using variation partitioning (Borcard et al. 1992). Total nitrogen and ammonium were removed from the dataset during variation partitioning, due to missing observations. The species community dataset was used without filtering (background, proximal, intermediate, distal wells; N = 48, 1979 taxa) as it is important to perform the above analysis on the full community dataset. Statistical tests were considered significant at P ≤ 0.05.

Prior to the network analysis, the community data from each of the wells was filtered by selecting taxa present more than 5 times in at least 50% of the samples from each of the well. Subsequently, the 25 most abundant taxa in the respective samples were chosen for further analysis. This reduced the number of taxa remarkably to only include the core members of the communities (from 616 to 82 taxa in R0; 1103 to 94 in R104; 1223 to 117 in R203; and 1186 to 81 in R402). Thus reducing the network complexity and eliminating taxa that were rare and/or showed multiple zero abundances, which should be avoided (Banerjee et al. 2018). From the quality filtered data, taxa co-occurrence based on Spearman’s rank correlations was calculated separately for each of the wells. The co-occurrence network was generated using package igraph v.1.2.6 (Csardi and Nepusz 2006), using an R script from the literature (Ju et al. 2014) on Github (https://github.com/RichieJu520/Co-occurrence_Network_Analysis). Only taxa having significant positive correlations (Benjamini-Hochberg corrected Spearman’s rank correlations, ρ > 0.6; P < 0.01) were displayed in the co-occurrence network. We focused exclusively on the positive associations because we think that in environmental systems such the studied aquifer, which is influenced by the operation of the landfill, microbial communities may need to cooperate and/or prefer common conditions, since positive associations may indicate common preference to conditions or cooperative associations (Fuhrman 2009). In our setting, such cooperative interactions may reflect co-metabolism, a function which is central in biodegradation in contaminated systems such as the present study aquifer, hence the need to focused on positive associations. Network visualization was performed in Gephi v.0.9.2 (Bastian et al. 2009). The network topologies of the final model were compared with those generated from a random network according to the literature (Erdős and Rényi 1960). The key network topological properties evaluated to identify community functions included betweenness centrality (the number of shortest paths going through a vertex (node)) used to delineate keystone taxa (Williams et al. 2014; Guo et al. 2022), node degree (the number of connections to other nodes) (Faust and Raes 2012; Guo et al. 2022), and closeness centrality. In addition, parameter estimations: within-module connectivity (Zi) and between-module connectivity (Pi) for identification of topological roles were calculated using package microeco (Liu et al. 2021). Prior to the network analysis, the taxa co-occurrence was evaluated for randomness by simulating a null community co-occurrence using the checkerboard-score (C-score) in package EcoSimR v.0.1.0 (Gotelli et al. 2015) for each of the wells, which were treated as independent communities. The null community co-occurrence (null model) assumes that co-occurrence patterns arise by chance (Gotelli et al. 2015).

Results

Community diversity metrics and variation partitioning

Alpha diversity was highest in the intermediate well and lowest in the distal well (Fig. 2a). Tukey’s honest significance difference (Table 1) indicates that the Shannon diversity index varied significantly between most of the combination of pairs except between the background and distal wells, and between the proximal and intermediate wells. Further, a t test performed on Shannon diversity index indicated non-significant differences between 2018 and 2019, and between spring and autumn, for nearly all the wells, except a significant seasonal difference in the background well (P = 0.04) and a significant yearly difference in the distal well (P = 0.04) (Table S1).

Fig. 2
figure 2

a Shannon diversity index across the sampling wells, calculated from data collected in the period 2018-2019. Each point represents a sample. The asterisk indicates an outlier. b Nonmetric multidimensional scaling (NMDS) plot of sites based on Bray-Curtis dissimilarity distance. R0: background well; R104: proximal well; R203: intermediate well; R402: distal well

Table 1 Tukey’s honest significance difference for the pairwise comparisons for every combination

Beta diversity analysis based on Bray-Curtis dissimilarity metric shows distinct microbial community composition across the wells, although a slight overlap between the proximal and intermediate wells exists (Fig. 2b). The first axis (NMDS1) separates the wells by aquifer type. The proximal, intermediate, and distal wells from the contaminated aquifer correlated positively with NMDS1, while the background well from the uncontaminated aquifer correlated negatively with NMDS1. The second axis (NMDS2) separates the wells by the degree of groundwater contamination. The uncontaminated groundwater from the background well and the less contaminated groundwater from the distal well both correlated negatively with NMDS2. On the other hand, the more contaminated groundwater from the proximal and intermediate wells correlated positively with NMDS2.

The groups in the NMDS were tested for significant difference using PERMANOVA. Both the global and pairwise analyses showed statistically significant differences across (F3.0 = 11.8, P = 0.001) and between the wells (Table 2). Similarly, differences in microbial community composition between spring and autumn and between 2018 and 2019 were tested. Results indicated non-significant differences in the microbial community composition between spring and autumn (F1.0 = 1.13, P = 0.273) and between 2018 and 2019 (F1.0 = 1.11, P = 0.288).

Table 2 PERMANOVA pairwise comparisons of microbial community composition between the wells

The variation in the microbial community composition (Fig. 3) partitioned among the variables: water chemistry (47.6%, F = 4.29, P = 0.001), well (44.8%, F = 13.7, P = 0.001), and both season and time (year) (0.4%, P > 0.05). Removing the effects of covariables resulted in explained variances of 7.5%, 4.5%, 1.1%, and 0.7% for water chemistry, well, season, and year, respectively. Of the explained variance (55.3%), 42.5% of this was accounted for by an interaction term between the groundwater chemistry and well, leaving only 12.8% of the variance attributable to the other terms in the model. The collective variance (that explained by all the variables together) was only 0.4%, and the unexplained variance was 44.7%. A summary of the groundwater geochemistry can be accessed from the supplementary information (Table S2).

Fig. 3
figure 3

Variation partitioning of proportions of variation in microbial community composition explained by water chemistry, year, season, and well. Values in parentheses indicate the variances explained by the respective variables but without removing the contribution from covariables

Co-occurrence network

Implementing the quality filtering and network selection criteria resulted in 33 nodes and 29 edges (background well), 70 nodes and 196 edges (proximal well), 58 nodes and 86 edges (intermediate well), and 8 nodes and 13 edges (distal well) (Figs. 4 and 5). The taxa with the most number of connections were Nitrospira, Acidobacteriae, and Babeliales in the background well with all having 3 connections; Candidate Kaiserbacteria, Omnitrophales, and Chloroflexi in the proximal well with each having 15, 13, and 12 connections respectively; Patulibacter, Legionella and Neisseriaceae in the intermediate well with each having 9, 8, and 7 connections respectively; and Chloroflexi and Nitrosomonadaceae in the distal well with both having 2 connections (Figs. 4 and 5).

Fig. 4
figure 4

Co-occurrence network in the background (a) and proximal (b) wells. Each connection represents a strong positive and significant Spearman’s correlation (ρ > 0.6, P < 0.01) and the thickness of the connections is proportional to the correlation coefficient. The size of the nodes is proportional to the node degree (number of connections), and the node colours represent microbial phyla

Fig. 5
figure 5

Co-occurrence network in the intermediate (a) and distal (b) wells. Each connection represents a strong positive and significant Spearman’s correlation (ρ > 0.6, P < 0.01) and the thickness of the connections is proportional to the correlation coefficient. The size of the nodes is proportional to the node degree (number of connections), and the node colours represent microbial phyla

Based on a combination of the network topological parameters (node degree, betweenness centrality, and closeness centrality), 19 taxa were designated as putative keystone taxa in the four communities (Tables S4–S7); although when the Zi-Pi model was applied, neither network hubs nor module hubs were identified except a node connector (Figs. S2–S5). The keystone taxa varied between and among the four communities (wells) in both number and composition. Those in the background comprised three taxa: Nitrospira, Acidobacteriae, and Babeliales. The most diverse and numerous (eight) keystone taxa were from the proximal well and consisted of Vicinamibacterales, Chloroflexi, Candidate Kaiserbacteria, Parcubacteria, Gemmatimonadaceae, Candidate phylum MBNT15, Omnitrophales, and Elusimicrobiota. Six keystone taxa were identified in the intermediate well which included Patulibacter, Legionella, Neisseriaceae, Nitrospira, Nitrosomonadaceae, and Steroidobacteraceae. The least diverse and the lowest number of keystone taxa was recorded in the distal well with Chloroflexi and Nitrosomonadaceae as the only taxa.

A well-by-well basis simulation of null communities showed significant non-random taxa co-occurrence in the proximal, intermediate, and distal wells (Table 3). Among these wells, the standardised effect size (SES) was highest in the intermediate well, moderate in the distal well and lowest in the proximal well. The background well by contrast, showed a non-significant marginally higher C-score (2.9333) than expected under random null model (2.9301), with 70/1000 simulations occurring more than the observed C-score.

Table 3 Results of null model simulations for the four different communities

Discussion

The network complexity varied notably among the four communities, with the proximal well showing the most complex taxa co-occurrence network while the distal having the least complex structure (Figs. 4 and 5). Similarly, the putative keystone taxa varied remarkably among the communities (Tables S4–S7). These results hint on inherent variations in community composition and interactions in situ. Taxa such as Gemmatimonadaceae, Nitrospira, and Nitrosomonadaceae were identified as putative keystone taxa (Tables S4–S7). Bioaccumulation of polyphosphate is a feature of two of the three species (as of writing this manuscript) of phylum Gemmatimonadetes (Pascual et al. 2018; Zhang et al. 2003). Making phosphorous bioavailable can be viewed as provision of ‘public goods’ of the microbial community that increase its stability and diversity (Konopka et al. 2015). Nitrospira and Nitrosomonadaceae may be involved in nitrogen cycling, as both nitrate and ammonium were present in the groundwater samples (Table S2). Moreover, the identification of Nitrosomonadaceae in the background, intermediate, and distal wells suggest the taxon is a potential cosmopolitan taxon in the area. Whereas the Nitrospira was designated as a node connector in the intermediate well (Fig. S4), implying that it plays an important role in inter-module communication within the community (Guo et al. 2022).

Taxon Parcubacteria (Candidate Jorgensenbacteria and Candidate Kaiserbacteria) belonging to phylum Patescibacteria form an important part of the network particularly in the proximal well. Patescibacteria are episymbionts (Castelle et al. 2018) and the strong correlations with other taxa in the co-occurrence network may therefore suggest potential host-symbiont relationship. Network analyses provide starting point for empirical observation and hypothesis testing, as well as for identifying ecological traits (Banerjee et al. 2018; Fuhrman 2009; Williams et al. 2014). Thus, the connection of Patescibacteria to many cultivable microbes suggests a way forward in the in vitro co-cultivation of Patescibacteria. Currently, no cultured representatives of the taxon exist (Brown et al. 2015; Kantor et al. 2013; Wrighton et al. 2012) and little is known about them (Castelle et al. 2018), yet they are abundant in groundwater (Herrmann et al. 2019).

Well-by-well null models showed non-random community co-occurrences in the contaminated aquifer. Non-random co-occurrence implies deterministic factors operate to shape the microbial communities (Horner-Devine et al. 2007). In the present study, the main driving factor is the landfill leachate and because this varies from well to well due to natural attenuation (Abiriga et al. 2020), a gradient exists and the communities from the wells showed idiosyncratic co-occurrence patterns. Thus, the proximal and distal wells represent opposing ends of a spectrum, with the proximal being highly influenced while the distal being least impacted. From an ecological point of view, this presents differences in niche-based processes that may be characterised by successions analogous to those observed in perturbation experiments (Herzyk et al. 2017; Zhou et al. 2014) as conditions revert to normal. The low SES (Table 3) recorded in the proximal well indicates that the microbial taxa in the proximal well co-occur more than in the intermediate and distal wells. This attests to our assertion that environmental filtering due to the leachate is strongest in the proximal well due to its proximity to the landfill (Abiriga et al. 2021a). The strong disturbance in the proximal well like other perturbations, will increase cell mortality and niche selection, and decrease microbial diversity and ecological drift (Zhou et al. 2014), causing the taxa to coexist more than in the intermediate and distal wells.

In the intermediate well, disturbance is expected to be of an intermediate strength. The higher SES and alpha diversity in the intermediate well agree with the ‘intermediate disturbance’ hypothesis that the highest diversity occurs at an intermediate level of disturbance (Miller et al. 2011; Svensson et al. 2012). Probable mechanisms shaping the microbial community in the intermediate well are niche-based processes such as predation and symbiosis. Some of the endosymbionts showed higher abundance in the intermediate well (Fig. S6).

In the distal well where the influence of leachate is expected to be minimal due to leachate attenuation (Abiriga et al. 2020) gives room to other ecological processes to drive the microbial community. This may include variable selection, competition, predation, and phylogenetic history. The same endosymbionts present in the intermediate well were also present here.

In contrast to the contaminated aquifer, the null model analysis of the background well showed only a marginally larger but non-significant observed C-score. This suggests that the microbial community in the background well exhibits some degree of aggregation. Evidence of putative aggregation is the co-occurrence of 7% of the 1000 simulations more than the observed. Possible explanations for species aggregation are mutualistic and syntrophic interactions (Horner-Devine et al. 2007).

Identifying the ecological processes shaping community compositions in any system involves identifying whether it is deterministic or stochastic. While our analysis does not identify the causal mechanistic processes, the non-random community assembly patterns do indicate the dominance of deterministic processes (Horner-Devine et al. 2007). Variation partitioning was employed to quantify the overall contribution of deterministic factors in shaping the microbial community compositions across the four wells sampled. The model explained 55.3% of the variation in the microbial community compositions, which is higher than that reported earlier (see below). Of the explained variance, the groundwater chemistry and well jointly accounted for most of the variance (42.5%), indicating that both the microbial community composition and the groundwater chemistry have similar spatial structuring (Borcard et al. 1992), which is attributed to the influence of the landfill leachate (Abiriga et al. 2021b). Given that not all environmental variables are measurable in any single study, the unexplained variance (44.7%) may represent both the unmeasured deterministic and stochastic factors, although stochastic processes may play a partial role in shaping microbial community compositions (Stegen et al. 2012; Williams et al. 2014). We posit that in systems such as the present aquifer, which is under press perturbation (Zhou et al. 2014), deterministic processes are more important than stochastic processes. At present, the nature of the sampling design does not allow a full account of any mechanistic processes (variable selection, homogenous selection, dispersal limitation, etc.) to be drawn, as samples were taken from one level in each well. Yet, there is a significant vertical variation in the aquifer (see below). To ensure a full account for the entire aquifer, an in-depth analysis taking into consideration both the depth-wise (small scale) and longitudinal-wise (bigger scale) processes will be conducted and communicated in a future manuscript.

Earlier, we reported a lower explained variance with variation partitioning, 33.2% (Abiriga et al. 2021b) versus 55.3% in the present study. The difference is due to the present study being restricted to a single level in the aquifer while the previous study included all the multilevel sampling system in each well, which suggests that variation along the vertical axis is controlled by variables which we did not measure. This may be due to the inherent variation arising from the aquifer layering which may respond differently to changes in hydrologic regimes (Smith et al. 2018), resulting in differences in the microbial community compositions across the depths in the aquifer (Abiriga et al. 2021b). Sampling groundwater from only one level avoided this bottleneck. The small variance attributable to season (0.4%, Fig. 3) further indicates that seasonal variability due to differential response of aquifer layers to hydrologic regimes, which causes shifts in microbial communities (Pilloni et al. 2019), was minimal. This finding has a serious implication for future studies on subsurface microbiology, where a great deal of attention needs to be given in designing sampling for heterogeneous systems.

Conclusion

Our study shows taxa co-occurrence in four communities. Both the structure and complexity of the networks varied remarkable among the communities, which highlights inherent variations in composition and taxa interactions in situ. Similarly, the putative keystone taxa varied among the communities in composition as well as numerically. Putative biogeochemical cycling potentials of the keystone taxa include carbon cycling, nitrogen cycling, and phosphorous cycling which may suggest taxa cooperation, although taxa with potential for symbiosis and parasitism were also present. The study identified deterministic processes as the driving force shaping the microbial community assembly in the landfill-leachate-impacted aquifer, a finding further substantiated by employing variation partitioning, which indicated that the measured environmental variables explained most of the variation in the microbial community composition. The novelty of this research is the application of a combination of network analysis, ecological null model analysis, and multivariate statistics to microbial data from an environment which has not been previously studied for the ecological processes. Findings from this study should therefore advance our understanding of microbial community assembly in ecosystems subjected to press perturbations from landfill operations.