Introduction

The loess hills area in northern Shaanxi, China, is an important oil and gas reserve and development area. Oil well leaks, crude oil pipeline, and storage tank leaks during the development of oil and gas fields cause petroleum-based products to fall on the ground soil, causing serious damage to the local ecological environment, in turn, affecting the ecological balance and leads to gradual environmental degradation (Zhen et al. 2015; Wang et al. 2017). Petroleum hydrocarbons are hard-to-degrade organic pollutants that can seriously affect soil productivity and cause different degrees of harm to microorganisms (Mambwe et al. 2021), plants, and animals in the ecosphere around the pollution source (Wang et al. 2016). The entry of petroleum-based substances into the soil has caused a dramatic increase in soil organic carbon content, resulting in an imbalance between soil C content and N content (He et al. 2021; Zhang et al. 2018; Wen et al. 2017). This imbalance destroys the habitat for soil microbial life and changes the number and diversity of soil microbial populations (Zheng et al. 2021; Dong 2020). Microorganisms play a very important role in maintaining soil ecosystem functions; they can effectively decompose plant and animal residues; promote soil C, N, S, P, and other nutrient cycles (Pan et al. 2012; Li et al. 2017; Mu 1994); and regulate soil material and energy cycle (Iyobosa et al. 2021; Kong et al. 2021; Wang et al. 2021). Therefore, it was important to investigate the current status of petroleum-contaminated soil in northern Shaanxi and analyze the effects of petroleum pollution on the diversity of soil microbial populations to explore remediation methods suitable for petroleum-contaminated soil in the region (Ajona and Vasanthi 2021; Fan et al. 2015).

The current research on microorganisms of petroleum-contaminated soil has mainly focused on the remediation effect of soil microorganisms (screening of efficient degradation bacteria, degradation characteristics, etc.) (Ueno et al. 2010), while the structural diversity of indigenous microbial communities of petroleum-contaminated soil in Loess hilly areas has been less studied (Margesin et al. 2007). In this study, we investigated the soil microbial diversity under the effects of different levels of oil pollution and analyzed the response of soil microbial community structure to oil pollution through field sampling (Yang et al. 2013), indoor simulation tests, detection, and analysis, to provide a scientific basis for bioremediation and treatment of oil-contaminated soil in the Loess hills of northern Shaanxi Province. This study aims to provide a scientific basis for the bioremediation and treatment of oil-contaminated soil in the Loess hills (Wang et al. 2020; Li et al. 2021).

Materials and methods

Sample collection

The test soil for this study was collected from a typical oil well area in Baota District, Yan’an City, Shaanxi Province, China. The soil samples were collected in the 0-30 cm range by the S-spotting method in the vicinity of oil wells, selected from areas not contaminated with oil, following the principles of random, equal, and multi-point mixing. The samples were then passed through a 2-mm sieve to remove impurities and preserved at low temperature for indoor simulation tests (Li et al. 2009).

Indoor simulation test

Crude oil was weighed according to the ratio of oil content of 15g/kg (C15), 30g/kg (C30), and 45g/kg (C45); added to clean soil; and mixed thoroughly to prepare different oil contamination gradient soil (Li et al. 2011). After 1 week of static aging, indoor simulated incubation tests were carried out, i.e., incubation under constant conditions (ambient temperature controlled at 25±2°C and soil moisture content maintained at 15%) for 21 days, with three replicates for each treatment (Table 1). After incubation, soil samples were collected from four different locations in each container using the multi-point mixed soil sample collection method and then mixed thoroughly to form a mixed sample of about 20 g for determining the microbiological properties of the soil (Liu et al. 2012, 2020).

Table 1 Experimental treatments

Testing indicators and measurement methods

Soil microbial diversity was determined by DNA extraction and high-throughput sequencing, and the genomic DNA was extracted by CTAB or SDS method, followed by agarose gel electrophoresis to check the purity and concentration of DNA. The PCR was performed using specific primers with Barcode, Phusion High-Fidelity PCR Master Mix with GC Buffer from New England Biolabs, and high efficiency high fidelity enzymes to ensure amplification efficiency and accuracy. The primers corresponding to the region for the identification of bacterial diversity were 16S V4 region primers (515F and 806R) (Zhang and Xu 2022; Sun et al. 2018).

The PCR products were detected by electrophoresis using a 2% agarose gel; the PCR products that passed the test were purified by magnetic beads, quantified by enzyme labeling, mixed in equal amounts according to the concentration of PCR products, mixed thoroughly, and then detected by electrophoresis using a 2% agarose gel, and the target bands were recovered using a gel recovery kit provided by Qiagen (Fig. 1). The library was constructed using TruSeq DNA PCR-Free Sample Preparation Kit. The constructed library was quantified by Qubit and Q-PCR, and after the library was qualified, the library was sequenced using NovaSeq6000 (Lusk et al. 2021).

Fig. 1
figure 1

Amplification product result graph

Statistical analysis of data

According to the Barcode sequence and PCR amplification primer sequence to split each sample data from the downstream data, truncated the Barcode and primer sequence and used FLASH (V1.2.7) to splice the reads of each sample (Liu and Peng 2001), the spliced sequence was the original Tags data (Raw Tags); the spliced Raw Tags, need to go through strictly. The spliced Raw Tags need to be filtered strictly to get the high-quality Tags data (Clean Tags). Referring to the quality control process of Tags in Qiime (V1.9.1), the following operations were performed: (a) Tags interception: Raw Tags were truncated from the first low-quality site where the number of consecutive low-quality bases (default quality threshold <=19) reached a set length (default length value is 3) (Ali et al. 2020); (b) Tags length filtering: Tags were intercepted to obtain the Tags. The Tags obtained after the above processing need to be processed to remove the chimeric sequences, and the Tags sequences were compared with the species annotation database to detect the chimeric sequences, and finally removed the chimeric sequences to obtain the final effective data (Effective Tags) (Wagh and Tiwari 2020).

Results and discussion

Dilution curve

For the dilution curve, a certain number of sequences were randomly selected from the sample, and the Alpha diversity index of these sequences corresponding to the sample was counted. The horizontal coordinate represents the number of sequencing strips randomly selected from the sample, and the vertical coordinate represents the number of operational taxonomic units (OTUs) that can be constructed based on this number of sequencing strips to plot the curve, and the adequacy of the current sequencing data volume is judged according to whether the curve reaches a flatness (Ma et al. 2020). According to Fig. 2, the microbial OTU in petroleum-contaminated soils was significantly reduced compared to the uncontaminated controls, i.e., the overall number of microbial species decreased. The number of soil microbial OTUs at different oil contamination concentrations was in the order of C15> C45> C30, where the difference between C30 and C45 was small, indicating that the number of soil microbial species first decreased and then leveled off as the contamination level increased (Chen et al. 2020).

Fig. 2
figure 2

Dilution curves of soil microorganisms under different oil pollution concentrations

Microbial alpha (α)-diversity—analysis for variability between groups

Alpha diversity (α-diversity) refers to the species diversity within a community or habitat, mainly focusing on the species diversity within the community. The common method is to calculate Chao, Shannon, Simpson, and coverage4 indexes based on OTU results for biodiversity analysis (Liu et al. 2021). In this study, the Shannon and Simpson indices were selected to characterize the α-diversity of soil contaminated with different additives and to examine the differences between treatments, the inter-group difference test of indices was used. The results showed that the microbial α-diversity of petroleum-contaminated soils (C15, C30, C45) was significantly lower than the control group with no petroleum contamination, but the differences in soil microbial α-diversity between varying concentrations of contamination did not reach a statistically significant level (Fig. 3).

Fig. 3
figure 3

Soil microbial α-diversity under different oil pollution concentrations

Microbial species composition analysis through Venn diagrams

Venn diagrams were obtained by comparing OTUs between samples or between subgroups, which can visualize the similarity and overlap of OTU composition of environmental samples (Wang et al. 2019). The results showed that each soil sample produced a total of 14,090 OTUs, of which 1791 were shared, accounting for 12.71% of the total OTUs (Fig. 4). the number of OTUs in the C15, C30, and C45 treatments were 3719, 3194, and 3345, respectively, which were lower than that of CK (3832). The number of shared OUTs between contaminated and clean soils showed a decreasing trend as the oil contamination gradient increased. The highest overlap between different pollution gradients was found in the C15 and C45 treatments, which had a high similarity of bacterial community structure. In addition, the different oil contamination treatments did not significantly affect the number of soil-specific OTUs.

Fig. 4
figure 4

Venn diagram of soil microbial OTU numbers at different oil pollution concentrations concentrations

Microbial β-diversity analysis for variability between groups

To study the similarity or difference relationship of different sample community structures, cluster analysis was performed on the sample community distance matrix to build a sample hierarchical clustering tree. This method effectively identifies the “major” elements and structures in the data, removes noise and redundancy, reduces the dimensionality of the original complex data, and reveals the simple structure hidden behind the complex data (Zhang et al. 2017).

Figure 5 shows that the different shape legends in the figure represent the control and three soil samples with different concentrations of oil contaminants. The differences in the microbial community composition of CK, C15, C30, and C45 were found by principal component analysis. The microbial community composition of the contaminated soil differed, although not significantly, between the different contamination treatments and differed significantly From CK. the PC1 and PC2 axes explain 20.03% and 9.23% of the results, respectively.

Fig. 5
figure 5

Principal component analysis of soil microorganisms under different oil pollution concentrations

Differential analysis of microbial community composition

The histogram presents the community composition and species abundance at different taxonomic levels (Zhu et al. 2020). In this study, community composition and species abundance analysis were conducted at the genus level. In the absence of oil contamination (CK), the dominant genera included Pantoea, Streptomyces, Alkanindiges, and Massilia. After oil contamination, the dominant genera of the microbial community changed significantly, with Pantoea being dominant. After D2 treatment, the abundance of Streptomyces increased, and the difference between C30 and C45 decreased (Fig. 6). In addition, the abundance of Nocardioides and Acinetobacter increased with increasing oil pollution concentration, while Thiothrix, Sphinggomonas, and Gemmatimonas decreased significantly.

Fig. 6
figure 6

Community composition characteristics of soil microorganisms at different genus levels

Conclusions

In this study, we used high-throughput sequencing technology to analyze soil microbial diversity under different gradients of petroleum pollution concentration and sequenced 16S amplicons by the soil, and the IlluminaMiSeq platform. The results showed that the microbial species in contaminated soil were more abundant than in clean soil, the main dominant bacterial groups were changed, and significantly more bacteria could degrade petroleum pollutants. At the same time, while the soil microbial diversity and community structure differed, it was not significant between different contamination levels and differed significantly from that of clean soil, consistent with the results of previous studies (Wu et al. 2020; Xu et al. 2021; Sun et al. 2021; Zhao et al. 2020).

At the genus level, Pantoea, Sphingomonas, Thiothrix, and Nocardioides were dominant in uncontaminated soil in high abundance. The intake of petroleum hydrocarbons resulted in a decrease in the abundance of Thiothrix, Sphingomonas, Gemmatimonas, Pseudomonas, Acinetobacter, Pedobacter, and the abundance of Pseudomonas, Acinetobacter, Pedobacter, and Massilia increased significantly. This may be because petroleum contaminants caused changes in water content, total organic carbon, total petroleum hydrocarbons, total carbon, total sulfur, and pH values, which are the limiting factors for the growth of the dominant genera (Liang et al. 2016), thus changing the relative abundance of Sphingomonas, Thiothrix, and Pseudomonas. The oleophilic bacteria genera in the contaminated soil in the typical oil extraction area in the Loess hilly area mainly include Pseudomonas, Acinetobacter, Pedobacter, Acinetobacter, Nocardioides, and Alkanindiges among others (Xu et al. 2021; Zhen et al. 2015).

These data can help us to better understand the evolutionary characteristics of various indigenous microorganisms in oil-contaminated soils in the region and provide theoretical reference and data support for remediation and treatment technologies of oil-contaminated soils in the region. In the future, several kinds of oleophilic indigenous bacteria need to be screened and domesticated from the contaminated soil in the study area to make bacteriological agents and conduct indoor simulation experiments to investigate and explore the applicability of indigenous oleophilic microorganisms in the bioremediation of petroleum-contaminated soil.