Geochemical characteristics and genetic families of crude oil in DWQ oilfield, Kuqa Depression, NW China

As one of the most petroliferous oil producing area in Kuqa depression, Dawanqi (DWQ) oilfield is supplying with great attention. In this regard, the geochemical characteristics and oil families from DWQ field were investigated using molecular compounds analysis of GC, GC–MS techniques. The bulk geochemistry of oils from DWQ oilfield displays complicated molecular composition characteristics, including relative higher indices of Pr/Ph (1.4 ~ 4.26, with an average of 2.4), high concentration of light hydrocarbons and certain abundant pentacyclic triterpene and steranes. The C7 light hydrocarbon and isoprenoids ratios indicate the oils were derived from terrestrial and higher plant input in weak oxidizing and reducing environment. Most of the oils are among the mature oils in the study area, except a few samples that are identified as slightly biodegraded by C7 hydrocarbon. Three oil families are identified in DWQ oilfield of Kuqa depression by biomarker analysis and geochemical parameters. The family A shares the attributes with higher amount of tricyclic terpanes, such as C19- C20 tricyclic terpane, higher C24-tercyclic terpane, lower concentration of gammacerane (< 0.6) but poor diasteranes. Family C is characterized with lower content of C19-tricyclic terpane than C20 tricyclic terpane, low C24-tercyclic terpane than C23-tricycli terpane, relative high concentration of gammacerane (> 0.6) but poor diasteranes. The oils of family B are mixed from the two types, showing mixed features of family A and C. The results can shed light for the exploration of the studied area.


Introduction
Kuqa depression is a petroliferous district of petroleum hydrocarbons in Tarim basin, northwest China (He et al. 2009;Liu, et al. 2008;Niu et al. 2020) (Fig. 1a, b). Crude oils and gas are both fertile in this area (Zeng et al. 2020;Zhu et al. 2015). Attention has been given to the abundant petroleum resources in the previous studies (Pan et al. 2013;Zhu, et al. 2015;Liang et al. 2003). But limited number of geochemical studies conducted in Kuqa depression only focused on natural gas resources or other central and southern parts of the depression, such as southern frontal uplift and northwest areas (Shen et al. 2017;Ju et al. 2018;Li et al. 2019). Studies showed that hydrocarbon sources in Kuqa depression are multi-derived and diverse (Qin et al. 2007;Tang et al. 2014;Huang et al. 2019). Located in the northwest of Kuqa depression, Dawanqi oilfield (DWQ) is rich in crude oils, especially light oils and condensate (Fig. 1c, d). As one of the typical and indispensable crude oil production area, more concerns need to be applied to DWQ oilfield. The purpose of this study is to understand the geochemical characteristics of the crude oil's samples by series of geochemical molecular compositions, including light hydrocarbons, biomarker indices of terpenoids and steroids. The analysis will provide a better understanding of the petroleum exploration of DWQ oilfield in Kuqa depression and shed light for the framework of hydrocarbon potential of this area.

Geologic setting
Situated in northern Tarim basin, the Kuqa depression is a superimposed foreland basin that overlain by the Mesozoic foreland basin and Cenozoic foreland basin, extending in a nearly NE-SW direction with and an area of 20,000km 2 ( Fig. 1a, b, c) Zhang et al. 2014). DWQ (DWQ) oilfield is situated in the Baicheng sag, northwest of Kuqa depression (Fig. 1 c, d). It is a tectonic uplift formed in the late Himalayan orogeny pitching by Baicheng sag. In the northern portion of the uplift close to the Tuzimazha anticline and southern Yakeleagh belt of Qiulitagh structure. Secondary faults are well developed in DWQ oilfield, and separated it into several blocks (Fig. 1d).
Stratigraphic features of Kuqa depression are well interpreted in previous studies (Fig. 2). An economic accumulation of oils found in DWQ oilfield occurs in this area consists of two sets of strata, the quaternary and Kangcun group of Neogene. Both of the two strata are dominated by sand rocks and fine clastic (Sun et al. 2017;Zhang et al. 2016a, b). Kuqa depression developed a great set of Mesozoic strata, undergone deep sink in early and middle Triassic, fulfilling in late Triassic, shallow and widen in late Jurassic and Cretaceous. Correspondingly, the paleoclimate gone through with aridhumid-arid condition. In the Neogene time continues with the arid condition, the Kuqa group of DWQ oilfield formed a set of lacustrine, fluvial facies and alluvial facies with terrestrial clastic sediments (Yu et al. 2017;Sun et al. 2017;Wu et al. 2019) (Fig. 2).

Samples and methods
Twenty-nine crude oil samples from DWQ oilfield were analyzed geochemically in the study. Most of the samples are light oils with saturated hydrocarbons as the predominant component. The techniques of gas chromatography (GC) and gas chromatography-mass spectrometry (GC-MS) were adopted, respectively, in this research. The extracted organic matter was subjected to GC and GC/MS analysis was conducted in the Yangtze University.
The samples were deasphalted by n-hexane and then fractionated with column chromatography into saturate, aromatic and NSO fractions by means of sequential elution with n-hexane, toluene and chloroform. Gas chromatography (GC) of the hydrocarbons was performed by HP6890 gas chromatograph equipped with a fused silica column (HP-PONA, 50 m × 0.32 mm × 0.25 μm). The oven temperature program was from 35 °C (5 min) to 300 °C (held 20 min) at 4 °C/min. Helium was used as the carrier gas. The gas chromatography-mass spectrometry (GC-MS) analysis of saturated compositions was conducted by Agilent 6890 N-59751IMSD fitted with HP-5MS capillary. The working temperature of the GC for saturated hydrocarbons was 50 °C (1 min) to 100 °C at 20°C/min and then up to 315 °C(18 min) at 3 °C/min. Mass range for the saturated and aromatic hydrocarbons is from 50 to 550amu and 50 ~ 450amu, respectively.

Total hydrocarbon distribution
The vast majority of crude oils are of typical pre-peak type. The main peak carbon numbers of some crude oil are mainly C 10 -C 14 as shown in Fig. 3a, and some others are C 17 -C 18 as shown in Fig. 3b, c. Generally, the content of normal alkanes before nC 15 is higher than that after nC 16 . Some crude oil samples show opposite distribution, which may be related to microbial degradation. The CPI value of crude oil samples is between 1.0 and 1.2, indicating that the crude oil has a high degree of maturity (Marzi et al. 1993). The total hydrocarbon distribution of GC-MS is quite complicated in DWQ oilfield. The complex hydrocarbon reflects the changes of crude oil composition. Three types of total hydrocarbon were observed in DWQ oilfield (Fig. 3). The main feature of Type A oil is characterized with abundant lower carbon number compounds and intact distribution of medium molecular weight (Fig. 3a). But only a few samples show the characteristics of this chromatographic appearance. The chromatographic appearance of Type B is the most commonly distributed ones in DWQ crude oil, and more than half of the samples show this type of chromatographic appearance. The distribution of n-alkanes in these crude oils is complete (Fig. 3b). Among the low-carbon number compounds, the content of light hydrocarbons is abundant, and the abundances of benzene series and methylcyclohexane are also abundant and comparable (Fig. 3b). The main characteristic of Type C crude oil is that heavily loss of low-carbon number n-alkane occurs, but the high-carbon number n-alkanes are relatively complete, and the benzene series and methylcyclohexane in light hydrocarbon compounds show significantly high abundance (Fig. 3c). Since the high-carbon number n-alkanes of this type of crude oil are preserved intact, the loss of low-carbon number n-alkanes probably related to the relatively slight biodegradation. In other words, this kind of crude oil may have suffered a slight biodegradation effect. The light hydrocarbon composition of these crude oils can provide further evidence for this in discussion part.

Light hydrocarbon distribution
Light oils and condensate take the predominate part of all the oils. Most GC attribute of the crude oils reveal with a broadly distribution of n-alkanes and high abundant of benzene compounds and methyl cyclohexane (Fig. 4).
The content of light hydrocarbons is rich in crude oils. The content of paraffins in light hydrocarbons is much higher, while the content of branched alkanes is relatively low, and the content of cycloalkanes distributes in broad range with significantly higher content of benzene and toluene compounds. (Fig. 4a, b). The abnormally high amount of benzene and toluene compounds in crude oil indicate a typical terrestrial origin (Wang et al. 2008;Hu et al. 1990Hu et al. , 2014.

Saturated hydrocarbon distribution
Compared with light hydrocarbons, the abundance of steroids and terpenoids in crude oil is generally lower. Since tricyclic and tetracyclic terpenes can be systematically detected on the m/z 191 mass chromatogram, effective comparative analysis can be carried out. Abundant tricyclic terpenes and tetracyclic terpenes were detected in DWQ crude oil, and the content of steranes was relatively less. The saturated hydrocarbon spectrum of DWQ crude oil has the following distribution features.
Abundant tricyclic and tetracyclic terpenes were detected in DWQ crude oil and there are two distinctive distribution patterns. One of the typical distributions is shown in Fig. 5a, b, and the other one is as shown in Fig. 5c. In Fig. 5a, b, the relative abundance of tricyclic terpenes shows with C 19 -> C 20 -> C 21 distribution pattern ( Fig. 5a, b left). The content of C 24 tetracyclic terpanes (C 24 Te) is much higher, and the oil is rich in Ts (18a(H)trisnorneohopane) and Tm (17a(H)-trisnorhopane) with relatively high C 30 diahopane (C 30 H) and C 30 hopane (C 30 H) ( Fig. 5a, b left). Among them, type B has significantly high C 24 tetracyclic terpanes (C 24 Te) and C 30 diahopane (C 30 H) and rearranged hopanes. Gammacerane (G) is relatively low, and C 34 -and C 35 -hopanes are not developed. The G/C 31 H is between 0.33 and 1.09. (Table 1, Fig. 5b left). The distribution of steranes in DWQ oil is characterized by the relatively abundant diasteranes. The overall abundance of C 29 -regular steranes in the regular sterane composition is relatively high, and the abundance of C 27 -and C 28 -steranes is relatively low (Fig. 5a, b right). Another distribution pattern is shown in Fig. 5c. The abundance of C 19 -tricyclic terpene is generally lower than that of C 20 -tricyclic terpene, and has a certain abundance of C 28 and C 30 tricyclic terpenes. The abundance of C 24 -tetracyclic terpene is lower than C 23 -tricyclic terpene. The abundance of gammacerane is obviously high, and gammacerane/C 31 -hopane (22R) (G/ C 31 -H) is above 0.60 (Table 1, Fig. 5c left). C 30 hopane content is very high, while C 30 rearranged hopane, C 29 rearranged hopanes and C 29 Ts is relatively low. The C 31-35 homohopane compounds are well developed with relatively higher abundance (Fig. 5c left); the distribution of steranes is characterized by the fully developed diasteranes and regular steranes. In the composition of the steranes, the overall abundance of C 29 -regular sterane is relatively high, and the abundance of C 27 -and C 28sterane is relatively low (Fig. 5c right). The distribution characteristics of terpenes and steranes indicate some differences in their genesis (Peters et al. 2004). The redox properties of the deposition environment can affect the formation of diasteranes, and a strong reducing environment can inhibit the rearrangement of steranes (Hu 1991;Jiang et al. 2018). The abundance of diasteranes in DWQ crude oil is low, reflecting the weak oxidation-weak reduction environment.

Origins and depositional environment
As the principle components of crude oils, light hydrocarbons can provide great significant geochemical information for generation environment and origins. Different types of C 7 compounds in light hydrocarbon components often have different parent material sources (Wever 2000;Wang et al. 2008). In recent years, it has been reported that n-heptane (nC 7 ) in C 7 light hydrocarbons is mainly derived from algae and bacterial lipids, but it is very sensitive to maturation. Methylcyclohexane (MCC 6 ) is mainly derived from higher plant lignin, cellulose and sugars, etc., and its thermodynamic properties are relatively stable (Zhang 2016b). It is a good parameter to reflect the type of terrigenous parent material. The large number of methylcyclohexane in light hydrocarbon is coal-derived. TT is for tricyclic terpenes; Te is for tetracyclic terpenes; C 19 TT stands for C 19 tricyclic terpenes, C 20 TT stands for C 20 tricyclic terpenes and so on. Ts is for 18α-trisnorhopane; Tm is for 17α-trisnorhopane; H is for 17α(H)hopane; C 29 Ts is for C 29 18α(H)-30 norneohopane; C 31 H is for C 31 22S/ (22S + 22R) homohopane and the same for C 32 H, C 33 H, C 34 H and C 35 H; C 21 and C 22 are for C 21 and C 22 pregnane; C 27 , C 28 and C 29 are for C 27 sterane, C 28 sterane and C 29 sterane, respectively   Various kinds of dimethylcyclopentane (ΣDMCC 5 ) are mainly derived from lipid compounds of aquatic organisms. The occurrence of dimethylcyclopentane in light hydrocarbon shows sapropel derived origin. Therefore, the C 7 light hydrocarbon series triangle chart compiled with nC 7 , MCC 6 and ΣDMCC 5 is a good way to distinguish crude oils of different parent material types (Zhang 2016b;Wang et al. 2008). According to the IMMC 6 distribution and Fig. 6 (nC 7 diagram), the crude oils of DWQ oilfield are relate to mixed sources and coal-derived oils that generated from higher land plant organic matter input. The index of I MMC6 that proposed by Hu et al. 1990) is another good parameter for organic matter type and depositional environment identification. According to the standard of index MMC 6 that if the index result is over 50%, the oil derived from humic organic matter input. The oils have been plotted on a diagram (nC 7 , MCyC 6 , ∑DMCC 5 ) in Fig. 6. The MCyC 6 distributes from 46.06 to 75.34% with abnormally high content. The relative abundance of nC 7 is ranged from 0.51 to 37.99%. The significantly high amount of MCyC 6 and index MMC 6 show that the oils are lacustrine oil with terrestrial higher plant input (Table 1, Fig. 6).
The isoprenoid hydrocarbons pristane and phytane are amongst the most widely found biomarkers in geosphere (Brassell et al. 1981). Based on the assumption that pristane is formed from the chlorophyll phytyl side-chain by an oxidative pathway, while phytane is generated through various reductive pathways, the ratio of pristane to phytane (Pr/Ph) has been proposed as an indicator of the oxicity of the depositional environment (Didyk et al. 1978). Generally, Pr/Ph < 1 indicates a reducing lacustrine or marine environment, and a very low Pr/Ph ratio (< 0.8) indicates a highly saline and reducing environment (Peters et al. 2004). The values of the Pr/Ph from DWQ oil field range from 1.28 to 4.26 with an average value of 2.6 (Table 1), indicating a weak oxidizing and weak reducing condition during source rock deposition. The plot of Pr/nC 17 versus Ph/nC 18 graph (Fig. 7) is often used to define maturity and type of organic matter of the oils. The Pr/nC 17 values of the samples from DWQ oil field range from 0.11 to 0.25 and the Ph/nC 18 values are 0.05 to 0.15. In the plot of Pr/nC 17 versus Ph/nC 18 graph (Fig. 7), the majority of the samples are in the "mixed and terrestrial source organic matter" field with the organic matter of type II and mixed type (II/III). The results also show that the maturity of the analyzed oil is quite high.

Maturity
C 29 αββ/(ααα + αββ) is very effective parameters for maturity evaluation (Moldowan et al. 1986;Peters et al. 2004). The ratio of C 29 ααα 20S/(20S + 20R) increases with the maturity, and attains the equilibrium values at 0.52 ~ 0.55. The ratio of C 29 αββ/(ααα + αββ) increases from nonzero to 0.7 by isomerization, and obtained the equilibrium state at 0.57 ~ 0.62. The oils were classified into two separate categories, "immature" and "normal maturity" oils, by means of C 29 αββ/ (ααα + αββ) and C 29 ααα 20S/(20S + 20R) sterane ratio of 0.25 and 0.2 as the cut-off point, respectively. Based on this classification, the general samples included in this study fall into the category of "normal oils", with the 20S/ (20S + 20R) average of 0.49 and 0.54, respectively (Table 1), showing the oils are in a mature state (Fig. 8). Thompson (1983) proposed that heptane number and isoheptane number are important indicators to measure the degree of thermal evolution of oils. According to the report, as maturity increases, heptane and isoheptane numbers continue to increase (Thompson 1983). Therefore, the correlation between heptane value and isoheptane value can be used to evaluate the maturity of crude oil. However, it was found that biodegradation can also reduce these two values. when Ro < 1.18%, the heptane value and isoheptane Fig. 6 The ternary plot of nC 7 in light hydrocarbon Fig. 7 The crossplot of Pr/nC 17 and Ph/nC 18 . Pr = pristane; Ph = phytane value increase with the increase in maturity, which is in line with the Thompson's theory. But when Ro > 1.18%, the twoheptane value inversion occurs, and it becomes smaller as the degree of evolution increases (Akinlua et al. 2006;Mark et al. 2002;Wang et al. 2008).
The heptane value and isoheptane value of crude oil in the study area are distributed between 0.51 and 37.99% and 0.4-3.05, respectively (Table 1). Most of them are mature crude oils, which are consistent with the sterane maturity parameters (Fig. 9). However, some samples have abnormally low values, which are caused by biodegradation. The research results are consistent with the light hydrocarbon chromatographic results (Fig. 9).

Biodegradation
The total hydrocarbon distribution in Fig. 3 and the correlation of heptane and isoheptane in Fig. 9 show that several oils have suffered certain degree of biodegradation. The degradation of the oils in DWQ oilfield is quite mild compared to those obviously distinguished reported. Many classic biodegradation indices such as saturated hopanes (Peters et al. 2004) that proposed don't suitable for this place. It is because of this minor degradation that makes the light hydrocarbons become an excellent parameter to evaluate the secondary alteration. The light hydrocarbons of C 7 series composition proposed by Halpern (1995) are utilized in the paper as follows.
Halpern proposed that 1,2-dimethylcyclopentane (X) is the strongest one of C 7 hydrocarbon to resist biodegradation. Toluene/1,1, DMCP (Tr1) parameters reveal the occurrence of washing effect during or before the biodegradation. The ratio of n-heptane, methylhexane and dimethylcyclopentane to 1,1-DMCP (Tr2-Tr7) can reflect different degrees of biodegradation, and the Tr8 ratio (methylhexane/dimethylpentane) is not affected by microbial activity (Halpern 1995). Using this light hydrocarbon ratios to draw a starshaped comparison chart, it can be seen that the degradation of DWQ crude oil is mainly performs in the Tr1 and  Tr2 parameters (Fig. 10a, b). Among them, the ratio of Tr1 parameters ranges from 0.09 to 48.63, with an average value of 27.214; Tr2 values range from 1.24 to 21.69, with an average value of 14.64. The distribution of Tr3 to Tr5 is less than 10, and Tr6-7 is less than 2 (Table 2; Fig. 10a, b). The above results indicate that part of the crude oils in the study area have undergone water washing and different degrees of biodegradation, but the degree of degradation is relatively low.

Oil family classification
Based on the biomarker difference of steranes and terpenoids compositions, four sets of parameters were selected as the oil's family classification standard. These parameters can reflect the origins and genesis of the oils in DWQ oilfield (Fig. 11).

Conclusion
(1) The hydrocarbon composition and distribution in DWQ oil field are quite complicated with several types. The light hydrocarbon takes the predominant part in the whole saturated hydrocarbons.
(2) The C 7 parameters of light hydrocarbon and terpanes are rich in the oils, indicating terrestrial organic matter input in a relative weak oxidizing and weak reducing environment.
(3) The heptane and isoheptane values, the C 7 series composition shows some of the oils have been suffered from different degree of washing effect and biodegradation, but the biodegradation is mild.
(4) The analyzed oils can be classified into three families. Family A shares the attributes with higher amount of tricyclic terpanes, lower concentration of gammacerane (< 0.6) but poor diasteranes. Family C is characterized with lower content of C 19 -tricyclic terpane than C 20 tricyclic terpane, low C 24 -tercyclic terpane than C 23 -tricycli terpane, relative high concentration of gammacerane (> 0.6) but poor rearranged steranes. The oils of family B are mixed form the two types.
Funding This study was funded by Natural Science Foundation of Guangdong Province, China (20201515110555), State Key Laboratory of Shale Oil and Gas Enrichment Mechanisms and Effective Development (33550007-21-ZC0613-0015), and Projects of Talents Recruitment of GDUPT (519017).

Fig. 11
The oil family classification of the analyzed oils in DWQ oilfield. C 19 TT stands for C 19 tricyclic terpenes, C 20 TT stands for C 20 tricyclic terpenes and so on. C 24 -TeT is for C 24 tetracyclic terpenes; G is for gammacerane; Dia C 30 -H is for C 30 diahopane. C 29 -H and C 30 -H are for C 29 and C 30 17α(H)hopane, respectively; C 29 Ts is for C 29 18α(H)-30 norneohopane

Declaration
Conflict of interest On behalf of all the co-authors, the corresponding author states that there is no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.