Application of principal component analysis on water flooding effect evaluation in natural edge-bottom water reservoir

Water flooding effect evaluation is considered as the basic work to formulate comprehensive adjustment measures and improve the effectiveness of oilfield development. However, natural edge-bottom water energy is seldom considered in the conventional evaluation method. So, it cannot reflect the comprehensive effect of both natural edge-bottom water and injected water. Principal component analysis is a kind of multivariate statistical analysis method, which has been widely used in social science and other fields. Based on this method, the water flooding effect of 5 edge-bottom water reservoirs is comprehensively evaluated. First, 11 indicators are selected from four aspects, including natural edge-bottom water energy, production change, water injection development and utilization, energy maintenance and deficit compensation. Then, the selection of principal components is optimized. Based on the consideration of keeping as much information as possible to get more convincing results, three principal components are obtained. Finally, take five oilfields as examples to realize comprehensive evaluation. Results indicate that the natural energy of B oilfield is quite sufficient and water injection is timely in the later stage of development. So the water flooding effect is the best among five oilfields and the comprehensive principal component value is 1.434. That of A and C oilfields are 0.527 and 1.021, respectively, ranking 3 and 2. Although D oilfield has quite sufficient natural energy, water injection is not timely. So the water flooding effect is poor and the comprehensive principal component value is 0.259. That of E oilfield is − 3.241, indicating that it has the worst water flooding effect. The ranking results of five oilfields are consistent based on principal component analysis and Tong's chart, which are both B, C, A, D and E oilfield, verifying this method’s feasibility and practicability. Additionally, compared with the single index, it can reflect the comprehensive water flooding effect of both natural edge-bottom water and injected water. Specific oilfield cases are evaluated by the proposed method, which help for better understanding its application potential for evaluating the water flooding effect of natural edge-bottom water reservoirs.


Abbreviations N pr
Dimensionless elastic production ratio N p Cumulative oil production, 10 4 m 3 N Geological reserve, 10 4 m3 B o Oil volume coefficient at current formation pressure B oi Oil volume coefficient at original formation pressure C t Comprehensive compression factor, MPa −1 P Current formation pressure, MPa P i Original formation pressure, MPa D pr Pressure drop per recovery of reservoir reserve, MPa R om Water flooding reserves utilization degree b The slope of water flooding characteristic curve X i The ith original evaluation index, i = 1,2, …, p F j The jth principal component, j = 1,2, …, k ij The linear combination coefficient between the ith evaluation index and the jth principal component Cov F i , F j The covariance between the ith and jth principal components Var The cumulative variance contribution rate e i The eigenvector corresponding to the eigenvalue λi F Comprehensive score

Introduction
Water flooding is often used for pressure maintenance and displacing oil to enhance oil recovery (Olayiwola and Dejam 2019;Rostami et al. 2019). Water flooding effect evaluation is the basic work of objectively understanding the current status of oilfield development and improving the effectiveness of oilfield development (Luo et al. 2012;Wen et al. 2016). However, natural edge-bottom water energy is seldom considered in the conventional evaluation method (Pajonk et al. 2011;Pang et al. 2013;Mamonov et al. 2017). Natural water flooding reservoirs exist widely and have rich reserves (Eric and Wilson 2015;Cui et al. 2016;Wei et al. 2019). The evaluation of water flooding effect is the key to reasonable evaluation of well pattern infill and fine adjustment of development technology policies. So, it is important to accurately and comprehensively evaluate the water flooding development effect of natural edge-bottom water reservoir. Regarding the evaluation of development effect in water injection reservoirs, scholars have established a variety of evaluation methods and systems. Huang and Liu (1998) used state comparison method to compare the deviation between theoretical curve and actual production curve. Commonly used contrast curves include the relationship between recovery degree and water storage rate, recovery degree and water cut (Zhang 1992). This method makes full use of abundant dynamic data which has been widely used in the oilfield, but the selection of indicators is relatively simple. Based on the principle of material balance, Liu and Xiao (2010) regarded water injection and oil recovery as a unified system. They proposed new evaluation indexes from the perspective of water injection to supply the evaluation indicators of water drive effect. However, the evaluation results obtained by using different indicators may be inconsistent or even contradictory. So it is necessary to take into account as many indexes as possible and establish a more comprehensive evaluation system. The fuzzy comprehensive evaluation method was proposed by Zhang and Feng (2018) to evaluate the effect of water flooding development. Compared with the traditional method, this proposed method can reflect the difference of the evaluation units. The gray correlation and analytic hierarchy process (William and Hanaa 2001;Xiao et al. 2019) were also applied to solve the similar problems in oilfield. This kind of mathematical methods can solve the problem of inconsistent or inaccurate evaluation results using single indicator, but some problems still exist. First, the selection of indicators is not representative and cannot reflect the characteristics of natural edgebottom water flooding. Second, the weight coefficients of different indexes are often influenced by subjective judgments, which may lead to evaluation results deviating from objective reality.
The principal component analysis method is a multivariate statistical analysis technology (Martina et al. 2016), which can convert multiple variables into a few unrelated principal components replacing most of the information contained in the original variables (Scheevel and Payrazyan 2001;Chopra and Marfurt 2014). Compared with the previously performed studies in the literature, the water flooding effect among several oilfields can be comprehensively and quantitatively evaluated in this work. First, the indicators reflecting the impact of both injection and natural edge-bottom water are considered at the same time, avoiding one-sidedness of traditional reservoir engineering methods. Then, the selection of principal components is optimized, which is the application innovation of this method. Additionally, it can objectively assign weight among different indicators, breaking the subjective limitation of other mathematical methods. Finally, the quantitative ranking results of water flooding effect among several oilfields can be obtained.
Therefore, to comprehensively evaluate the water drive development effect of natural edge-bottom water reservoir, this paper introduced principal component analysis method to realize comprehensive evaluation. Furthermore, 5 typical oilfields in the target block are selected as illustrations. First, the target oilfields are overviewed. Second, the water flooding effect is evaluated based on single index to get the preliminary cognition. On this basis, 11 indicators are screened from four perspectives, including natural edge-bottom water energy, production change, water injection development and utilization, energy maintenance and deficit compensation. Then, the principal component analysis method is used to take 11 indexes into consideration to evaluate the water flooding effect in natural edge-bottom water reservoir. Finally, the evaluation result obtained from principal component analysis is compared with the cognition of field practice to verify its reasonability. The advantage of this study is to fully consider the impact of both injected water and natural edge-bottom water. Also, the water flooding effect of several oilfields can be ranked. The disadvantage is the lack of indicator data. Because the target oilfields in this study are belonging to overseas, the access to some indicator data is limited. But the important indicators can be obtained and calculated, which is capable enough to get the reasonable results.

Geological and production analysis
The target block is located in northeastern Ecuador and dominated by tectonic-lithologic traps. There is some erosion on the top of the main oil layer, and the downdip direction is closed by faults. High-quality sandstone is developed in the Cretaceous Napo formation, which is dominated by Marine deposition and has a burial depth of 2100-2900 m. It belongs to natural edge-bottom water sandstone reservoir. A, B, C, D and E oilfields are the 5 typical edge-bottom water oilfields in this block. The section of typical B oilfield is shown in Fig. 1. The Napo formation can be further divided into T, U, M2 and M1 sandstone layers. The M1 sandstone layer is the main development layer of this block. The porosity of target block is between 20 and 32%, with an average porosity of 25%. The permeability is between 1000 and 8000 mD, with an average permeability of about 4000 mD.
The key properties of typical B oilfield, including porosity, permeability and effective thickness are shown in Fig. 2.
B and C oilfields were officially put into production in 1978. Initially, they mainly relied on natural energy for development and began to inject water to supply formation energy after 2000. D oilfield was put into production in 1998, which only relied on natural energy. So far, no water injection has been developed, causing the oil production to year by year after a certain period of stable production. The production declined quickly with the natural energy exhaustion in A oilfield. But it timely took water injection measures in 2006 to make up for formation energy. E oilfield was put into production in 2015. With the deepening of the exploitation, problems such as the rapid drop in reservoir pressure have become prominent, but it has not yet entered the water injection stage. A, B, C and D oilfields have all entered the ultrahigh water cut development stage until now, with the water cut over 95%, and the current water cut of E oilfield is 78.4%.

Natural energy evaluation
Reservoir natural energy evaluation is necessary during oil and gas production (Li and Zhu 2014). It is the key to determine reservoir development mode and countermeasures. Also, it lays the foundation for the comprehensive evaluation of the water flooding effect in edge-bottom water reservoir. According to the petroleum industry standard (1995), two indicators are usually used to quantitatively evaluate the natural energy in reservoir. One is dimensionless elastic production ratio, and the other is pressure drop per recovery of reservoir reserve.

Dimensionless elastic production ratio
Based on the principle of material balance, dimensionless elastic production ratio is defined as the ratio of cumulative oil production to the closed elastic driving production. This indicator is usually used to compare the contribution between natural energy and elastic energy of reservoir. The calculation formula is as follows: where N pr is dimensionless elastic production ratio; N p is cumulative oil production, 10 4 m 3 ; N is geological reserve, 10 4 m 3 ; B o and B oi are oil volume coefficients at current and original formation pressure; C t is comprehensive compression factor, MPa −1 ; P and P i are current and original formation pressure, MPa. When N pr = 1, the reservoir is elastically driven, indicating that there is no natural edge-bottom water energy. The larger N pr represents the more sufficient natural energy.

Pressure drop per recovery of reservoir reserve
This indicator reflects the adequacy of natural energy in the initial stage of the reservoir. The smaller value shows that the natural energy is more abundant and the edge-bottom water is more active. It is defined as follows: where D pr is pressure drop per recovery of reservoir reserve, MPa. Based on the production data of 5 target oilfields, the two indicators were calculated and plotted on the natural energy evaluation chart (Fig. 3). The results show that B and D oilfields have more sufficient natural energy, which belongs to level I. A and C oilfields have sufficient natural energy, which belongs to level II. E oilfield has weak natural energy, which belongs to level IV. Through natural energy evaluation, the natural energy adequacy of five oilfields is recognized and two important evaluation indicators N pr and D pr are calculated.

Single index evaluation
Water flooding reserves utilization degree is a common index for evaluating water injection development effect. The traditional calculation method introduced in the criteria (1996) is based on the thickness of all test water wells' absorption profiles and all test oil wells' fluid production profiles. At present, the water drive reserve is usually calculated by water drive characteristic curve, from which water flooding reserves utilization degree can also be obtained. This index is used to make a preliminary evaluation of water flooding effect in target oilfields, and the specific steps are as follows.

Plot the water cut relationship curve
According to production data, the relationship curve between water cut and recovery degree of recoverable

Choose the appropriate water flooding characteristic curve
First, compare the relationship curve between water cut and recovery degree of recoverable reserves in 5 target oilfields with four types of water drive characteristic curve plate (Duan et al. 2014;Miao et al. 2014). Then, choose the type of water drive characteristic curve with the highest degree of agreement according to the curve morphological characteristics. The results show that A, B, C and D oilfields conform to the feature of Type B water drive characteristic curve, whereas E oilfield conforms to the feature of Type C curve.

Calculate water flooding reserves utilization degree
Combined with the results of water drive characteristic curve selection, the curve slope b can be obtained by regression (Fig. 5). According to the formulas (Geng et al. 2014) for calculating the water drive reserves of type B and type C curves, the water flooding reserves utilization degree of 5 target oilfields can also be obtained. The results are shown in Table 1.
The calculation results show that water flooding reserves utilization degree of C oilfield is 78.97%, indicating that the water flooding development effect is good. Water flooding reserves utilization degree of A and B oilfields are 68.84% and 73.33%, respectively, showing that the water flooding effect is medium. That of D and E oilfields are 46.41% and 42.77%, respectively, indicating that the water flooding effect is poor.

Theory of principal component analysis
Principal component analysis is a multivariate analysis method in mathematical statistics. It can convert multiple indicators into a few principal components without affecting the original information. The high-dimensional variable space can be transformed into the low-dimensional space by means of it. This method can solve the problem of information overlap among evaluation indexes, and give objective weight for different indicators. The principal of this method is as follows.
Suppose the original evaluation indexes of water flooding development effect are X 1 , X 2 ,…, X p and new variables obtained from principal component analysis are F 1 , F 2 ,…, F k (k ≤ p), which can be expressed as follows: where F 1 , F 2 , …, F k , respectively, represent the 1st, 2nd, ⋯, and kth principal components of the original evaluation indicators X 1 , X 2 , …, X p ; μ 11 , μ 12 , …, μ pk are the linear combination coefficients between principal component and evaluation index.
The standard of extracting principal components is the top k comprehensive indicators with eigenvalues greater than 1 and cumulative variance contribution rate over 85%. At this time, the following three conditions need to be matched. (1) The principal component is a linear combination of all original evaluation indicators. (2) All principal components are uncorrelated. (3) The first principal component F 1 is the one with the largest variance among all linear combinations and so on. The mathematical description is as follows: where Cov(F i , F j ) represents the covariance between the ith and jth principal components; Var(F 1 ), Var(F 2 ), …, Var(F p ) are the variance of F 1 , F 2 , …, F p , respectively.

Steps of principal component analysis
There are 6 steps in the process of principal component analysis, including standardization of original data, calculation Fig. 4 Relation curve between water cut and recovery degree of recoverable reserves of correlation coefficient matrix, principal component eigenvalues and variance contribution rate, linear combination coefficient, getting the number of principal components and calculating the comprehensive score of water flooding effect of target oilfield. This process is shown in Fig. 6, and the specific calculation process is described in detail.

Standardize original data
Suppose there are n target oilfields, and each oilfield has p evaluation indexes, then the original sample data matrix is (X ij ) n×p , where i = 1,2, …, n and j = 1,2, …, p. Considering that the dimensions of each evaluation index are not uniform, the original data needs to be converted into dimensionless  1 3 data. The Z-score method is used to normalize the original data, and a standardized matrix Z ij with an average value of 0 and a standard deviation of 1 can be obtained.
where X j is the average value of original variables and S j is the standard deviation of original variables. When the real data of an indicator is greater than the average value of this index, the data after normalization is positive, otherwise, it is negative.

Calculate the correlation coefficient matrix R based on the standardized data
The correlation coefficient reflects the degree of correlation among different evaluation indicators, and the calculation formula is as follows:

Calculate principal component eigenvalues, variance contribution rate and cumulative variance contribution rate
Find principal component eigenvalues by solving the eigenvalue equations and rank them, that is, λ 1 > λ 2 > … > λ p . Get the eigenvector e i (i = 1,2, …, p) corresponding to the eigenvalue λ i . The variance contribution rate of principal component can reflect the contribution degree of information and it can be expressed as follows: where α i and λ i is the variance contribution rate and eigenvalue of principal component.
The cumulative variance contribution rate of principal components is as follows: where β(k) is the cumulative variance contribution rate.

Get the number of principal components and calculate the comprehensive score of all principal components
Based on the principle of principal component extraction, the number of principal components is determined and the comprehensive score of all principal components is calculated. The extracted k principal components are weighted by eigenvalues and the comprehensive score F of the principal components can be expressed as follows: Calculate the comprehensive score of water drive development effect in the target oilfield By substituting the standardized data of each target oilfield into the expression of F, the comprehensive score of water drive effect in each target oilfield is obtained, and the water drive effect can also be ranked.

Evaluation of water flooding effect in target oilfield
Water flooding development systems of oilfield are complex and have many characterization indicators. The evaluation of water flooding effect in edge-bottom water reservoir needs to fully consider the impact of natural edge-bottom water energy.
Based on the mature evaluation index system of water injection oilfield, 11 indicators are selected from four aspects including natural edge-bottom water energy, production Fig. 6 Flow chart of principal component analysis change, water injection development and utilization and energy maintenance and deficit compensation. To comprehensively reflect the overall development effect of the target oilfield under the impact of both natural edge-bottom water and injected water, 11 evaluation indicators are selected, including dimensionless elastic production ratio, recovery degree of reserves, water cut, rising rate of water cut, water storage rate, water consumption rate, liquid production rate, oil recovery rate, yearly injection-production ratio, cumulative injection-production ratio and monthly decline rate. Then, the principal component analysis is used to convert 11 evaluation indicators into several unrelated principal components. Also, it can objectively assign weights to different indexes, and comprehensively evaluate the water flooding effect of 5 target oilfields.
The value range of original evaluation index data in five oilfields is shown in Table 2. According to the theory and calculation steps of principal component analysis, the original index data is first standardized by using Eq. (7) to eliminate the influence of different dimensions and orders of magnitude. The result is shown in Fig. 7, from which it can be seen that 11 indicators are all in the same magnitude and dimensional range, facilitating comparison and analysis.
Based on the standardized sample data, the principal component eigenvalue, variance contribution rate and cumulative variance contribution rate can be calculated, and the result is shown in Table 3.
It can be seen from Table 3 that the eigenvalues of the first, second and third principal components are 7.324, 2.152 and 1.304, which are all greater than 1. Based on the consideration of keeping as much information as possible so as to get more convincing results, three principal components were selected finally. The cumulative variance contribution rate of three principal components reaches 97.997%, which means that they can comprehensively represent the information of 97.997% of the original 11 indicators. According to the extracting principle of principal component, the number of principal components is determined to be 3. Calculate the correlation coefficients between three principal components and each evaluation index based on Eq. (11), and the result is shown in Table 4.
According to Eq. (12), the extracted three principal components are weighted and summed in terms of eigenvalues to obtain the expression of comprehensive score F (Eq. (13)). Substitute standardized data of evaluation indicators into the expression of F to obtain the comprehensive score of each oilfield and rank them. The result is shown in Table 5, and the larger comprehensive principal component value means that water flooding effect of target oilfield is better.   It can be seen from Table 5 that B oilfield has the best water flooding development effect, of which comprehensive score is 1.434. This oilfield has sufficient natural edge-bottom water energy, and water injection is also timely to supply formation energy in the later stage of development. So, the overall water flooding effect is the best. For A and C oilfields, the comprehensive score is 1.021 and 0.527, respectively. The natural edge-bottom water energy of two oilfields is sufficient, and the formation energy is supplied by water injection later. Thereby, the overall water flooding development effect is also good. The comprehensive score of the D oilfield is 0.259. Although it has sufficient natural energy, the water injection measure has not been taken in time. Finally, the poor overall water flooding effect is caused. The water drive development effect of D oilfield is the worst, of which comprehensive score is − 3.241. Because it is not only weak in natural edge-bottom water energy, but also does not carry out artificial water injection measure at present. Therefore, the adequacy of natural edge-bottom water energy cannot determine the overall water flooding development effect of oilfield. It is very crucial to replenish formation energy in time by taking artificial water injection measure.

Results and discussion
Tong's chart has a good applicability for the evaluation of water flooding development effect in medium-high permeability reservoir (Zhang 2019). Based on Tong's chart, the accuracy of principal component analysis applied in evaluation of water flooding effect is further discussed from the perspective of field practice.
In order to make relation curves clear, the data from A, B, C oilfields and that of D, E oilfields were drawn, respectively. As it can be seen from Fig. 8a, when A, B and C oilfields were developed merely relying on natural energy, the water cut rose rapidly. Even if measures like drilling new production wells were adopted, the water cut still rose quickly after short drop, and relation curves fluctuated greatly. After artificial water injection development, water flooding scale gradually increased among three oilfields. So, the water cut rose slowly and predicted recovery was significantly improved, indicating that the water drive effect of three oilfields was improved. Among them, the water flooding effect of B oilfield is the best, with the highest recovery degree under the same production time. The predicted recovery of B oilfield is about 55%, followed by C and A oilfields. However, D and E oilfields have not yet been injected with water, only relying on natural edge-bottom water energy for development. The water cut of D oilfield rose fast, which was close to 97%. If no corresponding measures are taken, it will rapidly reduce to abandoned production. The predicted recovery of D oilfield is less than 35%, showing that the water drive effect is poor. At present, E oilfield is still in the early stage of development, but its water cut is close to 80% and the recovery degree is only 10%. If no measures are taken, the predicted recovery is expected to be only 25%, with the worst water flooding effect (Fig. 8b).
The evaluation results of water flooding effect based on different methods are shown in Table 6. It can be seen that the evaluation result of principal component analysis is in line with that of field practice, verifying its reasonability and accuracy. Compared to single index, whose evaluation result deviates from field verification, principal component analysis is more comprehensive. Because it can fully reflect the impact of both natural edge-bottom water energy and artificial injected water energy. The method proposed in this paper provides a novel idea for the evaluation of water flooding effect in natural edge-bottom water reservoir.

Summary and conclusions
To comprehensively evaluate water drive development effect of natural edge-bottom water reservoir, principal component analysis method is introduced by us in this paper. Taking five oilfields as illustrations and comparing evaluation results of water flooding effect based on different methods, we have drawn the following conclusions.
1. B and D oilfields have more sufficient natural energy, which belongs to level I. A and C oilfields have sufficient natural energy, which belongs to level II. The natural energy of E oilfield is weak, which belongs to level IV. 2. Based on principal component analysis, the comprehensive score of water flooding effect among five oilfields are 0.527, 1.434, 1.021, 0.259 and − 3.241, respectively. The result indicates that B oilfield has the best water flooding effect followed by C, A and D oilfields, and that of E oilfield is the worst. 3. The evaluation result obtained from principal component analysis is in line with that of field practice, verifying its reasonability and accuracy. The method proposed in this paper can be regarded as a novel idea for the evaluation of water flooding effect in natural edge-bottom water reservoir.
Funding The funding was grant by The National Science and Technology Major Project (Grant No. 2016ZX05031-001).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.