Introduction

Due to increasing demand for energy during the last decades, adding substantial oil reserves is a necessity. Besides exploration and unconventional field development, EOR technology has demonstrated very promising results in improving oil recovery factor.

Currently, the ultimate oil recovery factor is about 35% which means that two-thirds of the oil remains underground. Increasing the recovery factor from 35 to 45% would bring about 1 trillion bbl. of oil (Labastie 2011).

The WAG injection process in oil fields has shown an increase in the recovery factor typically ranging from 5 to 10% over water or gas injection in many fields around the world (Christensen et al. 2001). However, the application of this technology is still very limited due to the complexity of the WAG process and difficulty in quantifying the expected performance prior building full field complex simulation model and running a pilot test.

During the last decades, many research papers were published focusing on WAG performance (Giordano et al. 1998), WAG simulation (Kohata et al. 2017), WAG pilot results analysis (Christensen et al. 2001), WAG process mechanisms (Afzali et al. 2018), WAG hysteresis (Lazreg et al. 2017), and WAG management(Sanchez 1999). However, WAG incremental recovery factor over secondary recovery is still unknown.

Furthermore, field-scale WAG project approval required detailed technical study based on laboratory experiments and WAG modeling followed by WAG pilot test for WAG performance calibration which are very time and costly.

Figure 1 demonstrates the typical WAG project approval workflow.

Fig. 1
figure 1

Typical workflow for field-scale WAG project approval

The objective of this study was to develop a semi-numerical model that can predict WAG incremental recovery factor based on data mining of published WAG pilots worldwide. This research study started by literature review on WAG process and data mining techniques, followed by building a database of WAG pilot test results and input, and then building a predictive model from the WAG pilots using regression and GMDH.

This technical research paper will share the background of WAG process and mechanisms, summarizes WAG pilot lessons learned, and demonstrates a new approach for predicting WAG incremental recovery factor using semi-numerical model developed based on WAG data mining.

Water alternating gas recovery factor and mechanisms

The overall recovery factor (efficiency) (RF) of any secondary or tertiary oil recovery method is the product of a combination of three individual efficiency factors as given by the following generalized expression (Ahmed 2010):

$${\text{RF}} = E_{\text{D}} E_{\text{A}} E_{\text{V}}$$
(1)

where RF = the overall recovery factor (0 to 1), ED = the displacement efficiency (0–1), EA = the areal sweep efficiency (0–1), EV = the vertical sweep efficiency (0–1).

The overall recovery factor is a function of multiple factors including fluid mobilities, injection patterns, areal and vertical heterogeneities, the degree of gravity segregation, and total pore volume injected (Ahmed 2010).

WAG incremental recovery factor is a result of the increase in both displacement and volumetric sweep efficiencies due to the reduction in the residual oil saturation and improvement of both areal and vertical sweep efficiencies. WAG injection is one of the promising and effective EOR technologies in enhancing oil recovery factor over secondary recovery. However, WAG recovery mechanisms are complex process that require in-depth understating of factors controlling its performance.

Multiple research papers were published during the last decades on WAG recovery mechanisms which include three-phase WAG Hysteresis: residual oil to gas, mobility control, and oil vaporization and swelling.

Lazreg et al. (2017) demonstrated the impact of two-phase and three-phase WAG hysteresis on WAG incremental recovery factor based on an integrated research study that incorporated findings from both laboratory experiments and reservoir simulation from multiple Malaysian oil fields. This technical paper illustrated that three-phase WAG hysteresis could increase WAG incremental recovery factor by 1–2% on top of secondary recovery. Skauge and Larsen (1994) demonstrated that the residual oil saturation by three-phase flow was significantly lower than the residual oil saturation from two-phase waterflood and gas injection.

Mobility ratio is an important factor that controls the gas injection process volumetric sweep efficiency with a favorable mobility of less than one (< 1). Reduction in the mobility ratio can be obtained by increasing the gas viscosity or reducing the relative permeability of the fluids. Reduced mobility of the gas phase can be achieved by injecting water and gas alternately. It is essential to adjust the amount of water and gas to achieve the best possible displacement efficiency. Too much water will result in poor microscopic displacement, and too much gas will result in poor vertical, and possibly horizontal, sweep (Christensen et al. 2001).

Oil swelling and vaporization in the presence of oil and gas phases are the components of the WAG incremental recovery factor. The improvement of oil recovery during gas EOR includes oil swelling, gas–oil interfacial tension (IFT) reduction, oil viscosity reduction, and extraction of light and intermediate hydrocarbons for immiscible flooding to completely miscible displacement (Tunio et al. 2011; Cao and Gu 2013; Blunt et al. 1993; Chen et al. 2010). Chordia and Trivedi (2010)showed that when CO2 contacts the oil, swelling occurs, causing the oil to expand and move toward the producing well. Observations suggest that when the oil and gas mix, drainage rates become higher in the oil zone, driving the excess oil toward the fractures.

Water alternating gas injection technology application

Water alternating gas (WAG) injection technology is a method which may improve oil recovery efficiency by combining the effects from two traditional techniques—water and gas flooding. Both microscopic oil displacement and sweep efficiency can be enhanced by WAG implementation (Surguchev et al. 1992). WAG injection consists of injecting water and gas from the same injection wells which contribute to the improvement of overall sweep efficiency by:

  • Reducing oil viscosity and increasing reservoir oil volume due to oil swelling (Moffitt and Zornes 1992).

  • Vaporizing of light oil components and striping of higher hydrocarbons (Jakobsson and Christian 1994).

  • Reducing residual oil saturation under gas injection as compared to waterflooding (Moreno et al. 2010).

  • Increasing oil recovery factor through improved waterflood conformance (Champion and Shelden 1989).

  • WAG hysteresis which involves reduction in water and gas mobilities, and reduction in residual oil saturation (Lazreg et al. 2017).

  • Increasing attic oil production through gas injection (Owens et al. 2018; Christensen et al. 2001).

  • Maintaining reservoir pressure (Hermansen et al. 1997).

The improvements in gas handling and in situ lifting are some other benefits of WAG injection.

WAG injection process is divided into miscible and immiscible processes depending on the occurrence of miscibility between injection gas and in situ reservoir oil under reservoir conditions. If gas injection happens at a pressure higher than minimum miscibility pressure (MMP), then the WAG process is miscible otherwise the opposite. Field application has demonstrated that miscible WAG injection performance and recovery factor are generally higher than immiscible WAG injection (IWAG) (Christensen et al. 2001).

Injected gas during the WAG process can be hydrocarbon or non-hydrocarbon. Hydrocarbon gases are paraffin of lower molecular weight like methane and ethane, and non-hydrocarbon gases are carbon dioxide (CO2) and nitrogen (N2).

WAG injection scheme is one of the critical parameters that control WAG injection performance, as WAG ratio, WAG cycle, WAG slug size, injection rates, WAG duration, and start-up timing (Yang et al. 2008).

Most of the published successful WAG pilot projects are from fields located in USA and Canada (Christensen et al. 2001). Few published successful projects are in North Sea, China, Venezuela, North Africa, Middle East, Malaysia, and Croatia.

With the progress in computer hardware during the last decades and development of multiple reservoir modeling software, evaluating large-scale WAG project performance is now possible. However, modeling the WAG process in full scale is expensive, is very time-consuming, and depends on the modeling quality and input data uncertainty.

Pilot WAG data summary

Twenty-eight oil fields with published pilot WAG data were used in this research study. Field location includes USA, Canada, Malaysia, China, Norway, Venezuela, and India. Table 1 summarizes selected fields location. Many other reviewed WAG pilot papers did not reveal the observed WAG incremental recovery factor which is required in this research work.

Table 1 Fields used during this research study (Christensen et al. 2001; Ramachandran et al. 2010; Sanchez 1999; Rosman et al. 2011)

Figure 2 shows the WAG incremental recovery factor versus hydrocarbon pore volume of injected gas from the selected fields. Figure 3 shows each WAG projects parameter which includes WAG process type, rock and fluid properties, and WAG injection scheme. Few projects shared WAG recovery factor at specific hydrocarbon pore volume of injected gas and/or ultimate recovery factor from the WAG project only, while others shared full trend of the pilot WAG recovery factor.

Fig. 2
figure 2

Summary of WAG incremental recovery factor from WAG pilots

Fig. 3
figure 3

WAG pilot data used in this research study

WAG pilot data analysis

One method to qualify dependence between random elements is to rank them according to how close one is a function of another. Several measures based on this idea have been proposed. Many researchers have published different approaches for defining the dependency between random variables, including those based on copulas associated with Markov operators (Boonmee and Tasena 2016).

Understanding the dependency of input vectors is very important in reducing the number of predictors that simplify the study and cut the required time of the analysis. Linear regression is a widely useful statistical learning method for predicting a quantitative response. Moreover, it serves as a good jumping-off point for newer approaches as many fancy statistical learning approaches can be seen as generalizations or extension of linear regression (James et al. 2015).

Pearson correlation coefficient (PCC) is a statistical metric that measures the strength and direction of a linear relationship between two random variables. It has been applied to various indices in statistics, such as data analysis and classification, data analysis, clustering, decision making, finance analysis, and biological research (Zhou et al. 2016).

The Pearson correlation coefficient of two variables X and Y is formally defined as the covariance of the two variables divided by the product of their standard deviations (which acts as a normalization factor), and it can be equivalently defined by (Zhou et al. 2016):

$$r_{xy} = \frac{{\sum \left( {x_{i} - \bar{x} } \right)\sum (y_{i} - \bar{y})}}{{\sqrt {\sum \left( {x_{i} - \bar{x} } \right)^{2} } \sqrt {\sum \left( {y_{i} - \bar{y}} \right)^{2} } }}$$
(2)

where \(\bar{x} = \frac{1}{n}\sum\nolimits_{i = 1}^{N} {x_{i} }\) denotes the mean of x and \(\bar{y} = \frac{1}{n}\sum\nolimits_{i = 1}^{N} {y_{i} }\) denotes the mean of y.

The coefficient rxy indicates the strength of the linear relationship between the two variables and it ranges between 1 and − 1. The rxy is positive if the two variables are directly related and negative if they are inversely related. The closer the |rxy| to 1, the stronger the linear relationship between the two variables. rxy of 0 indicates uncorrelated variables.

The results of the input vector dependency analysis demonstrated low correlation coefficients which reflect the independence of the input vectors. Table 2 shows the results of the input vector dependency.

Table 2 WAG pilot input vectors Pearson’s correlation coefficients

Data mining

Data mining is the computational process of discovering patterns in large data sets (“big data”) involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems (James et al. 2015).

Data mining is an important part of the processes of knowledge discovery in medicine, economics, finance, telecommunication, and various scientific fields. Data mining helps to uncover hidden information from an enormous amount of data that are valuable for the recognition of important facts, relationships, trends, and patterns (Medvedev et al. 2017).

Nowadays, data mining (DM) has attracted a lot of attention in data analysis area, and it became a recognizable new tool for data analysis that can be used to extract valuable and meaningful knowledge from data (Ahmed et al. 2016).

As a highly application-driven domain, data mining has incorporated many techniques from other domains such as statistics, machine learning, pattern recognition, database and data warehouse systems, information retrieval, visualization, algorithms, high-performance computing, and many application domains (Fig. 4) (Han et al. 2012).

Fig. 4
figure 4

Data mining adopts techniques from many domains (Han et al. 2012)

Statistics studies the collection, analysis, interpretation or explanation, and presentation of data. Data mining has an inherent connection with statistics. A statistical model is a set of mathematical functions that describe the behavior of the objects in a target class in terms of random variables and their associated probability distributions. Statistical models are widely used to model data and data classes (Han et al. 2012).

Machine learning is a learning method that automates the acquisition of knowledge, and it plays an important role in artificial intelligence research. An intelligent system without learning ability cannot be regarded as a real intelligent system, but the intelligent system in the past was generally lack of learning ability (Xiuyi and Yuxia 2018).

Machine learning

Machine learning, by its definition, is a field of computer science that evolved from studying pattern recognition and computational learning theory in artificial intelligence. It is the learning and building of algorithms that can learn from and make predictions on data sets. These procedures operate by construction of a model from example inputs in order to make data-driven predictions or choices rather than following firm static program instructions (Simon et al. 2016).

Machine learning involves two types of tasks:

  • Supervised machine learning where the program is trained based on a pre-defined set of data, which then facilitates the program ability to get an accurate conclusion with new data.

  • Unsupervised machine learning where the program is given a set of data for list of vectors, and program must find relationships and patterns therein.

The most popular approaches to machine learning are artificial neural networks and genetic algorithms (Michael 2011). Artificial neural networks (ANNs) are computing system inspired by biological neural networks that constitute human brains. ANN is capable of approximating nonlinear functional relationships between input and output variables (Kim et al. 2018). The basic processing elements of neural networks are neurons. Neurons in ANN are characterized by a single, static, continuous-valued activation. A collection of neurons is referred to as a layer, and the collection of interconnected layers forms the neural networks (Kim et al. 2018). Figure 5 shows a typical structure for an artificial neural network.

Fig. 5
figure 5

Neural network typical structure

The development of neural networks was introduced to partly improve the modeling procedure, but their high degree of subjectiveness in the definition of some of their parameters as well as the demand of long data samples remains significant obstacles (Anastasakis and Mort 2001).

The group method of data handling (GMDH) is family of inductive self-organizing data-driven approach that requires small data samples, and it has the ability in optimizing neural network models structure objectively. GMDH technique has been used in data mining, knowledge discovery, prediction, complex system modeling, and pattern recognition (Lemke and Motzev 2016).

WAG Incremental Recovery Factor Prediction

Regression

Regression is a statistical technique to determine the relationship between two or more variables. It is used for predicting an output as a function of given input vectors. There are multiple types of regression techniques starting by the simplest regression technique, which is linear regression, and then the other advanced regression techniques.

Multiple regression is a technique for modeling the association among the scalar dependent variable V and one or more descriptive variables indicated by Y. It predicts the future value of the variable with respect to other variables.\(V = w_{0} + w_{1} y_{1} + \cdots + w_{n} y_{n} + \varepsilon\), where V implies the dependent variable, w0 to wn implies the coefficients, y1 to yn implies the independent variables, and ε implies the random error (Bini and Mathew 2016).

Regression model and validation

The WAG data were divided into three categories: miscible WAG in sandstone formation, immiscible WAG in sandstone formation, and miscible WAG in carbonate formation.

For the development of WAG incremental recovery factor predictive model for each category, 70% of the WAG data was used for training the model and 30% was used for model validation.

Figures 6, 7, 8, 9, 10 and 11, and Table 3 summarize the results of WAG incremental recovery factor for training and validation.

Fig. 6
figure 6

Regression WAG incremental recovery factor prediction model training results for miscible WAG in sandstone formation

Fig. 7
figure 7

Regression WAG incremental recovery factor prediction model training results for immiscible WAG in sandstone formation

Fig. 8
figure 8

Regression WAG incremental recovery factor prediction model training results for miscible WAG in carbonate formation

Fig. 9
figure 9

Regression WAG incremental recovery factor prediction model validation results for miscible WAG in sandstone formation

Fig. 10
figure 10

Regression WAG incremental recovery factor prediction model validation results for immiscible WAG in sandstone formation

Fig. 11
figure 11

Regression WAG incremental recovery factor prediction model validation results for miscible WAG in carbonate formation

Table 3 Regression WAG incremental recovery factor prediction model output

The prediction model details are shared under Appendix.

Group Method of Data Handling

GMDH is a supervised feed-forward networking model in which the original input vectors are used to generate the initial layer of the network, with each subsequent layer feeding its outputs to the next layer. The model’s underlying concept resembles animal evolution or plant breeding, as it adheres to the principle of natural selection. The multilayer criterion preserves superior networks for successive generations, eventually yielding an optimal network (Tsai and Yen 2017).

The topology of the GMDH network is determined using a layer-by-layer pruning process based on the pre-defined criterion of what are the best nodes at each layer. Farlow (1981) recognized that many types of mathematical models require the modeler to know the system variables that may generally be very difficult to find. The modeler will be forced to guess these variables; this guess is not only time-consuming but also produces unreliable prediction models.

GMDH was developed to produce a model by looking only at input data and the desired output (Semenov et al. 2010).

GMDH uses an iterative polynomial regression procedure to synthesize any model. The polynomial regression equations can produce a high-order polynomial model using effective predictors. Farlow (1981) started by computing the quadratic polynomial regression equation:

$$y = a + bx_{i} + cx_{j} + dx_{i}^{2} + ex_{j}^{2} + fx_{i} x_{j}$$

where y is the output sample; (xi, xj) is a pair of input samples; and a, b, c, d, e, and f are the polynomial coefficients to be determined by the training data set.

The matrix of input variables xij with m predictors, n observations, and the output variable yi is defined below.

$$\left( {\begin{array}{*{20}l} {x_{11} } \hfill & \cdots \hfill & {x_{m1} } \hfill \\ \vdots \hfill & \cdots \hfill & \vdots \hfill \\ {x_{1n} } \hfill & \cdots \hfill & {x_{mn} } \hfill \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {y_{1} } \\ \vdots \\ {y_{n} } \\ \end{array} } \right)$$

During the training, GMDH will use the input matrix of n observations and m + 1 input variables (m xij independent variables and one dependent variable Yi).

The training iterations will start by taking all the independent variables (two columns at a time) and then constructing the quadratic regression polynomial (equation), that best fits the dependent variables. Each pair of input vectors will form a final quadratic regression polynomial equation. The first layer is constructed using m independent variables and the dependent variable to form k = m(m − 1)/2 regression polynomials. New variables (z1n, z2n, …, zkn) that describe better the dependent variable will be the input to the second layer, and so on. Less effective variables will be eliminated using either regulatory criterion or root-mean-squared error.

Training and iteration will continue to produce new variables that describe the solutions better than the previous ones until the minimum error value of the current layer is greater than the previous one. Figures 12 and 13 show a typical GMDH method layout and GMDH algorithm.

Fig. 12
figure 12

GMDH method layout

Fig. 13
figure 13

GMDH method algorithms

GMDH technique has been used in data mining, knowledge discovery, prediction, complex system modeling, and pattern recognition (Lemke and Motzev 2016).

GMDH external criterion

It is a regularity criterion used to test the model adequacy. It evaluates the output of each new neuron in the GMDH network using the pre-defined regulatory criterion (i.e., root-mean-square error (RMSE) between the predicted and actual outputs of a neuron). Neurons that fulfill the regulatory criterion will survive and used as an input to the next layer, and neurons that do not fulfill the criterion will be discarded.

Building GMDH model procedure

The steps in building a GMDH model are:

Step 1 Divide the input data into training and test sets.

The input data are divided into training and test sets. The training set data are used to train the model and estimate certain characteristics of the nonlinear system, and the test set are then used to validate the model and determine the complete set of characteristics.

Step 2 Generate new variables in each layer.

New variables(neurons) for each layer are generated from the combinations of input variables. The number of combinations is given by:

\(C_{r}^{m} = \frac{m!}{{r!\left( {m - r} \right)!}} ,\) where m is the number of input vectors and r is usually set to two (Farlow 1981).

With m = 2, new variables’ count as per previous equation is \(C_{2}^{m} = \frac{{m\left( {m - 1} \right)}}{2}\).

Step 3 Optimization principle for elements in each layer.

Regression analysis is applied to the training data to calculate the optimum partial descriptions of the nonlinear system.

Usually root mean square (RMS) is used as an external criterion (index) to screen out underperforming neurons in each layer. RMS is defined as:

$${\text{RMS}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {\hat{y}_{i} - y_{i} } \right)^{2} }}{n}}$$

where \(\hat{y}_{i}\) is the actual output, \(y_{i}\) is predicted output, and n is number of observations.

RMSE balance is the criterion used in this study. RMSE criterion in this case is penalized by the difference between training and testing sample RMSE.

Step 4 Stopping rule for the multilayer structure generation.

By comparing the index value of the current layer with that of the next layer to be generated, further layers are prevented from being developed if the index value does not improve or fall below a certain objective default value; otherwise, steps 2 to 3 are repeated until the value matches the limited condition set above.

Difference between GMDH and neural networks

The major differences of the two techniques can be summarized in the ability of GMDH to objectively select the optimum model, avoid model overfitting problems and select the most relevant input variables. Anastasakis and Mort (2001) summarized the features of both NN and self-organizing modeling in a variety of categories as shown in Table 4.

Table 4 Comparison between artificial neural network and group method of data handling

Group method of data handling results

The WAG data were divided into three categories: miscible WAG in sandstone formation, immiscible WAG in sandstone formation, and miscible WAG in carbonate formation.

For the development of the prediction model for WAG incremental recovery factor, 70% of the WAG data were used for training the model, while 30% was used for model validation.

Figures 14, 15, 16, 17, 18, 19, and Table 5 summarize the results of WAG incremental recovery factor for training and validation.

Fig. 14
figure 14

GMDH WAG incremental recovery factor prediction model training results for miscible WAG in sandstone formation

Fig. 15
figure 15

GMDH WAG incremental recovery factor prediction model training results for immiscible WAG in sandstone formation

Fig. 16
figure 16

GMDH WAG incremental recovery factor prediction model training results for miscible WAG in carbonate formation

Fig. 17
figure 17

GMDH WAG incremental recovery factor prediction model validation results for miscible WAG in sandstone formation

Fig. 18
figure 18

GMDH WAG incremental recovery factor prediction model validation results for immiscible WAG in sandstone formation

Fig. 19
figure 19

GMDH WAG incremental recovery factor prediction model validation results for miscible WAG in carbonate formation

Table 5 GMDH WAG incremental recovery factor prediction model output

The prediction model details are shared under Appendix.

Results summary and discussions

The research study results can be divided into two parts: the WAG incremental recovery factor prediction model results and WAG pilots lessons learned.

WAG incremental recovery factor predictive model

The WAG data obtained from the WAG pilots include rock properties, fluid properties, and injection scheme; however, complete data set was not available for all fields. The preliminary analysis of the of WAG parameters as per Table 2 demonstrated that selected parameters are independent.

WAG pilot reservoir parameters and WAG incremental recovery factor with one hundred and seventy-seven (177) observations were prepared and inputted to the two widely used data mining techniques which are regression and GMDH. Input data include rock type, WAG process (miscible, immiscible), reservoir permeability, reservoir pressure and temperature, oil gravity and viscosity, and hydrocarbon pore volume of injected gas.

All prediction models were trained utilizing 70% of the observations and validated using the remaining 30% for three input classes that are miscible WAG in sandstone formation, immiscible WAG in sandstone formation, and miscible WAG in carbonate formation. Sensitivity on data splitting (i.e., 80/20, 60/40, 70/30) demonstrated no significant improvement of prediction models accuracy. After multiple iterations, both regression and GMDH models were optimized. For regression method, linear regression model was tested first leading to low coefficient of determination (R2) (i.e., R2 < 65% for validation); hence, complexity was added to the regression model till reaching minimum mismatch error between observed and predicted data.

For the GMDH method, multiple iterations were taken adjusting maximum number of layers and neurons till optimum model achieved with minimum mismatch error. RMSE balance was used as validation criterion. The final WAG incremental recovery factors models were achieved with high coefficient of determination (R2) as per Tables 6 and 7.

Table 6 Summary of the two prediction model results: training set
Table 7 Summary of the two prediction model results: training set

Tables 6 and 7 summarize the results of WAG incremental recovery factor prediction model parameters achieved for both regression and GMDH methods. GMDH algorithm results demonstrate its strength and ability in selecting the effective predictors in the prediction model, optimizing the network structure, and achieving predictive model with high accuracy as compared to regression technique.

Figure 20 shows the comparison between the regression and GMDH models’ prediction results. Both predictive models showed good predictivity of the WAG recovery factor; however, regression model showed significant mismatch for few of the observed data.

Fig. 20
figure 20

Comparison between GMDH and regression model prediction results

As compared to the traditional way of estimating WAG incremental recovery factor from time-consuming reservoir simulation studies which depend on reservoir models’ history matching quality, technical team expertise, and input data uncertainty, these WAG prediction models are an easy tool for quick WAG evaluation.

These WAG prediction models are expected to support technical people to estimate field WAG recovery factor prior launching very detailed technical study which is very expensive and costly and assess the risks associated with the WAG project.

These WAG predictive models apply to CO2 WAG mainly since most of the WAG projects used in this study are CO2-WAG as per Fig. 4.

WAG lesson learnt

WAG success criteria:

  • Higher production rate and reduced water production (Ma and Youngren 1994).

  • Immediate increase in oil production and an increase in the ultimate recovery (Hinderaker et al. 1996).

  • Oil viscosity was decreased, and light components of oil were increased (Jingcun et al. 1997).

  • Laboratory coreflood indicated that residual oil to waterflood, sorw, is reduced by trapped gas. To reduce this residual “immobile” oil saturation, the oil mobility must have increased (Champion and Shelden 1989).

  • WAG injection tends to increase waterflood efficiency and reduce water–oil ratio (WOR) (Champion and Shelden 1989).

  • A gently increase in well GOR and flattening on water cut and oil production rate (Choudhary et al. 2011).

  • The decline rate prior to WAG was about 55% and after a year since the commencement of WAG the decline rate dropped to below 25% (Choudhary et al. 2011).

  • Laboratory displacement study indicated an improvement in displacement efficiency by 14.5% over water flood in the target sand (Ramachandran et al. 2010).

  • Oil production from pilot wells was improved after initialization of WAG with a decline in water cut (Ramachandran et al. 2010).

WAG recovery mechanisms and benefits:

  • Reduction in oil viscosity and an increase in reservoir oil volume due to oil swelling (Moffitt and Zornes 1992).

  • Pressure maintenance, gravity drainage, compositional effects such as vaporization and viscous displacement of oil by gas (Hermansen et al. 1997).

  • Pressure support, swelling of the oil and vaporization, and striping of higher hydrocarbons (Jakobsson and Christian 1994).

  • Viscous displacement, molecular diffusion, gravity drainage, vaporization/stripping—below bubble point, swelling of oil—above bubble point (Jakobsson and Christian 1994).

  • Gas injection mobility control, thereby reducing gas-handling requirements (Champion and Shelden 1989).

  • Oil recovery factor increases through improved waterflood conformance (Champion and Shelden 1989).

  • Oil swelling and viscosity reduction contribute in the mobilization of the oil (Hsie and Moore 1998).

  • Reservoir heterogeneity, dispersion, and diffusion were believed to promote spreading of CO2 and mixing between CO2 and reservoir Oil, thereby increasing volumetric sweep (Hsie and Moore 1998).

  • Better reservoir management (improved gas handling, in situ lifting) (Ma and Youngren 1994).

Pilot WAG challenges and issues:

  • Formation of in situ carbonic acid is believed to cause dissolution of carbonate cementing materials in the reservoir (Hsie and Moore 1998).

  • Erosion caused by sand production accelerated downhole corrosion (Hsie and Moore 1998).

  • Downdip injection is dominated by gas segregation, as shown in gas injection (Leif Hinderaker et al. 1996).

  • Reservoir complexity/thickness of the formation, communication between main field and east flank, production of well coverage, the distance between the WAG injector and the production wells, extensive gas migration upwards in the formation, WAG performance was underestimated in simulation (Crogh et al. 2002).

  • Due to grid size and vertical reservoir permeability, representation of the WAG process in a full field simulation model is not satisfactory (Crogh et al. 2002).

  • Gas override during WAG process (Pritchard et al. 1990).

  • Injected CO2 moved mainly along the high permeable path (Jingcun et al. 1997).

  • The injection pressure of water injector was increased (Jingcun et al. 1997).

  • Produced liquid was slightly acidic when more calcium, magnesium, chloride, and salinity were increased (Jingcun et al. 1997).

  • Wellhead pressure increased with the increase in gas–oil ratio (Jingcun et al. 1997).

  • Well temperature decreased, and liquid yield of producers declined due to mobility decrease in the reservoir (Jingcun et al. 1997).

  • Controlling the injection between upper and lower zones was a key factor in deciding the success of the development (Al Shamsi et al. 2012).

Conclusions

  • Recovery factor from the WAG pilots was mostly ranging from 5 to 10%, with few WAG pilots showing an incremental recovery factor of up to 20%.

  • Three WAG incremental recovery factor predictive models were developed for sandstone formation with immiscible gas injection, sandstone formation with miscible gas injection, and carbonate formation with miscible gas injection.

  • A predictive model that estimates WAG incremental recovery factor was developed based on the input from WAG pilots with one hundred and seventy-seven (177) observations.

  • Seventy percent of the data were used for training the model and 30% were used in validating the model. Splitting data into 50/50, 60/40, 80/20 for model training and validation, respectively, changes the coefficient of determination by 0 to 0.05.

  • Coefficient of determination (R2) for GMDH WAG predictive models was ranging from 0.964 to 0.981 and 0.934 to 0.974 for training and validation sets, respectively.

  • Coefficient of determination (R2) for regression predictive models was ranging from 0.892 to 0.946 and 0.854 to 0.917 for training and validation sets, respectively.

  • GMDH results demonstrate its ability in developing predictive model, with high accuracy based, which is easy to use by reservoir engineers.

  • WAG incremental recovery factor is estimated as a function of rock type, WAG process type, reservoir horizontal permeability, API, oil viscosity, reservoir pressure and temperature, and hydrocarbon pore volume of injected gas.

  • This WAG incremental recovery factor prediction model is expected to help reservoir engineers generate WAG incremental production profiles and assess the WAG project risk prior launching detailed technical studies.

  • The developed WAG incremental recovery factor models apply mainly to CO2 WAG process.