Introduction

History matching, a process in which certain input parameters to the reservoir simulator such as porosity, permeability, thickness, saturations, depth of oil/water contact, connate water saturation, relative permeability, etc. are altered singly or collectively to obtain a match between simulator prediction values and observed historic data relating to flow rates of oil, gas, water, pressures, GOR (gas–oil ratio), WOR (water–oil ratio), and their variations as a function of time. The spatial inhomogeneity and anisotropic nature of the reservoir rocks result in very large dimensionality of the reservoir model which make this task complex. Reservoir history matching is considered to be an inverse problem, where one seeks to back calculate the system parameters from a given system output. In the normal modeling exercise, the system parameters are known, and our aim is to develop appropriate relationships so as to be able to predict the system performance. In history matching, the reservoir production data are available, but the reservoir static parameters (permeabilities and porosities) are unknown which need to be estimated. The spatial variation of these properties due to rock heterogeneity makes it an ill-posed problem since a very large number of permeability maps may lead to the same output, where most of these may be unrealistic. There are many stochastic soft computing techniques available to solve this inverse problem. In this study, an evolutionary optimization technique—called the genetic algorithm (GA)—has been employed to solve the history-matching problem. This optimization technique usually involves minimizing the objective function that describes the mismatch between the available field historic data and reservoir simulator response. GA as a stochastic optimization tool outperforms other gradient based methods (steepest descent, Gauss- Newton method, conjugate gradient etc.,) toward reaching a global optimal solution escaping the local optima (Gill 1981; Ouenes 1992; Tamhane et al. 2000; Gomez et al. 2001; Romero and Carter 2001; Schulze-Riegert et al. 2001; Choudhary et al. 2007).

The application of genetic algorithm for reservoir modeling was first introduced by Sen et al. (1995), for generating stochastic permeability distributions from a set of reservoir outcrops and tracer flow data, followed with uncertainty quantification of production forecasts. A modified GA was proposed by Bush and Carter (1996), for estimating parameters such as sand permeability, shale permeability, and fault throw. Their modified GA incorporated a nonstandard binary encoding, modified breeding strategies and niching, and was tested on a vertical cross section of a synthetic PUNQ-S3 reservoir. Another successful application of GA in identifying the heterogeneous reservoir properties by matching the tracer breakthrough profiles using six reservoir parameters was demonstrated by Guerreiro et al. (1998). They tested the proposed method on a heterogeneous quarter of five-spot synthetic reservoir, considering six parameters such as; the geometry of insertion and porosity inside and outside the insertion, for matching tracer breakthrough profile. Soleng (1999), applied steady state GA to condition the petrophysical properties of a reservoir to field observations. He examined the methodology on PUNQ S3 synthetic reservoir of grid dimensions 19 × 28 × 5. The grid block horizontal and vertical permeabilities and porosities were considered as the petrophysical parameters to be estimated, such that the reservoir description was conditioned to field observations (bottom hole pressure, gas/oil ratio, and water cut). A population size of 50 chromosomes was utilized and single point crossover, simple swap mutation and replacement operators were carried out for GA evolutions. An extensive testing of the GA optimizer for reservoir history matching and a comparison with simulated annealing and GA with hill climbing were attempted by Romero and Carter (2001). They tested their techniques on a coarse scale model of synthetic PUNQ- S3 complex reservoir of grid dimensions 12 × 44 × 10 having 11 producers and 6 injectors. 57 pilot points that included 17 well locations and 40 distributed pilot point locations in each of the nine layers (one inactive layer) were used for estimating the grid block permeabilities, porosities and shale volume. The authors concluded that genetic algorithm produced better optimal solutions than the results from simulated annealing and manual history-matching process. They showed the method to be inherently suitable for parallelization and reasonably insensitive to parameters settings used to control GA.

Williams et al. (2004), proposed the concept of top down reservoir modeling (TDRM) in history matching and uncertainty quantification. The approach utilizes genetic algorithm in conjunction with reservoir simulation for TDRM workflow to find reasonable multiple history-matched models. The authors reported that the tool had been successfully implemented in development of 18 oil and gas reservoirs. Automatic history matching, subsurface uncertainty quantification and infill well optimization was attempted by Choudhary et al. (2007) who developed a structured workflow that used evolutionary strategy and genetic algorithm optimization methods for re-evaluation of multiple history-matched models. Ballester and Carter (2007), designed a real-coded non-generational GA optimizer, to run on a cluster of parallel computers, for characterizing a real petroleum reservoir using available production data. Lange (2009), employed an inversion methodology based on GA optimization that was coupled with discrete fracture network (DFN) flow simulator to characterize the fractured reservoir models that are consistent with the well-test data. Han et al. (2011), presented multiobjective optimization using modified GA for history matching of waterflooding projects. Monfared et al. (2012) and Murgante et al. (2012) used GA for automatic history matching by reservoir parameterization for different case studies. While several workers have tried the use of GA for automatic history match in synthetic reservoirs, only a few attempted natural reservoirs which are more complex and hence more difficult to deal with.

A simple GA (SGA) and an adaptive genetic algorithm (AGA) have been employed in the present work. The algorithms and methodology were tested and validated on a synthetic 2D reservoir from 10th SPE Comparison Solution Project (Case Study#1) (Chitralekha et al. 2010; Christie and Blunt 2001). The methodology was subsequently used with a small real 3D, reservoir (Case Study#2). The history-matched model was, then, successfully used for reservoir production forecasting.

Genetic algorithm

The GA or simple genetic algorithm (SGA) utilizes the computer logic to mimic the mechanism of natural selection and natural genetics (Holland 1975). The procedure starts with a set of several initial feasible solutions, called chromosomes. These chromosomes evolve through consecutive iterations called generations based on the principles of natural selection, inheritance, crossover and mutation operations to generate new chromosomes, which have better fitness values as compared to the previous population. The fitness of the chromosomes is analyzed through an objective function called the fitness function that characterizes the performance of individual chromosomes in the search space. The superior the fitness value of a chromosome, greater the chances of it being selected to the next generation. Some parents and the offspring chromosomes may get rejected to maintain a fixed population size during generations. The algorithm converges to the best set of chromosomes after numerous iterations, which is considered as the potential set of solutions to the problem.

Objective function

The objective of history matching is to find those static parameters of the reservoir which minimize the error between field observations and simulator predictions. In reservoir history matching, the objective function (Q) is minimized which is the square of difference between the field historic production data and simulator response. As history matching is a minimization problem, the best model has the lowest numerical value of the objective function. Since the GA code is written for minimization, in the present case, objective function is same as fitness function. In general, the formulation of objective function used to find the optimal history-matched models is expressed as (Romero and Carter 2001):

$$ Q = \sum\limits_{i}^{{n_{p}}} {\sum\limits_{j}^{{n_{w}}} {\sum\limits_{k}^{{n_{t}}} {\left({\frac{{d_{ijk}^{O} - d_{ijk}^{S}}}{{d_{ijk}^{O}}}} \right)}}}^{2}, $$
(1)

where Q denotes the objective function, d O is the observed data such as fluid production and injection rates; bottom hole flowing pressure etc.; d S is the corresponding simulator (CMG®- IMEX™ reservoir simulator) output. The summation indices i, j, and k run over the production data types, number of wells and reported time steps with n p , n w and n t being the corresponding number of samples. The optimization is carried over the static parameters (permeability and porosity) of the reservoir.

The objective function has been formulated based on the objectives of the case study. For the 2D synthetic reservoir history-matching problem (see “History matching of a 2D Synthetic Reservoir (Case Study #1)” section for details), the objective function includes the quarterly oil and gas production data from one producing well. Hence, n p  = 2, n w  = 1 and n t  = 32 quarters over which data are available.

Selection mechanism

Selection or the reproduction operator selects the chromosomes from the population based on their fitness. A popular selection operator called the tournament selection has been applied in this work. The fitness of the solution represented by the objective function is calculated using Eq. 1. The string with a lower fitness value has greater chance of being copied in the mating pool compared to the string with a higher fitness value. Strings with low fitness values may be copied more than once, whereas strings with high values may be left out thus leading to a pool of strings (chromosomes) with improved overall fitness but of the same size of population.

The performance of genetic algorithm is mainly powered by crossover and mutation operators. The crossover operator induces a randomized exchange of genetic material between a pair of chromosomes with an assumption that the good chromosomes produce better ones that are fitter and hence closer to the optimal solution. The crossover operation is carried out with a probability, called crossover probability (P c ) on the chromosomes selected for recombination. The optimal values for P c reported in literature ranges between 0.5 and 1.00 for SGA, which is usually predefined by the user. Some of the established crossover operators are; single point, two point, k-point crossover, uniform crossover etc. Further, the chromosomes are subjected to mutation operation with a probability, called the mutation probability (P m ). Usually, the P m ranges between 0.001 and 0.05 for SGA. During mutation, the genetic material of chromosomes get modified to maintain genetic diversity. The mutation operation helps to recall the lost or uncharted genetic materials into the population, in order to avoid early convergence to local optimal solutions. Swap mutation, arithmetic mutation, jump mutation, uniform mutation and creep mutation are some of the well-known mutation operators. There are several publications that describe various recombination operators (Goldberg 1989, Eiben and Smith 2003, Schmitt 2004, Sivanandam and Deepa 2007, Picek et al. 2012). This process of GA continues with the newly generated offsprings until a termination criterion is satisfied. More detailed description and mathematics of genetic algorithm can be found in the books of Goldberg (1989); Deb (1998) and from other GA literature.

Determining the values of P c and P m is a crucial step, and there are no definite rules for choosing suitable values. In fact, the choice of optimal values for P c and P m depend on the problem under consideration (Srinivas and Patnaik 1994). Various studies detailing the effect of P c and P m on the performance of GA have been attempted (De Jong 1988; Grefenstette 1986; Schaffer and Morishma 1987; Goldberg 1989; Eiben et al. 1999; Herrera and Lozano 2003; Fernández-Prieto et al. 2010), and can serve as guide for choosing optimal values for P c and P m . The k-point crossover and uniform mutation operations were utilized for the present study.

The adaptive genetic algorithm (AGA) is an improved form of simple genetic algorithm in the sense that the crossover and mutation probabilities are no longer kept fixed at pre-assigned values but adjusted by the algorithm at each generation according to the fitness function response of the solution (chromosome). The design of AGA proposed by Srinivas and Patnaik (1994), adaptively tune the crossover and mutation probabilities between the maximum fitness and the average fitness value of the population to the fitness of the individual solution. In their design, the individual solutions with sub-average fitness are completely removed while retaining the high fitness solutions. This leads the algorithm to get stuck at local optimal solutions. Moreover, tuning P c and P m based on individual fitness solutions require large computation time (Wang and Shen 2001). Several researchers; Wang and Shen (2001); Fernández-Prieto et al. (2010); Wang and Tang (2011); Tang (2012) have subsequently improvised the adaptation mechanism leading to more efficient AGA algorithms. In the present work, P c and P m were tuned according to the fitness of whole population during evolution rather than of individual solution fitness (Wang and Tang 2011). The following section describes the AGA methodology employed in this work.

Adaptive crossover operator (P c )

The breeding of two chromosomes from the population based on the crossover rate and chromosome length have been utilized to generate new chromosomes. The newly generated population inherits the properties of their parent population. The formulation of P c is mathematically expressed as (Wang and Tang 2011):

$$ P_{c} = P_{c}^{0} \left[{1\; + \;\xi \frac{{(f_{\rm avg})^{\eta}}}{{(f_{\max} - f_{\min})^{\eta} + (f_{\rm avg})^{\eta}}}} \right], $$
(2)

where \( \xi \) and \( \eta \) are the coefficient factors; \( P_{c}^{0} \) and \( P_{c} \) are the initial crossover probability and adaptive crossover probability. f avg ; f min ; and f max denote the average fitness, minimum fitness, and maximum fitness of the population in each generation, respectively. The adaptive crossover operation implemented in this work is described in the following text.

Let, R be a randomly generated positive number (0 ~ 1), L c be the length of the chromosome, and \( P_{c} \) be the crossover probability for ith generation, \( K_{c} \) is the number of locations in the chromosome that undergo crossover. \( K_{c} \) is computed by multiplying length of the chromosome; L c by the crossover probability; \( P_{c} \) for the corresponding i th generation. This is mathematically represented as

$$ K_{c} = L_{c} \times P_{c} $$
(3)

Another random number between (0, L c  − 1) is generated K c times to find cross-site randomly. Crossover is performed at any Kth location, if R 1 and R 2 > R, where R 1 and R 2 are two more random numbers (0–1) corresponding to the Kth location. For example, if L c  = 100; P c  = 0.5, then the corresponding number of crossover locations, K c  = 50. Figure 1 illustrates an example of crossover operation implemented in this study. Let R be 0.4 and R 1 and R 2 as given in the figure corresponding to the locations of crossover for the two strings C 1 and C 2 participating in this operations. Figure 1 shows gene values of only crossover sites. Now check each of these locations, one at a time and effect a crossover of gene values if both R 1 and R 2 are greater than R (=0.4). The first crossover site (from the right) has R 1 = 0.78 and R 2 = 0.41. Since both are greater than 0.4, the crossover takes place, and at 72 and 8 from C 1 and C 2, they are crossed in the new strings N 1 and N 2. This process is repeated for the remaining 49 locations to complete the operation between C 1 and C 2. For the adaptive crossover operation presented here, the number of locations for crossover is controlled by the adaptive crossover probability (P c ) generated at each generation. As the value of P c increases, the number of locations (K c ) for crossover also increases.

Fig. 1
figure 1

Example of adaptive crossover operation

Generation of new chromosomes because of the crossover operation increases the pool size and hence the population is doubled. One simple way to keep the population fixed is to discard all the parents and use only children in the new pool. However, a preferred way is what is known as “elitism” in which the chromosome with lower fitness values are retained, be they from parents or from children and the rest are discarded keeping the population size constant.

Adaptive mutation operator (P m )

The rate at which the chromosomes are subjected to the mutation operation is controlled by the mutation probability (P m ). Mathematically, the calculation of P m at each iteration, is done according to,

$$ P_{m} = P_{m}^{0} \left[{1 + \omega \frac{{(f_{\max} - f_{\min})^{\eta} - (f_{\rm avg})^{\eta}}}{{\psi (f_{\max} - f_{\min})^{\eta} + (f_{\rm avg})^{\eta}}}} \right] $$
(4)

and

$$ \psi = \left({\frac{{(f_{\max} - f_{\min})}}{{f_{\rm avg}}}} \right)^{\eta} $$
(5)

where ω and η are the coefficient factors, P m and \( P_{m}^{0} \) represent the adaptive mutation probability and initial mutation probability, respectively (Wang and Tang 2011).

If L c is the length of the chromosome, P m is the mutation probability at the ith generation and K m is the number of locations in the chromosome that undergo mutation, then K m is calculated as,

$$ K_{m} = L_{c} \times P_{m} $$
(6)

A random number (0, L c  − 1) is generated K m times to locate the mutation sites. Then the mutation operator adds a randomly generated number (0, 1) to gene value at the mutation site of the chromosome. The number of location in the chromosome for mutation increases with the increase in adaptive mutation probability.

The genetic algorithm is terminated at a specified number of generations. Then the quality of population is checked against the problem definition else the process continues until achieving a satisfactory solution.

Workflow of genetic algorithm

Figure 2 shows the workflow of the methodology adopted for history matching. SGA and AGA codes were developed in MATLAB® environment for minimization. The GA code was interfaced with the CMG®-IMEX™ reservoir simulator for forward simulations. The program initializes with a set of initial realizations (population) of reservoir generated from geostatistical software (see next section for details). The fitness values of each of the realizations (set of initial solutions) are calculated using the CMG® simulator along with Eq. 1. The initial population passes through the GA operators (selection, crossover and mutation) to generate new reservoir realizations.

Fig. 2
figure 2

Workflow of genetic algorithm

While carrying out GA operations, it is necessary to recognize that the static parameters at well locations are observed values and hence cannot be allowed to change. This means that the permeabilities and porosities of the grid blocks which coincide with well locations should not take part in crossover and mutation operations. The values of chromosomes at well locations remain constant throughout the procedure. This completes one generation by the algorithm, and the process is repeated until satisfactory realizations are generated, which represent good history-matched models.

Step-by-step calculation procedure of GA used history matching

  1. (1)

    Generate initial population using a geostatistical algorithm. Each chromosome (string or realization) in this population is a full description of the reservoir parameters.

  2. (2)

    For each chromosome, the fitness function value is calculated using Eq. 1 which implies using the simulator to find the calculated values of the production terms and then comparing these with the observed values.

  3. (3)

    Use genetic operators: (a) Selection operator is used to create mating pool, keeping the population constant. (b) Crossover operator is used to modify the population with probability, P c , which means not all chromosomes in the pool undergo crossover. (c) Mutation operator is used to keep the population diversified with probability, P m .

  4. (4)

    This completes one generation (iteration). At the end of the first generation, we still have the same population size as in the initial population but with several of the chromosomes having been changed.

  5. (5)

    Repeat the above from Step (2) onward until convergence is reached.

Initial population generation

An approximate set of feasible solutions—called the initial population—are utilized to initialize the genetic algorithm. Conventionally, initial population is generated randomly, taking care that the values of all variables are within the bounds prescribed by the problem description. However, in history-matching context, the population represents reservoir realizations which contain the reservoir rock properties such as permeability and porosity etc. In order to generate initial population or the initial realizations, geostatistical methods (Deutsch and Journel 1998; Deutsch 2002) are used. Several authors have reported the use of geostatistical methods in generating initial realizations that represent prior knowledge of the distribution of static variables (permeability and porosity). These realizations are conditioned to honor the spatial correlation and variogram of the reservoir properties. A geostatistical method called, stochastic conditional simulation is used to generate the multiple equally probable descriptions of reservoir parameters. The method is stochastic and conditional as the reservoir properties are generated by hybrid method which is partially deterministic and partially random. The generated reservoir realizations honor the observations at the well locations (Romero and Carter 2001). In the present case study, the initial realizations were generated using GSLIB’s VSIM and SGeMS geostatistical software packages using the ‘mGstat’ interface of MATLAB®.

Inputs to the CMG® simulator

Reservoir simulation of multiple phase flow requires a geologic reservoir model which consists of complete description of the geology, rock properties, fluid properties, rock-fluid interactions etc., before it can calculate flow rates of each fluid and pressure drop. The CMG® simulator requires the following: structure of the reservoir, area, shoaliness, gross thickness, reservoir rock properties (geostatistical properties) such as porosity and permeability distribution maps etc., fluid model (PVT properties), rock-fluid model (relative permeability, saturation), fluid–fluid contact, faulting, aquifer properties, location, etc.

Geologic model

A proper geologic framework should be planned before building a reservoir simulation model. The framework consists of reservoir rock gross-thickness map which establishes the reservoir’s bulk volume, structure maps that provide the orientation and extension of sedimentary bodies, net-pay thicknesses, depth of fluid contacts, values of porosity and permeability obtained from core analysis, pressure-transient testing, etc. The distributions of porosity and permeability map are generated by geostatistical software package that incorporates core log and 3D seismic data in a consistent and realistic manner.

Grid selection

The reservoir under investigation is divided into grid blocks for ease of computations using numerical integration of flow equations embedded in the CMG® software. These grid blocks can be two- or three-dimensional, and grid types can be of variable depths and thicknesses: Cartesian, radial, or cylindrical, orthogonal corner point, and non-orthogonal corner point grid types depending on the reservoir. The size of each grid block depends on the size of the reservoir. A larger number of grid blocks (or smaller size of each grid block) make the algorithm to be slower with the increasing computational load. On the other hand, if the physical size of each grid block is too large, then the results become less accurate since it is assumed that throughout a single grid block, permeability and porosity (static parameters) are uniform. Grid selection is, therefore, problem dependent, and will, therefore, be discussed separately for the case studies investigated.

Faulting

An important factor influencing the reservoir behavior is the distribution of faults within the reservoir. The fault affects the petrophysical properties of the fault rock and modify the connectivity in sedimentologic flow units. The location of the fault in the reservoir is obtained from geologic analysis. The effects of fault transmissibility, such as sealing or nonsealing fault, must be inferred from special pressure testing (pulse and interference testing), analysis of production data, and field pressure survey.

History matching of a 2D synthetic reservoir (case study #1)

Before embarking on a meaningful real-field problem, it is important to validate the GA code, the problem formulation for history matching and the proposed scheme, and methodology of history matching. For this purpose, a 2D synthetic reservoir was chosen (Chitralekha et al. 2010) (see Fig. 3). The authors used Ensemble Kalman Filter for history matching of this reservoir. Since the synthetic reservoir construction started with known permeability distribution, the problem suited our purpose well.

Fig. 3
figure 3

2D heterogeneous black-oil reservoir with 2 producing wells and 1 injector well

The 2D synthetic reservoir under study

The synthetic reservoir presented here was taken from Chitralekha et al. (2010), which is a modification of the 10th SPE Comparative Solution project (Christie and Blunt 2001). The synthetic black-oil reservoir is a simple 2-phase, 2D model consisting of 20 layers discretized in a Cartesian coordinate system. The phases present in the reservoir are oil and gas. The reservoir is considered to be fully saturated by oil initially (no connate water). There is no fault presence in the reservoir. There are two producers (Well-1 and Well-2) that are placed symmetrically on either side of 1st injector (I-1), which is located at the center of the reservoir [grid block, (50,1)]. Wells, Well-1; Well-2, and I-1, are perforated through all the 20 layers of the model reservoir. Figure 3 shows the sectional view of the 2D, 2-phase, heterogeneous black-oil reservoir model. The reservoir has a constant porosity of 0.2 throughout all the layers with varying permeabilities in i direction. The permeabilities in j and k directions are set equal to permeability values in i direction. In addition, two core holes are considered to be drilled vertically through all layers and are located at grid block locations (25, 1) and (75, 1). The permeability values at the wells and core hole locations are assumed to be known. The problem is to find the permeability in the reminder of the 2000 grid blocks so as to match the known fluid production history.

Selection of GA parameters

The genetic operators for SGA are the tournament selection as the selection operator. A k-point crossover operator with P c  = 0.5 and a uniform mutation operator with P m  = 0.001 and 0.005 were used as the other GA operators.

The AGA methodology used the same three operators as used with SGA. However, the crossover and mutation probabilities were updated in every generation. Initial crossover probability, \( P_{c}^{0} = 0.5 \) and initial mutation probability, \( P_{m}^{0} = 0.001 \) and 0.005 were used in conjunction with Eqs. 2 and 4, and 5 to find P c and P m . The coefficient factor values were preassigned as ξ = 0.02, ω = 0.02, and η = 0.05.

Input to CMG® simulator: reservoir properties

•Initial reservoir Pressure

100 psia

•Datum Depth

0.0 ft

•Porosity

0.2

PVT properties

The pressure–volume–temperature data for the synthetic reservoir are given below. The fluids are assumed to be incompressible and immiscible.

•Initial reservoir pressure

100 psia

•Oil density

43.68 lb/ft3

•Gas density

0.0624 lb/ft3

•Oil viscosity

1 cp

•Gas viscosity

0.01 cp

Grid selection and initial population generation

A 2D grid, 100 × 20 was imposed on the reservoir which divided it into 2000 grid blocks, each measuring 25ft × 25ft. The porosity was constant throughout the reservoir. This meant that history-matching exercise required to estimate only permeability for each of the grid blocks, given the oil and gas production history. Clearly, GA formulation will lead to chromosomes of string length 2000, each element representing unknown permeability of each grid block with the exception that at the well location, the permeability is known and must not be allowed to change during GA operations. A population size of 40 was chosen, and hence, 40 initial realizations were generated using conditional direct sequential simulation in VISIM geostatistical software. Figure 4 shows a few typical initial realizations.

Fig. 4
figure 4

2D view of few initial permeability realizations of the model reservoir

Objective function

The objective function aims to minimize the mismatch between the quarterly oil production (bbl/day) and gas production (ft3/day) from Well-1 and the simulator output. The objective function, Q for this case becomes

$$ Q = \sum\limits_{k = 1}^{32} {\left( {\frac{{d_{k,{\rm oil}}^{O} - d_{k,{\rm oil}}^{S} }}{{d_{k,{\rm oil}}^{O} }}} \right)_{Well - 1}^{2} } + \sum\limits_{k = 1}^{32} {\left( {\frac{{d_{k,{\rm gas}}^{O} - d_{k,{\rm gas}}^{S} }}{{d_{k,{\rm gas}}^{O} }}} \right)_{Well - 1}^{2} } $$

where \( d_{{k,{\rm oil}}}^{O} \) and \( d_{{k,{\rm gas}}}^{O} \) are the quarterly observed oil and gas production data, respectively; \( d_{{k,{\rm oil}}}^{S} \) and \( d_{{k,{\rm gas}}}^{S} \) denote the corresponding simulator predictions; and k is the time period that represents 8 years’ (or 32 quarters) production history.

Results and discussion (case study #1)

Table 1 shows the objective function values of the initial realizations of the reservoir. The quarterly oil and gas productions from Well-1 from 40 initial realizations of the reservoir (before history matching) are shown in Fig. 5. Included in this figure are the observed production histories for comparison.

Table 1 Value of objective function (Q) for the initial realizations of the reservoir
Fig. 5
figure 5

Quarterly production data from Well-1 for 40 initial realizations: a oil production data and b gas production data

Results from SGA (case study #1.a)

The algorithm has been tested for history matching using crossover probability of 0.5 and mutation probability of 0.001 and 0.005. Figure 6 shows box-and-whisker plots at every 50th iteration illustrating how the value of the fitness function decreased with the increasing number of iterations. In a box-and-whisker plot, the bottom and the top of the box are the first and the third quartiles, respectively, the band inside the box is the median (or the second quartile), and the ends of the whiskers are the minimum and the maximum of all the data.

Fig. 6
figure 6

Objective function value versus iteration number (every 50 iterations) for SGA a \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.001 \) and b \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.005 \)

The objective function value of initial realizations reported in Table 1 are: Q min  = 1.77, and Q max  = 68.12. The SGA with crossover probability; P c  = 0.5 and mutation probability P m  = 0.001 produced the objective function values; Q min  = 0.83 and Q max  = 23.36 at 400th iteration. The SGA with P c  = 0.5 and P m  = 0.005, resulted in Q min  = 0.69 and Q max  = 17.42 after 400 iterations (Fig. 6a, b). From these plots, it is clear that the initial permeability realizations of the reservoir must be moving toward the real map as the number of iterations increases. The better history-matched models have lower objective function values.

Figure 7 shows history match for 10 best realizations. Compared with Fig. 5, it is clear that the deviations from the true values have been reduced significantly. As seen in this figure, for these best 10 realizations, the data of the calculated production rated for both oil and gas compare favorably with the observations.

Fig. 7
figure 7

History match for 10 best realizations resulting from SGA with \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.005 \) a quarterly oil production rate (bbl/day) and b quarterly gas production rate (ft3/day)

Results from AGA (case study #1.b)

The adaptive genetic algorithm was subsequently used for the history matching of the same 2D synthetic reservoir. For AGA, the following values of the genetic operators were used: Population size 40, initial crossover probability, \( P_{c}^{0} = 0.5 \) and initial mutation probability, \( P_{m}^{0} = 0.001 \) and 0.005. Equations 2, 4, and 5 were used to calculate P c and P m for subsequent iterations. The values of coefficient factors ξ(=0.02), ω(=0.02) and η(=0.05) required in these equations were chosen through experiments. Figure 8 shows how the crossover and mutation probabilities evolve over the generations for the two cases.

Fig. 8
figure 8

Adaptive crossover and adaptive mutation probability versus iteration for a \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.001 \) ; b \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.005 \)

Figure 9a, b shows box-and-whisker plots for the AGA with \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.001 \); and \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.005 \) as the initial probability values. AGA with \( P_{m}^{0} = 0.005 \) generated better reservoir realizations than that with \( P_{m}^{0} = 0.001 \) as shown in this figure. The objective function values (Q min  = 1.77 and Q max  = 68.13) of initial realizations converged to Q min  = 0.616, and Q max  = 15.706 at 174th iteration for \( P_{m}^{0} = 0.001 \) and to Q min  = 0.502, and Q max  = 12.22 at 172th iteration for \( P_{m}^{0} = 0.005 \), respectively. An increase in the initial adaptive mutation probability (\( P_{m}^{0} = 0.005 \)) enhanced the convergence rate and produced better results in fewer iterations compared to AGA with initial mutation probability; \( P_{m}^{0} = 0.001 \). Figure 10 shows the comparison of the best realization with observed production history. The match is very good.

Fig. 9
figure 9

Objective function values versus iteration number for AGA a \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.001 \); b \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.005 \)

Fig. 10
figure 10

History match for the best reservoir realization resulting from AGA with \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.005 \) a quarterly oil production (bbl/day); b quarterly gas production (ft3/day)

The permeability distribution of 2D heterogeneous reservoir obtained from the history-matched model which is conditioned to quarterly oil and gas production data acquired from the AGA with \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.005 \) is shown in Fig. 11b. Figure 11a shows the true permeability map and has been juxtaposed for comparison.

Fig. 11
figure 11

Permeability distribution of 2D synthetic reservoir: a True reservoir map (10th SPE comparative project). b Realization of best history-matched reservoir from AGA

Comparison between SGA and AGA: A comparison between the results from SGA and AGA shows that AGA was able to converge to optimal reservoir realizations much faster than the SGA. Table 2 demonstrates a comparison between the results obtained from SGA and AGA and in terms of minimum objective function values at different iterations. It is observed from this table that the AGA evolved to optimal solution in fewer number of iterations compared to SGA. This is due to the fact that the adaptive capability of the genetic operators adjusts the crossover and mutation probability according to the objective function value of the realization generated at each iteration. However, SGA may also result in equally good realization, if the algorithm evolves for more number of iterations or optimized values of crossover and mutation probability are used.

Table 2 Comparison of results from SGA and AGA

History matching using GA methodology has been successfully validated for 2D synthetic reservoir. The history match for oil and gas production from Well-1 obtained through GA technique shows equally good match as presented by Chitralekha et al. (2010) using Ensemble Kalman Filter for history matching of the same reservoir. The permeability map generated by AGA showed similarity to true permeability map hidden from algorithm. The permeability distribution map would have perhaps replicated the same if the history match for Well-2 was included in the objective function calculation. Since the objective of the synthetic case study was to validate the GA code and methodology developed for history matching, the study was restricted to match the history for productions from Well-1 only.

History matching of a real 3D reservoir (case study #2)

The oil field is located in the south-western part of Cambay Basin and to the west of Cambay Gas Field in Gujarat, India. The field was discovered in July 1999. The field consists of a total of eight oil-producing wells. The oil-producing sandstone has varying thickness up to 25 m and the sandstone is divided into three layers; Layer-1, Layer-2 and Layer-3.The sandstone layers are separated by thin shales that vary 1 to 2 m in thickness. The structure of the field trends NNW-SSE in direction and is bounded by a fault on either side, which separates the structure from the adjoining lows. The reservoir structure is controlled by East–West trending normal fault in the north, and it narrows down toward south. The fault surrounding the reservoir is noncommunicating and hence it is assumed that there is no hydrodynamic flow between the reservoir and the remaining area.

The initial reservoir pressure was recorded as 144 kg/cm2 at 1397 m. The quantity of reserved oil in-place was 2.9MMm3, and the cumulative oil production until September 2009 was 0.85MMm3 which is 29.1 % of the in-place reserve and 64.5 % of ultimate reserve. The marginal drop in reservoir pressure against cumulative oil production of 0.85MMm3 indicates that the reservoir is operating under active water drive. The presence of two aquifers toward the N-W side and toward the narrow region of the reservoir in Layer-3 has been reported. Most of the wells are producing gas–oil ratio (GOR) in the range of 30–35 volume/volume as producing wells are flowing above the bubble point pressure. Hence, the model shows well producing constant GOR values. The grid bottom structure 3D real reservoir is shown in Fig. 12.

Fig. 12
figure 12

3D view of grid bottom structure of real reservoir

The field started producing through the wells Well-1 and Well-2 from February 2000 and December 2000, respectively. The initial reservoir pressure recorded at Well-1 was 144.6 kg/cm2 at 1385 m. The cumulative productions of oil, gas, and water from Well-1 till September 2009 are 0.176 MMm3, 8.1MMm3, and 7.2 MMm3, respectively. Subsequently, the other wells (Well-3 to Well-8) were drilled and put on production in different years until 2009. The producing wells; Well- 3 and Well- 5 are perforated in Layer- 1 and Layer- 2; while Well- 1; Well-2; Well- 4; and Well- 6 are perforated through Layer-2 and Layer-3.

The case studies carried out here consider six oil-producing wells (Well-1 to Well-6) from the total of 8 oil-producing wells. The historic production is available for a period of 9 years. For the case studies, 70 months’ historic productions for the period of 2000–2005 were used for history matching using GA methodology and remaining data till 2009 were used for validating the model and the technique. Well-7 and Well-8 were put on production in January 2009.

Inputs to CMG®-Builder™ suit

The reservoir model is constructed by amalgamating many parameters such as petrophysical data, geologic structure (structural contour map, pay-sand thickness map etc.,), grid definition (size and type), PVT properties, reservoir fluid properties, well completion data, initial conditions etc.,. The reservoir rock, fluid, PVT parameters and initial conditions used to built a reservoir model through CMG®-Builder™ are produced in Tables 3 and 4. The measured permeability values at the well locations are given in Table 5. The relative permeability data have been generated using Corey’s correlation.

Table 3 Reservoir model parameters
Table 4 Reservoir PVT properties
Table 5 Permeability (k) values at well locations

The reservoir model consists of 3 layers and 6 oil-producing wells. The three layers have different porosities but remain constant within each layer. Layer-1 of the reservoir has a homogeneous permeability of 300 mD, whereas Layer-2 and Layer-3 have heterogeneous permeability distributions.

Grid selection

For the numerical integration of flow equations using finite difference method, the CMG® simulator uses a 50 m × 50 m size block grid on the reservoir which for the present case will result in 100 × 120 × 3 grid blocks. However, in the present study, a coarse scale grid was used to limit the dimensionality of the GA variables, and hence a 100 m × 100 m size was used for each block resulting in 50 × 60 × 3 grid blocks.

For the real reservoir shown in Fig. 13, region shown in light shade represents the active grid blocks, while region in dark shade denotes the inactive grid blocks. In the figure, the grid blocks are located inside the active region, and those shown in red represent the well locations. The genetic operators are programmed such that it operates on the region in light shade only.

Fig. 13
figure 13

Graphic view of active (light shade) and inactive grid blocks (dark shade) of real reservoir for a Layer-2 and b Layer-3

The Layer-1 of the reservoir has homogeneous permeability distribution of 300 mD for all the grid blocks. The objective of the present study was to estimate the active grid block permeability distributions in Layer-2 and Layer-3, since both the layers are highly heterogeneous.

Generation of initial population

The initial population was generated using geostatistical toolbox of MATLAB®, mGstat, which is interfaced to the SGeMS (geostatistical modeling software by GSLIB). The sequential Gaussian simulation (SGSIM) method was employed for generating initial realizations which honor the spatial variations and histograms of the real reservoir. The sequential Gaussian simulation determines each distribution of petrophysical properties under a multivariate Gaussian model. A Gaussian variogram model having correlation range of 20 grid blocks and with a sill value of 1 were used to estimate the permeability of each grid block in the realizations. The population size of 30 was chosen, and hence a set of 30 initial realizations representing the permeability distributions was generated using Gaussian simulations that honor the permeability values at the well locations in the reservoir. Figure 14 shows some of the initial permeability distributions generated by SGSIM.

Fig. 14
figure 14

3D view of few initial realizations generated using SGeMS for Layer-2 and Layer-3 of the reservoir

Selection of GA parameters

For SGA and AGA, the tournament selection operator was employed for selecting the fittest members from the population to the mating pool. In case of SGA, a k-point crossover and uniform mutation operator were used as the other genetic operators with crossover probability; P c  = 0.5 and mutation probability; P m  = 0.005. In case of AGA, the same operators were used except with initial crossover probability, \( P_{c}^{0} = 0.5 \) and initial mutation probability \( P_{m}^{0} = 0.005 \). The coefficient factors: ξ, ω, and η were assigned the same values as in the Case Study #1.

Objectives function

For this reservoir, porosity values were considered to be reliably established for each sand layer by the field geologists. The other most sensitive parameter, the field permeability distribution that has significant impact on field performance, was the only control variable. The GA procedure updates the initial solutions of permeability distributions called the initial realizations through generations to achieve a match between the field observations and the simulator output in terms of oil production rates, gas–oil ratio (GOR), water cut (WC), and bottom hole flowing pressure (BHP). There could be other uncertain parameters such as transmissibility, connate water saturation, depth of water–oil contact (DWOC) and aquifer properties which are sensitive to field observations. These uncertain parameters were not included in the prior information used for parameter estimation because of the computational constraints. However, some of these were adjusted manually. For example, aquifer properties will affect only those grid blocks which are located on the boundary between the aquifer and the reservoir, and it is much easier to adjust these manually rather than including these as control variables in the entire grid block population.

The objective function is formulated based on Eq. 1 taking into account the type of field observations, number of wells, time period, etc. In this case study, the field data comprise oil production rate, GOR, WC, and BHP from all six producing wells over a period of nearly 6 years (70 months) of production history (Mar 2000 to Dec 2005). Hence, the objective function Q for this case becomes

$$ \begin{aligned} Q & = \sum\limits_{w = 1}^{6} {\sum\limits_{k = 1}^{70} {\left({\frac{{d_{{ijk,{\rm oil}}}^{O} - d_{{ijk,{\rm oil}}}^{S}}}{{d_{{ijk,{\rm oil}}}^{O}}}} \right)}}^{2} + \left({\frac{{d_{{ijk\;{\rm GOR}}}^{O} - d_{{ijk\;{\rm GOR}}}^{S}}}{{d_{{ijk\;{\rm GOR}}}^{O}}}} \right)^{2} + \left({\frac{{d_{{ijk\;{\rm WC}}}^{O} - d_{{ijk\;{\rm WC}}}^{S}}}{{d_{{ijk\;{\rm WC}}}^{O}}}} \right)^{2} \\ & \quad + \left({\frac{{d_{ijk\;BHP}^{O} - d_{{ijk\;{\rm BHP}}}^{S}}}{{d_{{ijk\;{\rm BHP}}}^{O}}}} \right)^{2} \\ \end{aligned} $$

where subscripts w and k denote the number of wells and time period, respectively; \( d_{ijk}^{0} \) and \( d_{ijk}^{S} \) are the field observations and the corresponding CMG® simulator outputs in terms of monthly oil production rate, GOR, WC, and BHP. Q was minimized using GA, and the search was terminated when successive iterations produced essentially same values of the objective function.

Results and discussion for case study #2

The minimum and the maximum objective function values of the 30 initial realizations, approximating the field permeability distributions, range between 24.58 and 68.19 with the median, Qmed at 28.43 and the average value of Q avg being 35.096. The oil production rates (m3/day), water cut-%, GOR (m3/m3) and BHP (kg/cm2) for the entire field resulted from the initial realizations is shown in Fig. 15. Also included in this figure are field observations for comparison. As seen in Fig. 15a, c the oil production rate and GOR appear to match well for all the 30 initial guesses of the permeability distributions but water cut and bottom hole pressures show significant variations. This is due to the fact that the reservoir is producing under strong water drive mechanism provided by the two aquifers, which maintain near constant reservoir pressure for oil and gas productions, and there is no free gas cap.

Fig. 15
figure 15

Comparison between the field observations and the simulator output generated from 30 initial realizations. a Oil production rate SC (m3/day). b Water cut SC-  % (c) GOR (m3/m3) (d) BHP (kg/cm2)

Results from SGA (case study #2.a)

The SGA search was terminated after 240 iterations which resulted in an average value, Q avg  = 25.67; median value, Qmed = 23.54; and minimum value of Q min  = 19.98 (range 19.98–54.34). The objective function values resulting from SGA do not appear to be very small compared to the initial realizations Q values (see box-and-whisker plot in Fig. 16a). However, the WC and BHP showed better match with the field data (Fig. 17).

Fig. 16
figure 16

Objective functions versus number of iterations resulted from a SGA and b AGA

Fig. 17
figure 17

Comparison between the field observations and the simulator output after 240 iterations for best realizations using SGA. a Water cut SC- %. b BHP (kg/cm2)

Results from AGA (case study #2.b)

The minimum and the maximum objective function values after 120 iterations are Q min  = 19.606 and Q max  = 40.018 with the median value, Qmed = 19.98 and average value, Q avg  = 25.515. As mentioned earlier, the high values of the objective function is due to large error in predictions of water cut. The objective function values of the realizations resulting from AGA after every 20 iterations are shown in Fig. 16b. As seen in this figure, Q decreased rapidly up to 20 iterations and then gradually to the final value.

Figure 18 shows the variation of crossover and mutation probabilities with every 20 iterations. After 120 iterations, the values of the probabilities were P c  = 0.767 and P m  = 0.0077 A comparison of AGA results to those of SGA clearly establishes the superiority of AGA over SGA. The converged range and average values of the objective functions in case of AGA are lower than the corresponding numbers for SGA, achieved in half the number of iterations. Figure 19 shows the final average permeability distribution after AGA optimization (Fig. 19c for Layer-2, Fig. 19d for Layer-3). Also included in the figure are average initial permeability distributions for comparison (Fig. 19a for Layer-2 and Fig. 19b for Layer-3).

Fig. 18
figure 18

Adaptive crossover and mutation probabilities versus iteration number for \( P_{c}^{0} = 0.5 \) and \( P_{m}^{0} = 0.005 \)

Fig. 19
figure 19

Permeability distribution: a, b average initial distributions in Layers 2 and 3; c, d final distribution resulting from AGA in Layers 2 and 3

The simulator predictions using the best permeability map obtained from the application of AGA are compared with the field data in Fig. 20 for the period between March 2000 and December 2005. As seen in this figure, the oil production and GOR continue to show good match. The initial high values of GOR in the first year cannot be predicted from the model for reasons not well understood. It is, however, possible and suspected that the calculations of PVT properties may be in error which was fixed at a later date. The water cut match is also reasonable barring some period around 2003. The reason for the mismatch during this period is not clear and must perhaps be related to some unusual event. The pressure data also show a sudden dip around the same time. Also a course grid of 100 m × 100 m was used in the present case, but one can expect better match if a finer grid of say, 25 m × 25 m or a normal grid of 50 m × 50 m was used. This was not attempted since that would have increased GA variables to 16 or 4 times, making simulation calculation very lengthy. Usually, it is difficult to match everything over the entire time period owing to inhomogeneities and structural complexities of actual reservoirs, no matter which history-matching technique is used. The bottom hole pressure, however, shows a much better match in the entire range, validating the history-matching procedure developed in this study.

Fig. 20
figure 20

Production forecast of using history-matched model from 6 wells (2006–2008) and 8 wells (2009) a oil production rate (m3/day), b water cut -%, c GOR (m3/m3) and d BHP (kg/cm2)

Validation of the reservoir model

The history-matched reservoir permeability map (Fig. 19c, d) was used to predict the reservoir performance over the next 3 years (January, 2006 to December, 2008). The model predicted values were compared with field data available for this period but not used for model development (history matching). These comparisons are also shown in Fig. 20. A very good match, during 2006–2008, between simulator results and field data lends support to the technique of extracting reservoir properties using GA optimization.

Two new wells (Well-7 and Well-8) were drilled in 2009, and their locations are marked in Fig. 12. The production from these wells was included in the cumulative production data (from all the 8 wells) for the period January–September, 2009. For this period, the validated model was used to predict the productions profile, and Fig. 20 includes these comparisons for the said period. This further confirms that GA is a reliable history-matching optimization technique and is capable of future predictions as well as field development by way of drilling new wells.

Figure 21 shows a match between model predictions and field production data including bottom hole flowing pressure for individual wells. While calculations were made for all the eight wells for the entire period from 2000 to 2009, results for only three wells (Well-3, Well-6, and well-8) are included in this figure. For Well-7 and Well-8, the field data are available for a few months only. For these two wells, the bottom hole pressure (BHP) predictions were made for the entire duration between 2000 and 2008, which simply means what the pressure profile would have been if these wells existed at these locations. Only Well-8 has been included in Fig. 21.

Fig. 21
figure 21

History match and production forecast of individual wells in terms of a oil production rate, b GOR, c WC, and d BHP. (1) Well-3 (2) Well-6 (3) Well-8

Conclusions

The successful application of genetic algorithm in extracting a realistic permeability map of a 2D synthetic reservoir showed the technique as a promising optimization tool toward automatic history matching. The history-matched model when used with CMG flow simulator was able to predict production of oil and gas which was in good agreement with field measurements. The results were comparable to those reported by Chitralekha et al. (2010) for the same 2D reservoir using Ensemble Kalman Filtering technique. Adaptive genetic algorithm (AGA), in which crossover and mutation probabilities are dynamically adjusted according to the population fitness through generations, outperformed simple GA (SGA). AGA required less than half the iterations and resulted in smaller fitness function values compared to SGA. This validated history-matching methodology using GA as optimization tool was then applied to a real 3D petroleum reservoir. The results showed good match for oil production rate, gas–oil ratio (GOR), bottom hole flowing pressure (BHP), and reasonable match for water cut (WC). The WC mismatch during the period around 2003 and initial high value for GOR production may be due to unusual events and perhaps error in the PVT calculations. The coarse grid size, with each block measuring 100 m × 100 m, used in the present investigation may have contributed to higher error in certain wells. AGA was found to be more efficient and accurate compared to SGA for the real 3D reservoir also. Successful match of historic production of oil, water, and gas and satisfactory future predictions from existing and new wells drilled at later date in the reservoir established the power and efficacy of the technique. While only permeability was included in the present study, the technique can easily be extended to include other parameters in the search vector to make it a general tool for more complex reservoirs.