Introduction

In conventional formations, the resistivity ratio of oil layer to water layer is usually higher than 3. While in some unconventional formations, this ratio can be less than 2; such formations are usually defined as low-resistivity pay zones. With the development in exploration of subtle reservoirs, low-resistivity pay zones with considerable reserves have been found in various oil fields all over the world (Oyang 2009; Boyd et al. 1995). A great deal of attention is attached to such reserves. However, the capability in identifying low-resistivity pay zones is still very limited due to the unconventional characteristics of logging response in such zones (Poerboyo and Suharya 2014). Therefore, quick and accurate identification of low-resistivity pay zones using advanced mathematics method is significantly important.

Backpropagation algorithm (BP) is a classical artificial neural network algorithm, which uses performance learning as training rule. Due to its self-learning ability, it has great adaptability and fault tolerance (Hagan et al. 2002), especially when dealing with nonlinear problems, such as image recognition and pattern classification. Therefore, it has been introduced to solve problems in the petroleum engineering (Al-Kaabi and John Lee 1993; Farshad et al. 2000; Shokir 2004). For example, Shokir (2004) successfully applied BP algorithm to predict the hydrocarbon saturation in low-resistivity formation.

The evaluation of oil and gas pay zones relies on qualitative and quantitative explanations of the logging data. This involves analysis of various logging data of various formations. Therefore, identification of low-resistivity pay zones is a problem of mining and processing large amount of data with strong nonlinearity. Because of the concealed logging response of low-resistivity pay zones, it is critical to filter out the interferences of electrical, lithologic and physical properties on pay zone evaluation. BP algorithm is a functional method to identify low-resistivity pay zones. In this paper, the momentum backpropagation algorithm (MOBP), an improved algorithm of the traditional BP method, was used for logging data processing. In MOBP, momentum is introduced to the traditional BP algorithm to increase the speed of convergence, without compromising its capability in solving nonlinear problems.

Logging response of low-resistivity pay zones

Logging curves record various physical parameters of the drilled formation, and every single curve reflects one geological feature indirectly and conditionally (Ding 2002). An integrated study of several types of logging data can provide physical properties and fluid flow characteristics of underground formations. Then, hydrocarbon zones can be identified and evaluated in accordance with the unique characters of the formations.

The conductivity of formation water is much higher than that of hydrocarbons. By analyzing the resistivity, conventional hydrocarbon zones can be accurately identified and quantitatively evaluated using existing theoretical and/or empirical models (Yong and Zhang 2002). However, the electrical characteristics of low-resistivity pay zones are far less apparent compared with conventional formations. Therefore, a comprehensive analysis of the causes of low-resistivity response and its associated logging characteristics is necessary for identifying low-resistivity pay zones. It should be pointed out that although some new techniques, e.g., nuclear magnetic resonance (NMR) imaging, can accurately recognize low-resistivity pay zones (Guru et al. 2005; Murphy and Owens 1972), the main objective of the model presented in this paper is to improve utilization of the commonly available traditional logging data. The most common logging data include spontaneous potential (SP), gamma ray (GR), dual lateral resistivity (LLD and LLS), compensated neutron porosity (CNL), compensated formation density (FDC) and acoustic log (AC). Causes of low-resistivity pay zones and their associated logging characteristics will be discussed in the following.

When the bound water saturation is relatively high in a formation, even the formation is a pay zone, it will show abnormally low resistivity (Cheng 2008; Oifoghe 2014). High bound water saturation is usually encountered in formations with fine rock grains, high clay contents and/or complicated pore structures. Besides, when the clay in the pay zone is primarily montmorillonite, its high cation exchange capacity also increases the low-resistivity tendency of the zone. Therefore, when logging curves show low resistivity, high GR, small SP and/or large distance between neutron and density curves, low-resistivity pay zone with high bound water saturation can be identified, as illustrated in Fig. 1.

Fig. 1
figure 1

Typical characteristics of logging curves of low-resistivity pay zone with high bound water saturation (Cheng 2008)

In another scenario, when thin shale layers are sandwiched in the pay zone, the influence of the shale layers will lead to measured resistivity far below the actual value. This is usually caused by the precision of the logging tools. When logging curves show low resistivity, overall high and fluctuating GR, low SP, large distance between neutron and density curves, and/or jagged Microspherical Focused Log (MSFL) close to dual lateral resistivity, low-resistivity pay zone with shale sandwiches can be identified, as illustrated in Fig. 2.

Fig. 2
figure 2

Typical characteristics of logging curves of low-resistivity pay zone with shale sandwiches (Cheng 2008)

Besides high bound water saturation and shale sandwiches, the causes of low resistivity also include salinity difference between oil and water, invasion of brackish drilling mud, coexistence of hydrocarbon and water in the carbonate transition zones (Griffiths et al. 2006), and existence of conductive minerals, such as pyrite and siderite (Evdokimova 2013; Hamada and Al-awad 2000). In addition to the common low-resistivity signature, each cause has its unique combination of logging characteristics. The identification of low-resistivity pay zone is a process of abstracting, organizing and matching useful logging information. MOBP neural network used in this paper has its unique strength in such information identification. The development of low-resistivity pay zone identification model using MOBP neural network will be described in detail in the following section.

MOBP neural network

As aforementioned, BP neural network was used to develop the identification model for low-resistivity pay zone in this paper. Sigmoid function was used as the neuron transfer function in the BP neural network. The initial weights were generated randomly between −1 and +1. As a convergence threshold, the relative error was set to .01. The momentum factor was set to .8. The design of MOBP network mainly includes the determination of the number of neural layers and the number of nodes in each layer. The network design is directly related to whether the application of MOBP in identifying low-resistivity pay zone will be successful or not. Therefore, close attention should be paid to the design of MOBP network.

MOBP neural network should consist of three or more layers, including one input layer, one output layer and at least one hidden layer, as illustrated in Fig. 3.

Fig. 3
figure 3

Classic architecture of MOBP neural network

Design of input layer

The input layer is mainly used as buffer storage. Source data are loaded to the neural network through the input layer. The node number of the input layer depends on the dimensions of the source data. Each dimension corresponds to one node. In other word, the node number of the input layer is equal to the number of the samples. Because the main purpose of this paper is to improve the utilization of the traditional logging data, the selected eigenvalues are the most commonly used logging data, including SP, GR, LLD, LLS, CNL, FDC and AC, but the new logging technologies (e.g., NMR) are not considered. As mentioned in the above section, in addition to the difference in resistivity, there are some slight differences in other logging responses between the high- and low-resistivity pay zones. These differences can usually reflect the causes of low resistivity. Therefore, the eigenvalues of the sample should contain all kinds of available logging information to enhance the utilization of the logging data. The following indicator was used to describe the samples.

$$x_{i} = \left( {S_{{{\text{SP}}i}} ,S_{{{\text{GR}}i}} ,S_{{{\text{LLD}}i}} ,S_{{({\text{LLD}}/{\text{LLS}})i}} ,S_{{{\text{AC}}i}} ,S_{{{\text{CNL}}i}} ,S_{{({\text{CNL}}/{\text{FDC}})i}} } \right)$$
(1)

Seven parameters were used to describe the samples. The subscript i indicates the parameter related to sample i. S SPi is the parameter of SP; S GRi is the parameter of GR; S LLDi is the parameter of RRL; S (RRL/RRD)i is the parameter related to the difference between LLD and LLS; S ACi is the parameter of AC; S CNLi is the parameter of CNL; and S (CNL/FDC)i is the parameter related to the difference between CNL and FDC. These parameters can be expressed as:

$$\begin{aligned} & S_{{{\text{SP}}i}} = f_{\text{pureline}} \left( {\frac{{v_{{{\text{SP}}i}} - v_{{{\text{SP}}0}} }}{{v_{\text{SSP}} - v_{{{\text{SP}}0}} }}} \right) \\ & S_{\text{GRi}} = f_{\text{pureline}} \left( {\frac{{v_{{{\text{GR}}i}} - v_{{{\text{GR}}0}} }}{{v_{{{\text{GR}}1}} - v_{{{\text{GR}}0}} }}} \right) \\ & S_{{{\text{RRL}}i}} = f_{\text{pureline}} \left( {\lg \left( {\frac{{v_{{{\text{LLD}}i}} }}{{v_{R 0} }}} \right)/\lg \left( {\frac{{v_{Rx} }}{{v_{R 0} }}} \right)} \right) \\ & S_{{({\text{RRL}}/{\text{RRD}})i}} = f_{\log sig} \left( {\lg \left( {\frac{{v_{{{\text{LLD}}i}} }}{{v_{{{\text{LLS}}i}} }}} \right)} \right) \\ & S_{{{\text{AC}}i}} = f_{\text{pureline}} \left( {\frac{{v_{{\Delta ti}} - v_{{\Delta tma}} }}{{v_{{\Delta tf}} - v_{{\Delta tma}} }}} \right) \\ & S_{{{\text{CNL}}i}} = f_{\text{pureline}} \left( {\frac{{v_{{\Phi i}} - v_{{\Phi ma}} }}{{v_{{\Phi f}} - v_{{\Phi ma}} }}} \right) \\ & S_{{({\text{CNL}}/{\text{FDC}})i}} = f_{{\log {\text{sig}}}} \left( {\frac{{v_{{\Phi i}} - v_{{\Phi ma}} }}{{v_{{\Phi f}} - v_{{\Phi ma}} }} - \frac{{v_{{{\uprho} i}} - v_{{{\uprho} ma}} }}{{v_{{{\uprho} f}} - v_{{{\uprho} ma}} }}} \right) \\ \end{aligned}$$
(2)

where subscript i also indicates the parameter related to sample i. v spi is the spontaneous potential of the sample, v sp0 is the spontaneous potential of the thick shale in the same well and v ssp are the static spontaneous potential of water-saturated pure sand in the same area. v GRi , v GR0 and v GR1 are the GR values of the sample, pure sand and pure shale, respectively. v LLDi is the deep lateral resistivity of the sample, v R0 is the resistivity of 100 % water-saturated formation, v Rx is the resistivity of pure shale and v LLSi is the shallow lateral resistivity of the sample. v Δti , v Δtma and v Δtf are the interval acoustic transit times of the sample, pure sand and drilling mud, respectively. v Φi , v Φma and v Φf are the hydrogen indexes of the sample, pure shale and drilling mud, respectively. v ρi , v ρma and v ρf are the densities of the sample, pure sand and drilling mud, respectively.

Two normalized functions (i.e., the linear function and the sigmoidal function) were used and defined as

$$\begin{aligned} & f_{\text{pureline}} (x):y = x \\ & f_{{\log {\text{sig}}}} (x):y = \frac{1}{{1 + e^{ - x} }} \\ \end{aligned}$$
(3)

The plots of the two normalized functions are shown in Fig. 4. The sigmoidal function is relatively flat at both ends but steep in the middle. It can be used to normalize input data and highlight the differences for parameters with relatively wide ranges. In the proposed model, the sigmoidal function are used for S (RRL/RRD)i and S (CNL/FDC)i , while the linear function are used for normalizing the other parameters.

Fig. 4
figure 4

Normalized functions: the linear function (f pureline) (left) and the sigmoidal function (f logsig) (right)

Design of the output layer

The output layer is mainly used to output the final results. For a functional neural network, the number and values of the nodes in the output layer must meet the users’ expectations and cover all possible interpretation circumstances. However, in the network development, some constraints should be taken into account. For example, to ensure usability of the network, the node number of the output layer must be less than that of the input layer. Experiences show that the less nodes in the output layer, the faster the network converges and the more stable the network is. An interpretation method was proposed in this paper, which uses a single node in the output layer to cover all possible logging interpretations. An oil saturation function was defined as an indicator function for the single output node:

$$y_{i} = .5 + \xi f_{{\log {\text{sig}}}} \left( {\overline{{k_{i} }} \overline{{\phi_{i} }} \overline{{S_{i}^{*} }} - 1} \right)$$
(4)

where y is a quantized indicator value; \(\overline{{k_{i} }}\) is dimensionless permeability; \(\overline{{\phi_{i} }}\) is dimensionless porosity; \(\overline{{S_{i}^{*} }}\) is dimensionless saturation; and ξ is correlation coefficient. \(\overline{{k_{i} }}\) and \(\overline{{\phi_{i} }}\) are expressed as:

$$\begin{aligned} & \overline{{k_{i} }} = \frac{{k_{i} }}{{k_{0} }}; \\ & \overline{{\phi_{i} }} = \frac{{\phi {}_{i}}}{{\phi_{0} }} \\ \end{aligned}$$
(5)

where k i and k 0 are the permeability of the sample i and the minimum permeability in the sample database, respectively; ϕ i and ϕ 0 are the porosity of the sample i and the minimum porosity in the sample database.

The correlation coefficient ξ and dimensionless saturation \(\overline{{S_{i}^{*} }}\) differ between gas-bearing zones and non-gas-bearing zones. In gas-bearing zones:

$$\begin{aligned} & \xi = - .5; \\ & S^{*} = {{\left( {\frac{{S_{gi} }}{{S_{wi} - S_{wci} }}} \right)} \mathord{\left/ {\vphantom {{\left( {\frac{{S_{gi} }}{{S_{wi} - S_{wci} }}} \right)} {\left( {\frac{{S_{g0} }}{{S_{w0} - S_{wc0} }}} \right)}}} \right. \kern-0pt} {\left( {\frac{{S_{g0} }}{{S_{w0} - S_{wc0} }}} \right)}} \\ \end{aligned}$$
(6)

where S gi , S wi and S wci are gas saturation, water saturation and irreducible water saturation of sample i, respectively; S g0, S w0 and S wc0 are gas saturation, water saturation and irreducible water saturation of the gas-bearing zone that has the smallest ratio of gas saturation to the difference between water saturation and irreducible water saturation among all the gas-bearing zones.

In non-gas-bearing zone

$$\begin{aligned} & \xi = .5; \\ & S^{*} = {{\left( {\frac{{S_{oi} }}{{S_{wi} - S_{wci} }}} \right)} \mathord{\left/ {\vphantom {{\left( {\frac{{S_{oi} }}{{S_{wi} - S_{wci} }}} \right)} {\left( {\frac{{S_{o0} }}{{S_{w0} - S_{wc0} }}} \right)}}} \right. \kern-0pt} {\left( {\frac{{S_{o0} }}{{S_{w0} - S_{wc0} }}} \right)}} \\ \end{aligned}$$
(7)

where S oi , S wi and S wci are oil saturation, water saturation and irreducible water saturation of the sample i, respectively; S o0, S w0 and S wc0 are oil saturation, water saturation and irreducible water saturation of the non-gas-bearing zone that has the smallest ratio of oil saturation to the difference between water saturation and irreducible water saturation among all the non-gas-bearing zones.

The indicator value y (Eq. 4) ranges between 0 and 1. Its interpretation is shown in Table 1.

Table 1 Interpretation of the indicator function

Design of the hidden layer

Through solving XOR problems by BP neural network (see Fig. 5), it can be found that the hidden layer is very important to the solution of nonlinear problems (Ma 2010). The more the nodes in the hidden layer, the better the neural network matches the trained data; while the fewer the nodes, the better the network generalizes. After a great number of test calculations using the empirical models proposed by Lippmann (1987), Mirchandani and Cao (1989) and Maren et al. (1988), it is found that a network with two hidden layers, 20 nodes in the first hidden layer and 4 nodes in the second hidden layer, can converge reasonably fast and provide sufficient calculation accuracy. Figure 6 shows the architectures of MOBP neural network for interpreting low-resistivity pay zone. Table 2 reports the node numbers at each layer.

Fig. 5
figure 5

The training times of XOR problem for different node numbers in the hidden layer

Fig. 6
figure 6

Architectures of MOBP neural network for evaluating low-resistivity pay zone

Table 2 Node numbers at each layer of the MOBP network

The initial weights for all the three networks were the same set of random numbers between −1 and +1. The convergence thresholds were set equal to .01. The momentum factors were set equal to .8. The results show that the training times of 2-3-1 network were only about 4000 (middle plots), the training times of 2-2-1 network were around 8000 (lower plots), but the 2-1-1 network did not converge even with training times larger than 10,000 (upper plots).

In addition, after testing various kinds of network architecture in this study, it is found that the two-hidden-layer network converges much faster when the node number of the first hidden layer is larger than that of the input layer (6 in this study) and the node number of the second hidden layer is smaller than that of the input layer. This observation is not in accord with the empirical models mentioned above. However, since this is not the research focus of this paper, no more details are presented.

Case study

The proposed MOBP model was applied to interpret the low-resistivity pay zones in the Daqingzi oilfield in Jilin, China. High testing productions of some evaluation wells in the Heidimiao formation and Putaohua formation in the north Daqingzi area reveal that this two pay zones have great development potential. However, the oil–water relationships of these two formations are very complicated. Affected by many factors, the oil zone and water zone appear alternatively. This leads to great challenges in the secondary logging interpretation of this area. An effective pay zone identification method is desiderated.

Core analyses have shown that siltstone and argillaceous siltstone are the major lithology of the Putaohua formation. Illite and mixture of illite and montmorillonite with strong attached conductance are the major clay minerals, while the content of kaolinite with weak attached conductance is low. Therefore, it can be preliminarily inferred that low-resistivity pay zones exist in this area, affected by high irreducible water saturation due to fine lithology and high shale content.

There are totally 99 confirmed zones in this area. In this study, 90 of them were used to train the model and the rest 9 of them were used as test samples. Table 3 reports the training database.

Table 3 Training sample database

The MOBP network converged after 3900 times of training as shown in Fig. 7, with initial weights set randomly between −1 and 1, the relative error for convergence threshold set to .01, and the momentum factor set equal to .8.

Fig. 7
figure 7

Training errors

The accuracy of the proposed neural network model was tested by comparing the predicted results with the actual conclusions. The test results are shown in Table 4. All the predicted results are consistent with actual conclusions. This result validated the reliability of the proposed model.

Table 4 Test samples and results

Conclusions

Although the electrical responses of the low-resistivity pay zones are not obvious, lithologic and physical properties of the formation that cause low resistivity can be reflected in other logging curves. The identification of low-resistivity pay zone is a process of abstracting, organizing and matching useful logging information. Artificial intelligence algorithm, such as artificial neural network, can be used to intelligently and automatically to identify low-resistivity pay zones.

Based on investigation of the causes of low-resistivity response in the pay zones, a new artificial neural network model using the MOBP algorithm was developed in this paper. The model provides a fast and accurate identification of low-resistivity pay zones. This model can be used to reevaluate old wells using conventional logging data.

Some methods which could improve the convergence of MOBP network were found. Two-hidden-layer network converges much faster when the node number of the first hidden layer is larger than that of the input layer and the node number of the second hidden layer is smaller than that of the input layer.