Introduction

Permeability is one of the most difficult properties of rocks to estimate. It refers to the ability of fluids to flow through the substance. Permeability of a rock for oil, gas or water is a function of the absolute permeability and the fluid viscosity.

Permeability is influenced, the same as porosity, by many depositional and diagenetic factors. The most important factors that depend on the permeability of the rock is shape, pore diameter and pore connection (Schön 2011). Moreover, among the depositional factors affecting permeability, size and sorting of the grains should be mentioned (Beard and Weyl 1973; Bloch 1991; Lucia 1995). Coarse and well-sorted grains ensure better flows in the reservoir. Diagenetic factors such as compaction and cementation cause a reduction in permeability. Also, the type of minerals, which build the rock, affects this parameter. The presence of quartz increases the absolute permeability, despite the smaller porosity, while clay minerals due to their properties and distribution in rock significantly affect its reduction (Neuzil 1994).

The permeability is related to the productivity of the rock formation. Occurrence of oil or gas saturation in low-permeability reservoirs requires additional activities such as hydraulic fracturing or drilling of horizontal wells to obtain the hydrocarbon flow. Determination of the absolute permeability is crucial in order to properly identify the reservoir parameters and determine the profitability of possible hydrocarbon production.

Permeability can be estimated in many different methods. The most reliable method is laboratory measurements, which mainly are based on injecting fluid to the core sample under known condition (Tiab and Donaldson 2000; Jarzyna and Puskarczyk 2009) and are carried out on liquid permeameters. Laboratory measurements provide only point information and are mainly used to calibrate deterministic calculations based on well logging (Iturrarán-Viveros and Parra 2014; Wawrzyniak-Guz 2016).

Absolute permeability of materials is described by Darcy’s equation. Nevertheless, deterministic calculation of permeability in rocks is very difficult due to its dependence on many factors, which are problematic in determination in laboratory measurements and interpretation of well logs. Despite this, many methods of permeability estimation are available in the literature and interpretation systems (Asquith and Krygowski 2004). Most of the equations are based on the relationship between permeability and porosity. This is due to the fact that the permeability mainly depends on the structure and specific surface of the pore space (Such and Leśniak 2006). Kozeny (1927) and Carman (1937), using the Darcy equation, proposed a formula (Eq. 1) combining these two parameters, but it is challenging to correctly determine all the factors present in the formula (Amaefule et al. 1993):

$$K = \frac{{\phi_{{\text{e}}}^{3} }}{{\left( {1 - \phi_{{\text{e}}} } \right)^{2} }}\left[ {\frac{1}{{F_{{\text{s}}} \tau^{2} S_{{{\text{gv}}}}^{2} }}} \right]$$
(1)

where \(K\)—absolute permeability, \(\phi_{{\text{e}}}\)—effective porosity, \(F_{{\text{s}}}\)—shape factor, \(\tau\)—tortuosity, \(S_{{{\text{gv}}}}\)—surface area per unit grain volume.

Equations based on porosity are also available, which additionally use such quantities as irreducible water saturation or shale volume. In Poland, the empirical equation suggested by Zawisza (1993) is often used, which includes porosity and irreducible water saturation (Eq. 2):

$$K = a\phi^{3.15} \left( {1 - S_{{{\text{wir}}}} } \right)^{2}$$
(2)

where \(a\)—regional factor, \(\phi\)—porosity, \(S_{{{\text{wir}}}}\)—irreducible water saturation.

An interesting approach to the problem of permeability determination is the use of artificial neural networks based on laboratory measurement on core samples, modern, high-resolution logging tools or results of their qualitative interpretation.

Method

The precursor to the emergence of artificial neural networks was the development of the model of the neuron in the human brain and explained the mechanism of memorizing information via the biological network in the 1940s by McCulloch and Pitts (1943) (Tadeusiewicz 1992). The first designed and constructed neural network was the perceptron developed by Rosenblatt (1958). Initially, neural networks have not gained much interest because the use of single-layer networks is limited. Further work demonstrating that multilayer, nonlinear networks have unlimited possibilities caused a significant increase in their use. Neural networks are widely used also in geophysical and petrophysical problems (Huang et al. 1996; Aminzade and De Groot 2006; Bhatt and Helle 2002; Sudakov et al. 2019).

Artificial neural networks (ANNs) with the ability to reproduce complicated functions are used to reduce interference, classification or prediction of data and parameters. Neural networks have the ability to learn, memorize and generalize calculations based on the training data set. They cope very well with inconsistent, distorted data. ANN are stable and resistant to damage. Calculations using the network in relation to the structure are very efficient, even operating on large data sets.

In the case of the parameter prediction task, feedforward, multilayer networks stand out, which guarantee very good results, and are relatively easy to design and use. Multilayer Perceptron (MLP) is built from at least three layers: the input layer, the hidden layers and the output layer (Fig. 1).

Fig. 1
figure 1

Scheme of a multilayer perceptron (Tutak and Brodny 2019; modified)

During the design of neural networks, the most important stages are the selection process and the learning stage, because they project on the network and obtained results. Romeo (1994) distinguishes three main causes resulting in bad network performance: bad network configuration, algorithm suspension in the minimum and wrong learning set. The key moment is choosing the number of neurons in the hidden layer. There are no clearly defined rules for their number. However, the number of layers should not be too large because it can cause the network to be overly adapted to the test data. The best method to select the right number of layers is trial and error, starting with a small number of layers. In the case of selecting a training sample, it should be ensured that they are as representative as possible and possibly free from measurement errors. Avoiding the above-mentioned errors, it is possible to design networks that are capable of solving complex problems and predicting parameters whose relationship with measurements is not easy to describe using simple mathematical functions. One of such parameters used in petrophysics is the absolute permeability.

An important element of creating a network is the validation process. It allows to evaluate the quality of the solution obtained by the network. For this process, a certain sample of the output data is used, usually random, which is not used in the network learning process. The error function is minimized during the network learning process. Its drop is definitely faster for the learning sample and slower for validation. Changing the validation error is also an indicator for proper network construction, because its increase during the learning process may suggest too many hidden layers (Krogh and Vedelsby 1994). While the network is being designed for the learning process and its evaluation, the error function is used. The task of the network is to minimize it, in each subsequent iteration. To calculate the error, various functions are used, such as mean absolute error function, the cross-entropy or maximum likelihood function; and the most frequently used is sum of squares (Falas and Stafylopatis 1999) (Eq. 3):

$$E = \mathop \sum \limits_{i = 1}^{N} \left( {y_{i} - t_{i} } \right)^{2}$$
(3)

where \(N\)—number of cases used for learning, \(y_{i}\)—network prediction result, \(t_{i}\)—measured parameter value.

Presented analysis was performed in Statistica software (version 13) using Multilayer Perceptron algorithm with exponential activation function for the hidden and output neurons (StatSoft 2011).

Materials

Research was made on data from three wells located on Lublin Syncline in Poland. The analysis covered the Silurian and Ordovician shale and mudstone formations, which are potential unconventional shale gas deposits. These formations are characterized by poor reservoir properties, in particular low permeability (Krakowska and Puskarczyk 2015). Due to the location of the wells within various geological regions, analogous formations are buried at various depths. The distance and complex tectonics of the area can affect the differences in petrophysical parameters between the layers.

The basis for the training of neural networks was the results of laboratory measurements of absolute permeability from gas permeameter and effective porosity from mercury porosimetry measured on core samples from one of the wells. For each well, there were also available results of well logging and the logs interpretation, the summary of which is presented in Table 1.

Table 1 List of available data. Symbols are explained in the section: list of symbols

Results

The first, but very important step in the process of using neural networks is the appropriate selection of the data used. According to the assumption, the output data being tested, and training and validation data should be associated with the input data. In the case of creating neural networks, it is not entirely true to say that the more the better, because variables unrelated to the output data may cause deterioration of the network (StatSoft 2011).

The implemented input data for the creation of artificial neural networks were the following well logs: RHOB, NPHI, DT, LLD, LLS, MSFL, THOR, URAN, POTA, PE and the results of their interpretation: VCL, VANH, VDOL, VLIM, VSAN, VPYR, VTOC, PHI, K, SWI. As the qualitative input for the training wells, absolute permeability from laboratory measurements was applied. Before performing the data selection for all wells, the basic statistics and histograms for reservoir parameters were calculated (Table 2).

Table 2 Basic statistics of laboratory data

Basic statistics of laboratory data indicate that the studied formations exhibit very poor reservoir parameters, in particular absolute permeability, which is typical for unconventional hydrocarbon resources, such as tight gas or shale gas (Xiao et al. 2014). For the training well A, the absolute permeability is very low, mostly not exceeding 0.01 mD, as evidenced by the median, i.e. the value that prevents the flow of hydrocarbons. The network learning process was carried out on the basis of trial and error, both checking the various functions of activation of input and output neurons and observing the behaviour of the error function for individual network dimensions. Network design was carried out using automated algorithms in which a random selection of data for learning, testing and validation sets was applied.

At the beginning, for the well A, the networks were designed based on well logs, for the whole available depth interval. In the second step, division into stratigraphic units (periods: Silurian and Ordovician) was implemented as qualitative input. In the third step, based on the same data, the division into stratigraphic units was provided with details, taking into consideration Polish informal stratigraphic units: Ludlow, Wenlock, Llandovery, Ashgill, Caradoc. The absolute permeabilities, which are equal to the minimum detected value (K = 0.0001 mD) in gas permeameter, were not considered, as not informative for the ANN. The total number of permeability values from laboratory measurements was equal to 109. Thanks to this procedure, three absolute permeabilities were obtained: K_ANN—no stratigraphic units were applied, K_ANN1—stratigraphic units were applied, and K_ANN2—informal stratigraphic units were applied.

Table 3 presents the results of probe quality using three different constructions of the Multilayer Perceptron in the form of determination coefficient for the estimated absolute permeabilities from ANN and from gas permeameter. The best results were obtained for the Multilayer Perceptron: MLP 16-10-1 (ANN3), which is revealed by the highest determination coefficient for the learning, test and validation procedure.

Table 3 Quality of artificial neural network

Calculated absolute permeabilities were compared quantitatively and qualitatively with laboratory data (K_LAB) and permeability determined from the Zawisza (K_ZAWISZA) and Wyllie–Rose (K_WYLLIE–ROSE) formula using scatterplots (Fig. 2) and basic statistics (Table 4). Figure 2 presents the comparison between the absolute permeability from laboratory measurements and from ANN at the depth available from the laboratory measurements on the core samples. Absolute permeability from ANN is in the form of the log, so the values are available for each 0.1 m. ANN2 network was used on the other wells because it presented the best validation, which indicated the ability to implement network to other data. Determination coefficient for the absolute permeability from gas permeameter (K_LAB) and absolute permeability from the artificial neural network ANN2 is 0.72.

Fig. 2
figure 2

Comparison of laboratory data with estimated and calculated absolute permeabilities

Table 4 Basic statistics of measured, estimated and calculated absolute permeabilities

Permeabilities from Zawisza (available from the interpretation data set) and Wyllie–Rose (calculated by authors) formula were estimated in order to check the quality of the ANN results. Results obtained for Wyllie–Rose equation were better but similar to the Zawisza equation. The determination coefficient for K_LAB and K_Wyllie–Rose is 0.26. During the analysis, an attempt was made to learn the network applied to the division created using XRMI tool measurements, based on resistivity contrast between layers. Nevertheless, due to the small number of samples, the obtained results were unsatisfactory.

Analysing the calculated statistics, the parameter variability with the depth and the qualitative fit between the estimated permeability (K_ANN, K_ANN1, K_ANN2) and the input permeability (K_LAB, K_ZAWISZA), it was decided that the best match appears in the networks created for the most detailed stratigraphy (K_ANN2), even if the calculated matching factors are not the best.

Using the estimated permeability based on the formula (Eq. 4), the FZI (Flow Zone Index) parameter was calculated, which was then used as an input parameter in the creation of artificial neural networks based only on interpretation data from well logs. Interpretation logs were available for the whole Silurian depth interval and partially in Ordovician. The FZI parameter characterizes the rock ability to move media in the pore space. The determination of FZI classes allows classification of the formation according to intervals of similar hydraulic properties.

$${\text{FZI}} = 0.0314\frac{{\sqrt {\frac{K}{\phi }} }}{{\frac{\phi }{1 - \phi }}}$$
(4)

The calculated parameter FZI served as one of the parameters for the calculation of neural networks using the data obtained in the petrophysical interpretation. When training the network, a similar procedure was used as in the case of well logging data. Multilayer Perceptron had the form of MLP 9-6-1 with determination coefficient for the input, hidden and output layers: 0.79, 0.70 and 0.28, respectively. The result, in the form of absolute permeability K_ANN3, was obtained and is presented in the chart (Fig. 3) and in Table 5. Again, the best match was obtained for more detailed stratigraphy: K_ANN2.

Fig. 3
figure 3

Comparison between estimated absolute permeability (K_ANN3) and from laboratory measurements (K_LAB)

Table 5 Basic statistic of absolute permeability (K_ANN3) estimated based on interpretation data and FZI parameter

The estimated absolute permeabilities were compiled on the track, to which the results of laboratory measurements were superimposed (Fig. 4).

Fig. 4
figure 4

Results of absolute permeability estimation by artificial neural network in well A. Symbols of Polish informal stratigraphic units: A—Ludlow, B—Wenlock, C—Llandovery, D—Ashgill, E—Caradoc

The analysis of logs shows that, as indicated by the calculated statistics, the differences between the permeabilities, in which the input data were well logs, are not too high and the best fit was considered for K_ANN. Nevertheless, ANN2 network was used on the other wells because it presented the best validation statistics and differences in fit between K_LAB and K_ANN were slightly lower. The other wells have also implemented networks, taught on the basis of interpretation data. However, the obtained results were not satisfactory. The results of the implemented networks were presented on the track (Fig. 5). Unfortunately, due to the lack of data from laboratory measurements on core samples, it was not possible to compare them. Only the comparison with the permeability calculated from the Zawisza formula was used.

Fig. 5
figure 5

Results of absolute permeability estimation by artificial neural network in well B and C. Symbols of Polish informal stratigraphic unit as in Fig. 4

The correlation coefficient between the absolute permeability from the artificial neural networks and the Zawisza equation will be lower than 0.5 for both wells. Nevertheless, due to the lack of data from laboratory measurements on core samples, it is not possible to accurately assess this parameter. In particular, in the case of absolute permeability from the Zawisza formula calculated for the well C, it seems to be overestimated in relation to the actual parameters for the formation in the studied area. Lack of success in the application of the best ANN for wells B and C is caused by the strong heterogeneity of reservoir parameters for Silurian and Ordovician deposits and active tectonic in the research area. Even small changes in the diagenesis process for the thin-layered mudstone and shale deposits reveal enormous change in the petrophysical parameters, such as effective porosity and absolute permeability.

Conclusions

The presented work aimed at checking the legitimacy of using artificial neural networks to determine the absolute permeability parameter. Tests were carried out on data from Silurian and Ordovician shale and mudstone formations. These formations are characterized by poor reservoir parameters, such as effective porosity and absolute permeability, which additionally hampered the assumed task.

Attempts were made in providing the different sets of variables as the input data. The best results were obtained for well logs and the data from the well log interpretation, given as follows: RHOB, NPHI, DT, LLD, LLS, MSFL, THOR, URAN, POTA, PE, VCL, VANH, VDOL, VLIM, VSAN, VPYR, VTOC, PHI, K, SWI.

The best ANN was ANN2 with MLP 16-10-1 characteristic, using Polish informal stratigraphic units, as an input division. In both cases, learned networks for smaller depths intervals worked better. Determination coefficient for the relationship between the absolute permeabilities derived from ANN2 and from absolute permeability from gas permeameter is R2 = 0.72.

Unfortunately, due to the lack of data from laboratory measurements on core samples in the remaining B and C wells, it was impossible to make an unambiguous and objective assessment of their quality. A possible poor fit may result from the complicated tectonic structure of the research area. An attempt in learning the network using measurements from the XRMI tool, which seems to be an interesting issue, was made and failed also due to the small number of samples.

Artificial neural networks can be used in petrophysical analysis because they give the possibility to get the information in the form of the log, from larger amount of data and calibrated with the laboratory measurements. Calculations of ANN are fast and quite cheap in comparison with the measurements and give good results for the parameters with nonlinear characteristic such as permeability in reference to well logs.