Introduction

As a property of a porous medium, permeability indicates the capability of the medium to host a fluid flow. For a reservoir, a higher permeability means a higher production rate of the hydrocarbon flow from the reservoir. Therefore, a knowledge of the variations of this parameter across the reservoir zone is key to determine the optimal location of production wells and select the optimum path for the wells. Permeability is a function of porosity and pore geometry and structure, and since the pore geometry is determined by the effective confining stress, the permeability may change, especially in the near-well zone, as the production from the reservoir continues. Indeed, as the reservoir produces, the effective stress increases, thereby making the reservoir rock compacter and less porous, ending up with lower levels of permeability. When it comes to injection process, the effective stress decreases and the opposite occurs to the permeability. The complexity of studying the hydraulic behavior of the reservoir doubles for the carbonate reservoirs where the heterogeneity is relatively high due to the presence of vuggy pores developed upon dissolution of the constituent minerals (Bust et al. 2011), and this adds to the difficulty of accurate prediction of permeability and presentation of a highly generalizable model for such reservoirs.

There are various methods for evaluating the permeability of a hydrocarbon reservoir. These can be classified under either of three broad classes, namely core studies, image analysis at different rock scales (X-ray imagery, focused ion beam imagery, and optical microscopy), and laboratory experiments (routine core analysis (RCAL)/ special core analysis (SCAL)) (Al-Bazzaz and Al-Mehanna 2007; Ganat 2020; Alyafei 2021). These techniques are time intensive and costly as they need imaging efforts and laboratory experiments. Nevertheless, their results represent only a limited number of points along the well. Well-testing data have been used for obtaining average reservoir permeability (Mohaghegh et al. 1995). Petrophysical assessment of acquired well logs like nuclear magnetic resonance (NMR) and dipole shear sonic imager (DSI) has been tried to obtain reservoir permeability (Babadagli and Al-Salmi 2004). Although this latter method produces continuous logs of permeability along the studied interval, it needs time and money to perform the required analyses. Therefore, researchers have tried to relate the permeability to other directly measured parameters. Two classes of methods have been developed on this basis. These include physical-basis and machine learning methods.

In general, the main physics-based models used for predicting the permeability include index model, Kozeny–Carman model, Timur model, and Herron model (Male et al. 2020; Huang et al. 2021). In these models, the permeability is primarily seen as a function of effective porosity, with the function developed using the results of experimental works on a set of cores. These methods suffer from serious limitations. Some of the main limitations include their inability to capture the effect of complex nonlinear stress–strain relations, failure to consider elastoplastic behavior of the rock under injection/withdrawal conditions, inadequacy for unconventional reservoirs, and difficulty in determining particular parameters in these reservoirs, and limited capability for predicting the permeability in carbonate reservoirs due to their heterogeneous particle size distribution (Huang et al. 2021).

Thanks to their promising potential for modeling complex nonlinear relations and flexibility to the number and type of input parameters, intelligent algorithms (IAs) have been regarded for predicting permeability. Common machine learning methods can be categorized as simple machine learning (SMLM) and hybrid machine learning (HMLM) methods. Focusing on SMLMs, researchers have used artificial neural networks (ANNs) in the form of multilayer perceptron (MLP) and radial basis function (RBF) (Ali and Chawathé 2000; Jamialahmadi and Javadpour 2000; Handhel 2009; Verma et al. 2012; Adeniran et al. 2019; Okon et al. 2020), deep learning (Zhong et al. 2019; Kamrava et al. 2020; Tian et al. 2020; Tang et al. 2022), fuzzy neural networks (Ahmadi et al. 2014; Elkatatny et al. 2018; Adeniran et al. 2019), extreme learning machine (ELM) (Sitouah et al. 2014; Lin et al. 2015), support vector machine (SVM) and least squares SVM (LSSVM) Ahmadi et al. 2014; Adeniran et al. 2019; Mahdaviara et al. 2020a), group method of data handling (Mathew Nkurlu et al. 2020; Kamali et al. 2022; Mulashani et al. 2022), random forest (RF) (Talebkeikhah et al. 2021), decision tree (Sitouah et al. 2014; Otchere et al. 2021; Talebkeikhah et al. 2021), and a variety f regression methods such as Gaussian process and multivariate linear regression (Gholami et al. 2014; Kamali et al. 2022). When it comes to HMLMs, metaheuristic optimization algorithms have been used to determine hyperparameters of estimator algorithms in the process of training the models. Examples of such models include hybrid ANN-GAFootnote 1 and ANN-PSOFootnote 2 (Al Khalifah et al. 2020; Farouk et al. 2021; Kardani et al. 2021; Matinkia et al. 2022b; Tian et al. 2022), PSO-XGBoost (Gu et al. 2021; Liu 2022), PSO-SVM (Akande et al. 2017; Mahdaviara et al. 2020a; Yin et al. 2020; Tian et al. 2022), hybrid ELM-PSO and ELM-GA (Mahdaviara et al. 2020a; Kardani et al. 2021), and hybrid RF-PSO and RF-GA (Wang et al. 2020; Tian et al. 2022). Table 1 summarizes the studies focusing on the prediction of permeability by applying HML and SML models.

Table 1 Published studies focusing on the prediction of permeability applying HML and SML models

Depending on the nature of the method used to assess rock permeability, different parameters can be used as input variables to a machine learning model (MLM). In laboratory core studies, attempts have been made to use porosity, specific surface area of the pore, and residual water saturation for estimating the absolute permeability (Gholinezhad and Masihi 2012; Mahdaviara et al. 2020b; Kamali et al. 2022). This has been while field studies have used petrophysical logs for modeling the permeability. In their research, Mulashani et al. (Mulashani et al. 2022) used the gamma ray (GR), density (RHOB), effective porosity, shale volume, and neutron porosity (NPHI) logs for estimating the permeability. In another piece of work, P-wave travel time (DTCO), NPHI, formation resistivity (RT), GR, and RHOB logs were utilized to predict the permeability (Farouk et al. 2021). Otchere et al. (2021) based their work on the logs of caliper, DTCO, GR, NPHI, photoelectric (PEF), RHOB, resistivity log (RT) as well as rate of penetration (ROP) to model the permeability. Zhange et al.(2021) combine the petrophysical method of rok typing (flow zone indicator (FZI) or FZI*) for estimating the permeability from DTCO, RHOB, and GR logs.

A review of the studies performed on the permeability modeling shows that the attempt for formulating models for predicting the permeability with higher accuracy and generalizability is still an active field of research since no single universal model has been achieved yet. Accordingly, application of new modeling techniques, especially in the field of artificial intelligence, with its proven superiority to empirical models, has been seen as a priority. In this respect, we used multilayer ELM (MELM) for modeling the permeability based on petrophysical logs. The MELM not only offers the benefits of the ELM, including its high convergence rate (Huang 2015) thanks to the absence of an iteration phase for optimizing the weights and biases (Huang et al. 2006, 2011; Ebtehaj et al. 2016), simple learning process (Zhang et al. 2016), improved efficiency (Huang et al. 2006; Ding et al. 2014), good accuracy and high performance (Huang 2015; Huang et al. 2015), and limited need to optimization, as compared to the LSSVM (Zhang et al. 2016), but also covers the weakness of the ELM in handling big data by devising multiple hidden layers (Huang et al. 2015; Zhang et al. 2016). On the other hand, due to the random initialization of weights and biases in the ELM, it is susceptible to getting stuck in local optima (Cao et al. 2012). Since the same limitation is suffered by the MELM, the present research hybridizes the MELM with a couple of optimization algorithms (cuckoo optimization algorithm (COA), PSO, and GA) to come up with universal optimum.

Based on the literature review performed (see Table 1), four major differences between the present study and previously published studies can be distinguished: (1) as it can be seen from Table 1, CNN and MELM algorithms have not so far been applied in previous works for the prediction of permeability, and the improved version of the SVM, LSSVM, as a predictor algorithm was used and its prediction performance was compared with those of the MELM and CNN models; (2) the simple forms of learning algorithms have often been used to predict permeability while it has been shown in previous studies that the hybrid of optimization and predictor algorithms has led to improved results; (3) the input parameters used in the present study, unlike some previous studies, are among the most common logs which practically are recorded in any well. Therefore, the proposed models of the present study can be readily used; 4) most of the models developed in previous studies were based on the results of laboratory tests of permeability measurements on core samples. The core samples taken after being released from this stress will expand;so the permeability measured in the laboratory is not fully representative of rock permeability under reservoir conditions. In addition, the core samples used in the laboratory are basically free of natural fractures and vugs. Therefore, the models developed based on core permeability measurements typically are associated with some degree of uncertainty, especially in carbonate reservoirs that have high heterogeneity.

Research methodology

Workflow of permeability prediction methodology

Figure 1 diagrammatically illustrates the methodology applied, in the present research for modeling the permeability. This methodology contains eight steps which are briefly explained in the following:

Fig. 1
figure 1

Diagrammatic workflow representing the eight-step methodology used for developing, testing, and verifying intelligent models to predict permeability

Step 1: The methodology starts by compiling two banks of data, one for the modeling and testing of the permeability prediction (well A data) and one for evaluation and verification of the prediction performance of the best-performing model developed in prediction permeability. The compiled databanks for this study contained eight petrophysical parameters recorded versus depth, namely depth, caliper (HCAL), spectral gamma ray (CGR), photoelectric (PEF), neutron porosity (NPHI), density (RHOB), formation resistivity (RT), and compressive wave travel time (DTCO).

Step 2: In the second step of the methodology, a data cleaning is performed to identify and eliminate the outliers in the well A dataset using the so-called Tukey method. In the following, the procedure applied and the necessity of data preprocessing (cleaning) are explained.

A high-accuracy high-generalizability model cannot be achieved unless a powerful modeling algorithm coupled with high-quality data is available. This implies that a preprocessing step must be taken on the input data before the actual modeling of permeability. This begins by evaluating the data quality and attenuating the effect of noise on the data. Noise attenuation techniques can be broadly categorized into two classes, namely noise suppression methods and outlier elimination techniques. In the former techniques, the count of data points remains the same before and after the process. Proper implementation of such a technique requires identification of the share of noise in the measurements, which is a function of measurement environment conditions, measurement device quality, and the operator’s experience. In the latter approaches, the data points that significantly diverge from the rest of the data points are identified as outliers and hence eliminated from the dataset. This means that the number of data points becomes smaller upon applying such a technique. Since the application of the first approach required so more information that was unavailable in this study, we used the so-called Tukey method to identify and eliminate the outliers in this study. To this end, one must calculate the first and third quartiles (Q1 and Q3, respectively) for the studied parameters and proceed to evaluate the inter-quartile range (IQR) via Eq. (1). In this equation, the value of k is determined according to the data collection conditions and the importance of the problem in question. The smaller the value of k is, the more conservative the collected data is, and the larger amount of data is considered as outlier data. The value of this coefficient is considered equal to 1.3 based on the evaluation of the data and also the sensitivity analysis on it (refer to the appendix). Next, LIFFootnote 3 and UIFFootnote 4 can be obtained by adding and subtracting the IQR to and from Q1 and Q3, respectively, as expressed in Eqs. (2, 3), respectively. Finally, the data points corresponding to values below LIF or above UIF are flagged as outliers and hence omitted from the rest of the analysis.

$${\text{IQR}} = k \times \left( {Q_{3} - Q_{1} } \right)$$
(1)
$${\text{LIF}} = Q_{1} - {\text{IQR}}$$
(2)
$${\text{UIF}} = Q_{3} + {\text{IQR}}$$
(3)

Step 3: In the third step, the cleaned databank of well A was randomly divided into two sets of data, namely a training set (i.e., these data are for training the permeability prediction models considered) that included 70% of data records of well A databank and a testing set (i.e., the data records used as unseen data for testing the trained permeability prediction models) that contained 30% of data records of well A databank. Since the hyperparameters are initialized on a random basis and the division of available data into training and testing subsets are performed randomly, this algorithm leads to a different solution at each iteration. Accordingly, seeking to reduce the effect of these random phenomena on the final outcome, we used the so-called tenfold cross-validation (10FCV). This algorithm divides the original dataset into 10 clusters containing an identical number of data points. It then runs the estimator algorithm 10 times, with a different cluster taken as testing data and the remaining 9 clusters taken as training data in each run. As a final output, the algorithm reports an average error over the 10 runs.

Step 4: In the fourth step resolving the effect of the variable scale of the data, which ultimately can affect negatively the weights and biases as well as the feature selection results, all the data are normalized before conducting the feature selection. For this purpose, minimum and maximum values (Zmin and Zmax, respectively) of each feature (Z) are calculated, and then, normalization is practiced by Eq. (4) to come up with normalized values in the range of [-1, 1].

$$Z_{{{\text{norm}}}} = 2\left( {\frac{{Z - Z_{{{\text{min}}}} }}{{Z_{{{\text{max}}}} - Z_{{{\text{min}}}} }}} \right) - 1$$
(4)

Step 5: As a second step in the preprocessing of the input data, the selection of a subset of the most effective input parameters for the modeling can significantly reduce the processing time compared to a case where all features are taken as input to an estimator model. In addition, the use of less-noisy features translates into a higher accuracy in the modeling process. Accordingly, the redundant features can be eliminated from the rest of the analysis. Therefore, following the data normalization, a feature selection technique is used, in the fifth stage, to identify and choose the most effective set of features to be passed to the permeability modeling.

Numerous methods have been presented for this purpose, which can broadly be classified under three classes filters, wrapper methods, and embedded techniques. With a filter, feature selection is based on a statistical index, like the correlation coefficient. This is while a wrapper method undertakes the modeling with different sets of features and then compares the results to choose the best subset of features. This makes the wrapper methods generally slower than the filters. An embedded approach uses a mechanism that reflects a combination of filters with wrapper techniques. Osman et al. (2018) showed that the wrapper techniques are generally more accurate than the filters. Therefore, in the present research, we opted for a combination of MLP with the nondominated sorting genetic algorithm II (NSGA-II) to come up with a wrapper procedure for feature selection. In this hybrid algorithm, minimization of the number of selected features and the root-mean-square error (RMSE) of the permeability estimator model were as set as objectives to the NSGA-II. This algorithm started by randomly selecting a set of features and taking them as input to the MLP. The MLP algorithm then triggered a training and testing process to feedback to the NSGA-II on the error of modeling with the selected set of features. Adopting the crossover and mutation operators in successive iterations of the NSGA-II, this algorithm introduces a new set of features into the MLP that leads to a lower error than that in the previous iteration. This iterative process continues until a predefined stopping criterion is met. Finally, hybrid MLP-NSGA-II provides the user with a set of solutions from which the user can select the most appropriate one considering the problem conditions and constraints. In order to avoid repeating the relevant explanations on these algorithms, readers are recommended to refer to the previously published research studies (Deb et al. 2002; Yegnanarayana 2009; Yusoff et al. 2011; Anemangely et al. 2017, 2018, 2019; Matinkia et al. 2022a).

Step 6: In this step, as predictor models, MELM and LSSVM hybridized with metaheuristic algorithms of COA, PSO, and GA were applied to the training data. In addition to the mentioned hybrid methods, we adopted straightforward LSSVM and convolutional neural network (CNN) methods based on the training data. To this end, the most appropriate structure of MELM and the most efficient kernel function were first determined. In total, eight models were trained to accurately predict absolute permeability based on conventional well-logging data. The hybridization process of the predictor and optimizer algorithms is described in details in Sect. 2.2. In addition, a brief introduction to these algorithms’ theoretical description is given in Appendices 2, 3, 4.

Step 7. In this step, the trained models were evaluated on the testing data of well A in terms of the accuracy and generalizability in permeability prediction. To this end, five statistical error metrics were applied (Sect. 2.3). Comparing the error achieved by the eight permeability prediction models developed, the best model (i.e., the model which provided the lowest and highest values of RMSE and R2, respectively) was identified.

Step 8. Since the generalizability of intelligent predictive models determines their potential practical applicability, the accuracy and generalizability of the best-performing permeability prediction model identified were precisely assessed on the new unseen databank from another well, well B, drilled in the same field as well A, applying the error metrics presented in Sect. 2.3.

Hybridization of predictor and optimizer algorithms

Not only the proper selection of an architecture for MELM and a kernel function for LSSVM, but also the optimal setting of hyperparameters of these algorithms play a crucial role in the accuracy and generalizability of the developed model. Application of metaheuristic optimization models instead of classic competitors has proved the better accuracy of the metaheuristics than the classic algorithms (ÖZKARACA 2018). Therefore, once finished with determining a proper structure for MELM and a suitable kernel function for LSSVM, we herein used COA, PSO, and GA for determining optimal values of hyperparameters. For this purpose, hyperparameters of the predictor algorithms were defined as decision variables to the optimizer algorithms. On this basis, at each iteration, the optimizer algorithm introduces a set of values of hyperparameters for the MELM and LSSVM. The error of modeling with these values on the training data was then feedbacked to the optimizer algorithm, so that the optimizer could determine the hyperparameters in such a way to reduce the modeling error at the next iteration. This iterative process continued until the stopping criterion (usually in the form of achieving a maximum number of iterations) was satisfied. The MELM and LSSVM models that were obtained upon the final iteration of the optimizer algorithm were applied to the testing data. This process is schematically shown in Fig. 2.

Fig. 2
figure 2

Hybridization of predictor and optimizer algorithms for modeling the permeability

Validation of models

The trained models were validated by applying them to unseen data (30% of the entire data) and analyzing the modeling error and R2. To this end, percent deviation (PD) at each data point (i) was calculated from the measured (\({Perm}_{measured}\)) and predicted (\({Perm}_{Predicted}\)) values of permeability via Eq. (20). The value of PD was then used to calculate average percent deviation (APD) and average absolute percent deviation (AAPD) for each model using Eqs. (21, 22), respectively. Using the error (Er) at each data point and mean error (Ermean) over all data points (n), standard deviation (SD) was obtained using Eq. (23). The so-called RMSE and R2 were also obtained from Eqs. (5, 10), respectively. It is worth noting that these modeling validation criteria were further used in the training stage.

$${\text{PD}}_{i} = \frac{{{\text{Perm}}_{{{\text{measured}}}} - {\text{Perm}}_{{{\text{predicted}}}} }}{{{\text{Perm}}_{{{\text{measured}}}} }} \times 100$$
(5)
$${\text{APD}} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} {\text{PD}}_{i} }}{n}$$
(6)
$${\text{AAPD}} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left| {{\text{PD}}_{i} } \right|}}{n}$$
(7)
$${\text{SD}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {{\text{Er}}_{i} - {\text{Er}}_{{{\text{mean}}}} } \right)^{2} }}{n - 1}}$$
(8)
$${\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {{\text{PP}}_{{{\text{measured}}_{i} }} - {\text{PP}}_{{{\text{predicted}}_{i} }} } \right)^{2} }$$
(9)
$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {{\text{PP}}_{{{\text{measured}}_{i} }} - {\text{PP}}_{{{\text{predicted}}_{i} }} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {{\text{PP}}_{{{\text{predicted}}_{i} }} - \frac{{\mathop \sum \nolimits_{i = 1}^{n} {\text{PP}}_{{{\text{measured}}}} }}{n}} \right)^{2} }}$$
(10)

Studied data

In this study, we used full-set log data including depth, caliper (HCAL), spectral gamma ray (CGR), photoelectric (PEF), neutron porosity (NPHI), density (RHOB), formation resistivity (RT), and compressive wave travel time (DTCO). The logs were acquired against the Fahlian Formation at two wells in a hydrocarbon field in southwestern Iran. Due to the confidentiality of the information, one of the wells was named A, and the other well was named B. Over the studied depth intervals, a total of 1530 and 560 data points were acquired at a spacing of 0.1524 m for wells A and B, respectively. Given the high level of Fahlian Formation, which is composed of limestone, formation microimager (FMI) data were further used for estimating the permeability. The obtained permeability was then corrected for water saturation and calibrated against laboratory data from core analyses.

Figures 3, 4 demonstrate variations of the petrophysical logs and permeability along the studied depth range in the Fahlian Formation for wells A and B, respectively. Figure 3 demonstrates variations of the petrophysical logs and permeability along the studied well in the Fahlian Formation. As seen in this figure, in the depth intervals where washouts are evident (indicated by a sudden increase in the caliper readings, e.g., consider the 4400–4420 m depth interval), the petrophysical log readings are strongly affected. This indicates the necessity of a preprocessing step for noise attenuation of the data. Tables 2, 3 report five statistical indicators for the data collected from the Fahlian Formation at the studied wells A and B. Since the number of data in well A is more than that of well B, and also the range of changes of parameters' values in well A is wider than that of well B and covers the range of values of well B, well A is used as modeling data (training and testing) and well B was selected to validate the generalizability of the trained models.

Fig. 3
figure 3

Variations of petrophysical log readings and permeability over the studied depth interval at the studied well

Fig. 4
figure 4

Variations of petrophysical log readings and permeability over the studied depth interval at the studied well B

Table 2 Statistics on the collected data over the studied depth interval at the studied well (Matinkia et al. 2022b)
Table 3 Statistics on the collected data over the studied depth interval at the studied well B

Results and discussion

In this section, first, the development and evaluation of the permeability prediction models are discussed and followed by the evaluation of the generalization of the models proposed in the validation well (well B). Finally, to assess how the detection and elimination of the outlier with the Tukey method enhances the permeability prediction performance of the proposed model, the best-performing algorithm will be trained using uncleaned primary data of well A (data containing outliers), and then obtained model will be applied to the data of well B to predict permeability.

Developing and evaluating permeability prediction models

Firstly, the data were investigated by the Tukey method to address the outliers. Accordingly, we ended up eliminating 261 data points that were recognized as outliers. Following the outlier elimination stage, the relationship between the input parameters and the target parameter was conducted based on the heat map of the correlation matrix (Fig. 5). As seen in Fig. 5, the DTCO and NPHI were found to be most directly correlated to permeability. Although an increase in PEF and depth led to higher permeability values, the correlation coefficient was smaller than the DTCO and NPHI. At the other end of the spectrum, RHOB exhibited the strongest inverse correlation to the permeability, as compared to other petrophysical logs over the studied depth interval. Although higher values of RT, SGR, and HCAL were also associated with lower permeability, the correlation was much stronger for the RHOB.

Fig. 5
figure 5

Heat map of the correlation matrix of petrophysical logs and permeability over the studied depth interval at the studied well

In order to find the best set of features using MLP-NSGA-II, one should begin by finding the proper architecture (numbers of layers and neurons per layer) for the MLP. This was practiced by trial and error. That is, several models were built with different numbers of layers (from 1 to 3) and an equal number of neurons per layer considering the entire set of available petrophysical logs as input. The results of validation with this method indicated that an MLP with two layers provides higher accuracy than those with either one or three layers. Therefore, a sensitivity analysis was conducted on the two-layer MLP to find the optimal number of neurons in each layer. For this purpose, taking all petrophysical logs as input, permeability was modeled with the help of two-layer MLPs with different numbers of neurons (from 1 to 8). Noteworthily, 10FCV was devised to achieve reliable results.

Figure 6 shows the modeling error for different models in the form of a contour map. The figure suggests that the value of RMSE decreases with increasing the number of neurons per layer. From some point on, however, any further increase in the number of neurons per layer imposes an opposite effect on the RMSE, making the RMSE increasing. The lowest error was obtained for the model with 5 and 4 neurons in the first and second layers, respectively. Accordingly, the MLP with this structure was combined with NSGA-II to come up with a hybrid model for feature selection.

Fig. 6
figure 6

Variations of RMSE with the number of neurons in the first and second layers of the MLP NN

After optimizing the MLP structure, one should further optimize the values of the control parameters of the NSGA-II in such a way to enable achieving a high accuracy level. For this, the MLP-NSGA-II was devised with different values of mutation and crossover rates. Here again, the 10FCV was used to split the entire dataset into training and testing subsets in such a way to ensure reliable results. Based on the results of these analyses, the best values for the mutation and crossover rates were found to be 0.04 and 0.52, respectively. Moreover, the population size and the maximum number of iterations for the NSGA-II were set to 90 and 70, respectively.

Figure 7 depicts variations of RMSE and R2 when applying the MLP-NSGA-II algorithm to the training data for modeling the permeability. As the figure suggests, the R2 increased abruptly with increasing the number of input features from 1 to 3, while the increase in R2 was subtle as the number of input features increased further from 3 to 6. For models with 6 or more input features, the increase in the R2 was pretty much negligible. It is also observed that the rate of reduction in error was relatively high as the number of input features increased from 1 to 4, while the reduction in error was insignificant as the number of input features exceeded 4. Knowing that the R2 is susceptible to be affected by outliers, we herein opted for RMSE as the final criterion for selecting the optimum number of inputs. In this way, 4 input features were used for modeling the permeability. Table 4 presents the best combinations of features for various counts of input variables to the MLP-NSGA-II. Given that the previous step showed that the optimum number of input features was 4, the rest of the modeling process was done with DTCO, NPHI, RHOB, and RT as input features.

Fig. 7
figure 7

Variations of error and R2 for modeling the permeability with the help of MLP with the optimum sets of features considering various counts of input features to the NSGA-II

Table 4 The best sets of features for modeling the permeability for various counts of input features to the NSGA-II (the selected set of features for modeling is bolded)

In order to apply the hybridized forms of MELM on the data of the selected features, one should begin by determining the optimal structure (i.e., counts of hidden layers and neurons in each layer) for the MELM to ensure achieving a model with the lowest prediction error. This was herein done by a trial-and-error procedure. Application of this procedure showed that a structure with three hidden layers containing 6, 5, and 7 neurons in the first to third layers, respectively, could end up with the lowest errors. In this way, the biases and weights of the MELM with this structure were optimized by means of the metaheuristic optimization algorithms. Prior to applying the hybrid LSSVM-based algorithms on the data of the selected features, one had to determine the most appropriate kernel for the LSSVM. To this end, the LSSVM algorithm was applied to the training data with different kernels (i.e., linear (Lin), polynomial (Poly), MLP, and RBF kernels).

Figure 8 shows a bar plot demonstrating the obtained values of RMSE upon applying the LSSVM algorithms with different kernel functions. This figure suggests that the LSSVM with RBF kernel produced the smallest levels of error. Therefore, we used this kernel function for modeling the permeability with the LSSVM algorithm. Now, not only the parameter \(\gamma\) but also \({\sigma }^{2}\) (variance of RBF kernel) should be optimized using the metaheuristic optimizers considered. These adjustable parameters control the generalizability, learning power, and predictive power of the formulated LSSVM.

Fig. 8
figure 8

Bar plot of modeling error for the LSSVM models with different kernel functions when applied to the training data

Once the optimal structure of MELM and proper kernel function of LSSVM were determined, one should further optimize the adjustable parameters of COA, PSO, and GA considering the problem at hand before hybridizing them with the MELM and LSSVM. Once again, a trial-and-error procedure was adopted to set the mentioned parameters. The optimal values of the parameters of the mentioned algorithms are reported in Table 5. In the relevant analyses, the optimization algorithms converged to a solution in less than 160 iterations on average. Given this finding, we limited the number of iterations to 200.

Table 5 Determined values of adjustable parameters of the optimizer algorithms for modeling the permeability following a trial-and-error approach

Figure 9 shows the variations of RMSE at different iterations of the hybrid MELM and LSSVM algorithms for modeling the permeability using the training data. As the figure suggests, except for the LSSVM-GA, variations of error are very limited at iterations beyond the 100th iteration. On average, the convergence rate was higher with COA rather than the other two optimization algorithms. Focusing on the MELM-based hybrid model, MELM-COA produced the most accurate results. Among the LSSVM-based hybrid models, however, the highest accuracy levels were obtained with the LSSVM-COA. As a general conclusion, it can be stipulated that the MELM-based hybrid models provided for smaller errors than the LSSVM-based hybrids as far as permeability prediction from the training data was concerned, with the MELM-COA generating the highest accuracy.

Fig. 9
figure 9

Cross plots of RMSE versus iteration number for the hybrid (a) MELM and (b) LSSVM algorithms for modeling the permeability using the training data

Shown in Fig. 10 are the cross-plots of measured and predicted permeabilities based on the training data using the hybrid and straightforward schemes of LSSVM and CNN algorithms. As this figure suggests, for all algorithms, the best-fit line to the data points closely coincides with the Y = T line, indicating that the developed models are, on average, free of overestimation or underestimation problems. Despite the presence of small deviations between the two lines at low or high values of permeability, especially for the CNN model, these deviations are pretty subtle and negligible. Nevertheless, the two lines coincide well for the MELM-COA model. This latter model exhibits generally less dispersion in the data points, i.e., the data points are well concentrated around the Y = T line. Dispersion of the data points was higher with the simple and hybrid forms of LSSVM rather than the MELM-based hybrid models. This suggests that the LSSVM-based models were generally less accurate than the MELM-based models. Table 6 shows quantitative values of error and R2 for comparing the hybrid models against the simple LSSVM and CNN. This table indicates that the MELM-COA model is of higher accuracy than the other models. Moreover, hybrid forms of LSSVM showed smaller RMSE values than the simple LSSVM, reflecting the high power of metaheuristics for finding and optimizing the hyperparameters.

Fig. 10
figure 10

Cross plots of measured and predicted permeabilities based on the training data using the hybrid and simple CNN and LSSVM. a MELM-COA b MELM-PSO c MELM-GA d LSSVM-COA e LSSVM-PSO f LSSVM-GA g LSSVM h CNN

Table 6 Analyzing the results of different models for predicting the permeability from the training data using different error indices and R2

Figure 11 shows the cross-plots of measured and predicted permeabilities based on the testing data using the hybrid LSSVM and MELM as well as straightforward forms of CNN and LSSVM. As this figure suggests, the dispersion of the points around the Y = T line is less with the MELM-COA and CNN models, as compared to other models. This indicates the superior accuracy of these two models compared to other models. A more accurate comparison between these two models shows that the underestimation of lower values of permeability is more pronounced with the CNN model rather than the MELM-COA model. This highlights that, even in the testing stage, the MELM-COA model tends to produce smaller errors than the other models. Quantitative evaluation of error and R2 in the testing stage is presented in Table 7, further confirming the higher accuracy of the MELM-COA rather than the other models. This indicates the wider generalizability of this model compared to the rest of the studied models.

Fig. 11
figure 11

Cross plots of measured and predicted permeabilities based on the testing data using the hybrid MELM and LSSVM and straightforward LSSVM and CNN techniques. a MELM-COA b MELM-PSO c MELM-GA d LSSVM-COA e LSSVM-PSO f LSSVM-GA 30 g LSSVM h CNN

Table 7 Analyzing the results of different models for predicting the permeability from the testing data using different error indices and R2

Figure 12 shows the histogram of modeling errors achieved for various models upon their application to the testing data. The figure further depicts a fitted normal distribution function. In this figure, it is clear that MELM-COA produces the lowest average error (\(\mu\)), as compared to other models. The lower standard deviation of this model compared to the other models indicates more limited dispersion of data points around the mean value. Knowing that the closer the best-fit normal distribution function to error to a standard normal distribution, the better the performance of the corresponding model, it can be stipulated that the MELM-COA model can overperform the other models for similar depth intervals along another well in the same field.

Fig. 12
figure 12

Histogram of permeability modeling error for the testing data along with best-fit normal distribution function (red line). a MELM-COA b MELM-PSO c MELM-GA d LSSVM-COA e LSSVM-PSO f LSSVM-GA g LSSVM h CNN

Generally, the hybrid forms of the MELM algorithm have a lower error value than the hybrid forms of the LSSVM, separately for each of the metaheuristic optimization algorithms. Considering the sensitivity of LSSVM to outliers and that the MELM possesses superior characteristics compared to LSSVM (see the advantages/disadvantages of the predictor algorithms in Appendix 4), such a better prediction outcome was expected to be presented by the hybrid forms of MELM. Comparing the results of the models based on the optimization algorithms shows that the error of the COA is generally lower than the other two algorithms. This result is consistent with the poor performance of GA and PSO in local search (Zhuo and Yu 2019; El-Mihoub et al. 2021) and maintaining a balance between the local and global search of the COA (Mareli and Twala 2018).

Figure 13 shows the comparison of the measured and predicted values of permeability using the MELM-COA model on all the data of well A (in the presence of outliers). As can be seen in Fig. 13, the MELM-COA model precisely predicted the permeability changes trend along the depth interval studied in well A. However, the model underestimated the eliminated high permeability values and overestimated the low permeability values detected and eliminated as outliers in preprocessing stage. The RMSE and coefficient of determination values for permeability prediction by applying the MELM model developed with clean data of well A (without outliers) were obtained to be equal to 1.9525 and 0.9921, respectively.

Fig. 13
figure 13

Comparison of measured and predicted permeability for the complete dataset (with outliers) using developed MELM-COA model based on without outlier data in the studied depth range of well A

Validation of the trained models

To evaluate the generalizability of the models, the trained models were applied to the data of well B to estimate the permeability. In Table 8, the comparison of the error values and the coefficient of determination for the results of applying permeability predictive models on the validation data is presented. As can be seen in this table, the MELM-COA algorithm was able to predict the permeability in the studied depth interval of well B with the least error. However, other models showed satisfactory prediction results. These findings confirmed the high accuracy and generalizability of the proposed models in predicting permeability on unseen data of well B.

Table 8 Comparison of prediction results for the models proposed for predicting the permeability from the validation data of well B with respect to four error indices and R2

The cross-plot of the measured and predicted permeability values obtained by the MELM-COA model is shown in Fig. 14. As can be seen in this figure, the MELM-COA model presented outstanding capability in predicting the permeability values across the studied interval of the validation well, witnessed by the absence of underestimating or overestimating in predictions made. Figure 15 depicts the comparison of the measured and predicted values by the MELM-COA model in the studied depth interval of well B. These findings further confirmed the substantial prediction performance accuracy of the proposed MELM-COA, distinguished by the closely matched predicted and measured values of permeability.

Fig. 14
figure 14

Cross-plot of the measured and predicted values of permeability achieved by the MELM-COA model in the studied depth interval of well B

Fig. 15
figure 15

Comparison of measured and predicted values of permeability for the MELM-COA model across the studied depth interval of well B (due to the wider range of permeability values in the interval under study, the vertical axis is considered logarithmically to provide a clearer presentation of the results)

Conclusions

In this study, absolute permeability was modeled by integrating MELM and LSSVM, as estimator algorithms, with COA, PSO, and GA, as optimizer algorithms. Modeling was further performed by simple LSSVM and CNN for the sake of comparison. For this purpose, we began by evaluating a dataset including petrophysical logs (HCAL, SGR, PEF, NPHI, RHOB, RT, and DTCO) and permeability measurements from well A penetrating a carbonate reservoir to identify outliers using the Tukey technique and then remove them from the input data. Next, NSGA-II was applied to the training data (70% of the entire data set, i.e., 888 data points) to select the most appropriate features. Once finished with applying the algorithms on the training data, the developed models were applied to the testing data (i.e., 381 data points). To assess the generalizability of the developed models, these models were used to predict permeability in another well (well B) from this field and in the same formation. The following conclusions were drawn:

  • The Tukey technique recognized 261 data points as outliers. Accordingly, these points were omitted for the rest of the analysis.

  • Applying NSGA-II algorithm for the selection of the features with high influence degree on the targeted parameter, DTCO, NPHI, RHOB, and RT were selected as the most influential input features.

  • Testing the hybrid methods as well as the conventional LSSVM and CNN on the training data, it was figured out that the MELM-COA algorithm generates the smallest RMSE (0.5600 mD) and the highest R2 (0.9931), as compared to other models.

  • The MELM-based hybrid models presented smaller errors than those of the LSSVM-based hybrid models.

  • The hybrid LSSVM-based models were less erroneous than the simple LSSVM, indicating the better performance of metaheuristic algorithms than the classic optimizers.

  • Evaluation of the MELM-COA model on the testing data showed that it shows the lowest RMSE (0.6019) and the highest R2 (0.9919).

  • The deep learning model exhibited a comparable performance to that of the MELM-COA model in both the training and testing stages.

  • Given the high accuracy and generalizability of the proposed MELM-COA model, this robust hybrid model can be recommended for predicting the permeability in other wells of the field studied.