Introduction

The application of geomechanics in the petroleum industry can lead to significant improvements in the economy and optimized operation. In geomechanical studies, parameters such as elastic properties and rock strength are needed for wellbore stability, wellbore collapse, hydraulic fracturing, and sand production (Gholami et al. 2014).

In general, the GM parameters are calculated using two general methods: laboratory tests and in-well measurements. In the laboratory, the strength and elastic properties are measured on the core using the necessary equipment (Sanei et al. 2013, 2015, 2021a; Sanei and Faramarzi 2014). Furthermore, in-well measurements use wave velocity to compute GM characterization (Tiab and Donaldson 2015; Mavko et al. 2020). The problems of laboratory tests are the unavailability of samples due to the lack of access to appropriate laboratory facilities and the high cost of tests. Therefore, recent attention to in-well measurements have increased more than in the past. The advantages of using dynamic in-well measurement methods are non-destructive, cost-effective in terms of cost and time, as well as cover all wellbore distances (Zhang and Bentley 2005).

Dynamic GM parameters are calculated by applying p-wave velocities computed (Chang et al. 2006; Xu et al. 2016). To reduce several problems during drilling and production related to geomechanics, scientists will require to accurately present the GM parameters. The dynamic parameters, such as Poisson ratio, Young’s modulus, shear modulus, and bulk modulus, have been computed using logging data since 1950 by geophysics and geosciences. Then, these parameters can be converted to static parameters (Archer and Rasouli 2012; Najibi et al. 2015).

There are many researchers who have developed several predictive and empirical models to estimate GM parameters. These models are based on physical, petrophysical, and index parameters (Sachpazis 1990; Ulusay et al. 1994; Jamshidi et al. 2013; Aladejare 2016, 2020). Empirical relationships can estimate the parameters from well-logging data such as p-wave velocity, s-wave velocity, compressional transient time, shear transient time, density, and porosity (Chang et al. 2006). In addition to these models, drilling data, such as penetration rate (POR) and rotation per minute (RPM), were used to forecast rock strength parameters (Rampersad et al. 1994; Hareland and Nygård 2007). Tables 2 and 3 list some of the most generally used empirical relationships presented for estimating Young modulus, unconfined compressive strength, and Poisson ratio.

In the last few years, machine learning methods have started to use for several purposes. Machine learning includes artificial neural networks (ANNs), radial basis function (RBF), support vector machines (SVMs), random forest (RF), genetic algorithm (GA), etc., which can perform accurately (Mohaghegh 2000). Machine learning is widely used in different issues such as medicine, education, economics, transportation, and more (Elsafi 2014). In the petroleum industry, several machine learning methods have been used to predict different properties with high accuracy compared to other methods (Doraisamy et al. 1998). For instance, history matching process (Costa et al. 2014), pore pressure estimation (Aliouane et al. 2015; Rashidi and Asadi 2018), wellbore stability (Okpo et al. 2016), well planning (Fatehi and Asadi 2017), predicting the in situ stresses (Abusurra 2017; Ibrahim et al. 2021), fracture pressure prediction (Ahmed et al. 2019), reservoir fluids properties estimation (Elkatatny and Mahmoud 2018), permeability prediction (Gu et al. 2018), oil recovery factor estimation (Mahmoud et al. 2019a), predicting the water saturation (Tariq et al. 2019), forecasts oil rate prediction (Kubota and Reinert 2019), prediction of relative permeability (Zhao et al. 2020), predicting permeability impairment (Ahmadi and Chen 2020), prediction of fracture density (Rajabi et al. 2021), fault diagnosis (Jin et al. 2022), predicting movable fluid (Gong et al. 2023), and optimization of drilling parameters (Delavar et al. 2023).

In addition, several researchers have used machine learning to estimate GM parameters. For example, Tariq et al. (2017a) proposed a method to predict failure parameters using ANNs. Naeini et al. (2019) estimated pore pressure and geomechanical properties by considering an integrated deep learning solution. Elkatatny et al. (2019) proposed a scheme to estimate Young’s modulus by using machine learning methods. Mahmoud et al. (2019b, 2020) used the machine learning methods to evaluate the Young’s modulus. He et al. (2019) estimated the rock strength parameters by applying machine learning methods. Gowida et al. (2020) estimated the unconfined compressive strength by considering machine learning methods based on drilling data. Khatibi and Aghajanpour (2020) applied machine learning to estimate in situ stresses and GM parameters from offshore gas reservoir information. Ahmed et al. (2021) estimated the Poisson ratio by applying machine learning methods using drilling parameters. Aghakhani Emamqeysi et al. (2023) predicted the elastic parameters in gas reservoirs using the ensemble method.

It is the fact that most empirical relationships for prediction GM parameters (Tables 2 and 3) limit their accuracy and use them in a specific field. The results of such GM relationships are considerably affected by lithology type, which may present inadequate prediction accuracy (Güllü and Jaf 2016). In addition, the lack of generality of the mentioned methods to other field data and their poor fit limit their reliability (Güllü and Pala 2014). In recent years, the computational efficiency and predictive accuracy of GM parameters have increased by using a variety of ML methods (Wang et al. 2021). The data applied in these methods are generally verified using a few core measurement data. Table 1 presents some recent researches for calculating GM parameters using ML methods. Although such ML methods in Table 1 improve prediction and also extensive datasets are available worldwide, there are still many goals to improve the prediction accuracy of GM parameters. In addition, there are insufficient researches to present the appropriate feature selected to provide a correct estimation of GM properties, such as Poisson ratio, Young’s modulus, and unconfined compressive strength.

Table 1 ML methods previously presented for predicting GM parameters

The aim of this study is to provide a robust and user-friendly ML model to compute GM parameters from the complex dataset of the Volve field with various anisotropic reservoir rocks. The model can be used to obtain the GM parameters in offset wells without geomechanical measurements in the laboratory from standard well logs. The ML methods include two techniques: MLP and RBF, which are coupled with genetic algorithm (GA). The main novelty and features of this research are to develop, apply, and compare the predicted GM parameters from GA–MLP and GA–RBF methods applied to a very large multiple-well dataset from a complicated Volve oil field which include six wells. The process is done by introducing a suitable feature spectrum to present appropriate GM parameters. The feature spectrum is introduced from the well log data, which are compressional wave transmission time (DT), lithology index (LI), density (RHOB), coordinate X, coordinate Y, coordinate Z, measurement depth (MD), shear wave transmission time (DTs), and neutron porosity (NPHI). The predicted GM parameters’ performance of GA–MLP and GA–RBF methods is also compared, for the same dataset, with generally used empirical predicted GM parameter models. This process identifies the most effective model for predicting the GM parameters. To verify the results, we address possible concerns about ensuring the integrity of the proposed ML models by applying them to the other data from three wells in the oil field. Although this method is known as a fast and low-cost solution compared to the other existing methods, this technique has some disadvantages. For example, hardware and software limitations (appropriate computer system for processing power) and sometimes the lack of access to high-quality data from the well lead to errors in the prediction of GM parameters. However, in general, the advantages of the ML methods are more than their disadvantages and can be widely used in the industry with the development of suitable computational platforms.

Methodology

Multilayer perceptron

Artificial neural network (ANN) is a type of machine learning method used for various problems (Ali 1994). Multilayer perceptron (MLP) is one of the subsets of ANN model that acts as a global approximator (Hornik 1991). MLP has three layers, namely input, output, and hidden, as indicated in Fig. 1. The input layer is the layer that receives the signal for processing. Tasks such as classification and prediction are done based on the output layer. MLP includes one or several hidden layers. Each node in the output and hidden layers is a neuron that considers an activation function. The difference between the output and target is the residual error (Wei et al. 2021).

Fig. 1
figure 1

Schematic representation of the MLP with a single hidden layer

As presented in Fig. 1, the input layer includes a set of neurons \(\left\{ {{x_i}\left| {{x_1},{x_2}, \cdots ,{x_i}} \right.} \right\}\). Each neuron in the hidden layer transforms the values from the previous layer and followed by a nonlinear activation function \(g\left( \cdot \right):R \to R\) like the hyperbolic tan function. The computations are as follows (Menzies et al. 2014):

$${h_j} = s\left( {{w_{ji}}{x_i} + {b_i}} \right)$$
(1)
$${f_k} = G\left( {{w_{kj}}{h_j} + {b_k}} \right)$$
(2)

where \({b_i}\), \({b_k}\) are the bias vectors, \({w_{ji}}\), \({w_{kj}}\) are the weight matrices, and \(G\) and \(s\) are the activation functions.

Radial basis function (RBF)

Radial basis function (RBF) network is combined of three layers such as input, hidden, and output (Haykin 1999; Yu and He 2006). The first layer is the inputs of the network, the second is a hidden layer including a number of RBF functions, and the last one is the output. RBF includes just one hidden layer. Figure 2 indicates an example of the RBF structure.

Fig. 2
figure 2

Example of RBF network

The input is modeled by a vector of real numbers \(X \in {R^n}\). The output is a scalar function of the input, \(Y:{R^n} \to R\), and is presented by:

$${y_k} = \mathop \sum \limits_{j = 1}^n {w_{jk}}{\phi_j}$$
(3)

where \(n\) is the number of neurons, \(w_{jk}\) is the weight of neuron \(j\), and \({y_k}\) is the output of the neuron \(k\).

Genetic algorithm

The genetic algorithm (GA) is a metaheuristic algorithm proposed by Holland (1992). This method is applied to solving constrained and unconstrained optimization problems based on the natural selection process, which mimics biological evolution. The algorithm iteratively changes the population of individual solutions. At each step, the genetic algorithm randomly chooses individuals from the current population and uses them as parents to produce offspring for the next generation.

GA starts with an initial population of randomly generated chromosomes, where each chromosome represents a possible solution to the given problem. Each chromosome is associated with a fitness value, which is a measure of how good a solution is to the given problem. In each generation, the population evolves toward better fitness using evolutionary operators such as selection, crossover, and mutation. This process continues until a solution is found or the maximum number of iterations is reached. Figure 3 shows the GA process in a flowchart.

Fig. 3
figure 3

Flowchart of the execution sequence of a GA

GA–MLP algorithm

In order to improve the accuracy of the MLP method, it is essential to optimize the value of parameters that should be determined during the design of the MLP method. These parameters are weights and biases. Various algorithms have been proposed to present the quantities of weights and biases. Some researchers have suggested the genetic algorithm (GA) approach to generate an optimal set of weights and biases for the multilayer perceptron (MLP) method (Bansal et al. 2022). Therefore, in this study, the GA–MLP algorithm is applied to the subsequent sections. This algorithm improves the performance of the MLP by using a GA approach to generate the initial values of each parameter. Figure 4 (left) shows the workflow of the GA–MLP algorithm.

Fig. 4
figure 4

The workflow of: (left) GA–MLP, (right) GA–RBF algorithm

GA–RBF algorithm

In order to improve the accuracy of the RBF method, it is essential to optimize the value of parameters such as weights and biases. A variety of algorithms have been developed to propose the quantities of weights and biases. As the same above section, genetic algorithm (GA) can generate an optimal quantity of weights and biases for the RBF model (Jia et al. 2014). Therefore, in this study, the GA–RBF algorithm is applied to the next sections. This algorithm improves the performance of the RBF method. Figure 4 (right) shows the workflow of the GA–RBF algorithm.

Case study

Regional background

This paper is performed based on the Volve information, Norway, as indicated in Fig. 5. The structure of reservoir is a small dome shape (Equinor Website Database 2021; Szydlik et al. 2006). The reservoir formation belongs to Hugin and produced from sandstone.

Fig. 5
figure 5

The location of the Volve field (Sen and Ganguli 2019)

In this study, the information of nine wells is selected, which has been released by the Equinor company (Equinor Website Database 2021).

Geology of the region

In this field, the reservoir part is the Hugin formation and the caprock is the Heather and Draupne formations. The rocks generally are sandstone, siltstone, claystone, limestone, marl, calcite, tuff, and coal as illustrated in Fig. 6. The Volve field is geologically complicated as illustrated in Fig. 7.

Fig. 6
figure 6

Stratigraphic column of the Volve field, Norway

Fig. 7
figure 7

Three-dimensional view of the Volve field

Estimation of geomechanical (GM) properties

There are many methods in the petroleum industry to measure or estimate GM properties, such as elastic properties and rock strength. Correct estimates help to have appropriate knowledge of the various problems (Zhang 2020; Sanei et al. 2021b, 2022). The methods for direct measurement of geomechanical properties commonly are expensive and limited. The empirical models presented for estimating the geomechanical properties based on the well log data can provide the correct value if the parameters of models are computed appropriately.

As we want to present the continuous profile of parameters along with each well, geomechanical properties such as Young’s modulus, uniaxial compressive strength (UCS), and Poisson ratio for training and testing the machine learning model are essential. They can be synthetically generated through the empirical relationships, as the continuous data are not measured along the wall and is limited to some points. Although the machine learning methods to predict such variables are proven using synthetic data, the scheme can also be performed with real data of Young’s modulus, uniaxial compressive strength (UCS), and Poisson ratio, if available.

Elastic properties

The elastic parameters are computed from two methods, namely laboratory measurement based on core samples (static method) and determination of elastic constants using the well log data (dynamic method). The results show that the values obtained from dynamic methods are larger than static methods (Plona and Cook 1995).

Dynamic elastic properties

The dynamic Young’s modulus \({E_{\text{d}}}\) \(\left[ {{\text{GPa}}} \right]\) and dynamic Poisson ratio \({\nu_{\text{d}}}\) [%] are calculated as: (Fjær et al. 2008):

$${E_{\text{d}}} = \rho V_{\text{s}}^2\frac{{3V_p^2 - 4V_{\text{s}}^2}}{{V_p^2 - V_{\text{s}}^2}}$$
(4)
$${{\upnu }_{\text{d}}} = \frac{{{\text{V}}_{\text{p}}^2 - 2{\text{V}}_{\text{s}}^2}}{{2\left( {{\text{V}}_{\text{p}}^2 - {\text{V}}_{\text{s}}^2} \right)}}$$
(5)

where \(\rho\) is the bulk density \(\left( {{\text{g}}/{\text{c}}{{\text{m}}^3}} \right)\), \({V_p}\) is the compressional velocity \(\left[ {{\text{km}}/{\text{s}}} \right]\), and \({V_{\text{s}}}\) is the shear velocity \(\left[ {{\text{km}}/{\text{s}}} \right]\).

Static elastic properties

To estimate the static parameters from dynamic parameters, many equations have been presented. The well-known equations to estimate static Young’s modulus are presented in Table 2.

Table 2 Models to estimate static Young’s modulus

The static Poisson ratio \({\nu_{\text{s}}}\) is calculated from dynamic Poisson ratio \({\nu_{\text{d}}}\) as follows:

$${\nu_{\text{s}}} = {\text{ml}} \times {\nu_{\text{d}}}$$
(6)

where \({\text{ml}}\) is the multiplier. Some researchers, including Archer and Rasouli (2012), believed that the \({\text{ml}} = 1\). However, Afsari et al. (2009) expressed that the \({\text{ml}} = 0.7\).

Static Young’s modulus and Poisson ratio can be estimated using different equations as expressed above based on logging data. In this manner, different models are used and compared with each other to know which model is the best for estimating the synthetical data of static Young’s modulus and Poisson ratio. The process has been done, and the results show that the John Fuller model has a good ability to estimate the static Young’s modulus. In addition, the static Poisson ratio shows that the best multiplier is \({\text{ml}} = 1\). The process to estimate static Young’s modulus and the static Poisson ratio is performed based on the data of well F1A. The results in Fig. 8 indicate that the mentioned models can estimate parameters precisely. The same procedure for well F1A as shown in Fig. 8 is performed for other wells.

Fig. 8
figure 8

Comparison of: (top-left) measured static Young’s modulus with John Fuller model results, (top-right) measured and estimated Poisson ratio with multiplier, \({\text{ml}} = 1\), (bottom) measured unconfined compressive strength with Dick Plumb model results for well F1A

Rock strength

Unconfined compressive strength (UCS) can be calculated in two ways, one based on uniaxial compression tests on drilling cores, and the other one is obtained by using several empirical relationships based on well-log data. Some well-known relationships are presented in Table 3.

Table 3 Models for estimating the unconfined compressive strength

Unconfined compressive strength can be obtained from logging data using various equations as expressed above. In this manner, different models are used and compared with each other to know which model is the best for estimating the synthetical data of unconfined compressive strength. The process is carried out, and the results show that the Dick Plumb model has a good capability to correctly estimate the unconfined compressive strength, as presented, for well F1A in Fig. 8. The same procedure for well F1A is performed for other wells to produce the continuous profile unconfined compressive strength.

Development of neural networks

Normalization method

Data normalization is one of the basic requirements for machine learning (Anysz et al. 2016). Input data generally have different units and dimensions. Normalization causes the model to converge quickly and minimizes error (Rojas 1996). Here, min–max normalization method (as given in Eq. 7) is used to normalize the data in the range of zero to one.

$${{\text{x}}_{\left( {0,1} \right)}} = \frac{{{\text{xo}} - {\text{Min}}}}{{{\text{Max}} - {\text{Min}}}}$$
(7)

where \({\text{xo}}\) and \({x_{\left( {0,1} \right)}}\) are the original and normalized data, respectively; Min and Max represent the minimum and maximum values of the whole dataset.

Statistical criteria

The performance of machine learning can be analyzed. Several important criteria are considered to assess the efficiency and accuracy of these models. The criteria are the coefficient of determination (\({R^2}\)), root-mean-square error (\({\text{RMSE}}\)), and a standard deviation (\({\text{SD}}\)), which can evaluate the overall performance of the model. Equations 810 are applied to calculate these parameters.

$${R^2} = \frac{{{{\left( {\mathop \sum \nolimits_{i = 1}^n \left( {{M_{\text{m}}} - {{\bar M}_{\text{m}}}} \right)\left( {{M_{\text{p}}} - {{\bar M}_{\text{p}}}} \right)} \right)}^2}}}{{\mathop \sum \nolimits_{i = 1}^n {{\left( {{M_{\text{m}}} - {{\bar M}_{\text{m}}}} \right)}^2}\mathop \sum \nolimits_{i = 1}^n {{\left( {{M_{\text{p}}} - {{\bar M}_{\text{p}}}} \right)}^2}}}$$
(8)
$${\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^n {{\left( {{M_{\text{p}}} - {{\bar M}_{\text{m}}}} \right)}^2}}}{n}}$$
(9)
$${\text{SD}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^n {{\left( {{x_i} - \mu } \right)}^2}}}{m - 1}}$$
(10)

where \({M_{\text{m}}}\) and \({M_{\text{p}}}\) are the measured and predicted quantities, respectively, \({\bar M_{\text{m}}}\) and \({\bar M_{\text{p}}}\) are the average of the measured and predicted quantities, respectively, \(m\) represents the sample number, \({x_i}\) is each quantity of data, and \(\mu\) is an average of \({x_i}\). According to previous studies, the systems can present to good performance when they have a high coefficient of determination (close to 1) and a low error value of RMSE (close to zero) (Ranjbar-Karami et al. 2014; Armaghani et al. 2015; Elkatatny et al. 2019).

Data processing and analysis

Data processing

Machine learning models are data-driven, and inserting all available parameters to serve as an input does not always guarantee good results. The best practice is to find which input parameter is contributing positively and which input contributes negatively. A multivariate linear regression correlation coefficient feature selection system is used to estimate the individual relationship in terms of the correlation coefficient between input and output parameters as illustrated in Fig. 9. The correlation coefficient (\({\text{CC}}\)) between input and output can be calculated by Eq. (11) (Elkatatny et al. 2019).

Fig. 9
figure 9

Correlation coefficient between input data with Poisson ratio, Young’s modulus, and UCS

In machine learning models, applying all available parameters as input to the model usually does not guarantee good results. One of the best methods is to choose which input parameters have a positive effect and which have a negative effect on the results. Accordingly, a feature selection system such as correlation coefficient between input and output parameters can be used as shown in Fig. 9. The correlation coefficient (\({\text{CC}}\)) is:

$${\text{CC}} = \frac{{{\text{ns}}\;\sum xy - \left( {\sum x} \right)\left( {\sum y} \right)}}{{\sqrt {{\text{ns}}\left( {\sum {x^2}} \right) - {{\left( {\sum x} \right)}^2}\sqrt {{\text{ns}}\left( {\sum {y^2}} \right) - {{\left( {\sum y} \right)}^2}} } }}$$
(11)

where \(\text{ns}\) is the sample size, \(x\) and \(y\) are the input and output data. In order to analyze the input parameters in this study, the correlation coefficient as obtained by Eq. (11) between the available data, such as (DT, LI, RHOB, X, Y, Z, MD, DTs, NPHI) as input data, and the values of Poisson ratio, Young’s modulus, and UCS as output data are calculated. The results of the correlation coefficient presented in Fig. 9 show that Young’s modulus and UCS have an inverse relationship with DT, LI, Z, DTs, and NPHI and a direct relationship with RHOB, X, Y, and MD. In addition, the results present that the Poisson ratio parameter has an inverse relationship with LI and RHOB and a direct relationship with other input parameters, namely DT, X, Y, Z, MD, DTs, and NPHI.

Data description

Data preparation is one of the necessary steps to be a success in machine learning algorithms. The performance of models depends on having high-quality information. In this research, according to the correlation coefficient results (Fig. 9), a suitable feature spectrum is introduced as input parameters, such as compressional wave transmission time (DT), lithology index (LI), density (RHOB), coordinate X, coordinate Y, coordinate Z, measurement depth (MD), shear wave transmission time (DTs), and neutron porosity (NPHI) to provide appropriate geomechanical parameters. In this research, information of nine wells, i.e., a total of 69,100 data, is used, which are the result of well logging. The data include different formations, different wells, and different depths. The machine learning models will be implemented in two modes. In whole modes, the input data are divided into three categories: 70% data as train data, 15% as test data, and 15% as validation data. The data for the statistical analysis before normalizing are given in Table 4 and after normalizing are presented in Table 5.

Table 4 Data for the statistical analysis before normalizing
Table 5 Data for the statistical analysis after normalizing

Building a GA–MLP algorithm configurations

The performance of the MLP model is dependent on the number of hidden layers and the number of neurons. The more the complex problem, the greater the number of hidden layers and neurons, which causes a longer computational time. Then, optimizing the MLP structure can increase the performance and accuracy of this model. A trial-and-error learning method can be a suitable method to calculate the number of hidden layers (Ham and Kostanic 2001), but this is a time-consuming method. Therefore, in this study, the GA algorithm is used to calculate the number of MLP hidden layers and the number of neurons in each layer. A schematic of how to implement the GA–MLP model is shown in Fig. 4(left). Table 6 indicates the control parameter values of the GA–MLP algorithm, which are essential for optimizing the number of layers and neurons.

Table 6 Control parameter values for the GA–MLP algorithm

In the MLP, the neurons are connected in the neural network through feed-forward type. The sigmoid tangent transfer function is applied to the hidden layer, and the linear function is applied to the output layer. The Levenberg–Marquardt algorithm is applied to train the network. Figure 10 shows a schematic of the different steps of MLP model.

Fig. 10
figure 10

Schematic of the different steps of MLP model

Building a GA–RBF algorithm configurations

The radial basis function (RBF) network has a simple structure and a fast-training process (Wu et al. 2012). Choosing a sufficient number of neurons for the one hidden layer in RBF is required to satisfy the specific mean-squared error. The optimization of the RBF structure can increase the performance and accuracy of this method. Therefore, in this study, the GA algorithm is applied to determine the number of spreads and neurons in the RBF method. A schematic of how to implement the GA–RBF model is illustrated in Fig. 4(right). Table 7 indicates the control parameter values of the GA–RBF algorithm which are essential for optimizing the number of spreads and neurons.

Table 7 Control parameter values for the GA–RBF algorithm

In the RBF, activation functions in RBF are performed using Gaussian functions. The radial basis function is applied to the hidden layer, and the linear function is applied to the output layer. Figure 11 illustrates a schematic of the different steps of the RBF model.

Fig. 11
figure 11

Schematic of the different steps of RBF model

Results and discussion

In this study, to provide a robust ML model to compute GM parameters from the complex dataset, which can be used to obtain the GM parameters in offset wells without geomechanical measurements in the laboratory, the datasets of the nine wells of the Volve oil field are used. The appropriate ML model for geomechanical properties is selected based on two data sets. First, the datasets of five wells, namely F1A, F1B, F4, F5, and F11T2, and second, all data of nine wells, in which the data records split 70%:30% between training and validation subsets. The two sets of datasets are given separately to ML methods to predict the GM properties, such as Young’s modulus, UCS, and Poisson ratio and compared the results with measured data.

Prediction of geomechanical properties for some wells

In this section, to present the capability of GA–MLP and GA–RBF models to predict geomechanical parameters such as the static Poisson ratio, Young’s modulus, and UCS, the data set of wells number F1A, F1B, F4, F5, F11T2 is used and named as F1–F11. The mentioned machine learning methods for each geomechanical property are built based on nine input data, namely DT, LI, RHOB, X, Y, Z, MD, DTs, and NPHI. Evaluation of the models is assessed using the statistical criteria, such as the coefficient of determination (\({R^2}\)), root-mean-square error (RMSE), and standard deviation (SD). First, the geomechanical parameters such as Poisson ratio, Young’s modulus, and UCS are predicted using the GA–MLP algorithm, and their diagram is shown in Fig. 12. The results of \({R^2}\) between measured and predicted geomechanical properties using the GA–MLP of datasets F1–F11 are presented in Table 8.

Fig. 12
figure 12

The results of \({R^2}\) between measured and predicted for the datasets of F1–F11 using the GA–MLP algorithm: (top-left) static Young’s modulus for train data, (top-right) static Young’s modulus for test data, (middle-left) Poisson ratio for train data, (middle-right) Poisson ratio for test data, (bottom-left) UCS for train data, (bottom-right) UCS for test data

Table 8 Prediction of each parameter using the GA–MLP model for datasets F1–F11

Second, the geomechanical properties of F1–F11 are predicted using the GA–RBF model and their diagrams are indicated in Fig. 13. The results of \({R^2}\) between measured and predicted GM properties using the GA–RBF of datasets F1–F11 are given in Table 9.

Fig. 13
figure 13

The results of \({R^2}\) between measured and predicted for the datasets of F1–F11 using the GA–RBF algorithm: (top-left) static Young’s modulus for train data, (top-right) static Young’s modulus for test data, (middle-left) Poisson ratio for train data, (middle-right) Poisson ratio for test data, (bottom-left) UCS for train data, (bottom-right) UCS for test data

Table 9 Prediction of each parameter using the GA–RBF model for datasets F1–F11

The other statistical criteria applied to assess the performance of systems is the amount of RMSE error. The RMSE error quantities of geomechanical properties between the measured and predicted values have been calculated using the GA–MLP and GA–RBF methods for datasets F1–F11, and their results are presented in Tables 8 and 9.

To compare the efficiency of GA–MLP and GA–RBF models for predicting the geomechanical properties, the coefficient of determination (\({R^2}\)), RMSE error value, and standard deviation have been used. As we know, the model has acceptable performance when the value of \({R^2}\) has the highest value and close to 1 and the error values at the lowest value, i.e., close to zero.

According to the results obtained for the coefficient of determination and RMSE values given in Tables 8 and 9, the efficiency of the model has been proved. To present the accuracy of estimation, the results of the measured (targets) and estimated (outputs) geomechanical properties obtained based on the GA–MLP and GA–RBF for F1–F11 are compared and shown in Fig. 14.

Fig. 14
figure 14

The comparison between measured (targets) and estimated (outputs) values of: (top-left) static Young’s modulus based on the GA–MLP algorithm, (top-right) static Young’s modulus based on the GA–RBF algorithm, (middle-left) Poisson ratio based on the GA–MLP algorithm, (middle-right) Poisson ratio based on the GA–RBF algorithm, (bottom-left) UCS based on the GA–MLP algorithm, (bottom-right) UCS based on the GA–RBF algorithm

According to the results of \({R^2}\) and RMSE presented in Tables 8 and 9 for estimating geomechanical properties, they are presented that the GA–MLP forecasting system has a higher performance than the GA–RBF method.

Prediction of geomechanical properties for all wells

To present the ability of MLP and RBF models to predict geomechanical properties, all data from nine wells are selected. Then, the geomechanical parameters such as including static Poisson ratio, Young’s modulus, and UCS are predicted using the GA–MLP and GA–RBF models as the same procedure described for estimating the geomechanical properties of datasets F1–F11. The results of \({R^2}\), RMSE, and SD by using the GA–MLP and GA–RBF models for data from nine wells are presented in Table 10. In addition, to show the accuracy of estimation, the results of the measured and estimated geomechanical properties obtained from all nine well data are compared and given in Fig. 15.

Table 10 Prediction of static Young’s modulus using the GA–MLP and GA–RBF models for all wells
Fig. 15
figure 15

The comparison between measured (targets) and estimated (outputs) values of: (top-left) static Young’s modulus based on the GA–MLP algorithm, (top-right) static Young’s modulus based on the GA–RBF algorithm, (middle-left) Poisson ratio based on the GA–MLP algorithm, (middle-right) Poisson ratio based on the GA–RBF algorithm, (bottom-left) UCS based on the GA–MLP algorithm, (bottom-right) UCS based on the GA–RBF algorithm

According to the results of \(R^2\) and RMSE presented in Table 10 for estimating geomechanical properties, they are shown that the GA–MLP forecasting system has a higher performance than the GA–RBF method.

Since this paper uses a large number of datasets for ML methods, the results show that the ML methods in this study can be used for other datasets by reducing or increasing the number of data, and the methods still have the same results.

At the end of this section, to guide other researchers who intend to estimate GM parameters, it can be stated that the anisotropy of the reservoir rock must be taken into account when estimating GM parameters using ML methods. Because at the same time, there is a wide range of continuous well-logging data and limited geomechanical laboratory data. This problem makes the estimation of GM parameters based on well-logging data a little difficult. However, the results of this research show that the most appropriate method for continuous estimation of GM parameters is estimated based on ML methods.

Thus, one of the proposals to estimate GM parameters, which is a very complex task without uncertainty, is the development of software in this direction based on ML methods. However, it is necessary that before using the methods developed in the software, laboratory data must be measured using suitable apparatus on the rock core at the desired depth so that this information can be used to validate the results.

Conclusions

In this study, a robust machine learning (ML) model to compute geomechanical (GM) parameters from the complex dataset was proposed. To propose the model, the following remarks were made:

  1. 1.

    The large data sets of the Volve oil field were used to predict the GM parameters using two recombinant algorithms of ML methods: genetic algorithm (GA)–multilayer perceptron (MLP) and genetic algorithm (GA)–radial basis function (RBF).

  2. 2.

    The feature selection was used to avoid unessential inputs to achieve the best results.

  3. 3.

    MLP and RBF algorithms were optimized using GA optimizers. The GA–MLP and GA–RBF models were considered in one step followed by another to calculate the number of hidden layers and neurons in order to finally identify the optimal weights and biases for the ML methods. This process led to combining GA with MLP and RBF to predict accurately the GM parameters.

  4. 4.

    The comparison of GA–MLP and GA–RBF models indicated that the GA–MLP has higher performance accuracy to predict GM parameters.

  5. 5.

    The proposed method provides insights for applying ML methods to improve accuracy in order to generate the best performance for the prediction of the GM parameters.

  6. 6.

    The results showed that the proposed ML models in this study still have the same results for another unseen dataset.