Workflow diagram
Four hybrid machine learning algorithms with two efficient optimizers are used in this paper to predict FVDC for three wells in the large Marun oil and gas field in southwestern Iran. Figure 1 shows the applied workflow diagram for predicting the FVDC. Based on this diagram, the data collection process was first applied to the studied field data from the Asmari carbonate reservoir. The data are then sorted to describe the data variable, and maximum and minimum values for each variable are determined. After normalizing the input data (Eq. 1), the feature selection process is performed to determine a suitable combination of inputs. After determining the best combination of inputs, the input data related to the two wells are divided into three sections: test, train, and validation.
$$\mathrel\backepsilon_{i}^{l} = \left( {\frac{{ \mathrel\backepsilon_{i}^{l} - \mathrel\backepsilon \min^{l} }}{{ \mathrel\backepsilon \max^{l} - \mathrel\backepsilon \min^{l} }}} \right) * 2 - 1$$
(1)
where ϶ minl and, ϶ maxl are the minimum and maximum values of the attribute \(l\) in an arrangement; \(\mathrel\backepsilon_{i}^{l}\) is the value of attribute \(l\) for data record I.
One way to compare algorithms is to have a proper comparison (statistical parameters) (Hazbeh et al. 2021a) r. At this stage, a comparison is made based on statistical parameters, and after identifying the best algorithm, it is tested for development on another well.
Feature selection
One method for shortening the process and better estimating each model is to use the best features rather than all features, which increases the program's speed, efficiency, and accuracy (Farsi et al. 2021). This selection includes all of the features available for determining the best performance estimate (Jain and Zongker 1997). When the number of input variables is large, the probabilistic estimation becomes too repetitive and difficult to repeat consecutive performances. For example, if 15 attributes are available, there are as many as 2 N possible combinations (32,768) (Chandrashekar and Sahin 2014). There are numerous feature selection methods (for example, filtering, wrapping, and embedded methods), the most basic of which is the filtering method, and not every evaluated subset is optimal (John et al. 1994). Wrapping methods are more accurate and effective than other methods because they use evolutionary algorithms (e.g., genetic algorithm (GA)) to identify ineffective variables and eliminate potential properties. Variables are generated using potential solutions; each data set is then normalized, and the results are obtained using the cost function, which in this case is the root mean squared error. GA transfers high-performance solutions (minimum RMSE) to the next optimizer iteration and applies the main reader, which includes crossover, combination, and mutation, to the new iterations (Wahab et al. 2015).
Machine learning algorithms
Many researchers have conducted extensive research to determine the key factors in many areas of the oil and gas industry, including formation damage (Mohammadian and Ghorbani 2015), co2 capture (Hassanpouryouzband et al. 2019; Hassanpouryouzband et al. 2018a, b), reservoir performance (Ghorbani et al. 2017a), wellbore stability (Darvishpour et al. 2019), production (Ghorbani and Moghadasi 2014; Ghorbani et al. 2017b, 2014), rheology and filtration (Mohamadian et al. 2018), wellbore blowout (Abdali et al. 2021) and drilling fluid optimization (Mohamadian et al. 2019). Many studies in machine learning and new algorithms can aid in the solving of technical problems in many areas of the oil and gas industry (Choubineh et al. 2017; Ghorbani et al. 2020). In this study, four hybrid algorithms combine two networks of multi-extreme learning machines (MELM) and MLP using particle swarm optimization (PSO) and GA, including MELM-PSO MELM-GA, MLP-PSO, and MLP-GA.
Multi-layer perceptron
Artificial neural networks (ANNs) are one method for facilitating accurate prediction of dependent variables and complex methods and equations (Ali 1994). Because of the variety of complexities for each dependent variable, there are many different types of neural networks. The selection of attributes (i.e., the input variables to be considered), network architecture (number of layers and nodes), and the transfer of functions between layers, as well as the selection of training algorithm, are among the factors that increase the correct choice of these factors and thus increase the performance accuracy in artificial networks (Ghorbani et al. 2018; Mohamadian et al. 2021). The multi-layer perceptron (Fig. 2) is one of the most practical and adaptable neural networks for large and complex datasets (Rashidi et al. 2020). Therefore, MLP is very useful and suitable for predicting FVDC. Combining networks with different evolutionary algorithms is one way to improve performance and results in network algorithms. To improve the results, the MLP methodology employs two evolutionary genetic algorithms (GA) and one particle swarm algorithm (PSO). One of the algorithms in the train that helps the MLP algorithm in fast convergence to the optimizer is the Levenberg–Marquardt (LM) algorithm.
Multi-layer extreme learning machine
ELM, introduced by Liang et al (2006), is one of the fastest and easiest computing networks, avoiding time-consuming repetitions during ELM network training with single hidden layers (Liang et al. 2006). The input weight/bias in this network is selected at random from a uniform distribution and is not usually adjusted through network tuning (Fig. 3) (Cheng and Xiong 2017; Huang et al. 2011; Liang et al. 2006). The output weights of the ELM are converted to a linear output layer by inverting the hidden layer, and the optimal value of the output weight is identified using root mean-squared error regression (Huang et al. 2011). This eliminates the need for duplicate backpropagation simulation, which is an essential part of MLP networks (Yeom and Kwak 2017). The use of latent multi-layer ELM (MELM) is common in complex nonlinear systems that require classification. This new network (MELM) can achieve higher accuracy and generalizability than concealed single-layer models. The structure of MELM can be shown in (Rashidi et al. 2021).
Genetic algorithm
The genetic algorithm (GA) is one of the evolutionary algorithms that simulates feature selection and a suitable combination for hybrids with MLP and MELM and solves problems repeatedly (Simon 2013). One of the high-performance solutions is identified in each iteration and preferably used to help make changes to new solutions for the next GA iteration, while weaker performance solutions based on their poor fitness comparison are gradually eliminated (Mirjalili 2019).
Particle swarm optimization algorithm
The PSO algorithm is a common evolutionary algorithm based on the crowding of groups. Each particle has the potential to solve the problem, and the particle swarm represents one possible solution to the problem (Eberhart and Kennedy 1995). The PSO algorithm uses iteration to find the best possible random solution and to set the minimum (Vmin) and maximum (Vmax) values to determine the quality. Best positions for individual particles (Pb) and best positions achieved globally (Gb) by the entire particle swarm are recorded for each iteration of the algorithm (Atashnezhad et al. 2014). All information is transferred to the next iteration, and the position of each particle is adjusted with its velocity and the lowest Gb and regarding the lowest Pb of the particles of each particle is done, the velocity Vt sets them to Vt + 1, where t refers to the number of iterations. For each iteration, the particles in the swarm reach the lowest RMSE target performance values that have the greatest impact on the next generation of particles (Rashidi et al. 2021). References to related velocity changes and flowcharts can be found in the articles of Singh et al. (2020) and Cai et al. (2020).
Hybrid MELM-PSO/GA model
One of the initial requirements of MELM neural network is to determine the number of hidden layers (l) and the number of neural cells (n) related to each outset, which is usually determined by trial and error (Tang et al. 2014). One of the benefits of using an optimizer algorithm is that it increases the speed and scope of the search for practical values. Finding these optimal parameters in network structure formation is very sensitive and important because selecting these optimal parameters can increase the number of hidden layers, hidden neurons, and nodes in these layers, resulting in structural complexity and inefficiency of the model, as well as increasing run time and preventing network structure formation (Zhang et al. 2016). In this paper, we use one of the most recent MELM fabrication methods (Fig. 4) to help identify the desired number of hidden layers and nodes in these layers (as opposed to trial and error); and after establishment, to help identify the optimal values of weights to be applied to each neuron in each latent layer, as well as the biases applied to each latent layer (replacement of random assignment of these values) (Farsi et al. 2021; Su et al. 2019; Zheng et al. 2019). In this article, a hybrid algorithm that combines PSO and GA optimization algorithms with a MELM network is used, which has not previously been used in this field. The combination of PSO with MELM forms a combination of PSO-MELM-PSO, and the combination of GA with this network makes GA-MELM-GA. In this study, the number of hidden layers of MELM is allowed to vary from 3 to 9, and the number of nerve cells in each hidden layer is allowed to vary from 5 to 25 in order to determine and predict FVDC (Table 2). MELM-PSO has 3 to 7 hidden layers and 10 to 20 neurons for each hidden layer with higher accuracy (lower RMSE), as shown in Table 2, and control parameters used for predicting fracture detection for the algorithm specify PSO and GA in Tables 3 and 4.
Table 2 Report of RMSE value for the numbers of neurons in the layers to MELM network for the prediction of FVDC Table 3 Parameter’s control for the PSO algorithms to PSO-MELM-PSO for the prediction of FVDC Table 4 Parameter's control for the GA algorithms to GA-MELM-GA for the prediction of FVDC Hybrid MLP-PSO/GA model
A schematic of how to implement the MLP-PSO and MLP-GA models (Hazbeh et al. 2021b) is shown in Fig. 5. To optimize the weight and layer nodes (or neurons) in these hybrid neural networks, the MLP network is hybridized with GA and PSO algorithms (Rashidi et al. 2021; Sabah et al. 2019), as shown in Tables 5 and 6 of the control values, the PSO and GA parameters are given.
Table 5 Parameter's control for the PSO algorithms to PSO-MELM-PSO for the prediction of FVDC Table 6 Parameter's control for the GA algorithms to GA-MELM-GA for the prediction of FVDC Statistical errors for FVDC prediction
To predict FVDC using HML and statistical errors, a comparison of artificial intelligence methods such as mean square error (MSE), percentage deviation (PDi), relative error (RE), average percentage deviation (APD), absolute average percentage deviation (AAPD), coefficient of determination (R2), root mean square error (RMSE; the objective function of the HML models), and standard deviation (SD). The equations used in this work are given as Eqs. (2, 3, 4, 5, 6, 7, 8) in the following section:
Mean Square Error (MSE):
$${\text{MSE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {Z_{{{\text{Measured}}\;i}} - Z_{{{\text{Predicted}}\;i}} } \right)^{2}$$
(2)
Percentage deviation (PDi) or relative error (RE):
$${\text{PD}}_{i} = \frac{{H_{{\left( {{\text{Measured}}} \right)}} - H_{{\left( {{\text{Predicted}}} \right)}} }}{{H_{{\left( {{\text{Measured}}} \right)}} }} \times { }100$$
(3)
Average percentage deviation (APD):
$${\text{APD}} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} {\text{PD}}_{i} }}{n}$$
(4)
Absolute average percentage deviation (AAPD):
$$AAPD = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left| {PD_{i} } \right|}}{n}$$
(5)
Coefficient of Determination (R2):
$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {H_{{{\text{Predicted}}\;i}} - H_{{{\text{Measured}}\;i}} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left( {H_{{{\text{Predicted}}\;i}} - \frac{{\mathop \sum \nolimits_{I = 1}^{n} H_{{{\text{Measured}}\;i}} }}{n}} \right)^{2} }}$$
(6)
Root Mean Square Error (RMSE):
$${\text{RMSE}} = \sqrt {{\text{MSE}}}$$
(7)
Standard Deviation (SD):
$$\begin{gathered} {\text{SD}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {D_{i} - Di{\text{mean}}} \right)^{2} }}{n - 1}} \hfill \\ Di\;{\text{mean}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {H_{{{\text{Measured}}\;i}} - H_{{{\text{Predicted}}\;i}} } \right) \hfill \\ \end{gathered}$$
(8)