1 Introduction

1.1 Research background

Porosity is a crucial physical parameter in reservoir characterization, essential for guiding the efficient and accurate planning of oil and gas exploration and development, thereby helping to reduce costs. Traditional methods for predicting porosity rely on direct measurement, which offers high accuracy, but is time-consuming and resource-intensive, particularly due to the costs associated with sampling and core analysis across entire working areas. In contrast, indirect estimation methods are simpler, cost-effective, and efficient. However, these methods often depend on empirical formulas and simplified geologic models assuming reservoir homogeneity, limiting their ability to move beyond linear equations, and leading to significant relative errors in calculation results [1].

Moreover, constructing a porosity prediction model using a single well-logging dataset necessitates a substantial volume of data for fitting, with model parameters intricately linked to the logging data, posing challenges in ensuring result accuracy. Also if the feature selection of the logging data is inappropriate, it will be hard for model to keep its accuracy [2]. In addition, with increasing well depth, the pore structure of the formation grows more intricate, thereby presenting significant challenges to the precision of porosity prediction models [3]. In recent years, there has been rapid development and widespread adoption of artificial intelligence technology in the field of oil and gas geology. Traditional machine-learning methods such as multiple linear regression, artificial neural networks, and support vector machines have been extensively used for porosity prediction, yielding satisfactory outcomes [4, 5]. However, with the growing focus on unconventional oil and gas reservoirs in exploration and development, the target media for reservoir prediction exhibit pronounced heterogeneity and anisotropic characteristics. Traditional machine-learning methods have limited capability to capture robust nonlinear mapping relationships and spatial continuity between porosity and well-logging data, thereby restricting their application in the field of engineering.

In the era of exploring complex reservoirs, there is an urgent need to develop advanced deep learning models that can efficiently and accurately predict porosity, providing essential guidance for oil and gas field exploration and development. This study focuses on overcoming current challenges in porosity prediction, including complex datasets, inappropriate logging curve selection, and subpar model performance, through rigorous data preprocessing and model optimization efforts. The goal is to enhance porosity prediction accuracy and reduce costs associated with petroleum development, thereby ensuring reliable reservoir characterization.

1.2 Related Work

Deep learning techniques offer distinct advantages in handling vast, high-dimensional, and complex spatiotemporal geologic data. These methods have shown substantial potential across diverse applications, including well log-curve reconstruction [6], reservoir characterization [7], and permeability prediction in oil and gas reservoirs [8]. Data-driven deep learning approaches have also made significant advancements in predicting porosity. Table 1 provides a comparative review of related researches with author(s) names, datasets utilized, preprocessing methods, methodology measures and evaluation index.

Table 1 Summarize the related work

Yang [10, 11] used density (DEN), acoustic (AC), gamma ray (GR), and other parameters to predict porosity using a convolutional neural network (CNN) and a deep neural network (DNN). The prediction results of the model showed strong agreement with the porosity in the original logging data, achieving a correlation coefficient of 0.9725. Nonetheless, CNN models demand a significant quantity of labeled data, presenting difficulties in obtaining sufficient core data for practical engineering applications.

Shao et al. [12] introduced a multitask DNN model to predict porosity, permeability, and water saturation, simultaneously, significantly streamlining the prediction process. However, no preprocessing work was performed on the logging data set, and our work utilized a better data-preprocessing approach.

Wang et al. [13] used a deep bidirectional recurrent neural network to predict porosity, compensating for the inability of traditional deep networks to provide contextual information and improved the accuracy and stability of porosity prediction. In our study, we used different preprocessing methods and deep learning models.

An et al. [14] utilized long short-term memory (LSTM) neural networks to predict reservoir porosity, with superior performance compared to DNNs. Chen et al. [15] developed a multilayer LSTM that demonstrated efficient and cost-effective porosity prediction, highlighting LSTM’s robustness and accuracy in forecasting logging data with serialized structural features. Wang et al. [16] proposed an LSTM model incorporating domain information enhancement through principal component analysis and the K-means algorithm, effectively improving porosity prediction accuracy in carbonate rocks. Wang et al. [17] established a model linking logging curves and reservoir properties based on a gate recurrent unit (GRU), achieving strong results in porosity prediction. In contrast to these studies, our research focuses on optimizing model parameters to mitigate uncertainties associated with manual parameter tuning.

Song et al. [18] combined a CNN and GRU to develop a porosity prediction model, improving accuracy by adjusting the learning rate during training. While we also explore combined neural network models for accuracy improvement, our methods differ in model optimization.

Wang et al. [19] proposed a transfer DNN for predicting shale total porosity, reducing reliance on logging and core data. While we share similarities in missing value completion methods, our study additionally incorporate box plots for data processing.

Pan et al. [20] addressed poor accuracy in porosity prediction by optimizing extreme gradient enhancement parameters using grid search and genetic algorithms, achieving favorable outcomes. Our research parallels theirs in the data analysis module, but we employ four distinct data analysis methods.

Huo et al. [21] proposed an enhanced stacking ensemble learning model for reservoir parameter prediction, improving model generalization ability. We have all considered using ensemble learning strategies to improve prediction accuracy, but we introduced this strategy in the feature selection module.

Dai et al. [22] utilized the particle swarm optimization (PSO) algorithm to optimize the Relevance Vector Machine (RVM) model for porosity prediction, thereby reducing model uncertainty and enhancing the precision of the porosity prediction. The coefficient of determination R2 between the prediction results and the original values reached 0.969. We all considered enhancing the model’s generalization capacity through the PSO algorithm, but our research made further refinements to the PSO algorithm, which included a good-point set and an adaptive compression factor.

Moreover, to address the challenge of low accuracy in predicting complex reservoirs, researchers have developed various neural network models based on logging data. These models have been integrated with optimization algorithms such as fruit fly optimization [23], artificial fish swarm [24], firefly [25], gray wolf [26], fireworks [27], PSO [28], shuffled frog leaping [29], etc., to automatically optimize crucial model parameters. The optimized models exhibit enhanced feature learning capabilities, improved generalization adaptability, and achieve higher prediction accuracy and efficiency for porosity compared to standard neural networks. In our study, we employed an enhanced PSO algorithm to optimize GRU model parameters.

1.3 Contributions

The principal contributions of this study are outlined as follows:

First, an optimization method for parameter selection based on ensemble learning was proposed to address the issues of low quality, redundancy, and multicollinearity in logging data.

  • By employing box plots to detect outliers, treating these outliers as missing values, and using linear interpolation for filling the missing values, followed by the application of min–max normalization for scaling and completion, the data can be cleansed, missing values can be filled, and comparability and consistency of the data can be ensured.

  • Following data cleaning, a committee-voting-based ensemble learning (EL) strategy was developed by incorporating random forest (RF), gray relation analysis (GRA), the maximum information coefficient (MIC), and the Spearman coefficient. This strategy selected a subset of logging features highly correlated with porosity to tackle the multiscale coupling issue of high-dimensional logging data with complex space–time dependence. This approach not only avoided the limitations of a single-feature selection algorithm but also simplified data dimensions, provided high-quality samples for deep learning-based porosity prediction models, and enhanced accuracy and efficiency, while reducing computational space and costs.

Second, to address the issues of low model accuracy due to random population initialization in the PSO algorithm and the risk of missing the global optimal solution due to excessively fast particle velocity during iteration, an improved algorithm termed GPSCF-PSO (PSO based on a good-point set and adaptive compression factor) was introduced.

  • The population diversity was enhanced through initialization with a set of good points, imporving the accuracy of algorithm optimization.

  • In addition, a compression factor was introduced to adaptively adjust the particle velocity update method based on the iteration count to prevent the algorithm from being trapped in local optima.

  • Experimental comparisons were conducted on 10 benchmark test functions, with four swarm intelligence algorithms to validate the superiority of the proposed algorithm.

Finally, an EL-IPSO-GRU porosity prediction model (GRU based on EL and improved PSO) was proposed, considering the complex space–time sequence characteristics of logging data.

  • The improved PSO algorithm was utilized to optimize the number of hidden layers, batch size, and learning rate, minimizing the influence of manual empirical values, and thus enhancing model accuracy and reliability.

  • The performance of the proposed model was compared with those of LSTM, GRU, and EL-GRU using mean square error (MSE), mean absolute error (MAE), and coefficient of determination (R2) as evaluation metrics in four wells of an working area.

  • The experimental findings demonstrated that the proposed model exhibits higher prediction accuracy compared to the other three methods.

1.4 Organization of the Paper

This paper is structured into seven sections:

  • Introduction: This section provides an overview of the research background of porosity prediction, evaluates the strengths and weaknesses of prior methodologies, and introduces the EL-ISPO-GRU model adopted in this study.

  • Optimal selection of logging data features based on EL: This section details the data-preprocessing tasks and introduces an ensemble learning strategy incorporating four correlation analysis techniques.

  • Improved PSO algorithm based on a good-point set and adaptive compression factor: This section presents the fundamental concepts of the PSO algorithm, identifies existing challenges, proposes an enhanced PSO algorithm, and conducts comparative analyses to demonstrate the benefits of four distinct PSO algorithms.

  • Optimized GRU for porosity prediction based on the EI and the improved PSO algorithm: This part integrates the approaches from the first two sections with the GRU to predict porosity and conducts comparative evaluations of the accuracy of three different models.

  • Discussion: This section interprets, analyzes, and discusses the experimental results.

  • Conclusions: This section recapitulates the research objectives and methodologies, underscoring the significance and contributions of this research.

  • References: This part catalogs all the references cited throughout the paper.

2 Optimal Selection of Logging Data Features Based on the EL

A single-feature selection method cannot effectively characterize the complex nonlinear mapping relationship between logging curves and porosity. There is multicollinearity between logging data, which means two or more logging features are highly correlated. The model may be overly dependent on one or several features. Therefore, the optimal selection method for porosity-sensitive logging features was investigated in this study to eliminate multicollinearity. To address issues of logging data quality and integrity, outliers and missing data were initially handled using box plots and linear interpolation, followed by min–max normalization. Subsequently, a committee-voting model based on EI was constructed by integrating the Spearman coefficient, RF, GRA, and MIC. Feature subsets with high correlations with porosity were selected from the high-dimensional logging data. This process aimed to enhance the computational efficiency and prediction accuracy of GRU-based porosity prediction models by eliminating redundancy and multicollinearity in the high-quality sample data provided.

2.1 Dataset Selection and Preprocessing

2.1.1 Data Selection

The experimental data are sourced from four wells in a certain oilfield. The A01 well was the main experimental site, with a depth of 2405–4320 m and a sampling interval of 0.125 m. It includes 95 logging curves, totaling approximately about 15,000 data points. Considering the wide variety of logging parameters and varying degrees of sensitivity between different parameters and the porosity, to eliminate redundancy and multicollinearity, based on expert experience and the literature [19], in addition to DEPTH and porosity (POR), 15 logging parameters were initially selected, including acoustic (AC), compensated neutron (CNL), density (DEN), deviation (DEV), dip angle (DAZ), gamma ray (GR), natural gamma spectroscopy logging (K, U, and TH), KTH gamma ray without uranium (KTH), resistivity microacoustic (RMA), resistivity (RT), resistivity array (RTA), flushed zone formation resistivity (RXO), and spontaneous potential (SP) parameters.

The specific distribution of the data is shown in Table 2, including logging curves, unit, average value, maximum value, and minimum value.

Table 2 Statistical analysis of logging data

2.1.2 Data Preprocessing

Due to complex geologic conditions, instrumentation, measurement costs, and other factors, logging data often contain missing and abnormal value, which significantly impact the performance of porosity prediction models [30]. Therefore, data preprocessing is essential prior to model training. Logging attributes with more than 50% missing values were removed outright. For attributes with a small number of missing values, mean imputation was employed. In addition, outliers were identified using box plots based on the interquartile range, and data quality was enhanced using linear interpolation.

In addition, due to the inconsistent range and dimensions of various logging data, neural network training often favors attributes with larger values, over those with smaller values, thereby impacting training speed and prediction accuracy. Hence, this study employs min–max normalization to scale logging attribute values to the range [0,1], as depicted in Eq. (1). Where, Z and Z* denote the original and normalized data of a specific logging attribute, respectively, while Zmin and Zmax represent its minimum and maximum original values.

$$ z^{*} = \frac{{z - z_{{\min }} }}{{z_{{\max }} - z_{{\min }} }} $$
(1)

Figure 1 illustrates the distribution of normalized logging data from well A01 before and after outlier processing. In the figure, red denotes outlier data, while blue indicates the median. The integration of box plot and linear interpolation methods effectively eliminates outliers from the logging curve, significantly enhancing the quality of the preprocessed logging data.

Fig. 1
figure 1

Distribution of normalized logging data of well A01

2.2 Data Feature Correlation Analysis

To reduce the computational complexity of the model, further feature selection based on the relationship between the prediction target and logging parameters is necessary. Key methods for analyzing these relationships include linear and nonlinear analyses such as the Pearson coefficient, Kendall coefficient, Spearman coefficient, and GRA [32]. Given the temporal characteristics and strong nonlinear relationships inherent in logging data features, this study employed Spearman coefficient, GRA, RF, and MIC to assess the correlation degree among 17 logging features using A01 well-logging data as a case study. Subsequently, a committee-voting strategy was devised to identify a subset of porosity-sensitive logging features for input into the GRU model.

2.2.1 Gray Relation Analysis

The GRA method rooted in gray system theory, assesses correlation by comparing a reference sequence that reflects system behavior characteristics (X0 = {X0(k) | k = 1, 2, …, n}) with comparison sequences (Xi = {Xi(k) | k = 1, 2, …, n} (i = 1,2,…,m)) that influence system behavior. Where m denotes the number of factors, and n denotes the number of experiments per factor. The method determines correlation based on the consistency of trends between the reference and comparison sequences: strong correlation exists when trends align, and weak correlation exists when they diverge. Specific steps for GRA are as follows.

Step 1: Normalize the data, as shown in Eq. (2).

$$ \overline{X}_{i} = \frac{{X_{i} \left( k \right) - \min \left\{ {X_{i} \left( k \right)} \right\}}}{{\max \left\{ {X_{i} \left( k \right)} \right\} - \min \left\{ {X_{i} \left( k \right)} \right\}}} = \left\{ {\overline{X}_{i} \left( 1 \right),\overline{X}_{i} \left( 2 \right)} \right\},...,\overline{X}_{i} \left( n \right)\} , i = 1,2,...,m. $$
(2)

Step 2: Find the difference between the comparison sequence and the reference sequence, as shown in Eq. (3).

$$ \Delta \overline{X}_{i} \left( k \right){ } = { }\left| {\overline{X}_{0} \left( k \right) - \overline{X}_{i} \left( k \right)} \right| = \left| {\Delta \overline{X}_{i} \left( 1 \right),\Delta \overline{X}_{i} \left( 2 \right),...,\Delta \overline{X}_{i} \left( n \right)} \right|{ }i = 1,2,...,m $$
(3)

Step 3: Calculate the maximum difference D and minimum difference d of the two levels, as shown in Eqs. (4) and Eq. (5), respectively.

$$ D = \max_{i} \max_{k} \Delta \overline{X}_{i} \left( k \right) $$
(4)
$$ d = \min_{i} \min_{k} \Delta \overline{X}_{i} \left( k \right) $$
(5)

Step 4: Calculate the correlation coefficient between the comparison sequence and the reference sequence value, as shown in Eq. (6). Where, \({\Delta }_{i}(k)\) is the k-th difference of the i-th factor in the difference sequence matrix.

$$ \zeta_{0i} \left( k \right) = \frac{d + \xi D}{{\Delta_{i} \left( k \right) + \xi D}}. $$
(6)

Step 5: Calculate the gray correlation degree, as shown in Eq. (7).

$$ y_{0i} = \frac{1}{n}\mathop \sum \limits_{k = 1}^{n} \zeta_{0i} \left( k \right). $$
(7)

We use logging curves of well A01 as input data for the GRA algorithm. For any two logging curves, we use one curve data as a reference sequence and the other logging curve data as a comparison sequence. Based on Eqs. 27, we calculate the correlation degree between any two logging parameters and obtained the thermal diagram, as shown in Fig. 2. It can be seen that in the GRA analysis, the CNL, K, and SP are strongly correlated with the porosity.

Fig. 2
figure 2

Gray correlation matrix of logging data in well A01

2.2.2 Maximum Information Coefficient

The MIC is employed to measure the degree of correlation between two variables X and Y. The mutual information value is transformed into a metric with the interval [0,1] by employing an optimal discretization method. The main steps are outlined as follows.

Step 1: Variables x and y are partitioned into data spaces using an m × n grid, and the frequency of data points falling within the (x,y) grid is estimated as P(x,y), as depicted in Eq. (8), where N(x,y) denotes the count of occurrences within the (x,y) grid.

$$ P\left( {x,y} \right) = \frac{{N\left( {x,y} \right)}}{{\mathop \sum \nolimits_{i = 1}^{m} \mathop \sum \nolimits_{j = 1}^{n} N\left( {i,j} \right)}}. $$
(8)

Step 2: The grid-partitioning method that maximizes mutual information is selected, and normalization factors are utilized to convert mutual information values into the range of [0,1], as demonstrated in Eqs. (9) and (10). The gird resolution limit is defined as m × n < B; the empirical power exponent of 0.6 can be chosen based on specific datasets; I(X,Y) represents the mutual information value between variable x and variable y; P(x,y) denotes the joint probability density function of x and y; and P(x) and P(y) are the marginal probability density functions of X and Y, respectively. Due to the maximum possible mutual information distributed on the X–Y grid being log(min{X,Y}), the calculation result of MIC can be obtained based on Eq. (11).

$$ B = \left( {\mathop \sum \limits_{i = 1}^{m} \mathop \sum \limits_{j = 1}^{n} N\left( {i,j} \right)} \right)^{0.6} $$
(9)
$$ I\left( {X,Y} \right) = \mathop \sum \limits_{y \in Y} \mathop \sum \limits_{{{\text{x}} \in {\text{X}}}} P\left( {x,y} \right){\text{log}}\left( {\frac{{P\left( {x,y} \right)}}{P\left( x \right)P\left( y \right)}} \right) $$
(10)
$$ MIC\left[ {X,Y} \right] = \mathop {\max }\limits_{m*n < B} \frac{{I\left[ {X,Y} \right]}}{{\log \left( {\min \left\{ {X,Y} \right\}} \right)}}. $$
(11)

Based on the above steps, we conducted data correlation analysis using the MIC method. Figure 3 displays the correlation between logging parameters and porosity. The analysis reveals that the DEPTH, GR, and AC curves exhibit significant correlation with porosity.

Fig. 3
figure 3

Correlation analysis results based on MIC

2.2.3 Random Forest

RF is a widely used and robust EL method. Its fundamental approach involves constructing multiple decision trees and aggregating their predictions to enhance the accuracy and resilience of prediction or classification tasks. In regression tasks, the RF algorithm employs averaging or voting mechanisms to determine the final predicted value. Figure 4 illustrates the RF model.

Fig. 4
figure 4

Random forest (RF) model

The study employs the RF model to analyze the importance of different logging parameters on porosity, with results depicted in Fig. 5. According to Fig. 5, the GR, KTH, and SP curves demonstrate significant correlation with porosity in the RF analysis.

Fig. 5
figure 5

Correlation analysis results based on RF

2.2.4 Spearman Coefficient

Logging data exhibits nonlinear characteristics, while the Spearman correlation coefficient outperforms the Pearson coefficient in handling nonlinear data. Hence, this study opted to employ the Spearman correlation coefficient for its data analysis. It assesses the strength of the relationship between two variables by evaluating their monotonicity, typically denoted as ρ and illustrated in Eq. (12).

$$ \rho = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{i} - \overline{x}} \right)\left( {y_{i} - \overline{y}} \right)}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{i} - \overline{x}} \right)^{2} \mathop \sum \nolimits_{i = 1}^{n} \left( {y_{i} - \overline{y}} \right)^{2} } }}, $$
(12)

where, x represents the sequence sorted based on n sample data points according to the predicted results; y represents the sequence sorted based on actual scores, and ρ indicates the evaluation range of [– 1,1]. A larger ρ value, indicates a stronger correlation between the two variables.

Unlike the Pearson coefficient, which assumes linear relationships, the Spearman coefficient is adept at capturing nonlinear correlations between logging parameters and the porosity, with results illustrated in Fig. 6. According to Fig. 6, the GR, KTH, CNL, and AC curves exhibit significant correlation with porosity in the Spearman analysis.

Fig. 6
figure 6

Spearman coefficient matrix diagram of logging data in well A01

2.3 Optimal Selection of Logging Features Based on Committee Voting

In traditional research concerning logging-feature selection, a single-feature selection algorithm is typically employed. However, each algorithm is suited to specific scenarios, and different algorithms may yield varying results for the same logging data. To mitigate the limitations of relying on a single-feature method, this paper proposes a logging-feature selection strategy using committee-voting EL.

The approach involves several distinct steps. Initially, four feature selection methods were used to evaluate the importance of logging parameters for porosity (Table 3). Subsequently, the results from each method were ranked in descending order. Following this, logging parameters were assigned scores of 4, 2, 1, and 0 points, in a ratio of 3:3:3:1 based on their rankings. The total score for each logging parameter was then computed, as illustrated in Fig. 7. Finally, the top-ten logging parameters were selected as inputs for the porosity prediction model. Figure 8 presents the distribution of the selected logging data.

Table 3 Analysis results of the importance of logging parameters based on four feature selection methods
Fig. 7
figure 7

Committee voting scores of logging parameters of well A01

Fig. 8
figure 8

Distribution of selected logging parameters

3 Improved PSO Algorithm Based on a Good-Point Set and an Adaptive Compression Factor

3.1 Principles and Existing Problems of PSO

3.1.1 Basic Principles of the PSO Algorithm

The PSO algorithm, a well-known swarm intelligence method, draws inspiration from the study of bird foraging behavior. Through collective information sharing, the swarm aims to locate the optimal destination. Each particle in the swarm searches along a determined direction and keeps track of the location where the highest quantity of food has been found. Furthermore, particles exchange information regarding the location and quantity of food discovered during each iteration. Throughout the search process, the entire swarm adjusts its direction based on the global and individual optimal positions, as depicted in Eqs. (13) and (14).

$$ v_{i} = \omega {*}v_{i} + c_{1} {*}rand\left( {0,1} \right){*}\left( {p_{i} - x_{i} } \right) + c_{2} {* }rand\left( {0,1} \right){*}\left( {g_{best} - x_{i} } \right) $$
(13)
$$ x_{i} = x_{i} + v_{i} , $$
(14)

where vi represents the velocity of the i-th particle; xi denotes the position of the i-th particle; \(\omega \) is the inertia weight; Pi is the best position experienced by the i-th particle; gbest is the best location experienced by all particles in the group; rand(0,1) is a random number uniformly distributed between (0,1); and c1 and c2 are learning factors, typically ranging from 0 to 2.

3.1.2 Problems with PSO Algorithm

The PSO algorithm is recognized for its rapid computational speed and dependable outcomes, rendering it appropriate for addressing optimization challenges in multidimensional spaces. Nevertheless, this study identifies limitations, primarily manifested in two areas.

(1) Sensitivity to initial values: The distribution and quality of the initial population notably influence the PSO algorithm. Minor variations in initialization can cause substantial differences in the quality of the ultimate solution. Hence, a random initialization may lead to inadequate diversity and reduced global search capabilities.

(2) Prone-to-local optima: Owing to its local search characteristics, the PSO algorithm may inadvertently become confined near a local optimum, failing to explore the entire space for superior solutions. This deficiency in escaping local optima constrains the algorithm’s potential to locate the global optimum.

3.2 GPSCF-PSO algorithm

To address the deficiencies of the PSO algorithm, an enhanced algorithm named GPSCF-PSO was introduced, utilizing good-point sets and a compression factor. Initially, to tackle the issues of inadequate population diversity and limited global search capability resulting from random initialization in conventional PSO algorithms, a good-point set is used for population initialization. This enhances the uniformity of particle distribution and consequently improves the algorithm’s ability to search globally. Second, to mitigate the issue of high particle velocity during iterations, which may overlook the global optimum, a compression factor is incorporated into the velocity update formula. This adjustment limits the particle search range and moderates the velocity change, facilitating a faster and more effective convergence to the global optimum. Moreover, to avoid the algorithm from stagnation at local optima during prolonged iterations, an adaptive position update strategy based on iteration count was devised. This strategy promotes the algorithm’s escape from local extremum points; thus, enhancing its robustness.

3.2.1 Population Initialization Based on a Good-Point Set

A good-point set is an effective and uniform method for selecting points [33], and its basic definition is as follows. Let GS be the unit cube in an s-dimensional Euclidean space, if \(r\in {G}_{s}\), which is represented by Eq. 15, and its deviation \(\varphi (n)\) satisfies Eq. 16. Where \(\varepsilon \) is any positive number, then we call Pn(k) the set of good points, and \(r\) is the best point.

$$ P_{n} \left( k \right) = \left\{ {\left( {\left\{ {r_{1}^{\left( n \right)} {*}k} \right\},\left\{ {r_{2}^{\left( n \right)} {*}k} \right\}, \ldots ,\left\{ {r_{s}^{\left( n \right)} {*}k} \right\}} \right),1 \le k \le n} \right\} $$
(15)
$$ { }\varphi \left( n \right) = C\left( {r,\varepsilon } \right)n^{ - 1 + \varepsilon } . $$
(16)

Set r = {2cos(2 \(\pi \) k/p)}, where 1 ≤ k ≤ s and p is the minimum prime that satisfies (p-3)/2 ≥ s, and map it to the search space as shown in Eq. 17, where \({upper}_{j}\) is the upper bound, and \({low}_{j}\) is the lower bound.

$$ x_{i} \left( k \right) = \left( {upper_{j} - low_{j} } \right)\left\{ {P_{n} \left( k \right)} \right\} + low_{j} . $$
(17)

This study compares the two-dimensional initial populations created using both the good-point set method and the random method. With a population size set at 100, the individual distributions from the two methods are depicted in Fig. 9. The diagram illustrates that, compared to the random method, the good-point set method produces a more uniform distribution within the search space.

Fig. 9
figure 9

Initial population generated by two different strategies

Experimental findings show the spatial pattern generated by the good-point set method remains consistent across different iteration counts, indicating stability. Consequently, this paper adopts the good-point set method for initializing the population in the PSO algorithm, which enables particles to traverse the search space more effectively, thereby enhancing global optimization performance.

3.2.2 Introducing a Compression Factor to Adaptively Adjust the Velocity Update Mode

In the early stages of iteration, if an optimal solution is not found, as the global search capability diminishes and local search strengthens, particle swarm convergence increases, and the algorithm is prone to becoming trapped in local optima. Thus, based on population initialization using a good-point set, this study introduces an adaptive position update strategy that incorporates a compression factor, as detailed in Eq. (18). Where, \(\varphi \) represents the compression factor, defined in Eq. (19); flagnum is a flag variable for the position update strategy, initially set at 0.

$$ v_{i} = \left\{ {\begin{array}{*{20}c} {\varphi *\left( {\omega *v_{i} + c_{1} *r_{1} *\left( {p_{i} - x_{i} } \right) + c_{2} *r_{2} *\left( {g_{best} - x_{i} } \right)} \right) } \\ {0 < flagnum \le 200} \\ {\omega *v_{i} + c_{1} *r_{1} *\left( {p_{i} - x_{i} } \right) + c_{2} *r_{2} *\left( {g_{best} - x_{i} } \right) } \\ { flagnum > 200} \\ \end{array} } \right. $$
(18)
$$ \varphi = 2/\left| {2 - C - \sqrt {C^{2} - C} } \right|,{ }C = c_{1} + c_{2} {\text{ and }}C > 2. $$
(19)

The algorithm adjusts flagnum depending on changes in the global optimum. If the global optimum remains unchanged after several iterations, a velocity update strategy employing a compression factor is used to reduce particle speed, allowing for more detailed exploration. Each iteration increments flagnum by one. If the global optimum changes, flagnum is reset to zero. However, if flagnum exceeds 200 and the global optimum remains unaltered, the original velocity update strategy without a compression factor is reinstated to facilitate the particles’ escape from local extremum points at increased speeds.

The velocity update with the compression factor effectively balances the global search capability in the early iterations with the local convergence ability in later stages. It can be adaptively adjusted based on the iteration count while the global optimum remains unchanged; thus, maintaining particle diversity, effectively preventing the algorithm from becoming stuck at a local optimum, and enhancing its global search performance.

3.2.3 Steps of the GPSCF-PSO Algorithm

The procedure for enhancing the adaptive PSO algorithm via the integration of a good-point set and a compression factor is delineated in Fig. 10.

Fig. 10
figure 10

Flowchart of GPSCF-PSO algorithm

The outlined steps and corresponding pseudo-code are provided below, as shown in algorithm 1.

Step 1: Establish algorithm-related parameters, including population size and the maximum number of iterations.

Step 2: Initialize the population position using the good-point set and set the compression factor.

Step 3: Compute the fitness for each particle’s position and update the individual’s historical optimal position alongside the global optimal position.

Step 4: Verify if there has been a change in the global optimal value.

Step 5: If the global optimal value remains unchanged after a predefined number of times, such as 200, advance to step 6; otherwise, proceed to step 7.

Step 6: Employ the standard PSO algorithm for updating velocity and position, as specified in Eqs. (13) and (14).

Step 7: Implement a strategy with a compression factor to update the velocity and position of the particles, as outlined in Eq. (18).

Step 8: Return to step 3 and repeat the procedure until the preset number of iterations is completed or the termination condition is fulfilled.

figure a

Algorithm 1: GPSCF-PSO algorithm

3.3 Algorithm Performance Testing

The efficacy of the modified GPSCF-PSO algorithm was assessed against four distinct PSO algorithms across ten varied types of benchmark test functions, confirming the superiority of the improved algorithm was verified.

3.3.1 Benchmark Functions

Ten benchmark functions, frequently used in CEC2019 [34] and cited earlier [35, 36], were selected for experimental test. https://www.sfu.ca/~ssurjano/optimization.html.

The principal characteristics of these functions are presented in Table 4. They serve to compare and validate the improved algorithm’s superiority over the comparison algorithms. In these functions, x denotes the independent variable; D represents the dimension; Range indicates the range of values for the independent variable; and Xmin denotes the global optimal position of the function.

Table 4 Ten benchmark test functions

3.3.2 Compared Algorithms and Parameter Settings

To ensure a rigorous comparison, the standard PSO, Linear Decreasing Inertia Weight PSO (LD-PSO) [37], PSO incorporating compression factors (CF-PSO) [38] and PSO employing good-point sets (GP-PSO) were utilized. All algorithms were configured with identical parameters for fairness in testing. The population size was set at 100, the self-learning factor c1 and the global learning factor c2 were both set at 2.05, and the initial inertia weight was 0.9. For the LD-PSO algorithm, the inertia weight decreased linearly from 0.9 to 0.3 as calculated by 0.3*(t/T), where t is the current iteration count and T is the total number of iterations.

3.3.3 Comparison with Other Algorithms

Each algorithm was independently executed 50 times on each function, with the maximum number of iterations, T, set at 1000. Evaluation indicators included the best value, mean, and standard deviation (denoted as std). The experimental results of the five algorithms are shown in Table 5.

Table 5 Experimental results on 10 benchmark test functions

According to the table, within the specified number of iterations, the proposed algorithm reached theoretical optimal values on six functions. On the remaining four functions, although the theoretical optimal value was not achieved, the proposed algorithm found averages closer to the theoretical optimal values compared to other algorithms, and exhibited lower standard deviations, indicating higher convergence accuracy. However, the optimization performance of the proposed algorithm was suboptimal for certain functions, such as F9 and F10, where it was outperformed by the LD-PSO algorithm. This suggests that the performance of the proposed algorithm on complex function solutions requires further enhancements to enhance improve its robustness.

3.3.4 Analysis of Convergence Performance

To visually demonstrate the optimization process, search history graphs for each test function were selected, with four graphs per function shown in Fig. 11. These graphs illustrate the initial broad global search by the particle swarm within the defined search space, followed by more focused local searches, culminating in as close a convergence as possible to the global optimal position.

Fig. 11
figure 11

Two-dimensional diagrams of iterative optimization history of the GPSCF-PSO algorithm on ten functions. a Two-dimensional diagrams of iterative optimization history of the GPSCF-PSO algorithm on the first five functions. b Two-dimensional diagrams of iterative optimization history of the GPSCF-PSO algorithm on the last five functions

In addition, to further assess the convergence of the proposed algorithm, comparisons with four other algorithms were conducted on 10 functions. Owing to space constrains, not all convergence curves could be displayed. Specifically, the iterative convergence curves for the SumSquares and Rastigin functions as depicted in Figs. 12 and 13.

Fig. 12
figure 12

Iterative convergence curve of SumSquares function

Fig. 13
figure 13

Iterative convergence curve of Rastigin function

The experimental results indicate that, compared to the four other algorithms, the improved PSO algorithm proposed in this study achieves higher accuracy within the same number of iterations. Even with prolonged stagnation in the later stages of iteration, the optimization values continue to decrease significantly and the curve becomes smoother. This demonstrates that the improvements to the algorithm, though the integration of a good-point set and an adaptive factor, have resulted in greater stability, improved convergence accuracy, and an enhanced ability to escape local optima.

4 Optimized GRU for Porosity Prediction Based on the EI and Improved PSO Algorithm

The EL-IPSO-GRU model was constructed for porosity prediction. The logging data selected by the EL strategy based on committee voting were used as input for the model. The GPSCF-PSO algorithm was used to obtain the optimal hyper-parameters of the model. Then, the porosity was predicted using the parameters GR, AC, etc. By comparison with the LSTM, GRU, and EL-GRU models, the prediction performance of the proposed model was verified.

4.1 EL-IPSO-GRU Model

Key parameters in the traditional GRU typically depend on personal expertise, and excessive adjustments often result in suboptimal accuracy and reliability. Consequently, the GPSCF-PSO algorithm was employed to determine optimal values for the number of hidden layers, batch size, and learning rate to minimize human intervention and enhance accuracy and reliability. The hierarchical structure of the EL-IPSO-GRU model is shown in Fig. 14.

Fig. 14
figure 14

Layers of proposal model

The EL-IPSO-GRU model was implemented as follows.

Step 1: Define the structural parameters and hyper-parameters that need to be optimized for the GRU, set the range for parameter optimization range, and identify the target for the GPSCF-PSO algorithm.

Step 2: Establish the fitness function, utilizing the Mean Squared Error (MSE) as the evaluation criterion for each iteration, where a smaller MSE value indicates a better position.

Step 3: Initiate the population using a good-point set to determine the initial hyper-parameter combinations for the GRU neural network.

Step 4: Assess the fitness of the initial population, adjust particle positions based on fitness values, and cyclically feed the newly optimized individuals back into the GRU, until the termination condition is met.

Step 5: Employ the optimally generated individuals by the GPSCF-PSO algorithm as the ideal hyper-parameters for the GRU neural network, culminating in the trained EL-IPSO-GRU model.

The pseudo-code of the EL-IPSO-GRU model for porosity prediction is shown in algorithm 2.

figure b

Algorithm 2: Computing model for porosity prediction

4.2 Porosity Prediction Process

Initially, due to the complex relationship between reservoir porosity and various logging parameters, the committee-voting EL strategy was utilized to identify features significantly correlated with porosity for model input. Subsequently, the enhanced PSO algorithm was employed to determine the hyper-parameters of the GRU. Porosity was then predicted using the EL-IPSO-GRU model, as illustrated in Fig. 15.

Fig. 15
figure 15

Porosity prediction process based on EL-IPSO-GRU

  • Data preprocessing and logging-feature selection: To address the quality and integrity of logging data, box plots and linear interpolation were initially applied to detect outliers and impute missing values. This was followed by min–max normalization. Integration of Spearman, RF, GRA, and MIC facilitated the development of a committee-voting EL model, enabling the selection of sensitive and critical logging features as inputs for the porosity prediction model based on the GRU. The selected data were divided into training and testing sets at an 8:2 ratio.

  • Model construction and training: Logging data from the training set were used as input with porosity as the output. Concurrently, settings for relevant parameters were established. The GPSCF-PSO algorithm was then applied to optimize the number of hidden layers, batch size, and learning rate for the GRU neural network. The model’s performance was evaluated using MSE, MAE and R2, with the final trained EL-IPSO-GRU model achieved after multiple training iterations.

  • Porosity prediction and model evaluation: The trained neural network model was utilized to predict porosity in the test set. The performances of LSTM, GRU, and EL-GRU models were assessed by comparing their prediction results with actual values.

4.3 Experimental Results and Analysis

4.3.1 Model Comparison and Parameter Settings

Following preprocessing, 15,000 logging data points from well A01 were inputted into the deep learning model. LSTM and GRU, established models for sequence prediction tasks, including time series forecasting, provided a baseline for evaluating the EL-IPSO-GRU model. By comparing the performance of the proposed EL-IPSO-GRU model with these established models, researchers can assess its effectiveness and determine if it outperforms comparably to the existing methods. Thus the effectiveness of the EL-IPSO-GRU model was compared and analyzed with LSTM, GRU, and EL-GRU models in predicting porosity. as detailed in Table 6.

Table 6 Main hyper-parameter settings of the model

4.3.2 Model Performance Verification

Each method was independently run ten times on the training and testing sets, using MSE, MAE, and R2 as the evaluation metrics. Tables 7 and 8 display the errors and determination coefficients for the four methods.

Table 7 Evaluation indicators on training set
Table 8 Evaluation indicators on testing set

It is evident from Tables 7 and 8 that, compared to the other three methods, the proposed model exhibits the smallest error, the highest data fitting degree, and superior prediction performance for porosity on both the training and testing sets.

In addition, the error-iteration curves of the four methods are depicted in Fig. 16. It can be seen that the error loss of the proposed model decreases most rapidly and converges to the optimal result the earliest. The convergence speed and accuracy are significantly better than other three algorithms, indicating superior performance, high convergence accuracy, and stability.

Fig. 16
figure 16

Iterative curves of errors of four methods

Violin plot can display the distribution and probability density of multiple sets of data. It combines the characteristics of box plots and density plots, and is mainly used to display the distribution shape of data, but it has a better display effect at the density level. Therefore, in our study we used violin plot to display the experimental results of the four models to clearly display the distribution of MSE and MAE over ten runs on a test set comprising 12,000 samples, as depicted in Fig. 17.

Fig. 17
figure 17

Distribution of errors of four methods

The experimental results, as shown in Fig. 17 demonstrate that the MSE distribution for the LSTM model ranges from 2.89E–05 to 6.80E–05, with a mean of 5.88E–05, and the MAE distribution ranges from 0.0155 to 0.025, with a mean of 0.021. The MSE for the GRU model ranges from 1.91 to 8.04E–05, with a mean of 2.27E–05, and the MAE ranges from 0.01 to 0.022, with a mean of 0.155. The MSE distribution for the EL-GRU model ranges from 8.22E–06 to 2.96E–05, with a mean of 1.07E–05, and the MAE ranges from 0.008 to 0.017, with a mean of 0.0092. The MSE distribution for the model proposed in this paper ranges from 4.56E–06 to 1.15E–05, with a mean of 7.19E–06, and the MAE ranges from 0.007 to 0.01, with a mean of 0.0082. Compared to the other three methods, the proposed mdoel demonstrates the highest accuracy, with decreases in MSE of 32–7% and MAE of 11% ~ 61%, and an improvement in R2 of 0.7% to 3.4%. The predicted porosity values are relatively close to the actual values, and the errors for the majority of samples fall within a narrow range, suggesting that the proposed model demonstrates good stability and high prediction accuracy.

4.3.3 Analysis of the Porosity Prediction Results

To further assess the effectiveness and accuracy of the proposed method, for 3000 test samples, the predicted porosity results of the proposed model were compared with the actual values, as depicted in Fig. 18. From Fig. 18, it is evident that the proposed model displays a minimal discrepancy between the predicted value and true porosity values, and a strong correlation exists between the predicted curve and the actual curve, with a determination coefficient of 0.99.

Fig. 18
figure 18

The prediction results of the proposed model on test set

In addition, a visual analysis of the absolute error between the predicted porosity results of the four algorithms and the actual values on the test set was conducted, as shown in Fig. 19. Due to space constraints, the prediction results on the training set are not shown.

Fig. 19
figure 19

Absolute errors of the four models

The comparison reveals that the absolute error of the model discussed in this paper is lower than that of the other three methods, and the overall error between the predicted and actual values was within a small range.

5 Discussion

The experimental results indicate that the EL strategy of committee voting employed in this paper for optimal logging-feature selection successfully eliminates data redundancy and multicollinearity while facilitating dimensionality reduction. This approach provides the simplest and optimal sample data for the model, enhancing computational efficiency and prediction accuracy. By comparing the results, it was found that EL-GRU showed a decrease in MSE and MAE compared to GRU model, and an improvement in R2.

Furthermore, the PSO algorithm was enhanced by integrating the good-point set and compression factor, and was utilized to derive the optimal hyper-parameters for the GRU. A comparison of the performances of four PSO algorithms on 10 benchmark test functions demonstrated that the improved PSO algorithm achieves higher convergence accuracy and enhanced stability.

From the final evaluation of the results, it can be seen that the proposed method has improved the accuracy and reliability of model, yielding superior porosity prediction results in actual work areas. The MSE on the test set was 7.19 × 10–6, the MAE stood at 0.0082, and R2 reached 0.99. This not only addresses the challenge of obtaining comprehensive porosity distribution characteristics of the entire work area due to the absence of continuous coring but also markedly improves the efficiency and accuracy of porosity prediction, contributing to the reduction of costs in oil and gas exploration and development.

6 Conclusions

To address issues of redundant and nonlinear logging data in machine-learning-based reservoir porosity prediction, as well as the effects of empirical hyper-parameter values in neural networks on model accuracy and reliability, a logging parameter selection method based on EL and committee voting was introduced. Enhancements were applied to the PSO algorithm to secure the optimal hyper-parameter combination for the GRU model. Subsequently, the porosity was predicted using the optimized GRU model yielding the following results.

(1) The optimal logging parameter selection method, utilizing committee voting and EL, effectively addresses the nonlinear mapping problem between logging data and reservoir parameters, eliminates redundant information and multicollinearity, reduces dimensionality, overcomes the limitations of single-feature selection methods, provides the simplest and best optimal high-quality data input for machine-learning-based porosity prediction models, enhances improve prediction accuracy, and reduces computational space and costs.

(2) The PSO algorithm was enhanced by integrating good-point sets, a compression factor, and an adaptive strategy, which resolved the issues of poor global search capability due to random initialization of the population, and the tendency to converge to local optima in the later iterations. A comparison of the performances of three PSO algorithms on 10 benchmark test functions demonstrated that the improved PSO algorithm achieves higher convergence accuracy and enhanced stability.

(3) By refining the PSO algorithm to obtain the optimal hyper-parameter combination of the GRU model, the complexity and uncertainty of personal parameter adjustment are minimized. The porosity prediction model, based on the optimized GRU, has shown promising performance in actual work areas, not only addressing the challenge of not obtaining complete distribution characteristics of porosity due to limited coring and high testing costs but also significantly improving the efficiency, accuracy, and reliability of porosity prediction. This offers valuable guidance for oil and gas exploration and development. Due to the lack of consideration for the long-term dependency of GRU, other methods such as attention mechanisms can be introduced in this study to address this issue. Moreover, future research will focus on the generalization ability of the model in combination with its practical applications in real work areas, providing insights into its engineering applicability.