Introduction

The complexity and high degree of heterogeneity of carbonate reservoirs, as a result of depositional factors and long-term diagenetic processes, play a substantial role in subsurface exploration and development. These processes control the rock types, storage capacity, and fluid flow within the reservoir.

Unlike most sandstone reservoirs, which are typically single-porosity systems of relatively homogeneous nature, carbonate reservoirs are commonly multiple-porosity systems that normally impart petrophysical heterogeneity to the rock (Mazzullo and Chilingarian 1992; Wardlaw 1996). Hence, the types, relative portions, and distribution of pores affect the reservoir quality.

Following the development of depositional primary porosity, which ranges from 40 to 70% in modern carbonate sediments, the reservoir properties can be modified in the eogenetic (near-surface and shallow burial), mesogenetic (deep burial), and telogenetic (uplift) realms (Choquette and Pray 1970; Lucia 2007).

In carbonate reservoirs, dissolution commonly enhances porosity, while recrystallization and replacement may destroy and/or enhance porosity (Ahr 2008). Also, fractures can improve porosity and permeability but may act as barriers to fluid flow if filled with gouge or mineral cement (Nelson 2001). Subsequently, intricate pore systems are a characteristic feature of carbonate reservoirs, and nano-to-micro- and meter-scale pore connectivity can have a major effect on hydrocarbon storage and recovery (Lomando 1992; Hollis 2011; Burchette 2012; Amel et al. 2015; Ehrenberg and Walderhaug 2015; Hollis et al. 2017; Worden et al. 2018; Fallah-Bagtash et al. 2020; Kakemem et al. 2021; Abouelresh et al. 2022; Ghasemi et al. 2022). Hence, the modeling of the petrophysical properties of carbonates is challenging, particularly since the relationship between rock fabric and the pore system is complicated (e.g., Ping and Machel 1995; Woody et al. 1996; Tanguay and Friedman 2001; Beiranvand 2003; Varavur et al. 2005; Zhao et al. 2015, and Sha et al. 2018). To evaluate carbonate reservoirs accurately, integrated and multiscale datasets are required to classify the carbonate deposits into different rock units according to their variation in terms of geological conditions and rock physical properties (Skalinski and Kenter 2015).

The Gulf of Suez (G.O.S) is the most productive hydrocarbon province in Egypt (Egyptian General Petroleum Corporation (EGPC), 1996). It has 80% of Egypt’s oil reserves. Ras Fanar oil field lies on the offshore western side of the G.O.S. The Middle Miocene Nullipore reservoir is the main oil-producing unit in the field. Some studies were conducted concerning determining the electrofacies and petrophysical analyses of the Nullipore reservoir from well logs (El Fakharany et al. 2016; Genedi et al. 2016). However, no core or petrographic data were analyzed to validate the accuracy of the zonation of the reservoir and petrophysical assessment. Afife et al. (2017) integrated sedimentological, well logging, and core data to build a petrophysical model for the Nullipore reservoir. Nevertheless, the study lacks the basis for determining the facies from well logs. Moreover, they modeled permeability as a function of porosity, assuming a linear relationship, and didn’t use any validation metrics to evaluate the accuracy of the developed model. This yields inaccurate results that can’t be generalized due to the existence of nonlinear and complex relationships between the two parameters. To the best of the authors’ knowledge, no systematic studies in the literature were carried out concerning linking sedimentological analysis to petrophysical assessment and prediction. So, microfacies analysis, diagenesis investigation, and petrophysical analysis were performed in this study. Three Machine Learning (ML) algorithms are applied, namely Adaptive Boosting, Gradient Boosting, and Extreme Gradient Boosting to integrate the reservoir facies, porosity, and permeability to improve the performance of permeability prediction. This study aims to: a) investigate the Nullipore reservoir microfacies and facies associations from the detailed petrographic analysis, b) identify the facies from well logs to model the property in 3D, c) examine the reservoir heterogeneity and the relationship between facies, porosity, and permeability, and d) establish a Machine Learning-based permeability prediction approach to obtain a reliable 3D permeability model.

Location and geology of the study area

Ras Fanar field lies about 2.5 km to the east of Ras Gharib in the Belayim dip province of northeast direction on the western side of the G.O.S; between latitudes 28° 13ʹ and 28° 18ʹ to the north, and longitudes 33° 11ʹ and 33° 17ʹ to the east (Fig. 1) (Moustafa 1976).

Fig. 1
figure 1

Location map of Ras Fanar field

Three sequences of deposition characterize the stratigraphy of the Gulf of Suez. They are arranged as follows: Early Paleozoic-Late Eocene pre-rift, Early-Late Miocene syn-rift, and Late Miocene post-rift sequences. The sequence of stratigraphy in the Ras Fanar field ranges in age from the Paleozoic to recent (Fig. 2). Belayim, South Gharib, and Zeit Formations represent the syn-rift sequence (Souaya 1965; Hosny et al. 1986, and Rateb 1988). The Nullipore reservoir is equivalent to the Hammam Faraun Member of the Belayim Formation and represents a part of the Middle Miocene syn-rift succession (Moustafa 1976). Thiébaud and Robson (1979) described the Nullipore reservoir as bioclastic limestone exposed in the G.O.S area, with an abundance of corals, reefs, and algae. It contributes to about half of the field’s oil production (Afife et al. 2017).

Fig. 2
figure 2

Lithostratigraphic column in the Ras Fanar field (El Naggar 1988)

Due to the rifting of the G.O.S, many horst blocks, grabens, and half-grabens were developed. The strata of Miocene carbonate became highly tilted into narrow, elongated rectangular blocks aligned mainly in a NW–SE direction (Fig. 3). Besides, several stratigraphic successions were created on both sides of the gulf. The Ras Fanar area was created from an NW–SE structural trap bounded by a major fault system to the SW and tilted to the NE (Vaughan et al. 2003). During the Early Miocene, Ras Fanar was a positive highland area and was subjected to intense erosion. A thick organic-rich shale sequence was deposited in the adjacent lowland troughs. During the Early-Middle Miocene, the area was submerged by a shallow sea. This results in the accumulation of thick algal-reefal carbonate facies in the Nullipore reservoir. By the end of the Middle Miocene, more arid conditions occurred and led to vertical and lateral changes in facies from carbonates to alternative cycles of evaporites, siliciclastics, and carbonates of the South Gharib and Zeit Formations (Moustafa 1977; Thiébaud and Robson 1979; Chowdhary and Taha 1986; El Naggar 1988; Ouda and Masoud 1993, and Khalil and McClay 2001).

Fig. 3
figure 3

Structure contour map on Nullipore carbonate top (Lashin and El Din 2013)

Materials and methodology

Four wells from the Ras Fanar field, namely RF-B12, RF-B1, RF-B3, and RF-A2, were selected to conduct this study (Fig. 1), relying on data availability and maximum coverage. The wells targeted the Middle Miocene Nullipore reservoir interval. Conventional well logs, comprising gamma-ray (GR), neutron porosity (NPHI), and bulk density (RHOB) logs, every 0.5 ft. (0.1524 m.) are available from the four wells. The details of various methods and techniques applied during reservoir evaluation are illustrated in the following subsections:

Sedimentological analysis

Thin section photographs are available from the RF-B12 well and are used to determine the carbonate texture, microfacies, and facies associations (FAs). The classification scheme of Dunham (1962) was followed in the description of carbonate textures and Flügel and Munnecke (2010) in the interpretation of microfacies and FAs.

Facies recognition from well logs

The determined facies associations were integrated with the gamma-ray log to create a facies log for each well. Carbonate facies can be correlated with current energy in the depositional environment, hence to gamma radiation, because the amount of insoluble residue is inversely proportional to the current energy. Thorium and potassium are found in the insoluble residue-constituting minerals in carbonate rocks, such as clays and rock fragments. Uranium is most likely related to diagenesis and is concentrated in dolostone. Grain-dominated carbonates are deposited in high-energy environments and have a low gamma-ray activity, while mud-dominated ones are deposited in low-energy environments and have a high GR activity (Lucia 2007).

After the facies logs were created, they were upscaled and distributed in 3D using the Truncated Gaussian Simulation method.

Facies modeling

Truncated gaussian simulation

Truncated Gaussian Simulation (TGSim) is a pixel-based stochastic modeling algorithm that generates realizations of a continuous Gaussian variable and thereafter truncates them at a series of thresholds to create realizations of categorical variables (as facies). It requires a stringent sequential set-up of facies: if four facies are present (A, B, C, and D), then facies A must be next to B, and B must be next to C. The TGSim workflow can be summarized in four steps: truncate a continuous Gaussian distribution to honor proportions of categories and ordering relations, infer a continuous single variogram, compute a continuous simulation conditioned to the data transformed into the continuous variable, and truncate the continuous simulation into a categorical simulation (Armstrong et al. 2011; Pyrcz and Deutsch 2014, and Cannon 2018).

Validation of the accuracy of the 3D facies model

To evaluate the accuracy of the developed 3D facies model, three wells (RF-B1, -B12, and -A2) were used to build the model, and the other (RF-B3) was used to test the model's performance. The developed facies sequence of the test well is compared to the actual facies, and the distribution of the populated facies is compared to the input data and the upscaled cells’ distribution.

Routine core analysis

Routine core analysis (RCAL) was performed on 518 core plug samples from the Nullipore reservoir penetrated by the RF-B12, -B1, and -B3 wells. RCAL involves porosity (Φ) measured by a helium porosimeter and horizontal permeability (K) determined by a permeameter. This aids in determining the quality and heterogeneity of the reservoir.

Machine learning

In this study, Machine Learning, specifically hypothesis boosting methods, is applied to integrate the reservoir facies, porosity, and permeability to predict permeability logs along the reservoir profile of each well to be further populated in 3D.

Hypothesis boosting refers to ensemble methods that are able to combine several weak learners into a strong learner. Its basic idea is to train predictors sequentially, each trying to correct the error of its predecessor (Géron 2022). The details of the boosting algorithms implemented in this study are described in the following subsections:

Adaptive boosting

To build an Adaptive Boosting regressor (AdaBoost), a first base regressor (such as a Decision Tree) is trained and used to make predictions on the training set. The technique of AdaBoost depends on paying more attention to the training instances that the predecessor underfitted. This yields new predictors, focusing more and more on the hard cases. The prediction doesn’t need to match exactly, but a margin is given. If the prediction falls outside this margin, it is counted as an error. A weight (ѡ) is assigned to each sample, initially as (1/n), where n is the number of samples. The total error is determined by dividing the number of wrongly predicted samples by n. The learning rate (α) scales the influence of each tree and is expressed by:

$$\alpha =\frac{1}{2}ln\;[\frac{1-total\; error}{total\; error}]$$
(1)

The new weight is computed as follows:

$${w}_{new}={{w}_{old}}^{*}{e}^{\pm \alpha }$$
(2)

The ± sign used in the exponent depends on whether the sample was correctly predicted or not. The weights are normalized between 0 and 1. Bootstrapping is then performed to create a modified dataset, which can be used to build the next model, and so on. The final prediction of the constructed ensemble AdaBoost model is the weighted average of the individual predictions of all the models (Solomatine and Shrestha 2004; Géron 2022).

Gradient boosting

The Gradient Boosting algorithm (GB) involves adding predictors to an ensemble, each trying to correct the predecessor’s error, just like AdaBoost works. However, this method tries to fit the new predictor to the residual errors made by the previous predictor instead of tweaking the instance weights at every iteration. GB has hyperparameters to control the ensemble training (such as the number of trees or estimators), as well as hyperparameters to control the growth of trees. The contribution of each tree is scaled by the learning rate hyperparameter (Géron 2022). The implementation can be summarized in the following steps:

  1. (1)

    Initialize the model with a constant value:

    $${f}_{0}(x)={argmin}_{\rho }\sum\nolimits_{i=1}^{N}L ({y}_{i}, \rho )$$
    (3)

    where \({y}_{i}\) is the true value, \(\rho\) is the predicted value, and \(L ({y}_{i}, \rho )\) is the loss function.

  2. (2)

    The following steps are performed repeatedly, for m (the tree we are trying to build) = 1,2,…,m:

    1. (a)

      solve for the negative gradient (gim):

      $$g_{im}=_={-\lbrack\frac{{\partial\;L\;(y_if(x}_i))}{\partial\;{f(x}_i)}\rbrack}\;f\left(x\right)=f_{m-1\left(x\right)},\;\mathrm i=1,2,\;\dots\;\dots\;\dots\;\dots,\;\mathrm n$$
      (4)
    2. (b)

      Fit a regression tree to the residual values and create terminal regions gj, m, for j = 1,….m, where j is the index for each leaf in the tree.

    3. (c)

      Determine the output value for each leaf. For j = 1….m, compute ρjm as:

      $${\rho }_{jm}={argmin}_{\rho }\sum\nolimits_{i=1}^{n}L \left({y}_{i}, {f}_{m-1}\left({x}_{i}\right) +\rho \right)$$
      (5)
    4. (d)

      Update the model:

      $${f}_{m}(x)={f}_{m-1}(x)+\alpha . {\rho }_{jm}$$
      (6)

      where α is the learning rate.

  3. (3)

    obtain the final model:

    $$f(x)={f}_{m}(x)$$
    (7)

Extreme gradient boosting

Extreme Gradient Boosting (XGB) is an efficacious implementation of the GB algorithm proposed by Chen and Guestrin (2016). XGB makes two key optimization improvements compared with GB: it adds a regularization term to the objective function to decrease the capability of overfitting and performs a second-order Taylor approximation on the objective function, in contrast to GB, which only uses the first derivative in optimization. This provides a more accurate definition of the loss function.

The XGB objective function is defined as follows:

$${Obj}^{t}= \sum\nolimits_{i=1}^{n}L \left({y}_{i}.{\stackrel{\prime}{y}}_{i}^{(t)}\right)+\sum\nolimits_{i=1}^{t}\Omega ({f}_{t})$$
(8)

where \({\stackrel{\prime}{y}}_{i}^{(t)}\) is the prediction at the t round, ft is the structure of a decision tree, and the regularization term (ft) is given by:

$$\left({f}_{t}\right)=\gamma T+\frac{1}{2}\lambda \sum\nolimits_{j=1}^{t}{w}_{j}^{2}$$
(9)

where γ is the penalty coefficient, T is the number of terminal nodes (leaves) in a tree, and \(\frac{1}{2}\uplambda \sum_{\mathrm{j}=1}^{\mathrm{t}}{\mathrm{w}}_{\mathrm{j}}^{2}\) is the L2 norm of leaf scores.

The function of the model, after t iterations, is the (t-1) iteration prediction function plus a new decision tree:

$${\overset{\prime}y}_i^{(t)}={\overset{\prime}y}_i^{(t-1)}+f_t(x_i)$$
(10)

Thereafter, the objective function is updated as follows:

$${Obj}^{t}=\sum\nolimits_{i=1}^{n}L\left({y}_{i}.{\stackrel{\prime}{y}}_{i}^{(t)}\right)+{f}_{t}({x}_{i})+\Omega ({f}_{t})$$
(11)

The Taylor expansion of the objective function is denoted by:

$${Obj}^{t}\cong \sum\nolimits_{i=1}^{n}[L \left({y}_{i}.{\stackrel{\prime}{y}}_{i}^{\left(t\right)}\right)+{g}_{i}{f}_{t}({x}_{i})+\frac{1}{2}{h}_{i}{f}_{t}^{2}({x}_{i})+\Omega \left({f}_{t}\right)]+ \Omega (ft)$$
(12)

where gi and hi are the first and second derivatives of the loss function, respectively. gi and hi are denoted as:

$${g}_{i}=\frac{\partial }{{\partial \stackrel{\prime}{y}}^{(t-1)}}L \left({y}_{i}.{\stackrel{\prime}{y}}_{i}^{(t-1)}\right)$$
(13)
$${h}_{i}=\frac{{\partial }^{2}}{{\partial \stackrel{\prime}{y}}^{(t-1)}}L\left({y}_{i}.{\stackrel{\prime}{y}}_{i}^{(t-1)}\right)$$
(14)

The following procedures summarize the process of building an XGB tree:

  1. (1)

    Calculate the similarity score for the residuals as follows:

    $$Similarity\; score=\frac{{(Sum\; of\; residuals)}^{2}}{Number\; of\; residuals + \lambda }$$
    (15)
  2. (2)

    Calculate the gain of splitting the residuals into two groups to quantify how much better the leaves cluster similar residuals than the root:

    $$Gain={Left}_{similarity\; score}+{Right}_{sililarity\; score}-{Root}_{similarity\; score}$$
    (16)
  3. (3)

    Prune the tree by calculating the difference between gain and gamma (γ; a user-defined tree-complexity parameter). If the result is a negative number, the tree will be pruned, and if it is positive, it won’t be pruned.

  4. (4)

    Calculate the output value that minimizes the loss function, as follows:

    $$Output\; value==\frac{Sum\; of\; residuals}{Number\; of\; residuals + \lambda }$$
    (17)

Results and interpretation

Petrographic investigation

Seven microfacies (MF) are recognized in the detailed petrographic analysis of the Nullipore carbonate rocks in the RF-B12 well (Fig. 4). MF1 is represented by dark-colored, fine-laminated dolomudstone with very low molluscan shell fragments and anhydrite existing as light-colored patches (Fig. 4a). Molds, vugs, and interparticle pores were formed through the dissolution of bioclastic particles. Figure 4b shows dolomudstone with minor anhydrite cement (MF2). Small amounts of terrigenous clay exist, and intercrystalline pores are common between the crystals of dolomite. Figure 4c depicts recrystallized dolomudstone with dolomite rhombs up to 50 µm in size (MF3). Relics of the original micritic dolomite rhombs are present, and minor quantities of terrigenous clays are recorded. Molds and vugs are created via the dissolution of allochems. Figure 4d and e indicate algal-bioclastic wacke-packstone (MF4) with minor quantities of peloids embedded in a dolomicritic matrix. Various amounts of benthic foraminifera and a few planktonic foraminifera are present. The dissolution of algae and other bioclasts results in the development of molds, vugs, and intraparticle pores. Figure 4f illustrates the algal dolomudstone microfacies (MF5). It is composed mainly of calcareous algae embedded in a dolomicritic matrix. Considerable amounts of terrigenous clay exist. Figure 4g shows peloidal bioclastic packstone (MF6) with various types of allochems involving gastropods, calcareous algae, benthic and planktonic foraminifera, and other undifferentiated types. Minor terrigenous clays are found. Intraparticle pores and molds were created through the partial dissolution of bioclasts and algae. Figure 4h displays the dolomitic algal-peloidal-bioclastic packstone (MF7). It consists mainly of calcareous algae, biogenic fossils, and peloidal intraclasts. Coarse dolomite crystals replace the matrix and fill the biogenic particles.

Fig. 4
figure 4

Thin section photomicrograph showing microfacies of the Nullipore reservoir penetrated by the RF-B12 well. (a) dolomudstone with patches of anhydrite, (b) dolomudstone with anhydrite cement, (c) recrystallized dolomudstone, (d and e) algal-bioclastic wacke-packstone, (f) algal dolomudstone, (g) peloidal-bioclastic packstone, and (h) dolomitic algal-peloidal-bioclastic packstone (Afife et al. 2017)

Facies associations and depositional environment

The recognized seven microfacies from the detailed petrographic analysis of the Nullipore carbonates can be lumped into three facies associations (FAs), each of which represents a specific reservoir rock type: i) supratidal FA, ii) intertidal FA, and iii) shallow subtidal FA. Each FA must reflect the depositional environment of its group of facies (Reading 1996).

Supratidal FA

Supratidal FA encompasses the dolomudstone microfacies (MF1 and 2) and the recrystallized dolomudstone microfacies (MF3). It has a total thickness of 31 ft. at intervals of 3706–3737 ft. The fine-grained dolomite matrix, rarity of fossils, lack of wave- and current-reworking features, and the presence of anhydrite reveal the deposition of the facies in the low-energy supratidal zone of the tidal flat environment. Anhydrite is deposited by precipitation from standing water bodies isolated from the sea in arid climates (Flügel and Munnecke 2010).

Intertidal FA

Intertidal FA represents the intervals from 3737 to 3781 ft. It includes the algal-bioclastic wacke-packstone microfacies (MF4) and the algal dolomudstone microfacies (MF5). The appearance of calcareous algae in variable quantities and molluscan shell debris with benthic foraminifera, in addition to molds, vugs, and secondary intraparticle pores, reflect the deposition in the intertidal zone of the tidal flat environment (Flügel and Munnecke 2010).

Shallow subtidal FA

Shallow subtidal FA involves the peloidal bioclastic packstone microfacies (MF6) and the dolomitic algal-peloidal-bioclastic packstone microfacies (MF7). It represents the intervals from 3781 to 3850 ft. Increasing the number of allochems (peloids, calcareous algae, gastropods, and benthic and planktonic foraminifera) and the prevalence of grain-supported fabrics (packstones) with low micritic matrix disclose the deposition in relatively current-dominated conditions of the shallow subtidal zone (Flügel and Munnecke 2010).

From the aforementioned criteria, it is evident that the sedimentary succession of the Nullipore reservoir was deposited in marginal marine and shoreline depositional environments. The carbonate facies are vertically sorted in regressive shallowing upward succession, consisting of shallow marine subtidal-intertidal carbonates overlain by supratidal carbonates that were subjected to subaerial exposure periods.

Fracturing

Numerous macro-scale fractures were detected in some core samples of the subtidal facies of the Nullipore reservoir, as shown in Fig. 5. Relying on the criteria of Fossen (2010), three types of fractures were recognized:

Fig. 5
figure 5

Core photograph showing macro-fractures within the Nullipore reservoir (Afife et al. 2017)

  1. 1)

    Open fractures (OF): This type of fracture is a mode I type fracture with no filling by particles or mineral precipitates. OF are created by extension stresses and can be filled with fluids (hydrocarbons or water) or minerals (Fossen 2010). Several open fractures were recorded in the core sections (Fig. 5a and b).

  2. 2)

    Closed fractures (CF): This type of fracture is a mode IV type fracture that is developed by contraction stress (Fossen 2010). Figure 5a illustrates some closed fractures at the upper part of the cored section.

  3. 3)

    Partially Closed Partially Open fractures (PCPO): This type of fracture is a combination of mode I and IV fractures (Fig. 5a). Since different factors create each mode (extension stress creates mode I and contraction stress creates mode IV), they can’t be formed at the same time by the same stress regime. However, it is impossible to determine which one is the older, due to the lack of cross-cutting relationships (Marghani et al. 2023).

Gamma-ray log and facies recognition

The GR log of the Nullipore reservoir interval of the RF-B12 well and the corresponding facies are shown in Fig. 6. The figure illustrates that the supratidal and intertidal facies are well correlated with the GR log, where the supratidal facies exhibits a high GR response due to its deposition in a low-energy environment and high dolomite content (Fig. 4a, b and c). Relatively low GR distinguishes the intertidal facies as a result of its deposition in a relatively higher-energy environment than the supratidal facies. Despite the deposition of shallow subtidal facies in a high-energy environment, the GR is relatively higher than that of the intertidal facies. Since the clay content is very low (Fig. 4g), it seems that the high dolomite content of the packstone (Fig. 4h) is the reason for the GR increase.

Fig. 6
figure 6

Correlation of the gamma-ray log to the determined facies of Nullipore reservoir penetrated by the RF-B12 well

Figure 7 depicts the correlation of the GR logs and the corresponding facies across the four studied wells, where a facies log is developed for each well. It is evident from the figure that the facies in RF-B1 and -B3 wells are more developed and have greater extension than those in B12. The repeated deposition of the intertidal facies in RF-B3, -B1, and -A2 wells is evidence of cyclicity, which occurs in response to the repeated sea level changes.

Fig. 7
figure 7

Correlation of the GR logs and the corresponding facies across the four studied wells

Horizontal and vertical gridding of the 3D model

The description of the model grid is illustrated in this section. Using Petrel software, a 3D grid was created, and the reservoir was divided into four layers based on the previously determined facies. Table 1 indicates the parameters of the developed 3D grid. I, J, and K are the unit vectors in the directions of the x-, y-, and z-axes, respectively, in a 3D plane.

Table 1 The parameters of the developed 3D grid

Variography and facies modeling

After creating the 3D grid, the upscaling of facies logs was performed. Thereafter, a variogram was calculated and modeled in the vertical, major horizontal, and minor horizontal directions to specify the spatial continuity of facies. A Gaussian variogram was used to model the experimental variogram. Figure 8 shows the experimental variograms and variogram models in the three directions. The parameters of the variogram models are presented in Table 2.

Fig. 8
figure 8

The experimental and modeled variograms in the (a) vertical direction, (b) major horizontal direction, and (c) minor horizontal directions (Nugget = 0.01 and sill = 0.96)

Table 2 The parameters of the variogram models in x, y, and z directions

The TGSim method was used to distribute the upscaled facies in 3D. Twenty realizations were generated. The 3D facies model is depicted in Fig. 9. The figure clarifies that the three facies are well connected, and their stacking pattern is preserved. The model honored the facies distribution in the lateral and vertical direction. It is evident that the developed sequence of Nullipore facies penetrated by the RF-B3 well (test well) is the same as the actual facies indicated in Fig. 7, where transition occurs from supratidal to intertidal to shallow subtidal to intertidal facies. This affirms the high accuracy of the TGSim method in determining the facies along the reservoir. Figure 10 illustrates the histograms of the input facies logs, upscaled cells, and distributed facies (index 0 represents the supratidal facies, 1: intertidal facies, and 2: shallow subtidal facies). The figure indicates that the distribution of the populated facies honors the upscaled and the facies logs’ distribution accurately. This ascertains that the TGSim represents the facies accurately.

Fig. 9
figure 9

3D facies model built using Truncated Gaussian Simulation

Fig. 10
figure 10

Histograms of the facies logs, upscaled cells, and distributed facies using TGSim

Petrophysical assessment of the reservoir from core data

The routine core analysis (RCAL) was used to determine the quality and heterogeneity of the reservoir. The relationship between core porosity (Φ) and permeability (K) of the three reservoir rock types (RRTs) or facies associations of the Nullipore reservoir is shown in Fig. 11. The figure indicates that the contribution of porosity towards permeability is very low within RRT2 and 3 (R2 = 0.47 and 0.53, respectively) and is relatively higher within RRT1 (R2 = 0.73).

Fig. 11
figure 11

A cross-plot of the core porosity vs. core permeability of the three Nullipore reservoir rock types

The following poro-perm relationships are observed from the plot:

$$RRT1: K= 58003\; {\Phi }^{4.7615}$$
(18)
$$RRT2: K= 26974\; {\Phi }^{4.424}$$
(19)
$$RRT3: K= 34517\; {\Phi }^{3.6091}$$
(20)

Table 3 illustrates the statistical parameters of the two properties. Relying on the criteria indicated in Table 4, it is evident that the three RRTs are prospective. However, RRT3 constitutes the superlative reservoir quality, where K spans from 1.4 to 1917 md (fair to very good K) and the Kmean is 451.1 md.

Table 3 The statistical parameters of core porosity and permeability of the rock types of Nullipore reservoir
Table 4 Permeability categories (Tiab and Donaldson 2015)

Equations (18), (19), (20) are utilized to estimate permeability from porosity for the three RRTs. Figure 12 indicates a cross-plot of the predicted permeability using the conventional poro-perm fitting relationships vs. core permeability. The figure clarifies a high degree of scatter, specifically for K > 100 md, and poor correlation (R2 = 0.55). This introduces significant uncertainties in the modeled permeability, which led us to the application of Machine Learning to achieve better results.

Fig. 12
figure 12

A cross-plot of of the predicted permeability using the conventional poro-perm fitting relationships vs. core permeability

Permeability modeling via machine learning

In this study, three Machine Learning algorithms, namely AdaBoost, GB, and XGB, are implemented to integrate the RRT, effective porosity log, and core permeability to predict permeability logs along the logged reservoir intervals. The target variable of the regression model is the logarithm of horizontal core permeability. The RRTs and effective porosity log (PHIE) are the input features. PHIE is developed by combining the two porosity logs (neutron and density logs). The dataset is randomly subsampled into training (80% of the data samples) and testing (20% of samples) sets to evaluate the performance of the developed models. The random generator seed is set to 20. A grid search (LaValle et al. 2004) was conducted to select the optimum hyperparameters of each model that achieve the best performance. The grid search function evaluates all the possible combinations of hyperparameters of each model. This can be summarized as follows:

  1. a

    Define a search range for each hyperparameter of the ML model. For example, in the case of AdaBoost model, we used a search range of “2 to 50” (48 values) for the number of estimators (48 values) and “0.001 to 0.5” for the learning rate (7 values, including 0.001, 0.01, 0.1, 0.2, 0.3, 0.4, and 0.5).

  2. b

    The grid search trains 336 (48*7) different models, evaluates them, and returns the configuration of hyperparameters that achieves the best performance (we specify the R2 metric to evaluate the models). Table 5 depicts the range of hyperparameters that have been tuned and that have achieved the best accuracy.

Table 5 Summary of the process of hyperparameters’ tuning of the ML models

The accuracy of the developed models is evaluated using three statistical metrics, including the coefficient of determination (R2), mean absolute error (MAE), and mean squared error (MSE), as illustrated in Table 6.

Table 6 The statistical metrics used to assess the ML models’ performance

Performance of the developed ML models

In this section, the permeability prediction performance of AdaBoost, GB, and XGB models is evaluated and compared to core permeability for both training and testing datasets. Figures 13 and 14 show the predicted permeability and core permeability (in logarithmic form) of the three models for the training and testing sets. The statistical evaluation metrics are indicated in Fig. 15. It is observed that the three models outperform the conventional model, where the data samples are much closer to the 45° line than the conventional model (the closer the data is to the 45° line, the higher the prediction accuracy). In the case of AdaBoost model, R2 is 0.79 for the training set and 0.78 for the testing set, MAE is 0.32, and MSE is 0.21 for both sets. GB shows a slightly better performance, where R2 is 0.8 for the training set and 0.76 for the testing set, MAE is 0.32, and MSE is 0.19 for both sets. The XGB model provides the strongest correlation and lowest error, where R2 = 0.85 for the training set and 0.81 for the testing set, MAE is 0.32, and MSE is 0.19 for both sets. Figure 16 shows the predicted K using XGB vs. the core K for the whole reservoir samples.

Fig. 13
figure 13

Cross-plots of the predicted permeability and core permeability (in logarithmic form) of the training set

Fig. 14
figure 14

Cross-plots of the predicted permeability and core permeability (in logarithmic form) of the testing set

Fig. 15
figure 15

Evaluation metrics of permeability prediction of the three ML models for the training and testing datasets

Fig. 16
figure 16

A cross-plot of the predicted permeability via XGB and core permeability of the whole reservoir samples

The developed XGB model was used to predict the permeability logs along the three studied wells (Fig. 17). Using Petrel software, the wells’ data were imported, and the predicted K-logs were upscaled and populated in the geological model by setting the appropriate variogram and applying the Sequential Gaussian Simulation method (SGS). Table 7 presents the variogram parameters. The final 3D permeability model is indicated in Fig. 18. The histograms of the k-logs, upscaled cells, and the distributed permeability are shown in Fig. 19. The figure clarifies that the distributed permeability honors the distribution of the k-logs and upscaled cells accurately. This reveals the high accuracy of the developed model.

Fig. 17
figure 17

The predicted K-logs using XGB of the three studied wells

Table 7 The variogram parameters of the permeability model
Fig. 18
figure 18

The 3D permeability distribution of the Nullipore reservoir

Fig. 19
figure 19

Histograms of the k-logs, upscaled cells, and distributed permeability

Discussion

Diagenetic controls on reservoir quality

As stated in Sections "Petrographic investigation" and "Facies associations and depositional environment", seven microfacies and three FAs have been recognized in the Nullipore reservoir. The description of thin sections and cores reveals that the main factors controlling reservoir quality are:

Dolomitization

Dolomitization is considered one of the most important diagenetic processes affecting the studied carbonate rocks (Fig. 4a, b, c, f, and h). Both carbonate grains and matrix are replaced by fine- to medium- crystalline dolomite rhombs.

Dolomitization of limestone has a major effect on reservoir performance by increasing pore size, resulting from the increase in particle size from fine crystalline to medium or coarse crystalline in mud-dominated fabrics. The size of crystals in CaCO3 mud is usually less than 20 µm, while that of dolomite ranges from several to 200 µm. This leads to an increase in crystal size with a corresponding increase in pore size, which in turn increases the flow characteristics of the rock and improves the capillary properties (Lucia 2007). This can explain the high permeability of the dolomitic reservoir facies (Table 3).

Dissolution

The dissolution of different carbonate allochems and fine-grained matrix in the studied rocks plays a significant role in improving reservoir quality by creating vugs, molds, and intraparticle pores (Figs. 4 and 5b), which in turn modify the reservoir pore spaces and increase permeability.

Carbonate sediments are composed of minerals with different solubility. The relative solubility of carbonate minerals is a function of mineralogy and the magnesium (Mg) content of Mg-calcite. The original Mg-calcitic bioclasts have a higher susceptibility to dissolution than low-Mg calcitic bioclasts. Even low-Mg calcite can be dissolved in intensely aggressive waters (James and Choquette 1984; Boggs 2009). This may explain the partial and complete dissolution of the dolomite rhombs and Mg-calcitic bioclasts (Fig. 4d, e, g, and h), forming molds, vugs, and intraparticle pores.

Anhydrite is more soluble than calcite and dolomite and is commonly dissolved selectively to form molds. Dissolution enlarges the interparticle pore spaces, hence improving the flow characteristics and capillary properties (Lucia 2007).

Fracturing

As mentioned in Section "Location and geology of the study area", an NW–SE structural trap controlled the paleo-topography of the Ras Fanar field. The area is characterized by intense faulting activity that was reactivated during the late Early-Middle Miocene due to the culmination of the Gulf rift (Moustafa 1977; Sultan and Moftah 1985). Fractures are mainly created by tectonic stresses such as compaction. This is useful for permeating acidic and alkaline fluids into particles, causing the dissolution of particles (Fig. 5b), which in turn enlarges the pores and provides a channel for fluid storage and migration. Moreover, macro-fractures can induce micro-ones, which can be effective for oil and gas migration if they are greater than 0.1 µm (Berg and Gangi 1999; Kong et al. 2021). The high porosity and permeability of Nullipore RRT3 indicate that the fractures may have a crucial role in improving pore connectivity and permeability.

Conventional and machine learning approaches

As mentioned in Section "Petrophysical assessment of the reservoir from core data", the conventional model fails to capture the true relationship between porosity and permeability, resulting in a model with a high bias and high variance. This is attributed to the non-linear complex relationships between the two parameters in each rock type of the reservoir. On the other hand, the GB and XGB models showed better performance than the conventional model, since R2 falls within the ranges of 0.76–0.8 and 0.81–0.85, respectively, for both the training and testing sets. The regression trees built using the GB algorithm to predict the residuals minimize the model’s bias. The variance of the model is decreased by scaling the contribution of each tree to the final prediction by the learning rate. The addition of the regularization term in the XGB algorithm reduces the predictions’ sensitivity to individual observations, thereby minimizing the variance of the model even more than the GB model does. This suggests the former is a reliable and potentially accurate method to predict reservoir permeability.

Conclusions

This paper demonstrates how integrated sedimentological and petrophysical studies can provide better recognition and assessment of the reservoir quality of the Nullipore carbonate reservoir in the Ras Fanar field, Gulf of Suez, Egypt. Sedimentological and fracture analysis revealed that fracturing and long-term diagenetic factors involving dolomitization of limestone and dissolution of allochems are responsible for the complex distribution of reservoir facies and petrophysical properties. Dividing the reservoir into different facies, each of which represents a specific rock type and has different geological characteristics, improves the geological and petrophysical characterization. Cross-validation disclosed that the Truncated Gaussian Simulation method modeled the carbonate facies accurately in vertical and lateral directions. RCA indicates that each RRT has a unique petrophysical distribution and requires a specific regression model to predict permeability accurately. Machine Learning was applied to predict permeability within each RRT. Cross-validation and hyperparameters’ tuning revealed that the XGB model exhibits higher prediction performance than other ML models. The present study provides further insights into the characterization and modeling of facies and permeability of complex carbonate reservoirs. It can be applied in similar geological settings to better interpretation of depositional and diagenetic controls on reservoir quality assessment and aid in the field development plan.