Keywords

1 Introduction

Feature engineering is the formulation and compilation of a set of informative predictor features from raw data features to improve the performance of the predictive machine learning and other statistical models [1, 2], which includes selecting the most relevant available predictor features, removal of redundant features and constructing new predictor features. For example, the engineered predictor features based on local gradient filters are effective at highlighting data trends, and edge detection filters are effective at finding critical boundaries or transitions in data; therefore, it is common to use these engineered features in applications such as computer vision. Raw features may be combined to construct new features, for a specific subsurface case consider rock quality index as a combination of rock permeability and porosity [3]. The rapid development of deep learning shifts the burden of feature engineering to its underlying learning system [4], such as convolutional neural networks that learn local image transformations represented as weighted filters known as kernels.

Spatial feature engineering is critical for subsurface machine learning applications because deep learning captures spatial features learned from dense data representations like images or time series but the spatial hard data available for understanding the subsurface volume of interest (e.g., a hydrocarbon/water reservoir or mining deposit) are sparse, such as core measurements or well logs. Also, the spatial context of the subsurface data, such as sampling manner, spatial continuity and multiple scales, known as support size, of data and models, obscure the relationships between subsurface predictor features (e.g., rock type, porosity, and permeability) and response features (e.g., flow rates and mineral grade) for data-driven prediction model performance, and must be dealt with prior to feature engineering. Efforts have been made to address the spatial sampling issues for spatial machine learning model construction, including spatial sampling bias [5], spatial anomalies [6], spatial training and testing data splitting [7] and spatial statistical significance [8]. It is essential to capture other aspects of the spatial context, i.e., the spatial feature heterogeneity and the scale, volume support size of the data, in model predictions. Heterogeneity is the variation in subsurface features as a function of spatial location and is a vital factor to predict subsurface resource recovery [9]. Also, subsurface datasets and models span a large range of scales from well and drill hole cores, core plugs, well logs, remote sensing and production or recovery data sources. A practical heterogeneity metric can be applied as a spatial-engineered feature for inputs into predictive models to improve the integration of the spatial context to potentially improves the model performance.

Common non-spatial heterogeneity metrics include Dykstra-Parsons and Lorenz coefficients [10,11,12], which are relatively easy to estimate without much computational power. However, these metrics may be calculated from the permeability and porosity data table and ignore the spatial context like location, spatial continuity and support size. There are proposed heterogeneity metrics that attempt to integrate spatial correlation in, including Polasek and Hutchinson's heterogeneity factor [13] or Alpay’s sand index [14], but these metrics require extrapolation of the local heterogeneity measurements based on fence diagram or sand isopach map, which is too smooth. There are various research efforts focusing on the comprehensive analysis of heterogeneity characterization that is informative for the specific support size or type of reservoir [15,16,17] but lack the flexibility and computational efficiency to be generalized as a spatial feature for predictive machine learning models.

Dispersion variance is a generalized form of variance that accounts for the volume support size of data samples and models [18, 19]. It is relatively fast to calculate based on well or drill hole measurements while integrating spatial continuity. Dispersion variance is denoted as \({D}^{2}\left(a,b\right)\), representing the variance of the feature of interest measured at volume support size a in the larger volume b. Dispersion variance honors the additivity of variance relation, which is the foundation of the analysis of variance (ANOVA) in statistics. For example, the scale change from core scale to geological modeling scale, consider as the volume support of the spatial sample data (denoted as \(\cdot\)), as the support size of the geological modeling cells (denoted as \(v\)), while the volume of interest is denoted as \(V\). Then the dispersion variance can be decomposed into the following based on the additivity rule, known as Krige’s relation:

$${D}^{2}\left(\cdot , V\right)={D}^{2}\left(\cdot ,v\right)+{D}^{2}(v, V)$$
(1)

the total variance of the samples in the volume of interest is equal to the sum of the variance of the samples in the geological model cells and the variance of the geological model cells in the volume of interest.

The dispersion variance of different volume support sizes can be estimated from the volume integrated semivariogram \(\gamma \left(\mathbf{h}\right)\), denoted as \(\overline{\gamma }\left(\mathbf{h}\right)\) and stated as ‘gamma bar’, where one extremity of the vector \(\mathbf{h}\) describes the domain \(v\left(\mathbf{u}\right)\) and the other extremity independently defines domain \(V\left({\mathbf{u}}^{\mathbf{*}}\right)\). Semivariogram \(\gamma \left(\mathbf{h}\right)\), under the assumption of stationary of the mean and variogram, is defined as:

$$2\gamma \left(\mathbf{h}\right)={\mathbb{E}}\left\{{\left[Z\left(\mathbf{u}+\mathbf{h}\right)-Z\left(\mathbf{u}\right)\right]}^{2}\right\}, \forall \mathbf{u},\mathbf{u}+\mathbf{h}\in AOI$$
(2)

where \(2\gamma \left(\mathbf{h}\right)\) is the variogram, u is the coordinate location vector, h is the lag distance vector separating all pairs of data, \(Z\left(\mathbf{u}\right)\) and \(Z\left(\mathbf{u}+\mathbf{h}\right)\) of Z variable and AOI is the area of interest. Semivariogram is half of the variogram. Here we adopt the common shorthand of variogram to represent the semivariogram. In order to calculate dispersion variance from volume integrated variogram, a permissible, positive definite, parametric nested variogram model informed by the experimental variogram is needed to provide a continuous function that is valid for all distances and directions, as:

$$\gamma \left(\mathbf{h}\right)=\sum_{i=1}^{nst}{C}_{i}{\Gamma }_{i}\left(\mathbf{h}\right)$$
(3)

where nst is the number of nested variogram functions, \({C}_{i}\) is the variance contribution of each nested structure and \({\Gamma }_{i}\left(\mathbf{h}\right)\) is the variogram function for each variance contribution acting over all distances and directions modeled in the major and minor continuity directions and interpolated for all other directions with a geometric anisotropy model. Then the volume integrated variogram, gamma bar \(\overline{\gamma }\left(\mathbf{h}\right)\), is calculated by:

$$\overline{\gamma }\left( {v({\mathbf{u}}),\,V({\mathbf{u}}^{ * } )} \right) = \frac{1}{v \cdot V}\int_{v} {\int_{V} {\gamma \left( {y - y^{ * } } \right)} dydy^{ * } }$$
(4)

where \(v\) and \(V\) are volumes of \(v\left(\mathbf{u}\right)\) and \(V\left({\mathbf{u}}^{\mathbf{*}}\right).\) Then the dispersion variance can be estimated from \(\overline{\gamma }\) as:

$$D^{2} \left( {v,\,V} \right) = \overline{\gamma }(V,\,V) - \overline{\gamma }(v,\,v)$$
(5)

This volume-variance relation has been proven powerful in reconciling feature variance across multiple scales and volume support sizes while accounting for spatial continuity [20, 21]. Therefore, it is a reasonable hypothesis that dispersion variance could be a good spatial feature for predictive models by integrating heterogeneity for different volume support sizes.

We propose a novel utilization of dispersion variance as a heterogeneity metric for spatial feature engineering for subsurface predictive machine learning. The proposed heterogeneity metric is not only practical to calculate as common static heterogeneity metrics, but also integrates the spatial continuity and the volume support size of the predictor features, which are critical aspects of the spatial context [22]. We demonstrate that our proposed engineered feature is sensitive to variations over a variety of spatial settings while remaining easy to compute and that application as a new predictor feature improves machine learning prediction performance for the case of hydrocarbon recovery from a heterogeneous reservoir. The sensitivity of the proposed engineered feature (i.e., dispersion variance within well drainage radius) is shown by changing with possible variables, such as well length, well drainage radius, and well trajectory. Then we model variogram, map the volume of the data from the varying well trajectories and investigate the hydrocarbon recovery information informed by dispersion variance.

In the next section, we explain the methodology for practical calculation of our proposed dispersion variance-based heterogeneity spatial-engineered feature. In the results section, firstly, we show the sensitivity analysis results from the possible factors that affect dispersion variance calculation from the perspective of geological stratigraphy and well parameters. Secondly, we investigate the relation between dispersion variance and hydrocarbon production based on black-oil simulation while controlling other simulation parameters to be constant. Lastly, we conduct a case study with data from the Kaybob field, Duvernay Formation, where we use dispersion variance as a spatial feature with other completion and petrophysics features for machine learning models to demonstrate that dispersion variance is an informative spatial-engineered feature for hydrocarbon recovery predictive model.

2 Methodology

Our proposed method assumes stationary mean, \({\mu }_{z}\), variance, \({\sigma }_{z}^{2}\), and variogram \({\gamma }_{z}(\mathbf{h})\) of spatial features over the area of interest (AOI). Given this assumption of stationarity, for the calculation of the variogram, we omit the dependence on location and only consider the dependence on the lag vector, \(\mathbf{h}\).

$$\gamma \left(\mathbf{h};\mathbf{u}\right)=\gamma (\mathbf{h})$$
(6)

In the presence of non-stationarity, we can model a local trend model and work with a stationary residual or segment the area of interest into multiple stationary regions with domain expertise.

The steps to calculate the dispersion variance-based heterogeneity spatial-engineered feature are:

  1. 1.

    Calculate the representative feature distribution variance at data support. Declustering methods can be utilized to achieve the goal of statistical representativity by assigning each datum a weight [23, 24]. The representative feature distribution variance \({\sigma }_{z}^{2}\) can be approximated from sample variance \({s}_{z}^{2}\) by:

    $${s}_{z}^{2}= {\sum }_{i=1}^{n}{w}_{i}{\left({z}_{i}-\overline{z }\right)}^{2}$$
    (7)

    where weight \({w}_{i}, i=1, 2, \dots ,n\) are between 0 and 1 and add up to 1, \({z}_{i}\) is the sample datum, \(\overline{z }\) is the representative sample mean calculated from:

    $$\overline{z }={\sum }_{i=1}^{n}{w}_{i}{z}_{i}$$
    (8)
  2. 2.

    Calculate and model the variogram integrating all available spatial data, analog information and trend model.

  3. 3.

    Establish the volume support size of the sample data, or imputed data, \(v\), volume. For the case of predicting hydrocarbon or water recovery from a well, the volume support size is based on the well drainage radius and the well length. The variance of data within the well volume is denoted as \({D}^{2}\left(\cdot ,v\right)\), where \(\cdot\) is the volume support of the spatial sample data, \(v\) is the volume within well drainage radius, i.e., \({D}^{2}\left(data,well\right)\). According to Eq. 1:

    $${D}^{2}\left(data, reservoir\right)={D}^{2}\left(data,well\right)+{D}^{2}\left(well, reservoir\right)$$
    (9)

    The dispersion variance can be calculated from a volume-integrated variogram, \(\overline{\gamma }\left(\mathbf{h}\right)\) according to Eq. 4. The dispersion variance within the volume of drainage radius of a horizontal well is calculated with \(\overline{\gamma }\left(\mathbf{h}\right)\) according to Eq. 5 as follows:

    $${D}^{2}\left(data,well\right)= \overline{\gamma }\left(well, well\right)-\overline{\gamma }\left(data,data\right)$$
    (10)
  4. 4.

    Apply numerical integration to calculate the dispersion variance of the feature over the volume support \(v\) of the sample data and apply it as a new engineered feature. The numerical approximation for \(\overline{\gamma }\left(\mathbf{h}\right)\) can be estimated as:

    $$\overline{\gamma }\left(v\left(\mathbf{u}\right),V\left({\mathbf{u}}^{\mathbf{*}}\right)\right) \approx \frac{1}{m\cdot {m}^{*}}{\sum }_{i=1}^{m}{\sum }_{j=1}^{{m}^{*}}\gamma ({\mathbf{u}}_{i}-{\mathbf{u}}_{j}^{*})$$
    (11)

    where the \(m\) points \({\mathbf{u}}_{i}\), \(i=1, 2, \dots , m\) discretize the volume \(v\left(\mathbf{u}\right)\) and the \({m}^{*}\) points \({\mathbf{u}}_{j}\), \(j=1, 2, \dots , {m}^{*}\), discretize the volume \(V\left({\mathbf{u}}^{\mathbf{*}}\right)\). For calculating \(\overline{\gamma }\left(well, well\right)\) with a given volume support of the well, the volume support size \(v\left(\mathbf{u}\right)=V\left({\mathbf{u}}^{\mathbf{*}}\right)\), which is the volume within the well drainage radius. In \(\overline{\gamma }\) calculation, we assume the variogram of the feature average linearly within area of interest. If the original feature is standardized, and under the standardized scale with the stationary assumption, \(\overline{\gamma }\left(data,data\right)=0\), and \(\overline{\gamma }\left(reservoir, reservoir\right)={\sigma }_{z}^{2}=1\). So, the standardized dispersion variance is between 0 and 1. We will use standardized dispersion variance throughout our demonstrations.

From this workflow, we calculate the proposed dispersion variance of the volume support size and then apply it as a spatial-engineered feature with improved integration of the spatial context.

3 Results and Discussion

Based on the above workflow steps of calculating the spatial engineering features, dispersion variance at a given support volume size, the possible factors that affect dispersion variance are well length, well drainage radius, well trajectory (i.e., dip and azimuth deviating from the major direction of the spatial continuity model). We demonstrate the impact of each factor on dispersion variance within well drainage radius \({D}^{2}\left(\cdot ,v\right)\) through a sensitivity analysis first. Then we demonstrate \({D}^{2}\left(\cdot ,v\right)\) as a spatial-engineered feature for hydrocarbon production prediction models. Note, the features are standardized for standardized dispersion variance that is bound between 0 to 1.

To demonstrate the proposed heterogeneity metric as an engineered feature, we build a 3-dimension 590 m × 590 m × 90 m heterogeneous hydrocarbon reservoir model as a truth model based on sequential Gaussian simulation for primary variable porosity and with collocated cokriging for secondary variable logarithm permeability [25, 26]. Both porosity and permeability in the logarithm scale have the same variogram model and a 0.7 correlation coefficient. The variogram model parameters are in Table 1. The truth reservoir model is shown in Fig. 1.

Table 1 Variogram model parameters to generate heterogeneous reservoir model
Fig. 1
figure 1

Truth reservoir model sections including (a) porosity and (b) permeability in logarithm scale

We demonstrate the sensitivity of the spatial-engineered feature with respect to well length, drainage radius and well trajectories. Figure 2 shows a schematic indicating the spatial-engineered feature volume support size and variogram model parameters for the corresponding well trajectories. The volume support size is quantified by well drainage radius and well length. For well trajectories, we are interested the dip angle and azimuth deviating from major directions of the geological spatial continuity specified in the variogram model. Figure 3 shows an illustration of the proposed spatial-engineered feature, dispersion variance over the well drainage volume, as a function of the deviation of the well trajectory from the major spatial continuity azimuth and dip direction with fixed well length and well drainage radius. The well trajectory that is aligned with the major direction and the stratigraphic layer, i.e., when \(\Delta\) dip = 0 and \(\Delta\) azimuth = 0, has the minimum dispersion variance within drainage radius \({\mathrm{D}}^{2}\left(\cdot ,\mathrm{v}\right)\), as larger variogram range decreases \({\mathrm{D}}^{2}\left(\cdot ,\mathrm{v}\right)\) and increases \({\mathrm{D}}^{2}\left(\mathrm{v},\mathrm{V}\right)\), dispersion variance between wells. When well trajectory aligns with the major direction and \(\Delta\) dip = 0, the variogram range within well drainage radius is maximized. Figure 4 demonstrates the sensitivity of standardized dispersion variance within well drainage radius with respect to well length, well drainage radius and well trajectories. Table 2 shows the factors values for the base case and their range for test cases applied to calculate the sensitivity shown in Fig. 4. The base case parameters and test case ranges are picked based on the major and minor direction range of the variogram model. The base case has a standardized dispersion variance value \({D}^{2}\left(\cdot ,v\right)\) = 0.696. Well length has the largest impact on dispersion variance value in comparison with all other factors.

Fig. 2
figure 2

Schematic illustration indicating the spatial-engineered feature volume support size and variogram parameters for the corresponding well trajectories

Fig. 3
figure 3

Illustration of dispersion variance within well drainage radius changing with azimuth and dip angle of the well trajectory deviating from major direction with well length = 100–300 m, well drainage radius = 10–50 m, as examples

Fig. 4
figure 4

Tornado plot for the result of sensitivity analysis of the proposed spatial-engineered feature, standardized dispersion variance within well drainage radius, for different well length, drainage radius, well trajectories (i.e., dip and azimuth deviating from major direction of the geological spatial continuity model)

Table 2 Base case and test case range of the sensitivity analysis for standardized dispersion variance within well drainage radius with different well length, drainage radius, well trajectories (i.e., dip and azimuth deviating from major direction of the geological spatial continuity model)

Next, we demonstrate the ability of the spatial-engineered feature to predict subsurface hydrocarbon production behavior with flow simulation. We construct a black-oil, finite difference, implicit pressure explicit saturation (IMPES) numerical simulation model with the truth reservoir model in Fig. 1. The production relies on pressure depletion only, to simplify the production forecast simulation so that we can focus on the impact of heterogeneity.

Since well length and well drainage radius have an obvious impact on production, to investigate the impact of the dispersion variance as the proposed spatial feature on hydrocarbon production, we assume the constants for well length and well drainage radius as the base case values; therefore, dispersion variance only changes with respect to well trajectories. We iterate over realizations of varying dip and azimuth of well trajectory and calculate the corresponding dispersion variance within well drainage radius while controlling other simulation parameters to be the same for each realization. Then for each well trajectory in the realization, there is a corresponding cumulative production curve over time. We group the cumulative production curves using the base case \({D}^{2}\left(\cdot ,v\right)\) value as cut-off for high and low \({D}^{2}\left(\cdot ,v\right)\). Figure 5 shows the 95% confidence interval of the production curve conditional to the dispersion variance within the drainage radius of the well, which indicates the significantly different cumulative oil production with low and high \({D}^{2}\left(\cdot ,v\right)\). Therefore, the dispersion variance is informative as a spatial-engineered feature to be utilized in the data-driven predictive model for production.

Fig. 5
figure 5

Cumulative oil production expectation curve (solid line) and 95% confidence interval (dash line) grouped by high and low dispersion variance feature values with Dykstra-Parsons coefficient = 0.8, permeability mean = 0.08 mD

Additionally, a sensitivity analysis is conducted to investigate the flow performance over various permeability magnitudes and permeability heterogeneity quantified by the Dykstra-Parsons coefficient. We use the relative cumulative production change (%) between the group with high \({\mathrm{D}}^{2}\left(\cdot ,\mathrm{v}\right)\) and the group with low \({\mathrm{D}}^{2}\left(\cdot ,\mathrm{v}\right)\) to evaluate the sensitivity of the proposed metric for production. Figure 6 shows the cumulative oil production change under different Dykstra-Parsons coefficients and permeability magnitudes. Overall, the impact of dispersion variance is more sensitive when Dykstra-Parsons coefficient is high and the permeability mean is low. The sensitivity of the dispersion variance metric to cumulative production change per unit length increases as Dykstra-Parsons coefficient increasing.

Fig. 6
figure 6

Cumulative oil production change between the group curves with high \({D}^{2}\left(\cdot ,v\right)\) and the group with low \({D}^{2}\left(\cdot ,v\right)\) under different Dykstra-Parsons coefficient and permeability magnitudes (solid: permeability mean = 0.08 mD; dash: permeability mean = 0.8 mD; dot: permeability mean = 8 mD)

When the permeability mean is low, i.e., close to the magnitude of tight oil/shale reservoir, the dispersion variance within well drainage radius is informative, indicating high and low cumulative production. When permeability mean is high, well drainage radius is no longer a fixed value near-wellbore anymore, extending to the whole reservoir. That explains why dispersion variance reflects less information for production, as dispersion variance within well drainage radius now converges to the reservoir volume support size, which is equal to 1 in standardized scale.

4 Case Study

Based on the analysis from the previous section, we can infer the dispersion variance could be an informative predictor feature for tight oil or shale reservoirs. Therefore, we further demonstrate the dispersion variance as a spatial feature for the predictive machine learning model with a case study in the Kaybob field, Duvernay formation.

The Duvernay formation was deposited in a sub-equatorial epicontinental seaway in the late Devonian, Frasnian time, this corresponds to the maximum transgression of this late Devonian sea into the western Canadian craton. The shale is deposited in the paleo-lows within the confines of the surrounding Leduc reefs in a slope and basin environment. The results in a series of sub-basins deposits from the west shale to east shale basins. The depth ranges from 2000–3700 m and the formation produces across the oil, condensate and gas windows. The greater Kaybob area locates in the West shale basin. The formation is at greatest thickness in the Kaybob area thinning to the east. The mineralogy also changes from West to east. The Kaybob area is a more silica-rich shale passing into the less quartz-rich, higher clay and higher carbonate content East shale basin [27, 28]. The West shale basin, Kaybob area is the most developed, where the majority of Duvernay production comes from (see Fig. 7).

Fig. 7
figure 7

Duvernay formation map with vertical G&G wells (red) and horizontal producers (green)

For this case study, we use 110 horizontal wells with features and response listed in Table 3. We choose to use production per unit length as the response feature to remove the direct impact from well length and use barrel of oil equivalent (BOE) as a convenient summarization of production response for oil, gas and condensate volumes in the same units. The dispersion variance within well drainage radius is calculated based on the porosity variogram model and the maximum drainage radius for each well is approximated based on well spacing. A non-parametric conditional expectation plot of production given the dispersion variance is shown in Fig. 8. To further investigate the impact of our proposed spatial-engineered feature on a predictive machine learning model, we test the dispersion variance feature in random forest and gradient boosting models. By grid search and k-fold validation where k = 5, we find the optimal hyperparameters and the average of the metrics calculated from the k-fold cross-validation testing sets of the optimal model with standard deviation shown in Fig. 9, where the optimal models in Fig. 9a use all the features in Table 3 while the optimal models in Fig. 9b exclude the dispersion variance feature. Including the proposed spatial feature reduces the mean absolute error (MAE) and rooted mean squared error (RMSE) for random forest. While for gradient boosting model, adding the spatial-engineered feature only reduces MAE. Since both random forest and gradient boosting are stochastic models, we iterate over 100 realizations with different random seeds to check if the performance is stable. The relative difference of metrics (MAE, RMSE) is defined as the metric for the model with the spatial-engineered feature minus that without the spatial-engineered feature, over the metric value with the spatial-engineered feature. The relative difference distribution is shown in Fig. 10. In the majority of cases, including the proposed spatial-engineered feature improves the model performance, judging from a smaller MAE for both random forest and gradient boosting. This work demonstrates that our proposed dispersion variance feature may, in some cases, improve predictions and should be considered as a new spatially informed feature in building predictive models. Demonstrating the rank of feature importance of dispersion variance for predicting production relative to all other possible geological and engineering parameters would require a much more comprehensive study and is not the goal of this work.

Table 3 Summary of available data for the predictive machine learning model
Fig. 8
figure 8

Conditional P10 (dash line), expectation (solid line), and P90 (dash line) of first 12-month cumulative production to dispersion variance within well drainage radius over 110 wells (scatter)

Fig. 9
figure 9

Measured and predicted first 12-month cumulative production (scatter) along the 45-degree line (dash line) with average metrics (mean absolute error and root mean squared error) ± the corresponding standard deviation via k-fold validation, k = 5 (a) using random forest and gradient boosting with all the features in Table 3 (b) without dispersion variance feature

Fig. 10
figure 10

Relative difference distribution of the metrics, mean absolute error and root mean squared error, over 100 realizations with different random seeds for (a) random forest models and (b) gradient boosting models

5 Conclusion

The proposed spatial-engineered heterogeneity feature, well dispersion variance, integrates the impact of spatial continuity and data volume support size and is computationally efficient to calculate. Dispersion variance is sensitive to various spatial factors, such as well trajectories with respect to major direction, and to the response feature for flow through porous media. Integrating spatial, scale information into a single, spatially aware feature also helps to reduce dimensionality for predictive machine learning models and improve the model performance. We demonstrate the spatial-engineered feature could be specifically useful for unconventional or tight oil reservoirs. We suggest the augmentation of predictive machine learning models with the proposed spatially engineered feature for improving subsurface resource prediction.