Machine Learning Methods for Sweet Spot Detection: A Case Study

Chapter
Part of the Quantitative Geology and Geostatistics book series (QGAG, volume 19)

Abstract

In the geosciences, sweet spots are defined as areas of a reservoir that represent best production potential. From the outset, it is not always obvious which reservoir characteristics that best determine the location, and influence the likelihood, of a sweet spot. Here, we will view detection of sweet spots as a supervised learning problem and use tools and methodology from machine learning to build data-driven sweet spot classifiers. We will discuss some popular machine learning methods for classification including logistic regression, k-nearest neighbors, support vector machine, and random forest. We will highlight strengths and shortcomings of each method. In particular, we will draw attention to a complex setting and focus on a smaller real data study with limited evidence for sweet spots, where most of these methods struggle. We will illustrate a simple solution where we aim at increasing the performance of these by optimizing for precision. In conclusion, we observe that all methods considered need some sort of preprocessing or additional tuning to attain practical utility. While the application of support vector machine and random forest shows a fair degree of promise, we still stress the need for caution in naive use of machine learning methodology in the geosciences.

1 Introduction

In petroleum geoscience, sweet spots are defined as areas of oil or gas reservoirs that represent best production potential. In particular, the term has emerged in unconventional reservoirs where the reserves are not restricted to traps or structures, but may exist across large geographical areas. In unconventional reservoirs the sweet spots are typically combinations of certain key rock properties. Total organic carbon (TOC), brittleness, and fractures are some of the properties influencing possible production. In identifying these sweet spots, the operators face the challenge of working with large amounts of data from horizontal wells and modeling the complex relationships between reservoir properties and production.

In general, a more data-driven approach for sweet spot detection allows for a more direct use of less costly reservoir data, such as seismic attributes. Moreover, such an approach may potentially avoid parts of the expensive reservoir modeling. In particular, the time-consuming computations needed to build a full reservoir model can be avoided. Fast and reliable classification of the sweet spots is of high significance, as it allows for focusing efforts toward the most productive areas of a reservoir. This makes machine learning algorithms desirable, since these are typically fast to train, often easy to regularize, and have the ability to adapt and learn complex relationships.

The use of machine learning methodology for predicting and detecting potential areas of interest is gaining attention and is not new to the geosciences. A multidisciplinary workflow in order to predict sweet spot locations is presented in Vonnet and Hermansen (2015). An example of support vector machine application on well data for prediction purposes is given in Li (2005). In Wohlberg et al. (2006), the support vector machine is demonstrated as a tool for facies delineation, and in Al-Anazi and Gates (2010), the method is applied for predicting permeability distributions.

In this paper we continue this exploration and view sweet spot detection in a machine learning setting, framed as a traditional supervised learning problem, i.e., classification. These are data-driven algorithms that aim to learn relationships between the reservoir properties and sweet spots from labeled well-log training data. We illustrate different popular machine learning algorithms through a case study, considering a real and challenging data set with a weak signal for sweet spots. The algorithms we consider and compare are logistic regression, k-nearest neighbor (kNN), support vector machines (SVMs), and random forest.

We will emphasize a more moderate and cautious approach to uncritical use of machine learning for classification, wherein the awareness of what we can learn is of significance for interpreting the results. The main challenge here is related to the low data quality and the limited evidence for sweet spots (see Sect. 2). In such cases, the focus should be on the confidence of evidence of sweet spots, despite a potentially low discovery rate. There is usually a high cost associated to exploration and development of a field. It is therefore generally better to sacrifice some sweet spots (i.e., detection rate) in order to gain accuracy and precision. This is our main focus and we compare the ability of these machine learning algorithms to learn from a weak signal. We show how a simple modification can be used to improve such methods and how this improves recovering of the potential and providing sufficiently confident evidence of sweet spots. We also discuss the inadequacy of simple summary statistics for model validation and show that generally a more detailed investigation is needed in order to assess the actual performance.

In Sect. 2 we describe our real data set and set the sweet spot detection in a binary classification setting. Next, in Sect. 3, we discuss the machine learning algorithms used in this case study. Section 4 outlines the setup for training and validating the machine learning methods, before the numeric results are presented and discussed. Lastly, Sect. 5 concludes the case study.

The training and validation of machine learning methods and the predictions and numeric comparisons are carried out in R, using the package e1071, class, and randomForest.

2 Data and the Problem

The case study consists of labeled observations from four vertical blocked wells in a reservoir, providing a total of 315 observation points. For each observation point, there are six reservoir properties available for training, henceforth referred to as features. These are the seismic attributes P-wave velocity (Vp), S-wave velocity (Vs), density, acoustic impedance (AI), 4D residual of pre-stack time migration and average magnitude of reflectivity. In addition, total organic carbon (TOC) and gamma ray (GR) are provided in the wells, which are used to set the labels, i.e., sweet spots. See Fig. 1 and Table 1 for details regarding the number of observations and fraction of sweet spots to non-sweet locations in the wells.
Fig. 1

The ranges of TOC and GR used to define the sweet spots. The observations in the red region are defined (by the geologist) as sweet spots

Table 1

Number of sweet spots and non-sweet spots in the four wells

 

Sweet spots: non-sweet spots

Well 1

40: 63

Well 2

9: 40

Well 5

38: 63

Well 6

19: 43

Note that the first four features (Vp, Vs, density, and AI) have been corrected for a depth trend. Thus, the a priori background model for the parameters has been removed, since this introduced a systematic bias in the predictions.

To illustrate the complexity of this data set and the weak relationship in the underlying relationship between sweet spots and reservoir properties, we plot four cross plots of a selection of pairwise combinations of the six features in Fig. 2. These plots indicate a quite strong linear correlation between Vp and density and also between density and AI. Moreover, there is no clear relationship between AI and 4D residual and 4D residual and the average magnitude of reflectivity. This seems to be caused by the high level of noise in the measured 4D residuals. In all four cross plots, there is no trace of geometric delimitation of the sweet spots. Indicating that the well data is not easily linearly separable in the feature space, hence we expect a complex, or limited, relationships.
Fig. 2

Cross plots of a selection of pairwise combinations of the six features. All features are plotted in the normalized domain, hence no units along the axis. Again, red color marks observations defined as sweet spots; blue color marks non-sweet spots

The reservoir model used for predicting sweet spots has dimensions 280 by 350 by 100 cells. Figure 3 shows the top layer and a vertical slice of the acoustic impedance. Note that the upper left and lower right corners of the lateral view do not contain defined values. Six wells are marked with a circle in the lateral plot. Wells 1, 2, 5, and 6 have given the features defining the sweet spots. The two additional wells in the reservoir, Wells 3 and 4, lack values for total organic carbon and gamma ray. They cannot be used to define the sweet spots and are therefore excluded from further analysis.
Fig. 3

Top layer of the acoustic impedance (left). The wells are numbered from 1 to 6, of which Wells 3 and 4 do not have defined sweet spots. A vertical slice of the acoustic impedance along the dashed line marked in the top layer (right). Note the values are corrected for a depth trend

The S-wave velocity in the labeled data set appears to be artificially constructed from P-wave velocity, as the estimated correlation between the two is above 0.99, which is also verified by plotting. Traditionally, we would be inclined to exclude one of these variables in the statistical analysis, e.g., to avoid collinearity. However, we will keep both features in our training data to test and illustrate the robustness of the (probabilistic) model-free machine leaning methods.

The sweet spot classification is a binary classification problem, where we identify the two classes: sweet spots and non-sweet spots. In a binary classification there are four possible outcomes summarized below:

True negative (TN)

False positive (FP)

Correctly classified true non-sweet spots

Wrongly classified true non-sweet spots as sweet spots

False negative (FN)

True positive (TP)

Wrongly classified true sweet spots as non-sweet spots

Correctly classified true sweet spots

A perfect performance of a classification is identifying all true non-sweet spots as non-sweet spots and all true sweet spots as sweet spots. To measure the performance of a classification, several accuracy and error measures are available. In sweet spot detection, our primary goal is to obtain precise knowledge on the locations of possible sweet spots and the corresponding accuracy and precision of the predictions. We therefore focus on the performance measures True Detection Rate (TDR) and True Prediction Rate (TPR) defined as
$$ \mathrm{T}\mathrm{D}\mathrm{R}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}} = \frac{\mathrm{number}\ \mathrm{of}\ \mathrm{correctly}\ \mathrm{predicted}\ \mathrm{sweet}\ \mathrm{spots}}{\mathrm{number}\ \mathrm{of}\ \mathrm{true}\ \mathrm{sweet}\ \mathrm{spots}} $$
(1)
$$ \mathrm{T}\mathrm{P}\mathrm{R}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}} = \frac{\mathrm{number}\ \mathrm{of}\ \mathrm{correctly}\ \mathrm{predicted}\ \mathrm{sweet}\ \mathrm{spots}}{\mathrm{number}\ \mathrm{of}\ \mathrm{predicted}\ \mathrm{sweet}\ \mathrm{spots}} $$
(2)
The TDR is a measure of the recall (or sensitivity) of the classification and describes how well the classification method correctly detects the sweet spots that actually are sweet spots. The TPR is a measure of precision and gives the proportion of predicted sweet spots that are actual sweet spots. To combine the measure of recall and precision, we will use the Fβ-score defined as the weighted harmonic mean of recall and precision:
$$ \mathrm{F}\beta \mathrm{score}=\frac{\left(1+{\beta}^2\right)\cdot \mathrm{TP}\mathrm{R}\cdot \mathrm{TDR}}{\left({\beta}^2\cdot \mathrm{TP}\mathrm{R}\right)+\mathrm{TDR}}=\frac{\left(1+{\beta}^2\right)\cdot \mathrm{TP}}{\left(1+{\beta}^2\right) \cdot \mathrm{TP}+{\beta}^2\cdot \mathrm{F}\mathrm{P}+\mathrm{FN}} $$
(3)
In the following, we will use the balanced weighting with β = 1, denoted F1 score. The more general Fβ score will become of value for tuning the algorithms. Especially for the SVMs this score will be used as a mean of favorable balancing of TPR and TDR to avoid overfitting and collapsing the model to the uninteresting solution of predicting all locations as either sweet or non-sweet spots.

In the sweet spot setting, we argue that TPR is of most importance, as an assurance of correct sweet spot predictions. On the other hand, a carefully balanced focus on the TDR will ensure that more sweet spots are found, at the cost of including misclassified sweet spots. Again, care is needed when tuning methods against these measures.

Moreover, we expect that there is an overrepresentation of sweet spots in the data. This seems obvious, since the initial or any wells are not placed randomly into the field, but they are placed exactly where the developers expect they have the greatest potential for success, i.e., in the sweet spots. This suggests that there is most likely a confounding, or omitted, variable not observed. The information and process underlying the positioning of wells can be thought of as an unobserved (and highly complex) variable influencing both the response and the explanatory variables. This may in turn result in an unbalanced data problem (too many sweet spots) and introduce potentially complex correlations among the explanatory variables and the response; see, among others, He and Garcia (2009) and King and Xeng (2001) for additional discussion. It is generally hard, or even impossible, to correct for such; see Li et al. (2011) for an attempt to correct the support vector machine. The logistic regression model is particularly sensitive; see also Mood (2010). As a final remark, if we consider the overall reservoir from which well logs are collected, we will expect a minority of the sweet spots, causing an additional imbalance, this time in the opposite direction. Proper treatment of such effects and possible extensions are outside the scope of this paper.

3 Machine Learning Methods

In general, machine learning refers to algorithms and statistical methods for data analysis. Here, we will focus on machine learning methodology for prediction of binary class labels, i.e., two class problems. It should be pointed out that all methods discussed can easily be generalized to multiclass problems. We will consider four common and popular supervised learning algorithms, which are the logistic regression, random forest, k-nearest neighbor (kNN), and support vector machine (SVM).

3.1 Logistic Regression

Logistic regression is a classical and popular model-based classification algorithm. We refer the reader to, e.g., Hastie et al. (2009) or any introductory textbook in statistics for a general introduction. The logistic regression model provides estimates for the probability of a binary response as a function of one or more explanatory variables. Since it is model based, it is possible to obtain proper and valid statistical inference, e.g., for statistical tests for feature selection. In addition, compared to some machine learning algorithms, e.g., kNN, SVM, or tree-based models, the outputs of a fitted logistic regression model can be interpreted as actual class probabilities under the model conditions. Most (model-free) machine learning algorithm only output class labels, and the probabilistic proxies are obtained and tuned from the raw outputs to mimic an output from a probabilistic model; see, for instance, Platt (1999) for an algorithm for obtaining class probabilities for SVMs.

The logistic regression model has certain well-known challenges. Firstly, compared to simple machine learning algorithms, like the kNN and SVMs, fitting a logistic regression model requires some form of semi-complex and iterative optimization algorithm (like gradient decent). On the other hand, since it is based on a low-dimensional parametric model (the number of parameters is essentially number of features + 1), the fitted model is very efficient for predicting in large grids. Another challenge is that the logistic regression is sensitive to collinearity and confounding; it is not particularly robust against outliers and may become hard to tune automatically (i.e., select the appropriate number of features to use); see, among others, Menard (2002) for details on applied use of logistic regression.

3.2 Random Forest

Random forest is the ensemble of multiple decision or classification trees; see, e.g., Hastie et al. (2009). A decision tree is a greedy approach that recursively partitions the feature space. A single decision tree will easily overfit the training data to the test data and has potentially a large bias. In particular, with noisy data, the generalization of a single decision tree is poor. To avoid overfitting, the ensemble of decision trees, i.e., random forest, averages multiple decision trees based on different resampling of training data. Each of the trees in the ensemble has potentially a high variance, and the averaging of the ensemble reduced this variance. In general, random forest is computationally efficient and is easily interpreted. For more details, we refer the reader to Breiman (2001).

3.3 k-Nearest Neighbor (kNN)

The k-nearest neighbor (kNN) algorithm is one of the simpler and more robust supervised learning algorithms. An introduction can be found in any introductory textbook in machine learning. The algorithm classifies a new observation, or location, by comparing it with the k-nearest observations in the training set and classifies the new observation according to the dominant class. This algorithm is completely model-free and nonparametric. However, each new prediction needs a unique nearest neighbor search. This makes the algorithm less efficient for large data sets and prediction grids. The best choice of the number of neighbors, k, depends upon the data. In our case we perform a cross validation to find this parameter. In general, small values of k may result in noisier results. Larger values of k reduce the effect of noise, but make boundaries between classes less distinct. This algorithm will always improve with more data, and the method is known to work well in simpler classification problems; see also Beyer et al. (1999).

3.4 Support Vector Machine (SVM)

Lastly, support vector machine (SVM) classifies data by finding a hyperplane that separates the data. In the case of linearly separable data in two dimensions, the separating hyperplane is a separating line. Figure 4a shows an illustration of a linearly separable case with the separating line marked as the black line and the data points (support vectors) defining the line marked with circles. The dashed lines mark the margins, i.e., the distance from the separating line to the nearest data points.
Fig. 4

Illustration of SVM for (a) a linearly separable data set, (b) a non-separable data set with soft margins, and (c) a nonlinear separating hyperplane

For data sets that are not completely separable, the concept of soft margin is introduced to allow some data to be within the margin. SVM now attempts to find a hyperplane that separates the data as cleanly as possible, however, not strictly enforcing that there are no data in the margin (hence the term soft margin). The soft margin is controlled through a regularization parameter, often referred to as C. A large value for this regularization parameter aims for a smaller soft margin and fewer misclassified points. On the other hand, a small value for the regularization parameter aims for a larger soft margin, allowing more points to be misclassified and yielding a smoother decision boundary. Figure 4b has interchanged three points between the blue and red classes, making the data set linearly inseparable. This figure shows the separating plane as the black line, support vectors again marked with circles, and we observe that some points are allowed to appear within the margins (dashed lines).

The SVMs handle nonlinear classification by applying the so-called kernel trick, which allows for nonlinear decision boundaries, while the algorithm for the linear SVM still can be applied for determination of the hyperplane. The kernel trick can be thought of as mapping the observation points into some higher-dimensional space, in which an optimal separating hyperplane is found. Projecting the hyperplane back to the original space yields a nonlinear decision boundary. A typical choice for the kernel function applied is the radial basis function; see, e.g., Hastie et al. (2009). The radial basis function kernel is a scaled version of the Gaussian kernel, in which the squared Euclidean distance between two features is scaled by a free parameter. In the following, we will denote this kernel parameter γ. Adjusting these parameters allows the decision boundary to go from finely detailed decision boundary to a coarser distinction between the classes. Figure 4c shows a nonlinear separating boundary.

The use of SVMs is of great interest as a sweet spot classifier, as it is known to perform well in classification problems where the decision regions of the feature space are of a smooth geometric nature, as we expect to be the case in several applications in the geosciences. The SVM is often referred to be the “out-of-the-box” classifier and is known to be of high accuracy and has the ability to deal with high-dimensional data, i.e., usually no preselection of features is needed.

For a more extensive introduction to SVMs, the reader is referred to Bishop (2006) and Cortes and Vapnik (1995).

4 Numeric Comparisons

In the following, we first outline the setup for validating the various machine learning methods. Next, we report results of several comparisons. Along with the discussion of the results, we present additional tuning of the methods to sharpen and balance the performances.

4.1 Training, Testing, and Validation of Methods

To evaluate the machine learning methods, we use the labeled data and carry out a fitting (training and testing) and validation setup. In 100 rounds of validation, we assign 30–70 % of the labeled data set (randomly) for validating. The rest is left for fitting. In the validation, the fitted methods are applied on the validation data set, and F1 score, True Prediction Rate (TPR), and True Detection Rate (TDR) are recorded. For fitting of the methods (training and testing), again 30–70 % is assigned (randomly) for testing the methods, leaving the rest of the data set for training. In both training and testing, cross validation is used to obtain optimal parameters for the algorithms. Here we have focused on maximizing mainly the TPR value, but also various Fβ-scores. After the 100 rounds of training, testing, and validating, we average the obtained performance measures.

Note that when validating the methods, we randomly choose the observations from all of the four wells. We also consider a more real-case predicting study, where we sequentially hold out one well, fitting the methods on the remaining three wells, and investigate performance on the held-out well.

The optimal parameters, found by cross validation, refer to the parameters yielding, e.g., the largest TPR score. For the kNN we find the optimal number of nearest neighbors 0 < k <30. For the SVM we cross validate for the regularization parameter \( {2}^{-5}< C<{2}^{10} \) and kernel parameter \( {2}^{-10}<\gamma <{2}^5 \). For the random forest algorithm, ensembles of up to a couple of 1000 trees were tested. Interestingly we saw no significant change in performance for ensembles of more than 100 trees.

4.2 Results

Table 2 summarizes the performances of random forest, kNN, and SVM applied as described above. We report the obtained performance measures from the fitting, as well as on the validation set (in bold). The last column in Table 2 reports performance of a random classifier, which randomly (with equal probability) assign predictions as sweet or non-sweet spots. For the logistic regression, we were not able to obtain any results better than TPR of 0.10. The failure of the logistic regression is explained by the weak signal (correlation between the explanatory variables and the sweet spots are all less than 0.1) together with nonlinear separation in the feature space, as previously described. Inspection of residuals and several corrections, such as feature and subset selections, and change of threshold on the probabilistic output were tested, however, with no success. Other possible extensions to logistic regression, for instance, by introduction of hidden layers (neural nets) (see, e.g., (Bishop 2006)), are beyond the scope of this paper, and the logistic regression is left out of the following discussion.
Table 2

Summary of the performance of random forest, kNN (tuned k), and SVM (tuned C and γ)

 

Random forest

kNN

SVM

Random

F1

0.24

0.23

0.51

0.50

0.50

0.49

0.40

TPR

0.33

0.32

0.35

0.34

0.36

0.34

0.33

TDR

0.21

0.19

1.00

1.00

0.88

0.89

0.50

All tuning is optimized for TPR. The first column for each method is the measures obtained on the testing sets. The second column, marked with bold, is the measures obtained on the validation sets. Last column is the performance of a random classifier

From Table 2 we observe that all methods perform more or less equally with the random classifier, with quite low detection rate. Note that both kNN and SVM seem to perform well with the high detection rates. These rates, however, are a consequence of several cases of classifying all predictions as sweet spots, hence finding all, at the cost of significant misclassification rates. This suggests that additional fine-tuning, or preprocessing, is needed to improve the potential.

In further tuning of the methods, we were able to obtain better measures for all reported methods. Specifically, the tuning of random forest consists of excluding the features 4D residual and average magnitude of reflectivity. These features have a negative variable importance measure score; see Liaw and Wiener (2002). It is interesting to note that the kNN algorithm was essentially (with unchanged scores) insensitive to this preprocessing. Furthermore, the SVM actually did worse on the reduced data set, suggesting that SVM is able to make the feature selection on its own.

Therefore, to further improve performance, we included an additional fine-tuning parameter with the aim of obtaining a higher level of precision, i.e., TPR score, by increasing the threshold used by each algorithm to classify observations into the respective classes. This makes it harder, by requiring more evidence, to classify locations as sweet spots.

For kNN this tuning was on the threshold for the majority vote in the neighborhood and for SVM, tuning the threshold, or cutoff, on the decision function score. As alluded to above, the random forest algorithm was not sensitive to additional fine-tuning. For both kNN and SVM, the threshold-tuning comes in addition to tuning of the model parameters. By changing this cutoff, the hope is that we are able to modify the fraction of sweet spots detected. Table 3 thus summarizes the best achieved rates as obtained by using the above described tuning.
Table 3

Summary of the performance for random forest (excluding the features 4D residual and average magnitude of reflectivity), kNN (also threshold tuned), and SVM (also threshold tuned)

 

Random forest

kNN

SVM

F1

0.32

0.34

0.41

0.38

0.21

0.27

TPR

0.38

0.40

0.38

0.35

0.49

0.44

TDR

0.30

0.33

0.61

0.58

0.20

0.26

All tuning is optimized for TPR. Columns are for test and validation sets as given for Table 2

Comparing reported results in Table 3 with Table 2, we see that (prior) feature selection for random forest increases both precision and detection and seems to be the winner among the three. Additional tuning provided no significant improvements to the kNN algorithm, suggesting that sophisticated versions of kNN are required, e.g., the popular (Friedman 1994) or the more involved (Goldberger et al. 2005). The SVM algorithm received a considerable increase in the TPR score, indicating a good potential for additional fine-tuning of the SVM toward the most important properties (e.g., a predefined balance between TPR and TDR).

To evaluate and to get a better understanding of how the obtained performance measures will transfer to the real field, we now fit the models by sequentially holding out one of the wells. Firstly, Fig. 5a shows predictions in all four wells using random forest with four features as specified in Table 3. Here we get a visual impression of how well sweet spots are predicted. We note several missing sweet spots in the predictions, as well as sweet spots detected where the labeled data show non-sweet spots. We accompany the plots of predictions with Table 4, reporting obtained performance measures in the wells.
Fig. 5

Prediction of sweet spots in the four wells using (a) random forest with four features and (b) kNN, (c) fine-tuned SVM, and finally (d) random forest with four features and without corrections of the depth trend. Leftmost well column for each well is the labeled data, while the rightmost well column for each well is the prediction

Table 4

Obtained performance measures when sequentially holding out one well at a time

 

Random forest

SVM

kNN

 

TPR

TDR

TPR

TDR

β

TPR

TDR

Well 1

0.55

0.15

0.60

0.30

0.30

0.39

1.00

Well 2

0.25

0.33

0.00

0.00

0.40

0.19

0.78

Well 5

0.54

0.18

0.50

0.37

0.30

0.38

0.55

Well 6

0.53

0.42

0.38

0.53

0.45

0.31

1.00

For the random forest and kNN, the two columns are TPR and TDR. For the SVM, we report, in addition to TPR and TDR, the weight β used in the optimization for parameters

Next, Fig. 5b shows predictions in all four wells using kNN, with only tuning of the number of neighbors, k (as specified for Table 2). This poor performance is included to illustrate how “good” performance measures indeed transfer to real field. Although we might be led to believe in the predicting power of kNN from Table 2, here kNN is either useless (as in Wells 1 and 6) or yields quite noisy predictions (as in Wells 2 and 5). Also, note that for Well 5 the performance measures in Table 4 are indeed the same as one would expect from a random classifier.

Acknowledging the need for additional balancing of TPR and TDR for a best trade-off, we introduced for SVM additional tuning of the weighting of TPR and TDR. This is done by optimizing the parameters, by cross validation, against the Fβ-score, Eq. 3, for different values of β. The developments of the three performance measures TPR, TDR, and Fβ score, as a function of the weight β, for the four wells are shown in Fig. 6. Note that a TDR of 1.0 corresponds with predicting all points as sweet spots, hence detecting all, at the cost of a large amount of misclassifications.
Fig. 6

Development of the three performance measures TPR (solid), TDR (dashed), and Fβ score (gray) as a function of the weight β for the four wells. Left column shows the development on the testing set, while the right column shows the validation set

Selecting an appropriate weight β for each well yields predictions in the wells as displayed in Fig. 5c. By appropriate we here refer to the weights that best balance TPR and TDR, typically at the point where TPR and TDR cross in Fig. 6. Here, the “optimal” balance point is determined by inspecting of Fig. 6. Table 4 reports the weight β used for each of the wells. In general, we now observe in Fig. 5c that the detection has increased compared to random forest, as well as the precision is kept at an acceptable level, indicating good generalization potential.

Note again that predictions in Well 2 fail – more or less – for all methods. Extracting wells as validation sets introduces a grouping of the observations. There is reason to be skeptical regarding the results for Well 2. Figure 7 shows the pairwise cross plots of some of the features, distinguished by color on the four different wells. We observe that for Well 2 (red color) the features do not coincide with the three other wells. Therefore, this well can be interpreted as significantly noisier or to be representing something different. It is, of course, generally hard for a predictor to predict something it has never seen before. On the other hand, a simple linear classifier may still provide reasonable results, depending on the structure of the underlying problem.
Fig. 7

Cross plots of three pairs of features, distinguished on color for the four wells. Red color is for Well 2. Values are in the normalized domain, hence no units on the axis

As pointed out earlier, four of the features in the data set have been corrected for a depth trend. Figure 5d displays the obtained predictions applying random forest by including depth as an independent feature. We observe a seemingly good match, indicating possible spurious relationship. In the well data in our case study, the majority of the defined sweet spots are indeed located toward the bottom of the reservoir. However, none of the other methods performed acceptably with the depth trend; results were indeed worse.

5 Conclusion

In this paper we have illustrated the application of machine learning methods to a small, but challenging, real field case study of sweet spot detection. The data set has weak evidence of sweet spots, and validation of the methods supports the difficulty of detection. To increase the performance of the methods, we illustrate and discuss a simple solution. As a concluding summary, random forest, given proper preprocessing and feature selection, seems a safe and simple choice, at least for the described data set. Next, SVM shows flexibility and a good potential by responding well to tuning of parameters. SVM is able to obtain acceptable rates and proves transferrable to field predictions. However, unguided use of SVM easily leads to poor performance. The simple kNN with the described tuning does not seem to yield trustworthy results, and logistic regression failed already in the onset of these analyses and did not recover. In general, machine learning algorithms should be used with caution and proper preprocessing, and guided tuning seems to be needed for obtaining reasonable performance.

Notes

Acknowledgment

We thank Arne Skorstad and Markus Lund Vevle, both at Emerson Process Management Roxar AS, for the data set and for answering questions related to it.

Bibliography

  1. Al-Anazi A, Gates I (2010) A support vector machine algorithm to classify lithofacies and model permeability in heterogeneous reservoirs. Eng Geol 114(3–4):267–277CrossRefGoogle Scholar
  2. Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is “nearest neighbor” meaningful? In: Database theory — ICDT’99, vol 1540. Springer, Berlin, pp 217–235CrossRefGoogle Scholar
  3. Bishop CM (2006) Pattern recognition and machine learning (Information science and statistics). Springer, New YorkGoogle Scholar
  4. Breiman L (2001) Random forest. Mach Learn 45(1):5–32CrossRefGoogle Scholar
  5. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297Google Scholar
  6. Friedman J (1994) Flexible metric nearest neighbor classification. Stanford UniversityGoogle Scholar
  7. Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighborhood components analysis. Adv Neural Inf Process Syst 17:513–520Google Scholar
  8. Hastie TJ, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, New YorkCrossRefGoogle Scholar
  9. He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRefGoogle Scholar
  10. King G, Xeng L (2001) Logistic regression in rare events data. Polit Anal 2:137–163CrossRefGoogle Scholar
  11. Li J (2005) Multiattributes pattern recognition for reservoir prediction. CSEG Natl Conv 2005:205–208Google Scholar
  12. Li L, Rakitsch B, Borgwardt K (2011) ccSVM: correcting support vector machines for confounding factors in biological data classification. Bioinformatics 27(13):i342–i348CrossRefGoogle Scholar
  13. Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22Google Scholar
  14. Menard S (2002) Applied logistic regression analysis. Sage, Thousand OaksCrossRefGoogle Scholar
  15. Mood C (2010) Logistic regression: why we cannot do what we think we can do, and what we can do about it. Eur Sociol Rev 26(1):67–82CrossRefGoogle Scholar
  16. Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 61–74Google Scholar
  17. Vonnet J, Hermansen G (2015) Using predictive analytics to unlock unconventional plays. First Break 33(2):87–92Google Scholar
  18. Wohlberg B, Tartakovsky D, Guadagnini A (2006) Subsurface characterization with support vector machines. IEEE Trans Geosci Remote Sens 44(1):47–57CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Norwegian Computing CenterOsloNorway

Personalised recommendations