Machine Learning Methods for Sweet Spot Detection: A Case Study
In the geosciences, sweet spots are defined as areas of a reservoir that represent best production potential. From the outset, it is not always obvious which reservoir characteristics that best determine the location, and influence the likelihood, of a sweet spot. Here, we will view detection of sweet spots as a supervised learning problem and use tools and methodology from machine learning to build data-driven sweet spot classifiers. We will discuss some popular machine learning methods for classification including logistic regression, k-nearest neighbors, support vector machine, and random forest. We will highlight strengths and shortcomings of each method. In particular, we will draw attention to a complex setting and focus on a smaller real data study with limited evidence for sweet spots, where most of these methods struggle. We will illustrate a simple solution where we aim at increasing the performance of these by optimizing for precision. In conclusion, we observe that all methods considered need some sort of preprocessing or additional tuning to attain practical utility. While the application of support vector machine and random forest shows a fair degree of promise, we still stress the need for caution in naive use of machine learning methodology in the geosciences.
KeywordsSupport Vector Machine Total Organic Carbon Random Forest Machine Learning Algorithm Machine Learning Method
We thank Arne Skorstad and Markus Lund Vevle, both at Emerson Process Management Roxar AS, for the data set and for answering questions related to it.
- Bishop CM (2006) Pattern recognition and machine learning (Information science and statistics). Springer, New YorkGoogle Scholar
- Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297Google Scholar
- Friedman J (1994) Flexible metric nearest neighbor classification. Stanford UniversityGoogle Scholar
- Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighborhood components analysis. Adv Neural Inf Process Syst 17:513–520Google Scholar
- Li J (2005) Multiattributes pattern recognition for reservoir prediction. CSEG Natl Conv 2005:205–208Google Scholar
- Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22Google Scholar
- Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 61–74Google Scholar
- Vonnet J, Hermansen G (2015) Using predictive analytics to unlock unconventional plays. First Break 33(2):87–92Google Scholar