Machine Learning Methods for Sweet Spot Detection: A Case Study

  • Vera Louise Hauge
  • Gudmund Horn Hermansen
Part of the Quantitative Geology and Geostatistics book series (QGAG, volume 19)


In the geosciences, sweet spots are defined as areas of a reservoir that represent best production potential. From the outset, it is not always obvious which reservoir characteristics that best determine the location, and influence the likelihood, of a sweet spot. Here, we will view detection of sweet spots as a supervised learning problem and use tools and methodology from machine learning to build data-driven sweet spot classifiers. We will discuss some popular machine learning methods for classification including logistic regression, k-nearest neighbors, support vector machine, and random forest. We will highlight strengths and shortcomings of each method. In particular, we will draw attention to a complex setting and focus on a smaller real data study with limited evidence for sweet spots, where most of these methods struggle. We will illustrate a simple solution where we aim at increasing the performance of these by optimizing for precision. In conclusion, we observe that all methods considered need some sort of preprocessing or additional tuning to attain practical utility. While the application of support vector machine and random forest shows a fair degree of promise, we still stress the need for caution in naive use of machine learning methodology in the geosciences.


Support Vector Machine Total Organic Carbon Random Forest Machine Learning Algorithm Machine Learning Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We thank Arne Skorstad and Markus Lund Vevle, both at Emerson Process Management Roxar AS, for the data set and for answering questions related to it.


  1. Al-Anazi A, Gates I (2010) A support vector machine algorithm to classify lithofacies and model permeability in heterogeneous reservoirs. Eng Geol 114(3–4):267–277CrossRefGoogle Scholar
  2. Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is “nearest neighbor” meaningful? In: Database theory — ICDT’99, vol 1540. Springer, Berlin, pp 217–235CrossRefGoogle Scholar
  3. Bishop CM (2006) Pattern recognition and machine learning (Information science and statistics). Springer, New YorkGoogle Scholar
  4. Breiman L (2001) Random forest. Mach Learn 45(1):5–32CrossRefGoogle Scholar
  5. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297Google Scholar
  6. Friedman J (1994) Flexible metric nearest neighbor classification. Stanford UniversityGoogle Scholar
  7. Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighborhood components analysis. Adv Neural Inf Process Syst 17:513–520Google Scholar
  8. Hastie TJ, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, New YorkCrossRefGoogle Scholar
  9. He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRefGoogle Scholar
  10. King G, Xeng L (2001) Logistic regression in rare events data. Polit Anal 2:137–163CrossRefGoogle Scholar
  11. Li J (2005) Multiattributes pattern recognition for reservoir prediction. CSEG Natl Conv 2005:205–208Google Scholar
  12. Li L, Rakitsch B, Borgwardt K (2011) ccSVM: correcting support vector machines for confounding factors in biological data classification. Bioinformatics 27(13):i342–i348CrossRefGoogle Scholar
  13. Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22Google Scholar
  14. Menard S (2002) Applied logistic regression analysis. Sage, Thousand OaksCrossRefGoogle Scholar
  15. Mood C (2010) Logistic regression: why we cannot do what we think we can do, and what we can do about it. Eur Sociol Rev 26(1):67–82CrossRefGoogle Scholar
  16. Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 61–74Google Scholar
  17. Vonnet J, Hermansen G (2015) Using predictive analytics to unlock unconventional plays. First Break 33(2):87–92Google Scholar
  18. Wohlberg B, Tartakovsky D, Guadagnini A (2006) Subsurface characterization with support vector machines. IEEE Trans Geosci Remote Sens 44(1):47–57CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Norwegian Computing CenterOsloNorway

Personalised recommendations