Introduction to Geoscience Data Analytics Using Machine Learning

  • Y. Z. Ma


Before the arrival of big data, statistical methods used in science and engineering were dominantly model-based with an emphasis on estimation unbiasedness. Although many traditional statistical methods work well with small datasets and a proper experimental design, they are less effective in handling some of the problems that have arisen out of big data. Artificial intelligence (AI) has led the way to data mining for discovering patterns and regularities from big data and for making predictions for scientific and technical applications. Although the movement was initially led by computer scientists, statisticians, scientists and engineers are now all involved, thus strengthening the trend.

In exploration and production, data have also grown exponentially. Extracting information from data and making predictions using data have become increasingly important. Most data in exploration and production are soft data. Each data source may tell us something, but no source tells us everything; that is, few data give us a definitive answer. Hard data are still sparse. How to integrate big soft data with small hard data is a challenge for reservoir characterization and modeling. This chapter presents an introduction to machine learning and applications of neural networks to geoscience data analytics.


  1. Bansal, Y., Ertekin, T., Karpyn, Z., Ayala, L., Nejad, A., Suleen, F., Balogun, O., Liebmann, D., & Sun, Q. (2013). Forecasting well performance in a discontinuous tight oil reservoir using artificial neural networks. Paper presented at the Unconventional Resources Conference, SPE, The Woodlands, Texas, SPE 164542.Google Scholar
  2. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. Scholar
  3. Chakra, N. C., Song, K., Gupta, M. M., & Saraf, D. N. (2013). An innovative neural network forecast of cumulative oil production from a petroleum reservoir employing higher-order neural networks. Journal of Petroleum Science and Engineering, 106, 18–33.CrossRefGoogle Scholar
  4. Cleveland, W. S., & Devlin, S. J. (1988). Locally-weighted regression: An approach to regression analysis by local fitting. Journal of the American Statistical Association, 83(403), 596–610. Scholar
  5. Dormann, G. F., et al. (2013). Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36, 27–46. Scholar
  6. Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 123–141. Scholar
  7. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–58. Scholar
  8. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge, MA: MIT Press.zbMATHGoogle Scholar
  9. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). New York: Springer.CrossRefGoogle Scholar
  10. Khoshdel, H., & Riahi, M. A. (2011). Multi attribute transform and neural network in porosity estimation of an offshore oil field – A case study. Journal of Petroleum Science and Engineering, 78, 740–747.CrossRefGoogle Scholar
  11. Ma, Y. Z. (2010). Error types in reservoir characterization and management. Journal of Petroleum Science and Engineering, 72(3–4), 290–301. Scholar
  12. Ma, Y. Z. (2011). Pitfalls in predictions of rock properties using multivariate analysis and regression method. Journal of Applied Geophysics, 75, 390–400.CrossRefGoogle Scholar
  13. Ma, Y. Z., & Gomez, E. (2015). Uses and abuses in applying neural networks for predicting reservoir properties. Journal of Petroleum Science and Engineering, 133, 66–75. Scholar
  14. Ma, Y. Z., Gomez, E., & Luneau, B. (2017). Integrations of seismic and well-log data using statistical and neural network methods. The Leading Edge, 36(4), 324–329.CrossRefGoogle Scholar
  15. McLachlan, G. J., & Peel, D. (2000). Finite mixture models. New York: Wiley, 419p.CrossRefGoogle Scholar
  16. Negnevitsky, M. (2005). Artificial intelligence: a guide to intelligent systems. Harlow, England: Pearson Education.Google Scholar
  17. Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5, 197–227.Google Scholar
  18. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1), 267–288. Scholar
  19. Werbos, P. J. (1988). Generation of backpropagation with application to a recurrent gas market model. Neural Networks, 1(4), 339–356. Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Y. Z. Ma
    • 1
  1. 1.SchlumbergerDenverUSA

Personalised recommendations