Skip to main content

Advertisement

Log in

Combination of Machine Learning and Kriging for Spatial Estimation of Geological Attributes

  • Original Paper
  • Published:
Natural Resources Research Aims and scope Submit manuscript

Abstract

A growing number of studies in the spatial estimation of geological features use machine learning (ML) models, as these models promise to provide efficient solutions for estimation especially in non-Gaussian, non-stationary and complex cases. However, these models have two major limitations: (1) the data are considered to be independent and identically distributed (or spatially uncorrelated), and (2) the data are not reproduced at their locations. Kriging, on the other hand, has a long history of generating unbiased estimates with minimum error variance at unsampled locations. Kriging assumes stationarity and linearity. This study proposes a methodology that combines kriging and ML models to mitigate the disadvantages of each method and obtain more accurate estimates. In the proposed methodology, a stacked ensemble model, which is also referred to as the super learner (SL) model, is applied for ML modeling. We have shown how the estimates generated by the SL model and estimates obtained from kriging can be combined through a weighting function based on a kriging variance. The weights are optimized using the sequential quadratic programming. The proposed methodology is demonstrated in two synthetic case studies containing data with non-stationarity and non-Gaussian features; a real case study using a dataset from an oilsands deposit is also presented. The performance of the combined model is compared with the SL model and kriging using the coefficient of determination (R-squared), root mean squared error, and mean absolute error criteria. The combined model appears to yield more accurate estimates than the ones generated by SL model and kriging in all cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20

Similar content being viewed by others

References

  • Al-Anazi, A., & Gates, I. D. (2010). On the capability of support vector machines to classify lithology from well logs. Natural Resources Research, 19(2), 125–139.

    Article  Google Scholar 

  • Al-Anazi, A., & Gates, I. D. (2010). Support vector regression for porosity prediction in a heterogeneous reservoir: A comparative study. Computers& Geosciences, 36(12), 1494–1503.

    Article  Google Scholar 

  • Alpaydin, E. (2014). Introduction to Machine Learning (3rd ed.). The MIT Press.

  • An, S., Liu, W., & Venkatesh, S. (2007). Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression. Pattern Recognition, 40(8), 2154–2162.

    Article  Google Scholar 

  • Badel, M., Angorani, S., & Panahi, M. S. (2011). The application of median indicator kriging and neural network in modeling mixed population in an iron ore deposit. Computers& Geosciences, 37(4), 530–540.

    Article  Google Scholar 

  • Baglaeva, E. M., Sergeev, A. P., Shichkin, A. V., & Buevich, A. G. (2020). The effect of splitting of raw data into training and test subsets on the accuracy of predicting spatial distribution by a multilayer perceptron. Mathematical Geosciences, 52(1), 111–121.

    Article  Google Scholar 

  • Boggs, P. T., & Tolle, J. W. (1995). Sequential quadratic programming. Acta Numerica, 4(1), 1–51.

    Article  Google Scholar 

  • Boggs, P. T., & Tolle, J. W. (2000). Sequential quadratic programming for large-scale nonlinear optimization. Journal of Computational and Applied Mathematics, 124(1–2), 123–137.

    Article  Google Scholar 

  • Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.

    Article  Google Scholar 

  • Breiman, L. (1996). Stacked regressions. Machine Learning, 24(1), 49–64.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  Google Scholar 

  • Bressan, T. S., de Souza, M. K., Girelli, T. J., & Junior, F. C. (2020). Evaluation of machine learning methods for lithology classification using geophysical data. Computers& Geosciences, 139, 104475.

    Article  Google Scholar 

  • Brownlee, J. (2016). Deep learning with Python: Develop deep learning models on Theano and TensorFlow using Keras. Machine Learning Mastery.

  • Chatterjee, S., & Bandopadhyay, S. (2011). Goodnews Bay Platinum resource estimation using least squares support vector regression with selection of input space dimension and hyperparameters. Natural Resources Research, 20(2), 117–129.

    Article  Google Scholar 

  • Chatterjee, S., Bandopadhyay, S., & Machuca, D. (2010). Ore grade prediction using a genetic algorithm and clustering based ensemble neural network model. Mathematical Geosciences, 42(3), 309–326.

    Article  Google Scholar 

  • Chiles, J.-P., & Delfiner, P. (2009). Geostatistics: Modeling spatial uncertainty, (Vol. 497). Wiley.

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.

    Article  Google Scholar 

  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.

    Article  Google Scholar 

  • de Lima, R. P., Duarte, D., Nicholson, C., Slatt, R., & Marfurt, K. J. (2020). Petrographic microfacies classification with deep convolutional neural networks. Computers& Geosciences, 142, 104481.

    Article  Google Scholar 

  • Deutsch, C. V. (2018). Partitioning Drill Hole Data into K Folds. CCG Annual Report 20, Paper 112.

  • Deutsch, C. V. (2020). Cell declustering parameter selection. In J. Deutsch (Ed.), Geostatistics Lessons.

  • Deutsch, C. V., & Journel, A. G. (1998). GSLIB: Geostatistical software library and user’s guide (2nd ed.). Oxford University Press.

  • Deutsch, J. L., Szymanski, J., & Deutsch, C. V. (2014). Checks and measures of performance for kriging estimates. Journal of the Southern African Institute of Mining and Metallurgy, 114(3), 223.

    Google Scholar 

  • Dubrule, O. (1983). Cross validation of kriging in a unique neighborhood. Journal of the International Association for Mathematical Geology, 15(6), 687–699.

    Article  Google Scholar 

  • Dumakor-Dupey, N. K., & Arya, S. (2021). Machine learning-a review of applications in mineral resource estimation. Energies, 14(14), 4079.

    Article  Google Scholar 

  • Erten, E. G. (2021). Estimation of Geospatial Data by Using Machine Learning Algorithms. Doctoral dissertation, Eskisehir Osmangazi University, Eskisehir, Turkey.

  • Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121(2), 256–285.

    Article  Google Scholar 

  • Freund, Y., Schapire, R. E., et al. (1996). Experiments with a new boosting algorithm. In icml, (vol. 96, pp. 148–156). Citeseer.

  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics,, 1189–1232.

  • Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42.

    Article  Google Scholar 

  • Goovaerts, P. (1997). Geostatistics for natural resources evaluation. Oxford University Press on Demand.

  • Halotel, J., Demyanov, V., & Gardiner, A. (2020). Value of geologically derived features in machine learning facies classification. Mathematical Geosciences, 52(1), 5–29.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer.

  • Hengl, T., Nussbaum, M., Wright, M. N., Heuvelink, G. B. M., & Gräler, B. (2018). Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ, 6, e5518.

    Article  Google Scholar 

  • Heykin, S. (2009). Neural networks and learning machines (3rd ed.).

  • Hillier, F. S., & Lieberman, G. J. (1995). Introduction to operations research. McGraw-Hill Science, Engineering & Mathematics.

  • Hohn, M. E. (2000). Geostatistics and Petroleum Geology.

  • Horrocks, T., Wedge, D., Holden, E.-J., Kovesi, P., Clarke, N., & Vann, J. (2015). Classification of gold-bearing particles using visual cues and cost-sensitive machine learning. Mathematical Geosciences, 47(5), 521–545.

    Article  Google Scholar 

  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Springer.

  • Johnson, L. M., Rezaee, R., Kadkhodaie, A., Smith, G., & Yu, H. (2018). Geochemical property modelling of a potential shale reservoir in the Canning Basin (Western Australia), using Artificial Neural Networks and geostatistical tools. Computers& Geosciences, 120, 73–81.

    Article  Google Scholar 

  • Kanevski, M. (2013). Advanced mapping of environmental data. Wiley.

  • Kanevski, M., Pozdnoukhov, A., & Timonin, V. (2009). Machine learning for spatial environmental data: Theory, applications and software.

  • Kaplan, U. E., & Topal, E. (2020). A new ore grade estimation using combine machine learning algorithms. Minerals, 10(10), 847.

    Article  Google Scholar 

  • Koike, K., Matsuda, S., Suzuki, T., & Ohmi, M. (2002). Neural network-based estimation of principal metal contents in the Hokuroku district, northern Japan, for exploring Kuroko-type deposits. Natural Resources Research, 11(2), 135–156.

    Article  Google Scholar 

  • Kuhn, M., & Johnson, K. (2013). Applied predictive modeling (Vol. 26). Springer.

  • Leuenberger, M., & Kanevski, M. (2015). Extreme Learning Machines for spatial environmental data. Computers& Geosciences, 85, 64–73.

    Article  Google Scholar 

  • Luo, G. (2016). A review of automatic selection methods for machine learning algorithms and hyper-parameter values. Network Modeling Analysis in Health Informatics and Bioinformatics, 5(1), 18.

    Article  Google Scholar 

  • Manchuk, J. G., & Deutsch, C. V. (2011). A Short Note on Trend Modeling using Moving Windows. Centre for Computational Geostatistics, University of Alberta, Edmonton, Canada, CCG Paper(403).

  • Naimi, A. I., & Balzer, L. B. (2018). Stacked generalization: An introduction to super learning. European Journal of Epidemiology, 33(5), 459–464.

    Article  Google Scholar 

  • Olea, R. A. (2012). Geostatistics for engineers and earth scientists. Springer.

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. The Journal of machine Learning research, 12, 2825–2830.

    Google Scholar 

  • Polley, E. C., & van der Laan, M. J. (2010). Super Learner in Prediction (p. 226). Berkeley Division of Biostatistics: U.C.

  • Prasad, A. M., Iverson, L. R., & Liaw, A. (2006). Newer classification and regression tree techniques: Bagging and random forests for ecological prediction. Ecosystems, 9(2), 181–199.

    Article  Google Scholar 

  • Pygeostat (2021). Centre for computational geostatistics.

  • Pyrcz, M. J., & Deutsch, C. V. (2014). Geostatistical reservoir modeling. Oxford University Press.

  • Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

    Article  Google Scholar 

  • Raschka, S., & Mirjalili, V. (2019). Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2. Packt Publishing Ltd.

  • Rossi, M. E., & Deutsch, C. V. (2013). Mineral resource estimation. Springer.

  • Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.

    Google Scholar 

  • Samson, M. J. (2019). Mineral Resource Estimates with Machine Learning and Geostatistics. Master of science. University of Alberta.

  • Samui, P., & Sitharam, T. G. (2010). Applicability of statistical learning algorithms for spatial variability of rock depth. Mathematical Geosciences, 42(4), 433–446.

    Article  Google Scholar 

  • Seeger, M. (2004). Gaussian processes for machine learning. International Journal of Neural Systems, 14(02), 69–106.

    Article  Google Scholar 

  • Smirnoff, A., Boisvert, E., & Paradis, S. J. (2008). Support vector machine for 3D modelling from sparse geological information of various origins. Computers& Geosciences, 34(2), 127–143.

    Article  Google Scholar 

  • Tahmasebi, P., & Hezarkhani, A. (2011). Application of a modular feedforward neural network for grade estimation. Natural Resources Research, 20(1), 25–32.

    Article  Google Scholar 

  • Van der Laan, M. J., Polley, E. C., & Hubbard, A. E. (2007). Super learner. Statistical applications in genetics molecular biology, 6(1).

  • Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., et al. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17(3), 261–272.

    Article  Google Scholar 

  • Wackernagel, H. (2013). Multivariate geostatistics: An introduction with applications. Springer.

  • Wang, H., Guan, Y., & Reich, B. (2019). Nearest-Neighbor Neural Networks for Geostatistics. In International Conference on Data Mining Workshops (ICDMW), (pp. 196–205)., Beijing, China. IEEE.

  • Williams, C. K., & Rasmussen, C. E. (2006). Gaussian processes for machine learning (Vol. 2). MIT press.

  • Witten, I., Frank, E., Hall, M., & Pal, C. (2016). Data mining: Practical machine learning tools and techniques (Fourth ed.). Todd Green.

  • Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.

    Article  Google Scholar 

  • Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., et al. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37.

    Article  Google Scholar 

  • Yamamoto, J. K. (2000). An alternative measure of the reliability of ordinary kriging estimates. Mathematical Geology, 32(4), 489–509.

    Article  Google Scholar 

  • Yang, X.-S. (2016). Engineering mathematics with examples and applications. Academic Press.

  • Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis. European Journal of Operational Research, 116(1), 16–32.

    Article  Google Scholar 

  • Zhang, S. E., Nwaila, G. T., Tolmay, L., Frimmel, H. E., & Bourdeau, J. E. (2021). Integration of machine learning algorithms with gompertz curves and kriging to estimate resources in gold deposits. Natural Resources Research, 30(1), 39–56.

    Article  Google Scholar 

Download references

Acknowledgments

The authors thank the industrial sponsors of the Centre for Computational Geostatistics (CCG) for providing the resources to prepare this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gamze Erdogan Erten.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Erdogan Erten, G., Yavuz, M. & Deutsch, C.V. Combination of Machine Learning and Kriging for Spatial Estimation of Geological Attributes. Nat Resour Res 31, 191–213 (2022). https://doi.org/10.1007/s11053-021-10003-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11053-021-10003-w

Keywords

Navigation