Combination of Machine Learning and Kriging for Spatial Estimation of Geological Attributes

Erdogan Erten, Gamze; Yavuz, Mahmut; Deutsch, Clayton V.

doi:10.1007/s11053-021-10003-w

Combination of Machine Learning and Kriging for Spatial Estimation of Geological Attributes

Original Paper
Published: 15 January 2022

Volume 31, pages 191–213, (2022)
Cite this article

Natural Resources Research Aims and scope Submit manuscript

1935 Accesses
20 Citations
Explore all metrics

Abstract

A growing number of studies in the spatial estimation of geological features use machine learning (ML) models, as these models promise to provide efficient solutions for estimation especially in non-Gaussian, non-stationary and complex cases. However, these models have two major limitations: (1) the data are considered to be independent and identically distributed (or spatially uncorrelated), and (2) the data are not reproduced at their locations. Kriging, on the other hand, has a long history of generating unbiased estimates with minimum error variance at unsampled locations. Kriging assumes stationarity and linearity. This study proposes a methodology that combines kriging and ML models to mitigate the disadvantages of each method and obtain more accurate estimates. In the proposed methodology, a stacked ensemble model, which is also referred to as the super learner (SL) model, is applied for ML modeling. We have shown how the estimates generated by the SL model and estimates obtained from kriging can be combined through a weighting function based on a kriging variance. The weights are optimized using the sequential quadratic programming. The proposed methodology is demonstrated in two synthetic case studies containing data with non-stationarity and non-Gaussian features; a real case study using a dataset from an oilsands deposit is also presented. The performance of the combined model is compared with the SL model and kriging using the coefficient of determination (R-squared), root mean squared error, and mean absolute error criteria. The combined model appears to yield more accurate estimates than the ones generated by SL model and kriging in all cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Special Issue: Geostatistics and Machine Learning

Article Open access 21 March 2022

Geological Domaining with Unsupervised Clustering and Ensemble Support Vector Classification

Article 11 October 2023

Prediction of geoid undulation using approaches based on GMDH, M5 model tree, MARS, GPR, and IDP

Article 30 April 2022

References

Al-Anazi, A., & Gates, I. D. (2010). On the capability of support vector machines to classify lithology from well logs. Natural Resources Research, 19(2), 125–139.
Article Google Scholar
Al-Anazi, A., & Gates, I. D. (2010). Support vector regression for porosity prediction in a heterogeneous reservoir: A comparative study. Computers& Geosciences, 36(12), 1494–1503.
Article Google Scholar
Alpaydin, E. (2014). Introduction to Machine Learning (3rd ed.). The MIT Press.
An, S., Liu, W., & Venkatesh, S. (2007). Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression. Pattern Recognition, 40(8), 2154–2162.
Article Google Scholar
Badel, M., Angorani, S., & Panahi, M. S. (2011). The application of median indicator kriging and neural network in modeling mixed population in an iron ore deposit. Computers& Geosciences, 37(4), 530–540.
Article Google Scholar
Baglaeva, E. M., Sergeev, A. P., Shichkin, A. V., & Buevich, A. G. (2020). The effect of splitting of raw data into training and test subsets on the accuracy of predicting spatial distribution by a multilayer perceptron. Mathematical Geosciences, 52(1), 111–121.
Article Google Scholar
Boggs, P. T., & Tolle, J. W. (1995). Sequential quadratic programming. Acta Numerica, 4(1), 1–51.
Article Google Scholar
Boggs, P. T., & Tolle, J. W. (2000). Sequential quadratic programming for large-scale nonlinear optimization. Journal of Computational and Applied Mathematics, 124(1–2), 123–137.
Article Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
Article Google Scholar
Breiman, L. (1996). Stacked regressions. Machine Learning, 24(1), 49–64.
Article Google Scholar
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Article Google Scholar
Bressan, T. S., de Souza, M. K., Girelli, T. J., & Junior, F. C. (2020). Evaluation of machine learning methods for lithology classification using geophysical data. Computers& Geosciences, 139, 104475.
Article Google Scholar
Brownlee, J. (2016). Deep learning with Python: Develop deep learning models on Theano and TensorFlow using Keras. Machine Learning Mastery.
Chatterjee, S., & Bandopadhyay, S. (2011). Goodnews Bay Platinum resource estimation using least squares support vector regression with selection of input space dimension and hyperparameters. Natural Resources Research, 20(2), 117–129.
Article Google Scholar
Chatterjee, S., Bandopadhyay, S., & Machuca, D. (2010). Ore grade prediction using a genetic algorithm and clustering based ensemble neural network model. Mathematical Geosciences, 42(3), 309–326.
Article Google Scholar
Chiles, J.-P., & Delfiner, P. (2009). Geostatistics: Modeling spatial uncertainty, (Vol. 497). Wiley.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
Article Google Scholar
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.
Article Google Scholar
de Lima, R. P., Duarte, D., Nicholson, C., Slatt, R., & Marfurt, K. J. (2020). Petrographic microfacies classification with deep convolutional neural networks. Computers& Geosciences, 142, 104481.
Article Google Scholar
Deutsch, C. V. (2018). Partitioning Drill Hole Data into K Folds. CCG Annual Report 20, Paper 112.
Deutsch, C. V. (2020). Cell declustering parameter selection. In J. Deutsch (Ed.), Geostatistics Lessons.
Deutsch, C. V., & Journel, A. G. (1998). GSLIB: Geostatistical software library and user’s guide (2nd ed.). Oxford University Press.
Deutsch, J. L., Szymanski, J., & Deutsch, C. V. (2014). Checks and measures of performance for kriging estimates. Journal of the Southern African Institute of Mining and Metallurgy, 114(3), 223.
Google Scholar
Dubrule, O. (1983). Cross validation of kriging in a unique neighborhood. Journal of the International Association for Mathematical Geology, 15(6), 687–699.
Article Google Scholar
Dumakor-Dupey, N. K., & Arya, S. (2021). Machine learning-a review of applications in mineral resource estimation. Energies, 14(14), 4079.
Article Google Scholar
Erten, E. G. (2021). Estimation of Geospatial Data by Using Machine Learning Algorithms. Doctoral dissertation, Eskisehir Osmangazi University, Eskisehir, Turkey.
Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121(2), 256–285.
Article Google Scholar
Freund, Y., Schapire, R. E., et al. (1996). Experiments with a new boosting algorithm. In icml, (vol. 96, pp. 148–156). Citeseer.
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics,, 1189–1232.
Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42.
Article Google Scholar
Goovaerts, P. (1997). Geostatistics for natural resources evaluation. Oxford University Press on Demand.
Halotel, J., Demyanov, V., & Gardiner, A. (2020). Value of geologically derived features in machine learning facies classification. Mathematical Geosciences, 52(1), 5–29.
Article Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer.
Hengl, T., Nussbaum, M., Wright, M. N., Heuvelink, G. B. M., & Gräler, B. (2018). Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ, 6, e5518.
Article Google Scholar
Heykin, S. (2009). Neural networks and learning machines (3rd ed.).
Hillier, F. S., & Lieberman, G. J. (1995). Introduction to operations research. McGraw-Hill Science, Engineering & Mathematics.
Hohn, M. E. (2000). Geostatistics and Petroleum Geology.
Horrocks, T., Wedge, D., Holden, E.-J., Kovesi, P., Clarke, N., & Vann, J. (2015). Classification of gold-bearing particles using visual cues and cost-sensitive machine learning. Mathematical Geosciences, 47(5), 521–545.
Article Google Scholar
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Springer.
Johnson, L. M., Rezaee, R., Kadkhodaie, A., Smith, G., & Yu, H. (2018). Geochemical property modelling of a potential shale reservoir in the Canning Basin (Western Australia), using Artificial Neural Networks and geostatistical tools. Computers& Geosciences, 120, 73–81.
Article Google Scholar
Kanevski, M. (2013). Advanced mapping of environmental data. Wiley.
Kanevski, M., Pozdnoukhov, A., & Timonin, V. (2009). Machine learning for spatial environmental data: Theory, applications and software.
Kaplan, U. E., & Topal, E. (2020). A new ore grade estimation using combine machine learning algorithms. Minerals, 10(10), 847.
Article Google Scholar
Koike, K., Matsuda, S., Suzuki, T., & Ohmi, M. (2002). Neural network-based estimation of principal metal contents in the Hokuroku district, northern Japan, for exploring Kuroko-type deposits. Natural Resources Research, 11(2), 135–156.
Article Google Scholar
Kuhn, M., & Johnson, K. (2013). Applied predictive modeling (Vol. 26). Springer.
Leuenberger, M., & Kanevski, M. (2015). Extreme Learning Machines for spatial environmental data. Computers& Geosciences, 85, 64–73.
Article Google Scholar
Luo, G. (2016). A review of automatic selection methods for machine learning algorithms and hyper-parameter values. Network Modeling Analysis in Health Informatics and Bioinformatics, 5(1), 18.
Article Google Scholar
Manchuk, J. G., & Deutsch, C. V. (2011). A Short Note on Trend Modeling using Moving Windows. Centre for Computational Geostatistics, University of Alberta, Edmonton, Canada, CCG Paper(403).
Naimi, A. I., & Balzer, L. B. (2018). Stacked generalization: An introduction to super learning. European Journal of Epidemiology, 33(5), 459–464.
Article Google Scholar
Olea, R. A. (2012). Geostatistics for engineers and earth scientists. Springer.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. The Journal of machine Learning research, 12, 2825–2830.
Google Scholar
Polley, E. C., & van der Laan, M. J. (2010). Super Learner in Prediction (p. 226). Berkeley Division of Biostatistics: U.C.
Prasad, A. M., Iverson, L. R., & Liaw, A. (2006). Newer classification and regression tree techniques: Bagging and random forests for ecological prediction. Ecosystems, 9(2), 181–199.
Article Google Scholar
Pygeostat (2021). Centre for computational geostatistics.
Pyrcz, M. J., & Deutsch, C. V. (2014). Geostatistical reservoir modeling. Oxford University Press.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.
Article Google Scholar
Raschka, S., & Mirjalili, V. (2019). Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2. Packt Publishing Ltd.
Rossi, M. E., & Deutsch, C. V. (2013). Mineral resource estimation. Springer.
Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.
Google Scholar
Samson, M. J. (2019). Mineral Resource Estimates with Machine Learning and Geostatistics. Master of science. University of Alberta.
Samui, P., & Sitharam, T. G. (2010). Applicability of statistical learning algorithms for spatial variability of rock depth. Mathematical Geosciences, 42(4), 433–446.
Article Google Scholar
Seeger, M. (2004). Gaussian processes for machine learning. International Journal of Neural Systems, 14(02), 69–106.
Article Google Scholar
Smirnoff, A., Boisvert, E., & Paradis, S. J. (2008). Support vector machine for 3D modelling from sparse geological information of various origins. Computers& Geosciences, 34(2), 127–143.
Article Google Scholar
Tahmasebi, P., & Hezarkhani, A. (2011). Application of a modular feedforward neural network for grade estimation. Natural Resources Research, 20(1), 25–32.
Article Google Scholar
Van der Laan, M. J., Polley, E. C., & Hubbard, A. E. (2007). Super learner. Statistical applications in genetics molecular biology, 6(1).
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., et al. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17(3), 261–272.
Article Google Scholar
Wackernagel, H. (2013). Multivariate geostatistics: An introduction with applications. Springer.
Wang, H., Guan, Y., & Reich, B. (2019). Nearest-Neighbor Neural Networks for Geostatistics. In International Conference on Data Mining Workshops (ICDMW), (pp. 196–205)., Beijing, China. IEEE.
Williams, C. K., & Rasmussen, C. E. (2006). Gaussian processes for machine learning (Vol. 2). MIT press.
Witten, I., Frank, E., Hall, M., & Pal, C. (2016). Data mining: Practical machine learning tools and techniques (Fourth ed.). Todd Green.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.
Article Google Scholar
Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., et al. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37.
Article Google Scholar
Yamamoto, J. K. (2000). An alternative measure of the reliability of ordinary kriging estimates. Mathematical Geology, 32(4), 489–509.
Article Google Scholar
Yang, X.-S. (2016). Engineering mathematics with examples and applications. Academic Press.
Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis. European Journal of Operational Research, 116(1), 16–32.
Article Google Scholar
Zhang, S. E., Nwaila, G. T., Tolmay, L., Frimmel, H. E., & Bourdeau, J. E. (2021). Integration of machine learning algorithms with gompertz curves and kriging to estimate resources in gold deposits. Natural Resources Research, 30(1), 39–56.
Article Google Scholar

Download references

Acknowledgments

The authors thank the industrial sponsors of the Centre for Computational Geostatistics (CCG) for providing the resources to prepare this manuscript.

Author information

Authors and Affiliations

Department of Mining Engineering, Eskisehir Osmangazi University, 26040, Eskisehir, Turkey
Gamze Erdogan Erten & Mahmut Yavuz
Centre for Computational Geostatistics 6-247 Donadeo Innovation Centre For Engineering, University of Alberta, 9211-116 Street, Edmonton, AB, T6G 1H9, Canada
Clayton V. Deutsch

Authors

Gamze Erdogan Erten
View author publications
You can also search for this author in PubMed Google Scholar
Mahmut Yavuz
View author publications
You can also search for this author in PubMed Google Scholar
Clayton V. Deutsch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gamze Erdogan Erten.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Erdogan Erten, G., Yavuz, M. & Deutsch, C.V. Combination of Machine Learning and Kriging for Spatial Estimation of Geological Attributes. Nat Resour Res 31, 191–213 (2022). https://doi.org/10.1007/s11053-021-10003-w

Download citation

Received: 12 June 2021
Accepted: 14 December 2021
Published: 15 January 2022
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11053-021-10003-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combination of Machine Learning and Kriging for Spatial Estimation of Geological Attributes

Abstract

Access this article

Similar content being viewed by others

Special Issue: Geostatistics and Machine Learning

Geological Domaining with Unsupervised Clustering and Ensemble Support Vector Classification

Prediction of geoid undulation using approaches based on GMDH, M5 model tree, MARS, GPR, and IDP

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Combination of Machine Learning and Kriging for Spatial Estimation of Geological Attributes

Abstract

Access this article

Similar content being viewed by others

Special Issue: Geostatistics and Machine Learning

Geological Domaining with Unsupervised Clustering and Ensemble Support Vector Classification

Prediction of geoid undulation using approaches based on GMDH, M5 model tree, MARS, GPR, and IDP

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation