Abstract
A method for predictive lithological mapping is proposed, which combines geostatistical simulation of geochemical concentrations with coregionalization analysis and decision-tree classification algorithm. The method consists of classifying each target point based on simulated values of the geochemical concentrations, filtered from the short-scale spatial components corresponding to noise and measurement errors. The procedure is repeated over many simulations to give finally as a result the most probable lithology at each target point. An application to a set of geochemical samples of soils and surface rocks is presented, in which lithology is recorded from an interpretive geological field map. It shows significant classification improvement when pre-processing the sampling data through geostatistical simulation with filtering of the nugget effect, with rates of correctly classified data increased by 3.5 to 11 percentage points depending on whether training or testing data subset is considered. The lithological prediction allows generating geological maps as complementary activities to exploration of mineral resources to be able to forecast and/or to validate the geology mapped at each point of explored areas.
Similar content being viewed by others
References
Adeli, A., Emery, X., & Dowd, P. (2018). Geological modelling and validation of geological interpretations via simulation and classification of quantitative covariates. Minerals, 8(1), 7.
Baldock, J. W. (1982). Geología del Ecuador. Boletín de Explicación del Mapa Geológico (1:1.000.000) de la República del Ecuador (p. 54). Quito: Resource document. Ministerio de Recursos Naturales y Energéticos.
Barbosa, P., Oliveira, T., & Silva, J. (2010). Regionalized classification of multivariate geochemical data from Jacupiranga Alkaline Complex (Ribeira de Iguape Valley/Sao Paulo, Brazil). Revista Brasileira de Geociencias, 40(2), 212–219.
Barnett, R. M., Manchuk, J. G., & Deutsch, C. V. (2013). Projection pursuit multivariate transform. Mathematical Geosciences, 46(3), 337–359.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Carranza, E. J. M. (2009). Geochemical anomaly and mineral prospectivity mapping in GIS. Amsterdam: Elsevier.
Carrasco, P. (2010). Nugget effect, artificial or natural? Journal of the Southern African Institute of Mining and Metallurgy, 110(6), 299–305.
Castillo, P. I. C., Townley, B. K., Emery, X., Puig, A. F., & Deckart, K. (2015). Soil gas geochemical exploration in covered terrains of northern Chile: Data processing techniques and interpretation of contrast anomalies. Geochemistry Exploration, Environment, Analysis, 15(2–3), 222–233.
Chilès, J. P., & Delfiner, P. (2012). Geostatistics: Modeling spatial uncertainty. New York: Wiley.
Darsow, A., Schafmeister, M. T., & Hofmann, T. (2009). An ArcGIS approach to include tectonic structures in point data regionalisation. Ground Water, 47(4), 591–597.
Emery, X. (2008). A turning bands program for conditional co-simulation of cross-correlated Gaussian random fields. Computers and Geosciences, 34(12), 1850–1862.
Emery, X. (2010). Iterative algorithms for fitting a linear model of coregionalization. Computers and Geosciences, 36(9), 1150–1160.
Emery, X., Arroyo, D., & Porcu, E. (2016). An improved spectral turning-bands algorithm for simulating stationary vector Gaussian random fields. Stochastic Environmental Research and Risk Assessment, 30(7), 1863–1873.
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179–188.
Galli, A., Gerdil-Neuillet, F., & Dadou, C. (1984). Factorial kriging analysis: A substitute to spectral analysis of magnetic data. In G. Verly, M. David, A. G. Journel, & A. Maréchal (Eds.), Geostatistics for natural resources characterization (pp. 543–557). Dordrecht: Reidel.
Goovaerts, P. (1992). Factorial kriging analysis: A useful tool for exploring the structure of multivariate spatial soil information. Journal of Soil Science, 43, 597–619.
Goovaerts, P. (1997). Geostatistics for natural resources evaluation. Oxford: Oxford University Press.
Goulard, M., & Voltz, M. (1992). Linear coregionalization model: Tools for estimation and choice of cross-variogram matrix. Mathematical Geology, 24(3), 269–286.
Gringarten, E., & Deutsch, C. V. (2001). Teacher’s aide: Variogram interpretation and modelling. Mathematical Geology, 33(4), 507–534.
Grunsky, E. C. (2010). The interpretation of geochemical survey data. Geochemistry Exploration, Environment and Analysis, 10, 27–74.
Grunsky, E. C., Corrigan, D., Mueller, U. A., & Bonham-Carter, G. F. (2012). Predictive geologic mapping using lake sediment geochemistry in the Melville Peninsula. Geological Survey of Canada, Open File. https://doi.org/10.4095/291901.
Grunsky, E. C., Mueller, U. A., & Corrigan, D. (2014). A study of the lake sediment geochemistry of the Melville Peninsula using multivariate methods: Applications for predictive geological mapping. Journal of Geochemical Exploration, 141, 15–41.
Hassan, A. E., Bekhit, H. M., & Chapman, J. B. (2009). Using Markov Chain Monte Carlo to quantify parameter uncertainty and its effect on predictions of a groundwater flow model. Environmental Modelling and Software, 24(6), 749–763.
Hastie, T., Tibshirani, R., & Friedman, J. (2008). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). New York: Springer.
Hofmann, T., Darsow, A., & Schafmeister, M. T. (2010). Importance of the nugget effect in variography on modeling zinc leaching from a contaminated site using simulated annealing. Journal of Hydrology, 389(1–2), 78–89.
Ibarguren, I., Lasarguren, A., Pérez, J. M., Muguerza, J., Arbelaitz, O., & Gurrutxaga, I. (2016). BFPART: Best-first PART. Information Sciences, 367–368, 927–952.
Jaquet, O. (1989). Factorial kriging analysis applied to geological data from petroleum exploration. Mathematical Geology, 21(7), 683–691.
Jenny, H. (1941). Factors of soil formation: A system of quantitative pedology. New York: McGraw-Hill.
Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics, 29(2), 119.
Kuhn, S., Cracknell, M. J., & Reading, A. M. (2019). Lithological mapping in the Central African Copper Belt using random forests and clustering: Strategies for optimised results. Ore Geology Reviews, 112, 103015.
Larocque, G., Dutilleul, P., Pelletier, B., & Fyles, J. W. (2006). Conditional Gaussian co-simulation of regionalized components of soil variation. Geoderma, 134, 1–16.
Leuangthong, O., & Deutsch, C. V. (2003). Stepwise conditional transformation for simulation of multiple variables. Mathematical Geology, 35(2), 155–173.
Liu, Y., Carranza, E. J. M., Zhou, K. F., & Xia, Q. L. (2019). Compositional balance analysis: An elegant method of geochemical pattern recognition and anomaly mapping for mineral exploration. Natural Resources Research, 28, 1269–1283.
Matheron, G. (1962). Traité de Géostatistique Appliquée. Paris: Technip.
McKay, G., & Harris, J. R. (2016). Comparison of the data-driven random forests model and a knowledge-driven method for mineral prospectivity mapping: A case study for gold deposits around the Huritz Group and Nueltin Suite, Nunavut. Canada. Natural Resources Research, 25(2), 125–143.
Merian, E., Anke, M., Ihnat, M., & Stoeppler, M. (2004). Elements and their compounds in the environment-occurrence, analysis and biological relevance. New York: Wiley.
Mitchell, T. M. (1997). Decision tree learning. Singapore: WCB/McGraw-Hill Inc.
Olea, R. A. (1999). Geostatistics for engineers and earth scientists. New York: Springer.
Quinlan, J. R. (1993). C4.5, Programs for machine learning. San Mateo: Morgan Kaufmann.
Rodriguez-Galiano, V., Sanchez-Castillo, M., Chica-Olmo, M., & Chica-Rivas, M. (2015). Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geology Reviews, 71, 804–818.
Salminen, R., Batista, M. J., Bidovec, M., Demetriades, A., De Vivo, B., De Vos, W., et al. (2005). Geochemical Atlas of Europe. Espoo: Geological Survey of Finland.
Sandjivy, L. (1984). The factorial kriging analysis of regionalized data—Its application to geochemical prospecting. In G. Verly, A. G. Journel, & A. Maréchal (Eds.), Geostatistics for natural resources characterization (pp. 559–571). Dordrecht: Reidel.
Simpson, E. H. (1949). Measurement of diversity. Nature, 163(4148), 688.
Soares, A. (1992). Geostatistical estimation of multi-phase structures. Mathematical Geology, 24(2), 148–160.
Stanley, C. R., & Sinclair, A. J. (1989). Comparison of probability plots and the gap statistic in the selection of thresholds for exploration geochemistry data. Journal of Geochemical Exploration, 32(1–3), 355–357.
Sun, T., Chen, F., Zhong, L. X., Liu, W. M., & Wang, Y. (2019). GIS-based mineral prospectivity mapping using machine learning methods: A case study from Tongling ore district, eastern China. Ore Geology Reviews, 109, 26–49.
Talebi, H., Mueller, U., Tolosana-Delgado, R., Grunsky, E. C., McKinley, J. M., & de Caritat, P. (2019). Surficial and deep earth material prediction from geochemical compositions. Natural Resources Research, 28, 869–891.
Tolosana-Delgado, R., Mueller, U., & van den Boogaart, K. G. (2019). Geostatistics for compositional data: An overview. Mathematical Geosciences, 51(4), 485–526.
van den Boogaart, K. G., Mueller, U., & Tolosana-Delgado, R. (2017). An affine equivariant multivariate normal score transform for compositional data. Mathematical Geosciences, 49(2), 231–251.
Wackernagel, H. (1988). Geostatistical techniques for interpreting multivariate spatial information. In C. F. Chung, A. G. Fabbri, & R. Sinding-Larsen (Eds.), Quantitative analysis of mineral and energy resources (pp. 393–409). Dordrecht: Reidel.
Wackernagel, H. (2003). Multivariate geostatistics: An introduction with applications. Berlin: Springer.
Xiang, J., Xiao, K. Y., Carranza, E. J. M., Chen, J. P., & Li, S. (2020). 3D mineral prospectivity mapping with random forests: A case study of Tongling, Anhui. China. Natural Resources Research, 29(1), 395–414.
Zuo, R. G., & Xiong, Y. H. (2018). Big data analytics of identifying geochemical anomalies supported by machine learning methods. Natural Resources Research, 27(1), 5–13.
Acknowledgments
This research was funded by the National Agency for Research and Development of Chile, through grant ANID/CONICYT PIA AFB180004, by the Ministry of Higher Education, Science, Technology and Innovation of Ecuador (SENESCYT), through scholarship program “Open Call 2012 Second Phase” of the government of Ecuador, and by the Particular Technical University of Loja-Ecuador.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guartán, J.A., Emery, X. Regionalized Classification of Geochemical Data with Filtering of Measurement Noises for Predictive Lithological Mapping. Nat Resour Res 30, 1033–1052 (2021). https://doi.org/10.1007/s11053-020-09779-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11053-020-09779-0