Surface Warping Incorporating Machine Learning Assisted Domain Likelihood Estimation: A New Paradigm in Mine Geology Modeling and Automation

Leung, Raymond; Balamurali, Mehala; Lowe, Alexander

doi:10.1007/s11004-021-09967-5

Surface Warping Incorporating Machine Learning Assisted Domain Likelihood Estimation: A New Paradigm in Mine Geology Modeling and Automation

Special Issue
Published: 13 September 2021

Volume 54, pages 533–572, (2022)
Cite this article

Mathematical Geosciences Aims and scope Submit manuscript

582 Accesses
3 Citations
1 Altmetric
Explore all metrics

A Correction to this article was published on 11 November 2021

This article has been updated

Abstract

In surface mining, assay measurements taken from production drilling often provide useful information that enables initially inaccurate surfaces (for example, mineralization boundaries) created using sparse exploration data to be revised and subsequently improved. Recently, a Bayesian warping technique was proposed to reshape modeled surfaces using geochemical observations and spatial constraints imposed by newly acquired blasthole data. This paper focuses on incorporating machine learning (ML) into this warping framework to make the likelihood computation generalizable. The technique works by adjusting the position of vertices on the surface to maximize the integrity of modeled geological boundaries with respect to sparse geochemical observations. Its foundation is laid by a Bayesian derivation in which the geological domain likelihood given the chemistry, $p(g\!\mid \!{\mathbf {c}})$, plays a role similar to $p(y({\mathbf {c}})\!\mid \! g)$ for certain categorical mappings $y:{\mathbb {R}}^K\rightarrow {\mathbb {Z}}$. This observation allows a manually calibrated process centered on the latter to be automated, since ML techniques may be used to estimate the former in a data-driven way. Machine learning performance is evaluated for gradient boosting, neural network, random forest, and other classifiers in a binary and multi-class context using precision and recall rates. Once ML likelihood estimators are integrated into the surface warping framework, surface shaping performance is evaluated using unseen data by examining the categorical distribution of test samples located above and below the warped surface. Large-scale validation experiments are performed to assess the overall efficacy of ML-assisted surface warping as a fully integrated component within an ore grade estimation system, where the posterior mean is obtained via Gaussian process (GP) inference with a Matérn 3/2 kernel. This article illustrates an application of machine learning within a complex system where grade estimation is accomplished by integrating boundary warping with ML and other components.

Graphic Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

A systematic review and meta-analysis of artificial neural network, machine learning, deep learning, and ensemble learning approaches in field of geotechnical engineering

Article Open access 13 May 2024

A support vector machine model of landslide susceptibility mapping based on hyperparameter optimization using the Bayesian algorithm: a case study of the highways in the southern Qinghai–Tibet Plateau

Article 15 May 2024

Surface water quality index forecasting using multivariate complementing approach reinforced with locally weighted linear regression model

Article 23 April 2024

Authorship Statement

This paper was conceptualized and written by the first author. Raymond Leung performed the research, experiments, technical integration, and analysis. Mehala Balamurali and Alexander Lowe conducted preliminary investigations focused on binary domain classification and multi-class analysis of dolerite and hydrated domains. Those findings have informed and motivated this research.

Change history

11 November 2021
A Correction to this paper has been published: https://doi.org/10.1007/s11004-021-09982-6

Notes

Material properties, for example, whether the rock is hard or friable, lumpy or fine, viscous or powdery, are important considerations for downstream ore processing. For instance, viscous materials can cause clogging to equipments in the processing plant.
Recall that ${\mathbf {x}}$ and $\varvec{\delta }$ denote the location and dimensions of the sample.
For multi-class probability estimation, the one-versus-one training and cross-validation procedure described in Chang and Lin (2011) is employed.
Previous studies have shown that chemical compositional data points (the ${\mathbf {c}}$ in $p(g\!\mid \!{\mathbf {c}})$) lie on the Aitchison simplex (Leung et al. 2019; Tsagris et al. 2011).
In the bottom half of Fig. 5, each row corresponds to a particular geozone $g_\text {actual}$, and the vertical axis is limited to $[-3,0]$ in logarithmic scale. The color of each histogram reflects the sample size for the relevant geozone (darker shade means more plentiful)
Except the dynamic aspect is absent here, so the probability distribution is not updated as information is gained over time.
This varies from model to model, as the block spatial structures induced by the relevant warped surfaces are different.

References

Acosta ICC, Khodadadzadeh M, Tusa L, Ghamisi P, Gloaguen R (2019) A machine learning framework for drill-core mineral mapping using hyperspectral and high-resolution mineralogical data fusion. IEEE J Sel Top Appl Earth Observ Remote Sens 12(12):4829–4842
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78(1):1–3
Article Google Scholar
Caumon G, Collon-Drouaillet P, De Veslud CLC, Viseur S, Sausse J (2009) Surface-based 3D modeling of geological structures. Math Geosci 41(8):927–945
Article Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27
Google Scholar
Chieregati A, Delboni H, Coimbra Leite Costa J (2008) Sampling for proactive reconciliation practices. Min Technol 117(3):136–141
Article Google Scholar
Clout J (2006) Iron formation-hosted iron ores in the Hamersley Province of Western Australia. Appl Earth Sci 115(4):115–125
Article Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Google Scholar
Cressie N (1985) Fitting variogram models by weighted least squares. J Int Assoc Math Geol 17(5):563–586
Article Google Scholar
Cressie N (2015) Statistics for spatial data. Wiley, Hoboken
Google Scholar
Egozcue J, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geosci 35:279–300
Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232
Garrett RG, Reimann C, Hron K, Kynčlová P, Filzmoser P (2017) Finally, a correlation coefficient that tells the geochemical truth. Newsl Assoc Appl Geochem 176
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the international conference on artificial intelligence and statistics, pp 249–256
Greenacre M, Grunsky E, et al. (2019) The isometric logratio transformation in compositional data analysis: a practical evaluation, Economics Working Paper Series. Technical Report 1627, Barcelona Graduate School of Economics
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin
Book Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Hinton GE (1990) Connectionist learning procedures. In: Machine learning, Elsevier, pp 555–610
Horrocks T, Holden EJ, Wedge D, Wijns C, Fiorentini M (2019) Geochemical characterisation of rock hydration processes using T-SNE. Comput Geosci 124:46–57
Article Google Scholar
Jewbali A, Ramos FT, Melkumyan A (2011) A non-parametric Bayesian framework for automatic block estimation. In: Proceedings., APCOM symposium, AusIMM, 056, pp 1–20
Karpatne A, Ebert-Uphoff I, Ravela S, Babaie HA, Kumar V (2018) Machine learning for the geosciences: challenges and opportunities. IEEE Trans Knowl Data Eng 31(8):1544–1554
Article Google Scholar
Khushaba RN, Melkumyan A, Hill AJ (2021) A machine learning approach for material type logging and chemical assaying from autonomous measure-while-drilling (MWD) data. Math Geosci (accepted)
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Leung R (2020) Modelling orebody structures: block merging algorithms and block model spatial restructuring strategies given mesh surfaces of geological boundaries. J Spatial Inf Sci 21:137–174. https://doi.org/10.5311/JOSIS.2020.21.582
Article Google Scholar
Leung R (2021b) Empirical observations on the effects of data transformation in machine learning classification of geological domains. arXiv:2106.05855
Leung R, Balamurali M, Melkumyan A (2019) Sample truncation strategies for outlier removal in geochemical data: the MCD robust distance approach versus t-SNE ensemble clustering. Math Geosci 53:105–130. https://doi.org/10.1007/s11004-019-09839-z
Article Google Scholar
Leung R, Lowe A, Chlingaryan A, Melkumyan A, Zigman J (2021a) Bayesian surface warping approach for rectifying geological boundaries using displacement likelihood and evidence from geochemical assays. (to appear in ACM Transactions on Spatial Algorithms and Systems) https://doi.org/10.1145/3476979. arXiv:2005.14427
Lou W, Wang X, Chen F, Chen Y, Jiang B, Zhang H (2014) Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian Naive Bayes. PLOS ONE 9(1):e86703
Melkumyan A, Ramos F (2009) A sparse covariance function for exact gaussian process inference in large datasets. In: International joint conference on artificial intelligence (IJCAI) 9:1936–1942
Melkumyan A, Ramos F (2011a) Multi-kernel Gaussian processes. In: International joint conference on artificial intelligence (IJCAI)
Melkumyan A, Ramos F (2011b) Non-parametric bayesian learning for resource estimation in the autonomous mine. In: Proceedings., APCOM symposium, pp 209–215
Murphy KP et al (2006) Naive Bayes classifiers. Lecture Notes (CS340-Fall), University of British Columbia
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Google Scholar
Platt J et al (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classifiers 10(3):61–74
Google Scholar
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Article Google Scholar
Sommerville B, Boyle C, Brajkovich N, Savory P, Latscha AA (2014) Mineral resource estimation of the Brockman 4 iron ore deposit in the Pilbara region. Appl Earth Sci 123(2):135–145
Article Google Scholar
Song Y, Liang J, Lu J, Zhao X (2017) An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 251:26–34
Article Google Scholar
Strasdat H, Montiel JM, Davison AJ (2012) Visual SLAM: Why filter? Image Vis Comput 30(2):65–77
Article Google Scholar
Tahmasebi P, Hezarkhani A (2012) A hybrid neural networks-fuzzy logic-genetic algorithm for grade estimation. Comput Geosci 42:18–27
Article Google Scholar
Tolosana-Delgado R, Mueller U, van den Boogaart KG (2019) Geostatistics for compositional data: an overview. Math Geosci 51:485–526. https://doi.org/10.1007/s11004-018-9769-3
Article Google Scholar
Tsagris M, Preston S, Wood A (2011) A data-based power transformation for compositional data. In: CoDaWork’11: 4th international workshop on compositional data analysis, Girona, Spain
Vasudevan S (2012) Data fusion with Gaussian processes. Robot Auton Syst 60(12):1528–1544
Article Google Scholar
Vasudevan S, Ramos F, Nettleton E, Durrant-Whyte H (2010) Heteroscedastic gaussian processes for data fusion in large scale terrain modeling. In: 2010 IEEE international conference on robotics and automation, IEEE, pp 3452–3459
Wackernagel H (2013) Multivariate geostatistics: an introduction with applications. Springer, Berlin
Google Scholar
Wedge D, Lewan A, Paine M, Holden EJ, Green T (2018) A data mining approach to validating drill hole logging data in pilbara iron ore exploration. Econ Geol 113(4):961–972
Article Google Scholar
Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning, vol 2. MIT Press, Cambridge
Google Scholar
Yu HF, Huang FL, Lin CJ (2011) Dual coordinate descent methods for logistic regression and maximum entropy models. Mach Learn 85(1–2):41–75
Article Google Scholar
Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and Naive Bayesian classifiers. Int Conf Mac Learn 1:609–616
Google Scholar
Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data Mining, ACM, pp 694–699
Zhu C, Byrd RH, Lu P, Nocedal J (1997) Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw 23(4):550–560
Article Google Scholar

Download references

Acknowledgements

This work was supported by the Australian Centre for Field Robotics and the Rio Tinto Centre for Mine Automation. The authors would like to acknowledge Corentin Plouët and John Zigman for their software contributions and refactoring the code. Their efforts have facilitated this extension, allowing various ML techniques to be evaluated within an integrated surface warping framework. Katherine Silversides is thanked for proofreading this paper.

Author information

Authors and Affiliations

The Australian Centre for Field Robotics, The University of Sydney, Sydney Robotics Hub J18, Sydney, NSW, 2006, Australia
Raymond Leung, Mehala Balamurali & Alexander Lowe

Authors

Raymond Leung
View author publications
You can also search for this author in PubMed Google Scholar
Mehala Balamurali
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Lowe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raymond Leung.

Additional information

The original online version of this article was revised because the figure belonging to table 3 was missing from this article.

Appendix A: Validation Experiments for Grade Models Built Using ML Surface Warping and Gaussian Processes

The goal of these experiments is to provide compelling evidence that the proposed ML surface warping approach is competitive with the handcrafted y-chart surface warping approach when used in a real orebody grade estimation system. In terms of boundary representation, Leung et al. (2021a) have shown that inaccuracies in a modeled surface can affect how well the geochemistry can be estimated. Readers are referred to Leung et al. (2021a) for a demonstration of how a poorly estimated surface affects the spatial structure and inferencing ability of the resultant model.

The overall efficacy of ML surface warping can be measured by running end-to-end experiments. This involves using the warped surfaces produced by various machine learning techniques to update the spatial structure of a block model (described in Leung (2020)) and feeding this structure and sparse assay measurements to a Gaussian process inferencing engine (Leung et al. 2021a) to produce a block model with an estimated concentration of various chemical components. Once this process is complete, a quality assessment is performed. The proposed procedure, $\hbox {R}_{2}$ reconciliation, compares the inferred values with grade-block reference values. A grade block refers to a generally non-convex polygonal region that has been extended to a prism by the height of a mining bench. Grade blocks are created for grade-control and excavation purposes; therefore, they exist at the mining scale and are much larger than the blocks within the block model. They contain location, volumetric information, and geologist-validated compositions and serve as the ground truth. This assessment captures the full complexity of the interactions between components and overcomes the deficiencies noted in Sect. 6.3.

1.1 A.1 Inferencing via Gaussian Process

Before describing the validation procedure, it is worth addressing the topic of Gaussian processes (GPs) and mention their connections with geostatistics. GP may be viewed as a non-parametric Bayesian regression technique. Initially proposed under the name kriging in geostatistical literature (Cressie 2015), it is an essential tool for modeling spatial stochastic processes. As a supervised learning problem, the general idea is to compute the predictive distribution $f({\mathbf {x}}_*)$ at new locations ${\mathbf {x}}_*$ where $y_*$ is unknown given a training set ${\mathcal {D}}=\{{\mathbf {x}}_i,y_i\}_{i=1}^N$ comprising N input points ${\mathbf {x}}_i\in {\mathbb {R}}^D$ and N outputs $y_i\in {\mathbb {R}}$. In this work, ${\mathbf {x}}_i$ represents the spatial coordinates and dimensions of an assay sample, and $y_i$ denotes the concentration of a chemical component such as Fe or SiO₂. Unlike kriging where variograms (Cressie 1985) play a pivotal role, in GP, the random process $f({\mathbf {x}})\sim \mathcal {GP}(m({\mathbf {x}}),k({\mathbf {x}},{\mathbf {x}}'))$ is characterized by a covariance function, $k({\mathbf {x}},{\mathbf {x}}')$. For instance, the Matérn 3/2 covariance function (1D kernel) is given by $k(x,x';\theta )=\sigma \left( 1+\sqrt{3}d/\rho \right) \exp (-\sqrt{3}d/\rho )$ where $d=\left| x-x'\right| $ and the hyperparameters $\theta =[\rho ,\sigma ]$.

Grouping points within the training set together, the input, output, and distribution may be written collectively as $(X,{\mathbf {y}},{\mathbf {f}})\equiv (\{{\mathbf {x}}_i\},\{y_i\},\{f_i\})_{i=1}^N$ using matrix and vectors. Similarly, $(X_*,{\mathbf {y}}_*,{\mathbf {f}}_*)\equiv (\{{\mathbf {x}}_{*,i}\},\{y_{*,i}\},\{f_{*,i}\})_{i=1}^{N_*}$ for the test points. Conditioning on the observed training points, the predictive distribution may be computed as $p({\mathbf {f}}_*\!\mid \!X_*,X,{\mathbf {y}})={\mathcal {N}}(\varvec{\mu }_*,\varvec{\varSigma }_*)$ where the posterior mean and variance are given by $\varvec{\mu }_*=K(X_*,X)K_y^{-1}{\mathbf {y}}$ and $\varvec{\varSigma }_*=K(X_*,X_*)-K(X_*,X)K_y^{-1}K(X,X_*)+\sigma ^2 I$, respectively, and $K_y=K(X,X)+\sigma ^2 I$. The key observation is that the predictive mean is a linear combination of N kernel functions each centered on a training point (Melkumyan and Ramos 2009), $\varvec{\mu }_*=\sum _{i=1}^N \alpha _i k({\mathbf {x}}_i,{\mathbf {x}}_*;\theta )$ with $\alpha _i=K_y^{-1} {\mathbf {y}}$. In practice, this amounts to a length-scale ($\theta $)-dependent estimation of the local value using a subset of observations. The optimal hyperparameters $\theta =(\rho _x,\rho _y,\ldots ,\sigma )$ are obtained by maximizing the log marginal likelihood $\log p({\mathbf {y}}\!\mid \!X,\theta )$. The technical details are described in Melkumyan and Ramos (2009, 2011b) and Williams and Rasmussen (2006).

In fairness, many technical alternatives exist. For instance, Melkumyan and Ramos (2011a) investigated multi-task GP and used it to infer the concentration of several rather than a single chemical component. The multi-task inference problem, perhaps better known as co-kriging (Wackernagel 2013), involves exploiting co-dependencies between multiple outputs given ${\mathbf {y}}\in {\mathbb {R}}^M$. In Vasudevan et al. (2010) and Vasudevan (2012), heteroscedastic GP is used to fuse multiple data sets with different noise parameters. That work is noted for deriving expressions for the conditional mean and variance in a recursive form, showing the difference in posterior uncertainty: as successive data sets are added, to be a positive semi-definite matrix that guarantees that the uncertainty will either remain the same or decrease but never increase. Although these approaches show tremendous potential, in this paper, ordinary GP is used for grade estimation into block volumes (Jewbali et al. 2011), and the hyperparameters are learned for each individual domain and chemical component.

1.2 A.2 Validation Method: $\hbox {R}_{2}$ Reconciliation and General Interpretations

The validation experiments are conducted on ML-warped surface block models for deposit P₉. Each grade model contains approximately 6.5 million blocks, equating roughly 7.575 million tons. Each model comparison yields on average $15,702\pm 158$ pairwise intersections^{Footnote 7} with 237 grade blocks. $\hbox {R}_{2}$ reconciliation compares the model-predicted chemistry against the ground truth, which involves computing the weighted model-predicted values based on volumetric intersection with the grade blocks. By convention, the $\hbox {R}_{2}$ value is defined (based on the “actual-versus-estimate” adjustment concept known as mine call factor described in Chieregati et al. (2008)) as “grade_block_value / model_predicted_value.” Thus, a ratio less than 1 implies that the model is overestimating; conversely, a ratio greater than 1 implies that the model is underestimating. The comparisons are performed on a bench-by-bench basis. Each bench measures 10 m in height, with the five benches of interest designated 70, 80, 90, 100, and 110. One $\hbox {R}_{2}$ value is reported for each chemical (in this paper, the main focus will be on Fe, SiO₂, Al₂O₃, and P).

Following the approach in Leung et al. (2021a), $\hbox {R}_{2}$ cdf error scores are computed to facilitate performance comparison. In essence, the $\hbox {R}_{2}$ error score, $e_{\text {R}_2}$, is obtained by integrating the difference between the ideal and actual cdf curves from a plot where the x and y axes correspond to sorted $\hbox {R}_{2}$ values and cumulative tonnage percentages, respectively. An example of this is shown in Fig. 15. This generates 400 raw values given two pits, five benches, four chemical components, and ten models. For clarity, all ten models go through the common processes of block model spatial restructuring given a set of surfaces (Leung 2020) and inferencing (Leung et al. 2021a). Where they differ is that “unwarped” does not use any warped surfaces during this modeling process, while “y-chart” uses warped surfaces obtained via the $y({\mathbf {c}})\!\mid \! g$ route, and the remaining eight models all use ML-warped surfaces that utilize $p(g\!\mid \!{\mathbf {c}})$ estimates.

The computed raw $\hbox {R}_{2}$ values are quite dense and difficult to interpret. To extract overall trends, the geometric mean error score, $\mu _{\text {R}_2}$, and normalized $\hbox {R}_{2}$ error scores, $e'_{\text {R}_2}$, are computed from the raw error scores, $e_{\text {R}_2}$, and presented in Table 10. For a given chemical, the geometric mean is given by

$$\begin{aligned} \mu _{\text {R}_2}(\text {model})=\prod _i x_i^{w'_i}\equiv 10^{\sum _i w'_i\log _{10}x_i}\text { where }x_i \equiv e_{\text {R}_2}^{(\text {chemical,model},i)},\ w'_i=\frac{w_i}{\sum _i w_i}. \end{aligned}$$

(13)

Summation over i is done over benches, and the weights $w_i$ are based on the contribution of each bench to the total tonnage. Comparison is facilitated by dividing each $\mu _{\text {R}_2}(\text {model})$ by $\mu _{\text {R}_2}(\text {unwarped})$ to highlight the performance of ML-warped models relative to the “unwarped” model. In particular, with $e'_{\text {R}_2}(\small {\text {model}})\mathrel {\overset{{{ \tiny {\mathrm{def}}}}}{=}}\frac{\mu _{\text {R}_2}(\text {model})}{\mu _{\text {R}_2}(\text {unwarped})}$, $e'_{\text {R}_2}<1$ represents an improvement.

A perfect model that always agrees with the grade-block values has an $\hbox {R}_{2}$ cdf curve given by a step function that transitions from 0 to 100 (from left to right) at $x=1$ because the model-predicted value is identical to the grade-block value. Therefore, its $\hbox {R}_{2}$ value (their ratio) is 1 throughout. In practice, a model like “unwarped” has a performance curve given by the orange line in Fig. 15 and an error score given by the shaded area. This shaded area, for each performance curve, is the basis on which the prediction / inferencing ability of each model is compared. The authors hasten to point out that this represents only a snapshot for P₉ pit A, Fe, and bench 100. Performance should not be generalized based on limited observations. However, it turns out this snapshot is fairly representative of the general trend: (i) the reference GS model is by far the least accurate (see the green curve in the top-left panel); (ii) the ‘unwarped’ model is generally worse than any warped surface model (see orange curve in top-left panel); (iii) ML-warped models have similar performance to the y-chart-warped model (see GradientBoost, MLP, and RandomForest in the top-right panel); (iv) warped models with ilr transformation are not significantly different to the y-chart model (see bottom-left panel).

1.3 A.3 Observations on $\hbox {R}_{2}$ CDF Error Scores

In Table 10, a resource model created using low resolution and limited data (called GS) is also included for comparison. This GS model is inferior to both warped and unwarped models as it does not use surfaces to update the local model structure at all. The normalized error scores $e'_{\text {R}_2}$ appear in bold if a “warped” model improves relative to the baseline “unwarped” model. Furthermore, when an ML warped model outperforms the “y-chart” warped model with respect to chemical c, the error score $e'_{\text {R}_2}(c)$ is underlined. On this basis, one observes:

All ML-warped surface models perform better than the GS model and unwarped model, certainly with respect to Fe, SiO₂ and Al₂O₃, but also for P.
For the leading ML estimators, viz., GradientBoost, MLP, and RandomForest, grade estimation performance from the resultant warped surface models are compatible with the y-chart model.
In some instances, the ML-warped surface models perform better than the y-chart model. The error scores, $e'_{\text {R}_2}(\text {Fe})$, are 0.913 for ilr RandomForest, 0.933 for MLP, and worse (0.964) for the y-chart model. Overall, across all elements, ilr RandomForest and MLP are best, followed by ilr MLP, while GradBoost and RandomForest basically break even.

Table 10 Performance statistics: $\hbox {R}_{2}$ geometric mean ($\mu _{\text {R}_2}$) and normalized error score ($e'_{\text {R}_2}$) for deposit P₉ pit A

Full size table

1.3.1 A.3.1 Observations on $\hbox {R}_{2}$ Distribution Inter-Quartile Range

It is instructive to examine specific nonparametric univariate statistics such as the interquartile range (IQR) which indicates the spread of values. A distribution with a long, heavy tail (more extreme “ground truth / model” ratios) will have a larger IQR. The IQR is computed by first applying $\log _2$ to the $R_2$ values, ensuring inverse ratios have the same magnitude and produce (ideally) a symmetrical distribution. The transformed values $x\sim \log _2 R_2$ are scaled by tonnage to give the weighted quantiles: $q_{25},q_{75}$ and interquartile range $\text {IQR}=q_{75}-q_{25}$ to provide a complementary interpretation of the data. The implication is that errors integrated at the tail of the distribution are considered worse than those near the center, where the ratios are close to parity.

The findings echo the previous observations. Variability decreases relative to GS in line with the reduction for “y-chart”. In Table 11, the normalized IQR for Fe are 0.7433 and 0.7274 for MLP and ilr RandomForest, respectively, which do not differ significantly from the y-chart. The improvement for Fe with respect to “unwarped” ranges from 0.2195 to 0.2811; this is highly significant. “RandomForest + s” also has a lower IQR than y-chart with respect to SiO₂ and Al₂O₃.

Table 11 Performance statistics: $\hbox {R}_{2}$ interquartile ranges and normalized IQR scores for deposit P₉ pit A

Full size table

1.4 A.4 Concluding Remarks

The ML-warped surface models achieved similar performance gain as the y-chart model with respect to the GS and unwarped models. Using ML estimators with high learning capacity—such as RandomForest, MLP (with ilr) and GradBoost—there is a reasonable prospect of matching or exceeding the performance of the y-chart model. A key advantage of the ML surface warping approach is the flexibility of learning a general domain likelihood function $p(g\!\mid \!{\mathbf {c}})$, which is not restricted to a particular mineral (e.g. goethite/hematite versus shale) and commodity (e.g. iron or copper deposit).

This paper has adopted a three-pronged approach for surface warping performance evaluation. First, the classification performance of ML estimators was measured using precision and recall rates in a Monte Carlo cross-validation setup. Second, the categorical distribution of test samples was examined in terms of the number of HG/BL/LG/W samples located above and below the warped surfaces (modeled boundaries). Third, grade estimation performance of the resultant warped surface block models was quantified using normalized $\hbox {R}_{2}$ reconciliation error scores and IQR statistics. The measures considered vary in scope and complexity. Together, they provide a credible perspective of ML surface warping performance as a stand-alone and an integrated component within an orebody grade estimation system.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Leung, R., Balamurali, M. & Lowe, A. Surface Warping Incorporating Machine Learning Assisted Domain Likelihood Estimation: A New Paradigm in Mine Geology Modeling and Automation. Math Geosci 54, 533–572 (2022). https://doi.org/10.1007/s11004-021-09967-5

Download citation

Received: 02 October 2020
Accepted: 17 July 2021
Published: 13 September 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s11004-021-09967-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions