Skip to main content

Advertisement

Log in

Surface Warping Incorporating Machine Learning Assisted Domain Likelihood Estimation: A New Paradigm in Mine Geology Modeling and Automation

  • Special Issue
  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

A Correction to this article was published on 11 November 2021

This article has been updated

Abstract

In surface mining, assay measurements taken from production drilling often provide useful information that enables initially inaccurate surfaces (for example, mineralization boundaries) created using sparse exploration data to be revised and subsequently improved. Recently, a Bayesian warping technique was proposed to reshape modeled surfaces using geochemical observations and spatial constraints imposed by newly acquired blasthole data. This paper focuses on incorporating machine learning (ML) into this warping framework to make the likelihood computation generalizable. The technique works by adjusting the position of vertices on the surface to maximize the integrity of modeled geological boundaries with respect to sparse geochemical observations. Its foundation is laid by a Bayesian derivation in which the geological domain likelihood given the chemistry, \(p(g\!\mid \!{\mathbf {c}})\), plays a role similar to \(p(y({\mathbf {c}})\!\mid \! g)\) for certain categorical mappings \(y:{\mathbb {R}}^K\rightarrow {\mathbb {Z}}\). This observation allows a manually calibrated process centered on the latter to be automated, since ML techniques may be used to estimate the former in a data-driven way. Machine learning performance is evaluated for gradient boosting, neural network, random forest, and other classifiers in a binary and multi-class context using precision and recall rates. Once ML likelihood estimators are integrated into the surface warping framework, surface shaping performance is evaluated using unseen data by examining the categorical distribution of test samples located above and below the warped surface. Large-scale validation experiments are performed to assess the overall efficacy of ML-assisted surface warping as a fully integrated component within an ore grade estimation system, where the posterior mean is obtained via Gaussian process (GP) inference with a Matérn 3/2 kernel. This article illustrates an application of machine learning within a complex system where grade estimation is accomplished by integrating boundary warping with ML and other components.

Graphic Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Authorship Statement

This paper was conceptualized and written by the first author. Raymond Leung performed the research, experiments, technical integration, and analysis. Mehala Balamurali and Alexander Lowe conducted preliminary investigations focused on binary domain classification and multi-class analysis of dolerite and hydrated domains. Those findings have informed and motivated this research.

Change history

Notes

  1. Material properties, for example, whether the rock is hard or friable, lumpy or fine, viscous or powdery, are important considerations for downstream ore processing. For instance, viscous materials can cause clogging to equipments in the processing plant.

  2. Recall that \({\mathbf {x}}\) and \(\varvec{\delta }\) denote the location and dimensions of the sample.

  3. For multi-class probability estimation, the one-versus-one training and cross-validation procedure described in Chang and Lin (2011) is employed.

  4. Previous studies have shown that chemical compositional data points (the \({\mathbf {c}}\) in \(p(g\!\mid \!{\mathbf {c}})\)) lie on the Aitchison simplex (Leung et al. 2019; Tsagris et al. 2011).

  5. In the bottom half of Fig. 5, each row corresponds to a particular geozone \(g_\text {actual}\), and the vertical axis is limited to \([-3,0]\) in logarithmic scale. The color of each histogram reflects the sample size for the relevant geozone (darker shade means more plentiful)

  6. Except the dynamic aspect is absent here, so the probability distribution is not updated as information is gained over time.

  7. This varies from model to model, as the block spatial structures induced by the relevant warped surfaces are different.

References

  • Acosta ICC, Khodadadzadeh M, Tusa L, Ghamisi P, Gloaguen R (2019) A machine learning framework for drill-core mineral mapping using hyperspectral and high-resolution mineralogical data fusion. IEEE J Sel Top Appl Earth Observ Remote Sens 12(12):4829–4842

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78(1):1–3

    Article  Google Scholar 

  • Caumon G, Collon-Drouaillet P, De Veslud CLC, Viseur S, Sausse J (2009) Surface-based 3D modeling of geological structures. Math Geosci 41(8):927–945

    Article  Google Scholar 

  • Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27

    Google Scholar 

  • Chieregati A, Delboni H, Coimbra Leite Costa J (2008) Sampling for proactive reconciliation practices. Min Technol 117(3):136–141

    Article  Google Scholar 

  • Clout J (2006) Iron formation-hosted iron ores in the Hamersley Province of Western Australia. Appl Earth Sci 115(4):115–125

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    Google Scholar 

  • Cressie N (1985) Fitting variogram models by weighted least squares. J Int Assoc Math Geol 17(5):563–586

    Article  Google Scholar 

  • Cressie N (2015) Statistics for spatial data. Wiley, Hoboken

    Google Scholar 

  • Egozcue J, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geosci 35:279–300

    Google Scholar 

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232

  • Garrett RG, Reimann C, Hron K, Kynčlová P, Filzmoser P (2017) Finally, a correlation coefficient that tells the geochemical truth. Newsl Assoc Appl Geochem 176

  • Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the international conference on artificial intelligence and statistics, pp 249–256

  • Greenacre M, Grunsky E, et al. (2019) The isometric logratio transformation in compositional data analysis: a practical evaluation, Economics Working Paper Series. Technical Report 1627, Barcelona Graduate School of Economics

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin

    Book  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034

  • Hinton GE (1990) Connectionist learning procedures. In: Machine learning, Elsevier, pp 555–610

  • Horrocks T, Holden EJ, Wedge D, Wijns C, Fiorentini M (2019) Geochemical characterisation of rock hydration processes using T-SNE. Comput Geosci 124:46–57

    Article  Google Scholar 

  • Jewbali A, Ramos FT, Melkumyan A (2011) A non-parametric Bayesian framework for automatic block estimation. In: Proceedings., APCOM symposium, AusIMM, 056, pp 1–20

  • Karpatne A, Ebert-Uphoff I, Ravela S, Babaie HA, Kumar V (2018) Machine learning for the geosciences: challenges and opportunities. IEEE Trans Knowl Data Eng 31(8):1544–1554

    Article  Google Scholar 

  • Khushaba RN, Melkumyan A, Hill AJ (2021) A machine learning approach for material type logging and chemical assaying from autonomous measure-while-drilling (MWD) data. Math Geosci (accepted)

  • Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  • Leung R (2020) Modelling orebody structures: block merging algorithms and block model spatial restructuring strategies given mesh surfaces of geological boundaries. J Spatial Inf Sci 21:137–174. https://doi.org/10.5311/JOSIS.2020.21.582

    Article  Google Scholar 

  • Leung R (2021b) Empirical observations on the effects of data transformation in machine learning classification of geological domains. arXiv:2106.05855

  • Leung R, Balamurali M, Melkumyan A (2019) Sample truncation strategies for outlier removal in geochemical data: the MCD robust distance approach versus t-SNE ensemble clustering. Math Geosci 53:105–130. https://doi.org/10.1007/s11004-019-09839-z

    Article  Google Scholar 

  • Leung R, Lowe A, Chlingaryan A, Melkumyan A, Zigman J (2021a) Bayesian surface warping approach for rectifying geological boundaries using displacement likelihood and evidence from geochemical assays. (to appear in ACM Transactions on Spatial Algorithms and Systems) https://doi.org/10.1145/3476979. arXiv:2005.14427

  • Lou W, Wang X, Chen F, Chen Y, Jiang B, Zhang H (2014) Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian Naive Bayes. PLOS ONE 9(1):e86703

  • Melkumyan A, Ramos F (2009) A sparse covariance function for exact gaussian process inference in large datasets. In: International joint conference on artificial intelligence (IJCAI) 9:1936–1942

  • Melkumyan A, Ramos F (2011a) Multi-kernel Gaussian processes. In: International joint conference on artificial intelligence (IJCAI)

  • Melkumyan A, Ramos F (2011b) Non-parametric bayesian learning for resource estimation in the autonomous mine. In: Proceedings., APCOM symposium, pp 209–215

  • Murphy KP et al (2006) Naive Bayes classifiers. Lecture Notes (CS340-Fall), University of British Columbia

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  • Platt J et al (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classifiers 10(3):61–74

    Google Scholar 

  • Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  Google Scholar 

  • Sommerville B, Boyle C, Brajkovich N, Savory P, Latscha AA (2014) Mineral resource estimation of the Brockman 4 iron ore deposit in the Pilbara region. Appl Earth Sci 123(2):135–145

    Article  Google Scholar 

  • Song Y, Liang J, Lu J, Zhao X (2017) An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 251:26–34

    Article  Google Scholar 

  • Strasdat H, Montiel JM, Davison AJ (2012) Visual SLAM: Why filter? Image Vis Comput 30(2):65–77

    Article  Google Scholar 

  • Tahmasebi P, Hezarkhani A (2012) A hybrid neural networks-fuzzy logic-genetic algorithm for grade estimation. Comput Geosci 42:18–27

    Article  Google Scholar 

  • Tolosana-Delgado R, Mueller U, van den Boogaart KG (2019) Geostatistics for compositional data: an overview. Math Geosci 51:485–526. https://doi.org/10.1007/s11004-018-9769-3

    Article  Google Scholar 

  • Tsagris M, Preston S, Wood A (2011) A data-based power transformation for compositional data. In: CoDaWork’11: 4th international workshop on compositional data analysis, Girona, Spain

  • Vasudevan S (2012) Data fusion with Gaussian processes. Robot Auton Syst 60(12):1528–1544

    Article  Google Scholar 

  • Vasudevan S, Ramos F, Nettleton E, Durrant-Whyte H (2010) Heteroscedastic gaussian processes for data fusion in large scale terrain modeling. In: 2010 IEEE international conference on robotics and automation, IEEE, pp 3452–3459

  • Wackernagel H (2013) Multivariate geostatistics: an introduction with applications. Springer, Berlin

    Google Scholar 

  • Wedge D, Lewan A, Paine M, Holden EJ, Green T (2018) A data mining approach to validating drill hole logging data in pilbara iron ore exploration. Econ Geol 113(4):961–972

    Article  Google Scholar 

  • Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning, vol 2. MIT Press, Cambridge

    Google Scholar 

  • Yu HF, Huang FL, Lin CJ (2011) Dual coordinate descent methods for logistic regression and maximum entropy models. Mach Learn 85(1–2):41–75

    Article  Google Scholar 

  • Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and Naive Bayesian classifiers. Int Conf Mac Learn 1:609–616

    Google Scholar 

  • Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data Mining, ACM, pp 694–699

  • Zhu C, Byrd RH, Lu P, Nocedal J (1997) Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw 23(4):550–560

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Australian Centre for Field Robotics and the Rio Tinto Centre for Mine Automation. The authors would like to acknowledge Corentin Plouët and John Zigman for their software contributions and refactoring the code. Their efforts have facilitated this extension, allowing various ML techniques to be evaluated within an integrated surface warping framework. Katherine Silversides is thanked for proofreading this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raymond Leung.

Additional information

The original online version of this article was revised because the figure belonging to table 3 was missing from this article.

Appendix A: Validation Experiments for Grade Models Built Using ML Surface Warping and Gaussian Processes

Appendix A: Validation Experiments for Grade Models Built Using ML Surface Warping and Gaussian Processes

The goal of these experiments is to provide compelling evidence that the proposed ML surface warping approach is competitive with the handcrafted y-chart surface warping approach when used in a real orebody grade estimation system. In terms of boundary representation, Leung et al. (2021a) have shown that inaccuracies in a modeled surface can affect how well the geochemistry can be estimated. Readers are referred to Leung et al. (2021a) for a demonstration of how a poorly estimated surface affects the spatial structure and inferencing ability of the resultant model.

The overall efficacy of ML surface warping can be measured by running end-to-end experiments. This involves using the warped surfaces produced by various machine learning techniques to update the spatial structure of a block model (described in Leung (2020)) and feeding this structure and sparse assay measurements to a Gaussian process inferencing engine (Leung et al. 2021a) to produce a block model with an estimated concentration of various chemical components. Once this process is complete, a quality assessment is performed. The proposed procedure, \(\hbox {R}_{2}\) reconciliation, compares the inferred values with grade-block reference values. A grade block refers to a generally non-convex polygonal region that has been extended to a prism by the height of a mining bench. Grade blocks are created for grade-control and excavation purposes; therefore, they exist at the mining scale and are much larger than the blocks within the block model. They contain location, volumetric information, and geologist-validated compositions and serve as the ground truth. This assessment captures the full complexity of the interactions between components and overcomes the deficiencies noted in Sect. 6.3.

1.1 A.1 Inferencing via Gaussian Process

Before describing the validation procedure, it is worth addressing the topic of Gaussian processes (GPs) and mention their connections with geostatistics. GP may be viewed as a non-parametric Bayesian regression technique. Initially proposed under the name kriging in geostatistical literature (Cressie 2015), it is an essential tool for modeling spatial stochastic processes. As a supervised learning problem, the general idea is to compute the predictive distribution \(f({\mathbf {x}}_*)\) at new locations \({\mathbf {x}}_*\) where \(y_*\) is unknown given a training set \({\mathcal {D}}=\{{\mathbf {x}}_i,y_i\}_{i=1}^N\) comprising N input points \({\mathbf {x}}_i\in {\mathbb {R}}^D\) and N outputs \(y_i\in {\mathbb {R}}\). In this work, \({\mathbf {x}}_i\) represents the spatial coordinates and dimensions of an assay sample, and \(y_i\) denotes the concentration of a chemical component such as Fe or SiO2. Unlike kriging where variograms (Cressie 1985) play a pivotal role, in GP, the random process \(f({\mathbf {x}})\sim \mathcal {GP}(m({\mathbf {x}}),k({\mathbf {x}},{\mathbf {x}}'))\) is characterized by a covariance function, \(k({\mathbf {x}},{\mathbf {x}}')\). For instance, the Matérn 3/2 covariance function (1D kernel) is given by \(k(x,x';\theta )=\sigma \left( 1+\sqrt{3}d/\rho \right) \exp (-\sqrt{3}d/\rho )\) where \(d=\left| x-x'\right| \) and the hyperparameters \(\theta =[\rho ,\sigma ]\).

Grouping points within the training set together, the input, output, and distribution may be written collectively as \((X,{\mathbf {y}},{\mathbf {f}})\equiv (\{{\mathbf {x}}_i\},\{y_i\},\{f_i\})_{i=1}^N\) using matrix and vectors. Similarly, \((X_*,{\mathbf {y}}_*,{\mathbf {f}}_*)\equiv (\{{\mathbf {x}}_{*,i}\},\{y_{*,i}\},\{f_{*,i}\})_{i=1}^{N_*}\) for the test points. Conditioning on the observed training points, the predictive distribution may be computed as \(p({\mathbf {f}}_*\!\mid \!X_*,X,{\mathbf {y}})={\mathcal {N}}(\varvec{\mu }_*,\varvec{\varSigma }_*)\) where the posterior mean and variance are given by \(\varvec{\mu }_*=K(X_*,X)K_y^{-1}{\mathbf {y}}\) and \(\varvec{\varSigma }_*=K(X_*,X_*)-K(X_*,X)K_y^{-1}K(X,X_*)+\sigma ^2 I\), respectively, and \(K_y=K(X,X)+\sigma ^2 I\). The key observation is that the predictive mean is a linear combination of N kernel functions each centered on a training point (Melkumyan and Ramos 2009), \(\varvec{\mu }_*=\sum _{i=1}^N \alpha _i k({\mathbf {x}}_i,{\mathbf {x}}_*;\theta )\) with \(\alpha _i=K_y^{-1} {\mathbf {y}}\). In practice, this amounts to a length-scale (\(\theta \))-dependent estimation of the local value using a subset of observations. The optimal hyperparameters \(\theta =(\rho _x,\rho _y,\ldots ,\sigma )\) are obtained by maximizing the log marginal likelihood \(\log p({\mathbf {y}}\!\mid \!X,\theta )\). The technical details are described in Melkumyan and Ramos (2009, 2011b) and Williams and Rasmussen (2006).

In fairness, many technical alternatives exist. For instance, Melkumyan and Ramos (2011a) investigated multi-task GP and used it to infer the concentration of several rather than a single chemical component. The multi-task inference problem, perhaps better known as co-kriging (Wackernagel 2013), involves exploiting co-dependencies between multiple outputs given \({\mathbf {y}}\in {\mathbb {R}}^M\). In Vasudevan et al. (2010) and Vasudevan (2012), heteroscedastic GP is used to fuse multiple data sets with different noise parameters. That work is noted for deriving expressions for the conditional mean and variance in a recursive form, showing the difference in posterior uncertainty: as successive data sets are added, to be a positive semi-definite matrix that guarantees that the uncertainty will either remain the same or decrease but never increase. Although these approaches show tremendous potential, in this paper, ordinary GP is used for grade estimation into block volumes (Jewbali et al. 2011), and the hyperparameters are learned for each individual domain and chemical component.

1.2 A.2 Validation Method: \(\hbox {R}_{2}\) Reconciliation and General Interpretations

The validation experiments are conducted on ML-warped surface block models for deposit P9. Each grade model contains approximately 6.5 million blocks, equating roughly 7.575 million tons. Each model comparison yields on average \(15,702\pm 158\) pairwise intersectionsFootnote 7 with 237 grade blocks. \(\hbox {R}_{2}\) reconciliation compares the model-predicted chemistry against the ground truth, which involves computing the weighted model-predicted values based on volumetric intersection with the grade blocks. By convention, the \(\hbox {R}_{2}\) value is defined (based on the “actual-versus-estimate” adjustment concept known as mine call factor described in Chieregati et al. (2008)) as “grade_block_value / model_predicted_value.” Thus, a ratio less than 1 implies that the model is overestimating; conversely, a ratio greater than 1 implies that the model is underestimating. The comparisons are performed on a bench-by-bench basis. Each bench measures 10 m in height, with the five benches of interest designated 70, 80, 90, 100, and 110. One \(\hbox {R}_{2}\) value is reported for each chemical (in this paper, the main focus will be on Fe, SiO2, Al2O3, and P).

Following the approach in Leung et al. (2021a), \(\hbox {R}_{2}\) cdf error scores are computed to facilitate performance comparison. In essence, the \(\hbox {R}_{2}\) error score, \(e_{\text {R}_2}\), is obtained by integrating the difference between the ideal and actual cdf curves from a plot where the x and y axes correspond to sorted \(\hbox {R}_{2}\) values and cumulative tonnage percentages, respectively. An example of this is shown in Fig. 15. This generates 400 raw values given two pits, five benches, four chemical components, and ten models. For clarity, all ten models go through the common processes of block model spatial restructuring given a set of surfaces (Leung 2020) and inferencing (Leung et al. 2021a). Where they differ is that “unwarped” does not use any warped surfaces during this modeling process, while “y-chart” uses warped surfaces obtained via the \(y({\mathbf {c}})\!\mid \! g\) route, and the remaining eight models all use ML-warped surfaces that utilize \(p(g\!\mid \!{\mathbf {c}})\) estimates.

Fig. 15
figure 15

\(\hbox {R}_{2}\) reconciliation performance curves (for deposit: P9, pit: A, chemical: Fe, bench: 100)

The computed raw \(\hbox {R}_{2}\) values are quite dense and difficult to interpret. To extract overall trends, the geometric mean error score, \(\mu _{\text {R}_2}\), and normalized \(\hbox {R}_{2}\) error scores, \(e'_{\text {R}_2}\), are computed from the raw error scores, \(e_{\text {R}_2}\), and presented in Table 10. For a given chemical, the geometric mean is given by

$$\begin{aligned} \mu _{\text {R}_2}(\text {model})=\prod _i x_i^{w'_i}\equiv 10^{\sum _i w'_i\log _{10}x_i}\text { where }x_i \equiv e_{\text {R}_2}^{(\text {chemical,model},i)},\ w'_i=\frac{w_i}{\sum _i w_i}. \end{aligned}$$
(13)

Summation over i is done over benches, and the weights \(w_i\) are based on the contribution of each bench to the total tonnage. Comparison is facilitated by dividing each \(\mu _{\text {R}_2}(\text {model})\) by \(\mu _{\text {R}_2}(\text {unwarped})\) to highlight the performance of ML-warped models relative to the “unwarped” model. In particular, with \(e'_{\text {R}_2}(\small {\text {model}})\mathrel {\overset{{{ \tiny {\mathrm{def}}}}}{=}}\frac{\mu _{\text {R}_2}(\text {model})}{\mu _{\text {R}_2}(\text {unwarped})}\), \(e'_{\text {R}_2}<1\) represents an improvement.

A perfect model that always agrees with the grade-block values has an \(\hbox {R}_{2}\) cdf curve given by a step function that transitions from 0 to 100 (from left to right) at \(x=1\) because the model-predicted value is identical to the grade-block value. Therefore, its \(\hbox {R}_{2}\) value (their ratio) is 1 throughout. In practice, a model like “unwarped” has a performance curve given by the orange line in Fig. 15 and an error score given by the shaded area. This shaded area, for each performance curve, is the basis on which the prediction / inferencing ability of each model is compared. The authors hasten to point out that this represents only a snapshot for P9 pit A, Fe, and bench 100. Performance should not be generalized based on limited observations. However, it turns out this snapshot is fairly representative of the general trend: (i) the reference GS model is by far the least accurate (see the green curve in the top-left panel); (ii) the ‘unwarped’ model is generally worse than any warped surface model (see orange curve in top-left panel); (iii) ML-warped models have similar performance to the y-chart-warped model (see GradientBoost, MLP, and RandomForest in the top-right panel); (iv) warped models with ilr transformation are not significantly different to the y-chart model (see bottom-left panel).

1.3 A.3 Observations on \(\hbox {R}_{2}\) CDF Error Scores

In Table 10, a resource model created using low resolution and limited data (called GS) is also included for comparison. This GS model is inferior to both warped and unwarped models as it does not use surfaces to update the local model structure at all. The normalized error scores \(e'_{\text {R}_2}\) appear in bold if a “warped” model improves relative to the baseline “unwarped” model. Furthermore, when an ML warped model outperforms the “y-chart” warped model with respect to chemical c, the error score \(e'_{\text {R}_2}(c)\) is underlined. On this basis, one observes:

  • All ML-warped surface models perform better than the GS model and unwarped model, certainly with respect to Fe, SiO2 and Al2O3, but also for P.

  • For the leading ML estimators, viz., GradientBoost, MLP, and RandomForest, grade estimation performance from the resultant warped surface models are compatible with the y-chart model.

  • In some instances, the ML-warped surface models perform better than the y-chart model. The error scores, \(e'_{\text {R}_2}(\text {Fe})\), are 0.913 for ilr RandomForest, 0.933 for MLP, and worse (0.964) for the y-chart model. Overall, across all elements, ilr RandomForest and MLP are best, followed by ilr MLP, while GradBoost and RandomForest basically break even.

Table 10 Performance statistics: \(\hbox {R}_{2}\) geometric mean (\(\mu _{\text {R}_2}\)) and normalized error score (\(e'_{\text {R}_2}\)) for deposit P9 pit A

1.3.1 A.3.1 Observations on \(\hbox {R}_{2}\) Distribution Inter-Quartile Range

It is instructive to examine specific nonparametric univariate statistics such as the interquartile range (IQR) which indicates the spread of values. A distribution with a long, heavy tail (more extreme “ground truth / model” ratios) will have a larger IQR. The IQR is computed by first applying \(\log _2\) to the \(R_2\) values, ensuring inverse ratios have the same magnitude and produce (ideally) a symmetrical distribution. The transformed values \(x\sim \log _2 R_2\) are scaled by tonnage to give the weighted quantiles: \(q_{25},q_{75}\) and interquartile range \(\text {IQR}=q_{75}-q_{25}\) to provide a complementary interpretation of the data. The implication is that errors integrated at the tail of the distribution are considered worse than those near the center, where the ratios are close to parity.

The findings echo the previous observations. Variability decreases relative to GS in line with the reduction for “y-chart”. In Table 11, the normalized IQR for Fe are 0.7433 and 0.7274 for MLP and ilr RandomForest, respectively, which do not differ significantly from the y-chart. The improvement for Fe with respect to “unwarped” ranges from 0.2195 to 0.2811; this is highly significant. “RandomForest + s” also has a lower IQR than y-chart with respect to SiO2 and Al2O3.

Table 11 Performance statistics: \(\hbox {R}_{2}\) interquartile ranges and normalized IQR scores for deposit P9 pit A

1.4 A.4 Concluding Remarks

The ML-warped surface models achieved similar performance gain as the y-chart model with respect to the GS and unwarped models. Using ML estimators with high learning capacity—such as RandomForest, MLP (with ilr) and GradBoost—there is a reasonable prospect of matching or exceeding the performance of the y-chart model. A key advantage of the ML surface warping approach is the flexibility of learning a general domain likelihood function \(p(g\!\mid \!{\mathbf {c}})\), which is not restricted to a particular mineral (e.g. goethite/hematite versus shale) and commodity (e.g. iron or copper deposit).

This paper has adopted a three-pronged approach for surface warping performance evaluation. First, the classification performance of ML estimators was measured using precision and recall rates in a Monte Carlo cross-validation setup. Second, the categorical distribution of test samples was examined in terms of the number of HG/BL/LG/W samples located above and below the warped surfaces (modeled boundaries). Third, grade estimation performance of the resultant warped surface block models was quantified using normalized \(\hbox {R}_{2}\) reconciliation error scores and IQR statistics. The measures considered vary in scope and complexity. Together, they provide a credible perspective of ML surface warping performance as a stand-alone and an integrated component within an orebody grade estimation system.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Leung, R., Balamurali, M. & Lowe, A. Surface Warping Incorporating Machine Learning Assisted Domain Likelihood Estimation: A New Paradigm in Mine Geology Modeling and Automation. Math Geosci 54, 533–572 (2022). https://doi.org/10.1007/s11004-021-09967-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-021-09967-5

Keywords

Navigation