Estimation of Fe Grade at an Ore Deposit Using Extreme Gradient Boosting Trees (XGBoost)

Atalay, Fırat

doi:10.1007/s42461-024-01010-5

Estimation of Fe Grade at an Ore Deposit Using Extreme Gradient Boosting Trees (XGBoost)

Open access
Published: 15 June 2024

Volume 41, pages 2119–2128, (2024)
Cite this article

Download PDF

You have full access to this open access article

Mining, Metallurgy & Exploration Aims and scope Submit manuscript

Estimation of Fe Grade at an Ore Deposit Using Extreme Gradient Boosting Trees (XGBoost)

Download PDF

Fırat Atalay ORCID: orcid.org/0000-0001-6349-7745¹

388 Accesses
Explore all metrics

Abstract

Estimating the spatial distribution of ore grade is one of the most critical and important steps to continue investment decision on the deposit. Kriging is the most widely used method to estimate the ore grade while alternative techniques are being developed. Machine learning algorithms can be used as alternative methods to classical kriging. In this paper, Fe grade of a deposit is estimated with XGBoost algorithm, and results are compared with kriging estimation results. For estimation processes, samples collected from the drillholes are used. To mitigate the effect of varying sampling length, both estimations use composites of these samples. Due to the different nature of the estimation methods, different steps have been taken to perform estimations. Results show that XGBoost estimates produced higher ranged estimates which is a desired result in ore grade estimation while minimum and maximum of the estimates were lower and higher than the kriging estimates, respectively. However, like kriging estimates, estimation results were smoother than composites while variance of the XGBoost estimates were lower than variance of composites. This means that even though estimation with XGBoost mitigates the smoothing effect, estimation results suffer from smoothing effect like kriging.

Evaluation of Machine Learning Models for Ore Grade Estimation

Integration of Machine Learning Algorithms with Gompertz Curves and Kriging to Estimate Resources in Gold Deposits

Article 27 September 2020

Prospectivity and Uncertainty Analysis of Tungsten Polymetallogenic Mineral Resources in the Nanling Metallogenic Belt, South China: A Comparative Study of AdaBoost, GBDT, and XgBoost Algorithms

Article 10 April 2024

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Due to the limited time and financial resources, ore grade estimation generally depends on the limited data which generally consists of drillings. The mass of rock samples collected from a deposit is generally equivalent to 1 to 1.000.000 of the minerals deposit that would be exploited [1]. This limited data should be used with care to make reliable estimation of spatial distribution of target ore grade. This estimate is used in many areas such to assess the economic viability, investment decision to deposit, short and long-term mine planning, and estimation of mine life [2,3,4,5]. In ore grade estimation classical methods like inverse distance weighting (IDW), stochastic simulation and, most widely, kriging and its variants are used [6,7,8,9,10,11,12]. These geostatistical techniques are readily available in some commercial and open-source software packages [13]. Availability of the numerous software does not mean that these techniques can be easily applied to ore grade estimation. Still, steps like variogram modelling, detection of possible anisotropy and trends, and determination of kriging plans stand as challenging tasks [14]. These steps generally require expert knowledge and experience [15]. Due to the complexity of the classical geostatistical methods such as kriging, some alternative techniques are being developed. As alternative methods, Machine learning (ML) algorithms provide a rich spectrum of approaches to classical geostatistical methods.

ML algorithms directly learn from available data to perform mainly regression and classification tasks while generally not making any assumption about available data [16]. Due to the power and simplicity of machine learning algorithms, the use of the algorithms is gaining popularity in ore grade estimation [17]. Many researchers used artificial neural networks and their variants, fuzzy logic, support vector machine, regression trees, extreme learning machines, and random forest in ore grade estimation with encouraging results [10, 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33]. In these research, distribution of Au, Fe, Cu, Mo, Ag, Al₂O₃, and Zn contents is estimated. For the resource estimation of the iron ore deposits, neural networks and its variants are dominating approaches [23, 25, 34,35,36] while other techniques like random forest and fuzzy logic are also used [27, 37]. Even though gradient boosting trees are used with success for the estimation of gold deposits [38, 39], the method has not been applied at the estimation of Fe content yet. The estimation steps of iron ore deposit are like other mineral resources. Nevertheless, mineral deposition processes are unique to all depositions while this affects the spatial continuity of the underlying element [40,41,42]. For example, gold deposits show higher spatial variability than iron deposits. For this reason, it is important to show that methods that have been applied and succeeded in other commodities can also be used in estimating the content of iron ore, where spatial variability is different from other commodities.

In this study, Fe grade estimation of an iron deposit is performed by using XGBoost algorithm, and results are compared with traditional ordinary kriging. Due to the different nature of these estimation methods, different steps must be taken to perform estimations. In estimation with XGBoost algorithm, hyper-parameter optimization is performed by using grid search to determine the most appropriate hyper-parameter value while root mean square error is used to evaluate the performance of estimations. In estimations with ordinary kriging, experimental variograms are calculated in different directions, and model variograms are fitted to these variograms. Cross-validation serves as the cornerstone technique for assessing the acceptability of variogram models, a role it has maintained since its introduction to the geostatistical field. For this reason, cross-validation is used to assess the acceptability of the fitted models [43,44,45,46,47]. Estimations are performed by using the model variogram. In estimations with both methods, composites of Fe grades are considered as input data. Finally, estimations are compared with these composite values and each other by means of summary statistics and residual values when kriging estimate results are considered as base. Results show that estimations with XGBoost algorithm produced estimates with higher range than kriging estimates. But the method still suffers from smoothing like kriging while minimum and maximum values of estimates were higher and lower than composite values, respectively.

2 Methods

2.1 Extreme Gradient Boosting (XGBoost)

XGBoost is a member of ensemble learning family which is a supervised, parallel, scalable tree boosting system that can be used to solve both regression and classification problems [48]. An ensemble can be defined as the fusion of two or more trained models to improve performance of underlying individual models [49, 50]. While the fusion of the models increases the generalization, local and specific information is also captured [50]. These properties make ensemble techniques one of the most powerful methods among the machine learning alternatives [51]. Due to the simplicity and success of the method, ensemble learners are used in many applications of classification, regression, ranking, and anomaly detection [52,53,54,55] .

In recent years, large numbers of ensemble machine learning approaches are proposed [52]. Among the alternatives, boosting techniques show promising results while achieving the best performance which is measured in terms of squared correlation (R²) [56]. In boosting algorithms, weak learners which are slightly better than random guess are combined to generate strong learner in iterative way. The aim of gradient boosting is to approximate the function ${F}^{*}\left(x\right)$ which maps the x to corresponding outputs y while dataset $D={\left\{{x}_{i},{y}_{i}\right\}}_{1}^{N}$ is given. This approximation is reached by minimization of the loss function $L\left(y,F\left(x\right)\right)$. Additive approximation of the ${F}^{*}\left(x\right)$ function is reached by weighted sum of functions;

$${F}_{m}\left(x\right)={F}_{m-1}\left(x\right)+{p}_{m}{h}_{m}\left(x\right)$$

(1)

where ${p}_{m}$ is the weight associated with the mth function ${h}_{m}\left(x\right)$. The approximation is first reached by constant approximation of ${F}^{*}\left(x\right)$ as;

$$\left({F}_{0}\left(x\right)\right)=\text{arg}\;\underset{\alpha }{\text{min}}\sum _{i=1}^{N}L\left({y}_{i},a\right)$$

(2)

And following models are minimized as;

$$\left({P}_{m},{h}_{m}\left(x\right)\right)=\text{arg}\;\underset{p,h}{\text{min}}\sum _{i=1}^{N}L\left({y}_{i},{F}_{m-1}\left({x}_{i}\right)+ph\left({x}_{i}\right)\right)$$

(3)

${F}^{*}$ is optimized greedily by using gradient descent optimization. ${h}_{m}$ is trained on new dataset $D=\left\{{x}_{j},{r}_{mi}\right\}$, and pseudo residuals rm are calculated as follows;

$${r}_{mi}={\left[\frac{\partial L\left({y}_{i},F\left(x\right))\right)}{\partial F\left(x\right)}\right]}_{F\left(x\right)={F}_{m-1}\left(x\right)}$$

(4)

One can calculate the value of ${p}_{m}$ by applying line search optimization.

In XGBoost, only decision trees are considered as base regressor or classifier to boost ensemble [48]. The complexity of the trees is controlled by using loss function variant.

$${L}_{xgb}=\sum _{i=1}^{N}L\left({y}_{i},F\left({x}_{i}\right)\right)+\sum _{m=1}^{M}{\Omega }\left({h}_{m}\right)$$

(5)

$$\Omega\left(h\right)=\gamma T+\frac12\lambda\left\|w\right\|^2$$

(6)

where $T$ is the number of leaves and $w$ are output scores associated with the leaves in the XGBoost algorithm. It uses parallel processing, regularization, and tree pruning which dramatically increases the regression and classification accuracy and precision compared to raw techniques. These superiorities make XGBoost to be widely used in regression problems [57].

2.2 Ordinary Kriging (OK)

Kriging is a widely used spatial estimation method that depends on attaching the weights to neighboring data to make estimation at unsampled location.

Estimation using ordinary kriging depends on the fitting variogram model by using the experimental variogram values which are calculated based on the distance h as:

$$\gamma \left(h\right)=\frac{1}{2n}\sum _{i=1}^{n}{Z}_{i}-{Z}_{i+h}$$

(7)

where is experimental variogram value at distance h and n is the number of pairs used in experimental value calculations. In resource estimation, experimental variograms are generally calculated in horizontal and vertical directions. In horizontal, experimental variograms are calculated in four directions which are at azimuths 0°, 45°, 90°, and 135°. Next step is to fit a model to experimental variogram. Most common model that used is nested spherical and pure nugget model. By fitting model variogram, estimation can be made by using kriging approach. Despite its widespread use, kriging is not without its limitations in spatial estimation.

One of the main limitations of kriging is it requires estimation of variogram model that represents spatial continuity of the target variable to estimate unsampled locations. The estimation of the variogram model is subjective and often requires considerable experience. Also, the variogram exhibits sensitivity to deviations from normality or symmetrical distributions [58]. This sensitivity arises from its inherent reliance on squared differences. Even a single outlier can significantly distort the experimental variogram due to its potential involvement in numerous paired comparisons across multiple, or even all, lag intervals [59]. Other limitation is while kriging theory assumes an infinitely large study area, practical applications are confined to finite domains, often delineated by geological boundaries. This inherent discrepancy can manifest as the “string effect” when kriging along linear features like drillholes. Due to the limited spatial extent of the domain, samples located at the ends of these linear features possess fewer neighboring data points for comparison. Consequently, the kriging system assigns them greater weight, potentially leading to biased estimations, particularly in non-stationary domains where the spatial characteristics of the variable under study exhibit systematic variations. This effect becomes especially critical when the end and central samples within the drillhole data exhibit inherent differences [60, 61].

3 Case Study

3.1 Drillhole Data and Compositing

To estimate the Fe grade of a deposit located in Türkiye, 30 drillholes are drilled with 45 m average drilling space at horizontal direction. All drillholes are drilled vertically except one drillhole with − 75° inclination and 154° azimuth. Total length of all drillings is 10,650 m with an average of 355 m. A total of 1010 samples were collected from the drillholes in approximately 1 m average length with varying lengths between 0.5 and 2 m. Each sample is represented by Easting(X), Northing (Y), Elevation (Z), and Fe (%) content. Due to the confidentiality agreement with the company that owns the deposit, no further information like location and name of the deposit can be given. Plan and oblique view of the drill holes are shown in Fig. 1.

Compositing is a well-known and standard technique in grade estimation with data having unequal sampling length. All samples are composited into 1 m length with sample length weighted approach. The acceptance rate for composites considered as 50% which means that composites with length less than 50 cm considered as short composites and discarded from the composite dataset. Only two short composites are discarded from the dataset while the remaining 1058 composite are used in estimations. Relative frequency distribution of the Fe grades of the composites are shown in Fig. 2.

As seen from the Fig. 2 Fe grades show negatively skewed distribution. Nearly 85% of the data lies between 45 and 50% Fe grade. Only 3 of the composite data have higher than 50% Fe grade which are isolated occurrences that do not show spatial continuity.

3.2 Estimations

Estimations are both performed with XGBoost and ordinary kriging. These methods demand different steps to be taken to estimate the ore grade distribution. All estimations are performed at block model that represents mineralization with sizes 5 × 5 × 5 m in X, Y, and Z directions consisting of 29,262 blocks. All estimations are performed at midpoints of these blocks while these estimations are considered as Fe grade of the corresponding block. The estimation steps specific to both methods are explained in detail in the following sections.

3.2.1 Estimation with XGBoost

Input data is considered as X, Y, and Z coordinates of the composites while output is considered as Fe analysis. Estimation with XGBoost requires the prediction of parameters of the algorithm. In general, default parameters provided by the XGBoost package are not the best option. Some parameters should be tuned to estimate the Fe distribution at the deposit by using XGBoost. Performance of the XGBoost is sensitive to selected parameters. Inappropriate parameters result in unacceptable estimates. For this reason, parameters should be tuned. In machine learning, data splitting is done to avoid possible over-fitting. The composite data set was divided into training and test sets as 80% and 20% of all data, respectively, which was carried out randomly. Some parameters should be tuned to estimate the Fe distribution at the deposit by using XGBoost. Performance of the XGBoost is sensitive to selected parameters. Inappropriate parameters result in unacceptable estimates.

To predict acceptable parameters, grid search methodology is considered. Grid search method is easy to implement and understand [62]. It is an optimization method that tries all possible combinations of given parameters. Among all possible combinations, the combination that gives the best result according to the performance criterion is determined as the estimation parameter. In parameter tuning of the ML algorithms, there are many alternatives [63,64,65,66]. While grid search is computationally intensive approach, this method offers the advantage of exhaustive exploration of the parameter space. This guarantees evaluation of all possible combinations, potentially leading to superior results [64, 67]. In this study, root mean square error (RMSE) is considered as performance criterion while the combination with lowest RMSE can be considered as the best option. Relying only on grid search may result in overfitting which is undesired in estimation with ML in general. Cross validation techniques like train/split, K-fold, stratified K-fold, and leave-one-out can be used to mitigate overfitting while completely avoiding overfitting is impossible. In this study, K-fold cross validation technique is used to mitigate overfitting. In K-fold cross validation, data points are split into k equal-sized subsets which are called folds. Among the subsets, one subset is used to test the performance of the model which is trained by remaining subsets. To tune the parameter of XGBoost, estimation grid search parameters are given in Table 1 while K-fold cross validation (with K = 10) approach is adopted.

Table 1 Parameters that are tuned in XGBoost estimations (Default parameters provided by Python XGBoost package are shown in bold)

Full size table

In Table 1 parameter ranges are determined as eta lower boundary of starts from 0.01 which the value is close to zero and upper boundary is 1 which is the maximum value that eta parameter can take. Max_depth parameter starts from 3 while less than this value avoids model to be fitted in practice, and maximum value is considered as 15 while increasing tree depth increases overfitting possibility. Min_child_weight parameter is the minimum weight required to create a new node. This parameter range is selected between 1 and 10 while 1 is minimum integer value that parameter can take. An upper boundary higher than 10 generally results in smoothing of the estimation results while high size groups in leaf nodes avoid algorithm to capture values at the upper and lower tail of dataset. By nature, subsample parameter can take values between 0 and 1. In ensemble learning like XGBoost, the subsample ratio parameter typically ranges from 0.5 to 1.0 during decision tree construction. This parameter governs the proportion of training data points utilized to grow each individual tree. A subsample ratio of 0.5 signifies that half of the training data is randomly sampled with replacement before building each tree. To prevent over fitting, the subsample range is determined between 0.5 and 1. Parameters that are visited in grid search with lowest RMSE is considered as best alternative. The average of the RMSE in K-fold cross validation was 0.68. The fitted model is used to estimate the test data which is already known. RMSE of the test data and predictions was 0.69 which is very close to K-fold cross validation’s RMSE. Alternative to RMSE, minimum absolute error (MAE) is also considered during the parameter tuning. The average of MAE was 0.67 and 0.65 for K-fold cross-validation and test data, respectively, which were lowest values among the alternatives. This can be interpreted as generalization of the model is enough to be used in estimation. Unlike classical kriging method, most machine learning algorithms does not provides an statistical measure of reliability of the estimation results. In kriging, even though it has limitations, estimation error variance is used to assess the uncertainty of the estimates. Unlike kriging, most machine learning algorithms do not provide such a tool to assess the estimation reliability while XGBoost is not an exception. In this study, estimation results are assessed based on the swath plots of the XGBoost and composite respect to X, Y, and Z directions. Visual checks of the swath plots show that XGBoost was able to capture random and structural variation which also indicates that XGBoost estimates were able to capture the anisotropy. All estimations are performed with a Python 3.10 code written and run at a computer with 64 GB RAM, 24 core-CPU.

Complexity analysis is a formal technique employed to assess the resource consumption of an algorithm. It characterizes the relationship between the input size and the execution time of the algorithm, independent of the specific hardware platform, programming language, or compiler used. This analysis allows for the evaluation and comparison of different algorithms solving the same problem, providing insights into their relative efficiency as the input size grows. In seminal work that XGBoost algorithm was proposed [48], complexity of the training original algorithm is O(K d ||X||₀ log n) where ||X||₀ is number of data used in training, d is maximum depth of three, and K is total number of trees. In prediction step, the complexity is O(K d). In the current computer configuration, estimation of 26,291 blocks took only 3.99 × 10⁻³ s. In order to assess performance of the algorithm, 2 millions of artificial blocks are generated and estimated with the fitted model. For these blocks, it took 0.12 s to estimate. Which means that with current computer configuration, XGBoost is able to handle large number of blocks in estimation.

3.2.2 Estimation with Ordinary Kriging

In order to estimate the variogram in 3D, vertical and horizontal experimental variograms are calculated using Netpromine software by using composite values. In horizontal direction, experimental variograms are calculated in 0°, 45°, 90°, and 135° azimuths to detect possible anisotropy. Lag distance of 50 m is used with lag tolerance of 25 m and tolerance angle 22.5°. All experimental variograms showed similar behavior in different directions. Due to the isotropic behavior of the directional experimental variograms, omnidirectional horizontal variogram is calculated to fit the underlying variogram model. Variogram modelling is continued with vertical experimental values. In order to calculate the vertical experimental variograms, a lag distance of 1 m is used with lag tolerance of 5°. Model variogram is selected as a nested model consisting of one nugget effect and one spherical model (Table 2).

Table 2 Variogram model used in estimation

Full size table

To assess the acceptability of fitted variogram, model cross validation is used. In grade estimation with kriging, leave-one-out cross validation (LOOCV) is the standard technique to assess the acceptability of the estimation results. In LOOCV, each datum and associated datum locations are removed one at a time, and the grade of data location is estimated with remaining data. From another perspective, LOOCV can be considered as a specialized form of K-Fold cross validation with K is set to number of composites in dataset. Cross validation ends when all locations are visited and grades at the locations are estimated with selected model variogram and search ellipsoid parameters. In Kriging, only neighboring data is used to avoid excess smoothing. The neighboring data is determined as the data that falls in the ellipsoid which central is located at the estimation point. The axes length of the ellipsoid is usually determined by choosing a slightly bigger value than the ranges in the horizontal and vertical directions. When all data points are visited, two different data are obtained, the estimated and the actual values. The difference of measured and estimated values is named as residual and calculated as follows;

$${x}_{residual}= x-{x}^{{\prime }}$$

(7)

where $x$ is data value and ${x}^{{\prime }}$ is estimated value during the cross-validation. The statistics of these residuals (kriging errors) provide insight for acceptability of the underlying variogram model. The mean of these residuals should be as close as to zero while percentage of the errors within two standard deviations (PEWTSD) should be higher than 95%, while PEWTSD measures the spread of the residuals respect to mean of residuals and calculated as;

$$\text{P}\text{E}\text{W}\text{T}\text{S}\text{D}= \frac{n}{Number\;of\;data}*100$$

(8)

where n is the number of residuals that stands between mean of residuals ± 2×standard deviation of kriging errors.

LOOCV performed to assess the acceptability of the underlying variogram is given in Table 2. Search ellipsoid with 250 m and 30 m in horizontal and vertical direction is selected. Summary statistics of the residuals are given in Table 3.

Table 3 Summary statistics of kriging errors of residuals

Full size table

As seen from Table 3 the mean of the kriging errors is close to zero which means that estimations are unbiased. 99.90% of the errors lie between the two-standard deviation from the mean of residuals. Therefore, cross validation results reveal that proposed variogram model can be used in estimations.

After the fitting and cross validation steps, estimations are performed with moving neighborhood having an ellipsoid with radiuses of 250 m and 30 m in horizontal and vertical directions, respectively. A maximum of 16 closest data are used in estimation to mitigate over-smoothing. The geometry of the search ellipsoid and conditioning data was enough to perform estimations at single pass.

4 Results and Discussion

In the spatial estimation of the ore grade, the statistics of the estimation results are expected to be as close as possible to the statistics of the composites. However, the statistics of the estimations are generally smoother while this phenomenon is well-known in geostatistical estimation [68]. In other words, the highest value of the estimation is expected to be lower than the highest value of the composite values. In addition, the lowest value of the predictive values is expected to be higher than the lowest value of the composite values. As a result of smoothing, variance of the estimates is lower than variance of the composites. Due to the unbiasedness property of the estimator, averages of the estimations and composite values are expected to be close to each other.

Results of the estimations can be assessed by using summary statistics and cross-plots [1, 69]. Summary statistics of the composites, kriging, and XGBoost are given in Table 4.

Table 4 Summary statistics of composites, kriging, and XGBoost estimates

Full size table

As seen from Table 4, kriging and XGBoost both produced similar estimation averages to composites and kriging. This similarity shows that both estimation methods produced unbiased results. For both estimation methods, the absolute value of the deviation of the means from the composite mean was less than 0.2% which is a neglectable deviation. The estimate range of the XGBoost is 13% higher than the Kriging estimates while the range percent is calculated as follows.

$$Estimation\;range\;\%=\frac{\left(max(\;XGBoost)-min(\;XGBoost)\right)-\left((max\left(OK\right)-min\left(OK\right)\right)}{\left((max\left(OK\right)-min\left(OK\right)\right)}\ast100$$

(9)

As expected, a well-known smoothing effect is observed at both estimation results. Both methods produced smoother estimation results while variances of the estimations are notably lower than the variance of the composites. XGBoost produced estimation results with higher variation than OK due to the higher estimation range. Comparison of the estimates are followed by cross-plotting kriging estimates against XGBoost estimates (Fig. 3). Assessment of the results followed by cross-plots of the estimates and cross-plots of the Kriging and XGBoost estimates is given in Fig. 3.

As seen from Fig. 3 estimation results show moderately positive association in linear characteristics with 0.63 Pearson correlation coefficient. One reason of the moderate correlation is estimation range of the XGBoost is higher than Kriging while at extreme points of XGBoost estimates where kriging produced smoother estimates that resembles to each other which decreases the linear relation between the XGBoost and kriging estimates. Other reason for the moderate correlation is variance of the estimates of XGBoost is higher as seen from Table 4. Which means that even though mean and median values of both estimates are similar, variability of the XGBoost is higher in spatial sense and kriging estimated less variable results. In order to continue assessing the kriging and XGBoost estimates, residuals are calculated by subtracting XGBoost estimates from kriging estimates. The histogram and summary statistics of the estimates are shown in Fig. 4.

As seen from the Fig. 4 residuals show normal distribution with the average value is close to zero while maximums and minimums are 4.88 Fe (%) and −5.25 Fe (%), respectively. As expected from a normal distribution, histogram of these residuals shows light-tailed behavior. That means that the number of data that stays at the tails is relatively lower than the center of the histogram while 80.3% of the residuals stays between −1 and 1 Fe (%). This was the expected result while the correlation between the estimates is moderate. Among 26,291 block estimates, at tails only two of the data were higher than the 5 Fe (%) and at lower tail three of the residuals were lower than −4 Fe (%) which represents only 0.02% of data.

Kriging is an industry standard that requires certain steps to be taken like variographic analysis, determination of possible anisotropy, and model variogram fitting in grade estimation. While an expert is needed to perform these time-consuming and troublesome steps, the estimation results generally reliant on knowledge and experience of this expert. In grade estimation, XGBoost algorithm can be good candidate to industry standard kriging while estimations can only be performed with input-output data pairs which consists of X, Y, and Z coordinates and grade measurements at those locations, respectively. In XGBoost estimation, grade continuity is implicitly captured by the algorithm which is mainly dependent on hyper-parameters specific to the algorithm. However, results of the estimations are dependent on estimation of these parameters which are appropriate to given data set.

5 Conclusions

In this paper, XGBoost algorithm was used to estimate Fe grade of a deposit. This study is the first in literature in which the spatial estimation of Fe grade is made using the XGBoost algorithm. For comparison, the kriging method is used. Results of the estimation show that XGBoost can be used in grade estimation as an alternative to kriging. The XGBoost produced a higher estimation range than kriging which is desired in grade estimation at an ore deposit. In correlation with higher estimation range of XGBoost, standard deviation of the estimation results with XGBoost was higher than kriging. Estimates were moderately correlated with kriging estimates. Like kriging, XGBoost suffers from smoothing the estimates while standard deviation of the estimation method was significantly lower than the composites. XGBoost requires hyper-parameter tuning to reach an acceptable level of estimation while default parameters are not the best option to estimate the grade distribution. In hyper-parameter tuning, grid search can be used which sometimes can need high computing power.

The current work only considers the composite values of the samples with X, Y, and Z coordinates as input variable and Fe grade as target variable. Other attributes like alteration degree, possible minor faults and rock type are ignored during the work. Further studies that assess the effect of these variables on estimation with XGBoost algorithm should be conducted. As is well known, like all other ML algorithms, XGBoost is data hungry and requires a great number of data to establish a relation between input and output data. It is not possible to know the number of data that is required to make reliable grade estimation prior to running XGBoost model. When the number of data is not enough to make reliable estimation with XGBoost algorithm, additional data should be collected from the field which requires additional drillings. These additional drilling requires costly and time-consuming steps like planning of the drill hole locations, depths and inclination of these drillholes, data sampling, and chemical analysis of samples which stands as disadvantage of the using XGBoost algorithm in grade estimation.

References

Rossi ME, Deutsch CV (2013) Mineral resource estimation. Springer Science & Business Media
Google Scholar
Krige DG (1973) Computer applications in investment analysis, ore valuation and planning for the Prieska copper mine. Proceedings of the 11th Symposium of Computer Applications in the Mineral Industry, Tucson, Arizona, pp. G31-47
Koushavand B Long-term mine planning in presence of grade uncertainty 2014
Daya AA (2014) Application of disjunctive kriging for estimating economic grade distribution in an iron ore deposit: a case study of the Choghart North Anomaly, Iran. J Geol Soc India 83:567–576
Google Scholar
Wellmer F-W, Scholz RW (2018) What is the optimal and sustainable lifetime of a mine? Sustain 10:480
Google Scholar
Vasiukhina D (2020) 3D geological modeling for mineral resource assessment of the Galeshchynske iron ore deposit, Ukraine
Ali Akbar D (2012) Reserve estimation of central part of Choghart north anomaly iron ore deposit through ordinary kriging method. Int J Min Sci Technol 22:573–577. https://doi.org/10.1016/j.ijmst.2012.01.022
Article Google Scholar
Choudhury S (2015) Comparative study on linear and non-linear geo-statistical estimation methods: a case study on iron deposit. Procedia Earth Planet Sci 11:131–139. https://doi.org/10.1016/j.proeps.2015.06.017
Article Google Scholar
Afzal P, Madani N, Shahbeik S, Yasrebi AB (2015) Multi-Gaussian kriging: a practice to enhance delineation of mineralized zones by concentration-volume fractal model in Dardevey iron ore deposit, SE Iran. J Geochem Explor 158:10–21. https://doi.org/10.1016/j.gexplo.2015.06.011
Article Google Scholar
Badel M, Angorani S, Shariat Panahi M (2011) The application of median indicator kriging and neural network in modeling mixed population in an iron ore deposit. Comput Geosci 37:530–540. https://doi.org/10.1016/j.cageo.2010.07.009
Article Google Scholar
Zerzour O, Gadri L, Hadji R, Mebrouk F, Hamed Y (2020) Semi-variograms and kriging techniques in iron ore reserve categorization: application at Jebel Wenza deposit. Arab J Geosci 13:820. https://doi.org/10.1007/s12517-020-05858-x
Article Google Scholar
Beheshti Bafqi R, Mohammad Torab F (2021) Determination of optimal block dimensions using geostatistical and simulation methods in Surk iron ore and Esfordi phosphate mines. J Anal Numer Methods Min Eng 11:73–82
Google Scholar
Deutsch CV (2021) Implementation of geostatistical algorithms. Math Geosci 53:227–237
MathSciNet Google Scholar
Oliver MA, Webster R (2015) Basic steps in geostatistics: the variogram and kriging. Springer
Google Scholar
Goovaerts P (2009) Geostatistical software. Handbook of applied spatial analysis: software tools, methods and applications. Springer, pp 125–134
Google Scholar
Galetakis M, Vasileiou A, Rogdaki A, Deligiorgis V, Raka S (2022) Estimation of mineral resources with machine learning techniques :122. https://doi.org/10.3390/materproc2021005122
Dumakor-Dupey NK, Arya S (2021) Machine learning—a review of applications in mineral resource estimation. Energies (Basel) 14:4079
Google Scholar
Jafrasteh B, Fathianpour N (2017) A hybrid simultaneous perturbation artificial bee colony and back-propagation algorithm for training a local linear radial basis neural network on ore grade estimation. Neurocomputing 235:217–227. https://doi.org/10.1016/j.neucom.2017.01.016
Article Google Scholar
Kapageridis IK, Denby B, Hunter G (1999) Integration of a neural ore grade estimation tool in a 3D resource modeling package. IJCNN’99. International Joint Conference on Neural Networks. Proceedings (Cat. No. 99CH36339), vol. 6, IEEE; pp. 3908–12
Kapageridis IK (2005) Input space configuration effects in neural network-based grade estimation. Comput Geosci 31:704–717
Google Scholar
Samanta B, Bandopadhyay S, Ganguli R (2002) Data segmentation and genetic algorithms for sparse data division in Nome Placer gold grade estimation using neural network and geostatistics. Explor Min Geol 11:69–76
Google Scholar
Jafrasteh B, Fathianpour N, Suárez A (2016) Advanced machine learning methods for copper ore grade estimation. Near Surface Geoscience 2016-22nd European meeting of Environmental and Engineering Geophysics. EAGE Publications BV, pp cp–495
Google Scholar
Afeni TB, Lawal AI, Adeyemi RA (2020) Re-examination of Itakpe iron ore deposit for reserve estimation using geostatistics and artificial neural network techniques. Arab J Geosci 13:1–13
Google Scholar
Lishchuk V, Lund C, Ghorbani Y (2019) Evaluation and comparison of different machine-learning methods to integrate sparse process data into a spatial model in geometallurgy. Min Eng 134:156–165
Google Scholar
Singh RK, Ray D, Sarkar BC (2018) Recurrent neural network approach to mineral deposit modelling. 4th International Conference on Recent Advances in Information Technology (RAIT), IEEE; 2018, pp. 1–5
Jafrasteh B, Fathianpour N, Suárez A (2018) Comparison of machine learning methods for copper ore grade estimation. Comput Geosci 22:1371–1388
MathSciNet Google Scholar
Tutmez B, Tercan AE, Kaymak U (2007) Fuzzy modeling for reserve estimation based on spatial variability. Math Geol 39:87–111
MathSciNet Google Scholar
Kaplan UE, Topal E (2020) A new ore grade estimation using combine machine learning algorithms. Minerals 10:847
Google Scholar
Chatterjee S, Bhattacherjee A, Samanta B, Pal SK (2006) Ore grade estimation of a limestone deposit in India using an artificial neural network
Abuntori CA, Al-Hassan S, Mireku-Gyimah D, Ziggah YY (2021) Evaluating the performance of extreme learning machine technique for ore grade estimation. J Sustainable Min 20
Samson M (2020) Mineral resource estimates with machine learning and geostatistics
Wu X, Zhou Y (1993) Reserve estimation using neural network techniques. Comput Geosci 19:567–575
Google Scholar
Jalloh AB, Kyuro S, Jalloh Y, Barrie AK (2016) Integrating artificial neural networks and geostatistics for optimum 3D geological block modeling in mineral reserve estimation: a case study. Int J Min Sci Technol 26:581–585
Google Scholar
Guo WW (2010) A novel application of neural networks for instant iron-ore grade estimation. Expert Syst Appl 37:8729–8735. https://doi.org/10.1016/j.eswa.2010.06.043
Article Google Scholar
Tahmasebi P, Hezarkhani A (2011) Application of a modular feedforward neural network for grade estimation. Nat Resour Res 20:25–32
Google Scholar
Goswami A, Das, Mishra MK, Patra D (2017) Investigation of general regression neural network architecture for grade estimation of an Indian iron ore deposit. Arab J Geosci 10:80
Google Scholar
Boroh AW, Kouayep Lawou S, Mfenjou ML, Ngounouno I (2022) Comparison of geostatistical and machine learning models for predicting geochemical concentration of iron: case of the Nkout iron deposit (south Cameroon). J Afr Earth Sc 195. https://doi.org/10.1016/j.jafrearsci.2022.104662
Ibrahim B, Majeed F, Ewusi A, Ahenkorah I (2022) Residual geochemical gold grade prediction using extreme gradient boosting. Environ Challenges 6. https://doi.org/10.1016/j.envc.2021.100421
Kaplan UE, Dagasan Y, Topal E (2021) Mineral grade estimation using gradient boosting regression trees. Int J Min Reclam Environ 35:728–742. https://doi.org/10.1080/17480930.2021.1949863
Article Google Scholar
Revan MK, Demir Y, Uysal İ, Özkan M, Dumanlılar Ö, Şen C et al Recently-discovered Bahçecik au ± ag mineralization in the Eastern Pontides, Gümüşhane-NE Türkiye: geological and geochemical implications on the intermediate sulfidation epithermal deposit. Int Geol Rev 2023:1–26
Ünlü T, Gürsu S, Tiringa D (2023) An approach to the origin of the early Cambrian Karaçat iron deposit (Mansurlu Basin, Adana) and iron deposits outcrops at its eastern parts. Bull Mineral Res Explor 2016:6
Google Scholar
Rabayrol F, Wainwright AJ, Lee RG, Hart CJR, Creaser RA, Camacho A (2023) District-scale VMS to porphyry-epithermal transitions in subduction to postcollisional tectonic environments: the Artvin Au-Cu District and the Hod gold corridor, Eastern Pontides Belt, Turkey. Econ Geol 118:801–822
Google Scholar
Oz B, Deutsch CV Cross validation for selection of variogram model and kriging type: application to IP data from West Virginia. Center for Computational Geostatistics Annual Report Papers 2000:1–13
Davis BM (1987) Uses and abuses of cross-validation in geostatistics. Math Geol 19:241–248
Google Scholar
Knudsen HP, Kim YC (1978) Application of geostatistics to roll front type uranium deposits. Society of Mining Engineers of AIME
Google Scholar
David M (1976) The practice of kriging. Advanced Geostatistics in the Mining Industry: Proceedings of the NATO Advanced Study Institute held at the Istituto di Geologia Applicata of the University of Rome, Italy, 13–25 October 1975, Springer; pp. 31–48
Delfiner P (1976) Linear estimation of non stationary spatial phenomena. Advanced Geostatistics in the Mining Industry: Proceedings of the NATO Advanced Study Institute held at the Istituto di Geologia Applicata of the University of Rome, Italy, 13–25 October 1975, Springer; pp. 49–68
Chen T, Guestrin C, Xgboost (2016) A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 785–94
Rokach L (2019) Ensemble learning: pattern classification using ensemble methods. World Scientific
Google Scholar
Zhang C, Ma Y (2012) Ensemble machine learning: methods and applications. Springer
Google Scholar
Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37
Google Scholar
Dong X, Yu Z, Cao W, Shi Y, Ma Q (2020) A survey on ensemble learning. Front Comput Sci 14:241–258
Google Scholar
Mendes-Moreira J, Soares C, Jorge AM, Sousa JF, De (2012) Ensemble approaches for regression: a survey. Acm Comput Surv (Csur) 45:1–40
Google Scholar
Zhong Y, Chen W, Wang Z, Chen Y, Wang K, Li Y et al (2020) HELAD: a novel network anomaly detection model based on heterogeneous ensemble learning. Comput Netw 169:107049
Google Scholar
Xie J-C, Pun C-M (2020) Deep and ordinal ensemble learning for human age estimation from facial images. IEEE Trans Inf Forensics Secur 15:2361–2374
Google Scholar
Fernández-Delgado M, Sirsat MS, Cernadas E, Alawadi S, Barro S, Febrero-Bande M (2019) An extensive experimental survey of regression methods. Neural Netw 111:11–34
Google Scholar
Otchere DA, Ganat TOA, Gholami R, Lawal M (2021) A novel custom ensemble learning model for an improved reservoir permeability and water saturation prediction. J Nat Gas Sci Eng 91:103962
Google Scholar
Webster R, Oliver MA (2007) Geostatistics for environmental scientists. Wiley
Google Scholar
Kerry R, Oliver MA (2007) Determining the effect of asymmetric data on the variogram. II Outliers Comput Geosci 33:1233–1260
Google Scholar
Miguel-Silva V, Afonseca BCD, Costa J, Medeiros AHS (2021) The bias caused by the string effect in ordinary kriging: risks and solutions. Appl Earth Sci 130:209–221
Google Scholar
Deutsch CV (1994) Kriging with strings of data. Math Geol 26:623–638
MathSciNet Google Scholar
Rao CR, Vinod HD (2019) Conceptual econometrics using R. Elsevier
Kazemi MMK, Nabavi Z, Armaghani DJ (2024) A novel hybrid XGBoost methodology in predicting penetration rate of rotary based on rock-mass and material properties. Arab J Sci Eng 49:5225–5241
Google Scholar
Pan S, Zheng Z, Guo Z, Luo H (2022) An optimized XGBoost method for predicting reservoir porosity using petrophysical logs. J Pet Sci Eng 208:109520
Google Scholar
Sparks ER, Talwalkar A, Haas D, Franklin MJ, Jordan MI, Kraska T (2015) Automating model search for large scale machine learning. Proceedings of the Sixth ACM Symposium on Cloud Computing, pp. 368–80
Kuhn M (2015) Caret: classification and regression training. Astrophys Source Code Libr :ascl–1505
Anggoro DA, Mukti SS (2021) Performance comparison of grid search and random search methods for hyperparameter tuning in extreme gradient boosting algorithm to predict chronic kidney failure. Int J Intell Eng Syst 14
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press on Demand
Google Scholar
Leuangthong O, McLennan JA, Deutsch CV (2004) Minimum acceptance criteria for geostatistical realizations. Nat Resour Res 13:131–141
Google Scholar

Download references

Acknowledgements

The author would like to thank the Dr. S. Yasin Kıllıoğlu who kindly reviewed the earlier version of this manuscript and provided valuable suggestions and comments. Also, author thank the anonymous reviewers for their careful reading of the manuscript and their many insightful comments and suggestions.

Funding

Open access funding provided by the Scientific and Technological Research Council of Türkiye (TÜBİTAK).

Author information

Authors and Affiliations

Hacettepe University (Hacettepe Universitesi), Ankara, Türkiye
Fırat Atalay

Authors

Fırat Atalay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fırat Atalay.

Ethics declarations

Conflict of interest

The author declares no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Atalay, F. Estimation of Fe Grade at an Ore Deposit Using Extreme Gradient Boosting Trees (XGBoost). Mining, Metallurgy & Exploration 41, 2119–2128 (2024). https://doi.org/10.1007/s42461-024-01010-5

Download citation

Received: 23 August 2023
Accepted: 24 May 2024
Published: 15 June 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s42461-024-01010-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Estimation of Fe Grade at an Ore Deposit Using Extreme Gradient Boosting Trees (XGBoost)

Abstract

Similar content being viewed by others

Evaluation of Machine Learning Models for Ore Grade Estimation

Integration of Machine Learning Algorithms with Gompertz Curves and Kriging to Estimate Resources in Gold Deposits

Prospectivity and Uncertainty Analysis of Tungsten Polymetallogenic Mineral Resources in the Nanling Metallogenic Belt, South China: A Comparative Study of AdaBoost, GBDT, and XgBoost Algorithms

1 Introduction