Keywords

1 Introduction

When mining, accurate ore grade estimation is critical as it influences mine planning [1], logistics, and product reliability. In stratified ore deposits—such as banded iron formation (BIF) hosted iron ore deposits—accurate boundary estimation is a pre-requisite for high fidelity, accurate ore grade estimations. Poor boundary modelling can result in the inclusion of ore into a waste region or vice versa. This has a deleterious effect on ore grade estimates, due to either lower ore recovery or ore dilution and is the equivalent of the inclusion of bad/incorrect data into the model. Reduced fidelity can result from either poor boundary position estimates or by too coarse a tessellated model surface. The coarseness of the tessellation introduces a trade-off between computational costs and model fidelity.

While much work has been done on implicit modelling methods for geological boundaries [2, 3], considerably less work has been done on probabilistic methods for modelling boundaries. Neves et al. [4] also considered using geochemical data rapidly obtained from portable XRF devices to update potentially out-of-date grade estimates in what is termed as real-time mining. Their proposed method models the uncertainty of XRF measurements by considering their conditional distribution using confident laboratory assays (hard and sparse data) that derive from exploration holes. A distinguishing feature of our paper is that we focus on the location of geological boundaries rather than grade estimation per se; additionally our approach is based on Gaussian Processes rather than stochastic simulation. This study builds on previous work related to the creation of probabilistic boundaries generated from multiple data types (each having a different levels of noise). The creation of easily updatable boundaries as more data becomes available, builds on the foundation of the Gaussian Processes (GP) probabilistic boundary estimation framework described in [5, 6] and proposes several changes to address issues that arise from a tile-based implementation. The main issues to be resolved are the tendency of GP overfitting and tiling artefacts. The former produces contorted surfaces (unreasonable boundary estimates) particularly in region devoid of input or labelled data. This paper demonstrates that supplying a priori data that properly constrains the solution space (e.g. conservatively indicating where a boundary should not occur) can alleviate boundary distortion. The latter is a manifestation of boundary effects, which occurs when local inference regions have different rotations due to variations in the overall directional trend of the surface and there is no smoothness guarantee for adjacent tile-regions. There are many possible solutions to this problem, one is to introduce a form of weighted transition between adjacent regions. While we focus on one boundary estimation process in this work, some processes (defined below) can be applied to other boundary estimation and update models [7, 8]. These contributions are demonstrated on the banded iron formation-hosted iron ore deposits in the Hamersley Province of Western Australia.

The specific contributions of this work are as follows: (1) The inclusion of a priori data, allowing for the incorporation of domain expertise into the boundary modelling process, preventing the generation of some surface artefacts. (2) A heuristic for the labelling of unassayed production holes, improving boundary modelling accuracy. This further increases the incorporation of domain expertise into the modelling process through the augmentation of available data. (3) The conversion of a set of local rotation calculations defined in [5] to a global rotation model. This allows for the interpolation and extrapolation of rotation transformations across locally modelled sub-regions, providing correlation between sub-region rotations, and preventing artefacts from being introduced by the modelling process.

2 Geology

The data used in this study is from two typical Brockman style BIF hosted iron ore deposits from the Hamersley Region in Western Australia. The Brockman Iron Formation contains two sequences of interbedded BIF and shale bands, the Joffre and Dales Gorge Members, as well as two sequences dominated by shale, chert, and/or carbonate bands, the Yandicoogina and Whaleback Shale Members [9, 10]. In some localised areas the BIF in the Joffre and/or Dales Gorge Members has been enriched to form a high grade iron ore [9, 11, 12]. These deposits contain two distinct types of boundaries, stratigraphic and mineralisation. The stratigraphic boundaries are those that follow the bedding of the sequences, either between two members or internal boundaries between sub-units within a member. These are often designated as occuring at specific shale bands. These boundaries define regions with different source rock, which controls the type of ore produced and therefore some of its physical properties. The other type of boundary is related to the mineralisation. These boundaries indicate the areas that have been impacted by geological events after the source rocks were deposited. Examples include where sections of the BIF have been enriched to form iron ore, or where ore quality has been reduced by a hydration overprint. In this study, the stratigraphic boundary is an internal boundary within the Dales Gorge Member, and the mineralisation boundary is located at the base of the ore where it transitions into unenriched BIF.

The data available consisted of exploration drill holes and production blast holes. The exploration holes are spaced \(\sim \)50 m apart, and are labelled in 2 m intervals. These labels were added manually by geologists based on the chemical assays and geophysical logging. The labels provided information on both the stratigraphy and the mineralisation, and therefore these holes were used for both boundaries. The blast holes are much more closely spaced, 5–10 m apart, and 10–12 m deep. Each blast hole was given a grade label based on a single chemical assay. This only provided information for the mineralisation boundary, not the stratigraphic boundary. As there was only a single label, it was assigned to the midpoint of the hole.

3 Gaussian Processes

In this work, Gaussian Processes are used as a probabilistic non-parametric regression technique. Formally, GPs are a collection of random variables, any finite number of which have a joint Gaussian distribution [13]. A GP is completely defined by its mean function, \(m(\textbf{x})\), and covariance function, \(k(\textbf{x},\textbf{x}^\prime )\), of a real process \(f(\textbf{x})\) as

$$\begin{aligned} m(\textbf{x})&= \mathbb {E}[f(\textbf{x})], \\ k(\textbf{x},\textbf{x}^\prime )&= \mathbb {E}[(f(\textbf{x}) - m(\textbf{x}))(f(\textbf{x}^\prime ) - m(\textbf{x}^\prime ))^T], \end{aligned}$$

allowing for the GP to be written as

$$\begin{aligned} f(\textbf{x}) \sim \mathcal{G}\mathcal{P}(m(\textbf{x}),k(\textbf{x},\textbf{x}^\prime )). \end{aligned}$$

In this implementation, GPs are used to compute the mean and variance for each point within a regular 3D mesh of points that cover the region of interest. Details relating to the implementation of GPs to this work will be covered in the relevant sections. We encourage the interested reader to refer to [6, 13] for a deeper explanation of GPs.

4 A Priori Data

Automated boundary estimation models are generally data driven [14] and contain minimal information that exploits geological domain expertise. Examples of domain expertise include the understanding of the relationship between different surfaces—especially stratigraphic surfaces—and the trend of surfaces outside of the data range [15, 16]. By including a priori data, that is, data that encapsulates geological expertise, it is possible to incorporate domain knowledge about the underlying surface being modelled, and its relationship to other surfaces. This knowledge is in the form of a constraint defining where regions are above the surface and below the surface, rather than where the surface is. Perez et al. [17] define high-order training images and present a way of evaluating those against the data seen, here we derive constraint data positioned above or below the data that defines the transition from below to above the surface. The inclusion of this data informs the boundary modelling process and assists in the generation of a boundary estimate that has an appropriate trend in the absence of data.

Further to this, the presence of a priori data will also prevent surface artefacts that can result from GP based boundary estimates [5]. These surface artefacts arise when inferring model values at data sparse locations, due to the tendency of the GP mean function to trend towards zero in these spaces. A manifestation of surface artefacts and the management of it is shown in Fig. 1. In Fig. 1a, the boundary has been estimated without the inclusion of any a priori data. Marching cubes is used to approximate a surface separating the estimates above 0.5 (considered above the surface) from those below 0.5 (below the surface). Using a GP estimation points that are significantly far from data will drift back to the mean, in this case 0. This can result in the introduction of a fictitious isosurface sufficiently far from data where the estimation drifts back below the 0.5 contour level. This tendency causes artificial structures to be introduced in the absence of data, which is a serious problem. The appearance of false surfaces can lead to erroneous interpretations as it incorrectly indicates the location of the boundary. Inclusion of the a priori data prevents the GP mean from tending towards zero in the modelling space at some distance from the provided data. This prevents the generation of surface artefacts, as shown in Fig. 1b.

These a priori points are not included when training a model, only when inferencing from the model across the region of interest. The a priori points represent geological expertise, not actual data, and so are useful in indicating where a surface is not (which complements the spatial region of where the surface may be). By only including the a priori data in the inferencing step, deference is given to physical observations of the region through exploration hole (or other) data. When modelling, the a priori data is placed ‘sufficiently far’ from the actual data, so as to only guide the generation of a boundary estimate, rather than specify it. The a priori data is dithered to reduce ripple like effects and the associated noise is increased from that of the measured observations.

The a priori data can be defined either through a computational policy mechanism or by utilizing a geological estimate of the boundary. In the case of a computational policy used in the results shown in this paper the introduced data covers gaps above or below real labelled data amount a surface can diverge is constrained. Where a pre-existing surface is used points substantially above or below the surface are used so influence mesh estimations where there is no close by actual data. In either case this data serves as a guide between which a surface is approximated. How close the a priori data need be is a function of the length scales learnt in each sub-region modelled.

The examples shown in this paper contain auto-generated data, where the distance from the actual data is based on the length scales after the section rotation has been applied. However, the distance between the a priori and actual data, and the density, shape, and regularity of the a priori point cloud are all parameters tunable by a geologist, based on the type of information that they wish to encode into the modelling process. An example set of a priori data is shown in Fig. 2. This is a subset of the a priori data that was used to generate the improved surface shown in Fig. 1b.

Fig. 1
figure 1

A comparison of two boundary estimates. a without using a priori data. b when a priori data is included (see Fig. 2). Blue points indicated data labelled as being above the boundary, while red points are labelled as below the boundary

Fig. 2
figure 2

Example a priori point clouds can be seen above (blue) and below (red) the available exploration hole data. The a priori is extended to the upper limit of the modelled section and beyond the easting and northing limits of the region to reduce artifacts near the edges

5 Model Building

5.1 Spatial Rotations

In the boundary modelling method proposed by Ball et al. [5], a global region (with coordinates in ENU) is divided into an overlapping set of local sub-regions. Each local sub-region has an approximate trend direction for the boundary computed for that region. That local mine-space region is rotated into an estimation space so that the nominal trend direction of the surface within that estimation space is horizontal. Each estimation space is a separate local GP model. A rotation matrix for each sub-region was obtained via principal component analysis (PCA). These matrices were calculated from the ore-to-waste transition points provided in exploration hole data within some defined neighbourhood of each local sub-region.

Learning local rotations in the above manner can lead to significant changes between regions due to the inclusion or exclusion of a small number of data points. It is not possible to use PCA for localized rotations using the entire set of data. In comparison, a GP can model rotations where the dependence on the data is a function of the distance from the point of interest and the learnt parameters, i.e. all data is considered by the local data has a greater impact. The consequence is that the inclusion or exclusion of a transition point from determining PCA rotation introduces a stepwise change in the computed rotation. That stepwise change can be significant. To address this the method presented in [5] using PCA is replaced with a GP model for rotations the deflection of the normal from vertical is modelled using the computed normals for the surface at the transition points. This allows a continuously varied estimate of the normal in the mining space being modelled, see Fig. 3.

While PCA rotations can provide significant rotation changes based on inclusion and exclusion of data, it is also true that there may be significant rotation direction differences produced by a GP model of the rotations. Those effects are due to extrapolation of the data (rather than interpolation in the inner regions), the overlap of the regions and how regions and the function used to produce the values for each of the mesh points from the overlapping regions, discussed in Sect. 5.2.

Fig. 3
figure 3

Proposed change to the boundary modelling process presented in [5]. Steps listed outside of the blue box are performed once for the entire modelling space. The steps listed inside the blue box are performed once per local sub-region. We propose to move the calculation of local rotation transformations from the local sub-region level to the global model level

5.1.1 Rotational Model Construction

Observation points ENU (Easting, Northing, Up) for the model occurred at the boundary transition point down each exploration hole. At each down-hole transition point, \(t_i\), the closest n transition points were used to calculate a rotation matrix, \(R_i\), using PCA. The difference in the transition point, \(t_i\), of mining sub-region, and the point resulting from the inverse transform of the corresponding point in the estimation space plus an upward unit vector approximates the normal in the mining sub-region, yield an approximate normal in the mining space:

$$\begin{aligned} t_i - R_i^{-1}(R_i t_i + (0,0,1)) \end{aligned}$$

Note that as n increases, the smoothness of this rotation space also increases. The x-, y-, and z-component vectors of the surface normal unit vector were then used as observation values for three different GP models.

A GP model was fitted for each Easting, Northing and Up of the unit normals at each transition point in the mining space (ENU). This allowed the estimation of a normal at an point within the mining space. The estimates for Easting, Northing and Up parts of the normal are normalized to ensure a unit normal. Given a normal vector of \((e_j,n_j,u_j)\) a rotation matrix is computed such that \((e_j,n_j,u_j)=R_j (0,0,1)\), i.e. the vertical unit normal in the modelling space when rotated into back into the mining space matches the mining space normal. Modelling unit normal rotations using polar coordinates were discounted as those introduce multiple values and, consequently, training ambiguities at angles near multiples of \(2\pi \).

5.2 Region Overlap

Each sub-region is nominally of fixed size and mesh resolution. A region is computed with an overlapping set of transition and a priori data and 3D mesh of estimation points (in ENU mining space). However, the rotations applied in adjacent regions may differ. Furthermore, the length scales learnt for each regions model based on transition data may vary. The 3D mesh of estimation points computed that overlap between regions are merged into single values at each of those points. There are several approaches to computing the merged value of different regions. The methods explored include various methods of weighted averaging where the weights are based on:

  • Scaled distance from start to edge of the overlap, i.e. the first point over overlap starting from the centre of a region is given a weighting of close to 1, the furthest point of overlap starting from the centre of a region is given a weighting close to 0.

  • Inverse distance between an estimation point for sub-region and the centre of that estimations sub-region.

  • Inverse Manhattan distance between an estimation point for sub-region and the centre of that estimations sub-region.

  • Inverse variance computed for a estimation point within sub-region by the GP for modelling that sub-region.

Other methods are possible. The first method is generally the more robust and much less susceptible to dramatic variation between adjacent sections.

5.3 Mesh Resolution

The modelling of the surface is done indirectly by a set of points, where each estimation point has a value representing which side of the surface that point likely lay on, see [5]. The Lewiner et al. [18] marching cubes algorithm is then used to find the surface that cuts between the two sections.

Mesh granularity affects accuracy of the surface produced. The finer the mesh the more accurate the surface. In this case the surface fidelity is affected both by the accuracy of the estimates, and also by the coarseness of the tessallation demarcating the estimates using marching cubes. As the mesh becomes coarser the accuracy declines and additional artifacts become noticeable, in particular what appears to be stair casing which can be seen in the centre sections of Fig. 4c, d. We can for instance visualize a mesh resolution of one full bench height, at this resolution surfaces that run at a shallow angle to the bench will run horizontally for a while then step up/down to the next bench and run horizontally for a while. This can be controlled by varying the resolution of the mesh (at the cost of computation). It is also worth noting that in a geological model the space is often turned into blocks of varying sizes. The minimum viable size for a block will be related to the physical characteristics of the diggers employed at a site and the amount of material movement that is normally seen during blasting. While improving mesh resolution may look nice, it will reach a point in which it has little/no practical value.

5.4 Model Evaluation

Results from the use of a global rotation model on a mineralised boundary and a stratified boundary are compared to boundaries generated using the original local rotation calculations (Fig. 4). The two different boundary types are modelled in two spatially different locations within the Pilbara region of Western Australia.

By having a global rotation model, the rotation transformations for local sub-regions are now spatially correlated. From Fig. 4 we can see that this has ensured a level of similarity between neighbouring sub-regions, less dramatic artifacts between neighbouring sub-regions, as can be seen when comparing Fig. 4c, d.

To compare model performance, we calculated reconciliation values for tonnes for two relevant portions of the mine. Reconciliation values are a comparison of what a particular model predicted vs what was extracted. In this case we produced three GP estimation models for the area. The first model (Fusion) was created using the boundaries produced by this work, including both the apriori data and the GP rotation. The second model (Warping) was created using boundaries produced by Bayesian surface warping which reduces inaccuracies in a modelled boundary with respect to new assay observations via displacement likelihood estimation [7]. The third model (Exploration) was created using the original exploration based boundary surfaces. The reconciliation values were calculated by comparing estimates from the GP models to the values calculated for the same region using the production hole data. More details on the reconciliation procedure and differences between bench within and bench below prediction are illustrated in [7]. A value closer to 0 indicates a better prediction. At the bench within level, where the composition of the lowest bench containing blast hole data is predicted, the proposed model outperforms the other models on all grade block categories (Tables 1 and 2). When predicting on the bench below the available data, our model consistently outperforms the original model based on the exploration holes and has a comparable performance to the surface warping model (Tables 3 and 4).

Fig. 4
figure 4

Comparison of boundary estimates from modelling pipelines that use two different methods for determining local region rotational transformations. This comparison was performed on two different underlying surfaces in two spatially different regions. Figures a and c presents a boundary estimate where each local regions rotation was calculated through PCA. Circled are regions where the surface looks ‘step-like’ as a result of neighbouring local regions having drastically different rotation functions. Figures b and d presents a boundary estimate with a trained GP model for inferring the rotation function across the global space. We can see that the boundary estimate is smoother and the ‘steps’ from a are not present

Table 1 Bench within reconciliation results for site 1
Table 2 Bench within reconciliation results for site 2
Table 3 Bench below reconciliation results for site 1
Table 4 Bench below reconciliation results for site 2

6 Unassayed Production Holes

In an operational open-pit mine, production holes are holes drilled into mining benches in preparation for their blasting. Samples from production hole drillings are routinely collected for assaying [19], allowing for updated boundary models to be produced from the new data [7, 8]. In mining scenarios where ‘ore’ and ‘waste’ are visually differentiable, production holes in ‘waste’ regions will not be assayed for temporal and financial reasons. This has a deleterious effect on boundary models generated through automated processes as the absence of observation data on the waste side of the ore/waste boundary will reduce the accuracy of the resulting boundary estimate.

It is therefore desirable to have some estimation method for determining whether an unassayed production hole is a waste hole omitted from the assaying process, or is unassayed for some other reason. Some other reasons for not assaying a production hole are: every n-th hole is assayed (for temporal and financial reasons), for quality control (i.e. the values in the assay are not considered correct), or the hole was drilled for blast control reasons, removing the need for assaying. The omission of only waste hole observations biases predictive models, resulting in poor model performance. This makes them non-ideal in deciding whether an unassayed production hole should be re-introduced into the boundary modelling process.

We therefore present a heuristic for the labelling of unassayed holes from the assayed production holes with ‘ore’ and ‘waste’ labels. This heuristic is only used for consideration as to whether an unassayed hole should have a ‘waste’ label. While we do not assume that all unassayed production holes are waste holes, we do assume that ‘ore’ regions—regions of interest—are sufficiently represented.

In constructing the heuristic to determine whether unassayed production holes should be labelled as ‘waste’, we assume that a spread of production holes (combination of assayed and unassayed) is provided, with the assayed production holes having an associated ’ore’ or ’waste’ label. From here, a series of criteria for labelling an unassayed production hole with a ‘waste’ label can be formulated. For this paper, the criteria were:

  • The production hole in question must be sufficiently far from all ‘ore’ labelled production holes. The nearby presence of a ‘waste’ label should not prevent, or be required for, the labelling of a production hole as waste. In our scenario, the distance threshold was 10m.

  • The production hole must not be one drilled for the purpose of blast control in the bench. This can be determined by ensuring that the hole is ‘sufficiently’ vertical and has a depth typical of production holes drilled on site. For our example, this distance was 9–12 m. Some mining operations also assign particular codes to production holes drilled for the purpose of blast control, which can also be incorporated into the heuristic.

6.1 Results

Figure 5a presents an area which has several production holes that we would like to assign a label to. For the purposes of demonstration, the production holes that we are attempting to label have been assayed and have a total assay percentage of 98–99%. These are production holes that have not been included in the original modelling process, but have sufficient data to be used in heuristic validation. When the proposed heuristic was applied, a total of 25 production holes were eligible to be labelled as a waste hole. Comparing these holes against their recorded assay values yielded a labelling accuracy of 100%. The newly labelled production holes can be seen in Fig. 5b.

We can see that there is a trade-off when setting heuristic parameters such as the minimum distance to a non-waste hole. Reducing this distance will mean that more unassayed holes will be labelled, but the likelihood of these production holes being incorrectly labelled increases. If the threshold in our presented example was reduced to 9 m, 1 out of 34 holes would be incorrectly labelled (97.06% accuracy), while if the threshold was 8 m, 1 out of 44 holes would be incorrectly labelled (97.73% accuracy).

Fig. 5
figure 5

Example of production holes. The red (ore) and blue (waste) dots represent the location of assayed production holes ready for inclusion in a modelling process. In a the hollow magenta dots represent unlabelled production holes, while in b the black dots represent unassayed production holes that will be assigned a waste label

The boundary estimates resulting from the production hole data in Fig. 5 and exploration hole data is shown in Fig. 6. Both of the boundary estimates are shown with the augmented production hole data. When the newly labelled production holes are omitted from the modelling process (Fig. 6a), the boundary estimate passes under these holes (centre-right). When these holes are included in the modelling process (Fig. 6b), the boundary estimate has risen to put these holes in the waste region.

Fig. 6
figure 6

Comparison of mineralisation boundary estimates for a grade block. The first figure shows the boundary estimated using the original production hole labelling, while the second figure shows a boundary estimated using the augmented production hole labels. In both images, the augmented production hole labels are shown (red = ore, blue = waste, grey = unlabelled). This allows us to see the change in the boundary estimate near the centre right of the data (circled), where the boundary estimate from the augmented data has lifted to cover the newly labelled holes

7 Discussion and Conclusions

The inclusion of a priori data and previously unlabelled waste production holes both provide a mechanism for incorporating expert domain knowledge into a boundary modelling process, improving boundary estimation models. The inclusion of a priori data also contributes to the generation of boundary estimates that better align with geological expectations in data sparse regions, and can prevent the generation of some artefacts, including false isosurfaces, in the boundary estimates. By using a global, continuous GP model to determine local sub-region rotation transformations for a stitched large-scale GP boundary modelling process, a further improvement in boundary estimates can be realised.

Although the contributions presented in this paper improved the boundary modelling process in the demonstrated regions, some modification of these steps would be required for application to a different region due to modelling parameters being scenario specific. This is something of interest that we hope to explore in the future. Further to this, there is room for exploration into alternate representations for surface normals for the generation of local sub-region rotation models. Some examples here include: (1) estimating surface normals from a Delaunay mapping based on exploration hole transition point data, (2) having a more complex inferencing method for each local sub-region, such as taking the average of the transformation functions from the corners of the sub-region, rather than just using the inferred rotation from the centre of the sub-region. As accurate modelling is not typically performed in waste regions, quantitative assessment of the proxy production hole labels was not possible in this study. In the future, we plan to work with industry to acquire suitable data for these metrics.

In conclusion, this paper has presented three application-based solutions to challenges in the automated generation and updating of boundary estimates: (1) The inclusion of a priori data, allowing for the incorporation of domain expertise into the boundary modelling process. (2) A heuristic for the labelling of unassayed production holes, improving boundary modelling accuracy. This further increases the incorporation of domain expertise into the modelling process through the augmentation of available data. (3) The incorporation of a global modelling process for the calculation of rotation transformations for local sub-regions that are used in the generation of a large-scale probabilistic boundary estimate. For the two sites considered in this work, the boundary fusion method improved the predicted tonnages for both bench within and bench below reconciliations when compared to the exploration based boundary surfaces. When compared to the surface warping model, boundary fusion had comparable results on the bench below and improved results on the bench within. These solutions allow for the generation of more accurate boundary estimations which in turn improves the fidelity of ore-grade estimation models. Demonstration of these solutions has been presented on both stratigraphic and mineralisation boundaries in the Hamersley Province of Western Australia.