High-Order Data-Driven Spatial Simulation of Categorical Variables

Modern approaches for the spatial simulation of categorical variables are largely based on multi-point statistical methods, where a training image is used to derive complex spatial relationships using relevant patterns. In these approaches, simulated realizations are driven by the training image utilized, while the spatial statistics of the actual sample data are ignored. This paper presents a data-driven, high-order simulation approach based on the approximation of high-order spatial indicator moments. The high-order spatial statistics are expressed as functions of spatial distances that are similar to variogram models for two-point methods, while higher-order statistics are connected with lower-orders via boundary conditions. Using an advanced recursive B-spline approximation algorithm, the high-order statistics are reconstructed from the available data and are subsequently used for the construction of conditional distributions using Bayes’ rule. Random values are subsequently simulated for all unsampled grid nodes. The main advantages of the proposed technique are its ability to (a) simulate without a training image to reproduce the high-order statistics of the data, and (b) adapt the model’s complexity to the information available in the data. The practical intricacies and effectiveness of the proposed approach are demonstrated through applications at two copper deposits.


Introduction
Geostatistical simulations are often required in reservoir modeling, as well as in the quantification of geological uncertainty, pollutants in contaminated areas, and other spatially dependent geological and environmental phenomena. During the past few decades, geostatistical simulations of categorical variables, such as geological units with complex spatial geometries of mineral deposits and petroleum reservoirs, have largely been modeled within the framework of multiplepoint spatial simulation (MPS) methods that were introduced in the 1990s and have been further developed since then (Guardiano and Srivastava 1993;Journel 1993;Strebelle 2002Strebelle , 2021Journel 2003;Zhang et al. 2006;Chugunova and Hu 2008;Remy et al. 2009;Mariethoz and Renard 2010;Straubhaar 2011;Stien and Kolbjørnsen 2011;Toftaker and Tjelmeland 2013;Strebelle and Cavelius 2014;Zhang et al. 2017;Gómez-Hernández and Srivastava 2021, others). The MPS framework is based on the use of training images (TI) or analogues of the attributes of interest being modeled and contains additional information about the complex spatial relations of the attributes to be simulated; however, the TIs are not conditioned to the available data and their spatial statistics. To retrieve and use the pertinent information from a TI, the similarity between the local neighborhood of an unsampled location to be simulated and the TI is calculated in an explicit or implicit form. Based on this similarity measure, the value of a node from the TI with the most similar neighborhood is assigned to the unsampled location being simulated. Generally, most of the multi-point simulation techniques are a Monte Carlo sampling of values from the TI in some form. No spatial models are used and, importantly, no spatial information is retrieved from the available sample data. As a result, simulated realizations of attributes of interest reflect the TI and its spatial aspects. In cases where there are relatively large data sets, conflict between the available data and the TI statistics is observed, and the resulting simulated realizations do not reproduce the spatial statistics of the data (Goodfellow et al. 2012;Osterholt and Dimitrakopoulos 2007;Dimitrakopoulos et al. 2010;Pyrcz and Deutsch 2014). Several attempts have been made to incorporate more information from the available data. Some authors suggest using replicates from the data in addition to TI (Mariethoz and Renard 2010); however, in practice, it is difficult to find any replicates for three-point relations when data are sparse. Others (Mariethoz and Kelly 2011) apply affine transformations to better condition to the data; however, a TI remains the main source of information. Dimitrakopoulos et al. (2010) and Mustapha and Dimitrakopoulos (2010a, b) propose the use of high-order spatial cumulants to capture complex multi-point relations during the simulation of non-Gaussian random fields. The proposed high-order simulation approach estimates the third-and fourth-order spatial statistics from data and complements them with higher-order statistics from the TI. Further developments in algorithmic performance (Yao et al. 2018(Yao et al. , 2020, generalization using splines (Minniakhmetov et al. 2018), a high-order decorrelation method (Minniakhmetov and Dimitrakopoulos 2017a), and efficient block simulations (de Carvalho et al. 2019), and training-image-free simulations (Yao et al. 1 3 2021) have made the approach more practical. These approaches are based on the approximation of a conditional distribution using Legendre polynomials, which are smooth functions and are incapable of an adequate approximation of the discrete distribution of categorical variables.
The topic of describing complex multi-point relations of categorical variables is addressed in Vargas-Guzman (2011) andVargas-Guzman andQassab (2006), who use high-order indicator statistics to characterize spatially distributed rock units, while Minniakhmetov and Dimitrakopoulos (2017b) develop this further by introducing the connection between different orders to the related mathematical model for the twodimensional case. For example, consider a third-order spatial indicator moment of a stationary random field, which is a function of two lags. When one of the lags is equal to zero, the third-order indicator moment becomes the second-order indicator moment. In addition, instead of exponential functions, the B-spline functions are used to estimate high-order spatial indicator moments. It is known that B-splines provide an optimal (in terms of accuracy) estimation of equi-continuous functions defined on compacts (Evans et al. 2009;Babenko 1986). Based on the above, a new recursive algorithm is proposed for the better approximation of high-order spatial statistics with nested boundary conditions of lower-level relations. Then, the conditional distribution for the given neighborhood is calculated from high-order indicator moments and the related category is simulated. In addition to extending previous developments mentioned above, the present study also explores practical aspects of the proposed method through applications at two major, real-world copper deposits: Olympic Dam, Australia, and Escondida, Chile (the world's largest copper mine). Furthermore, the paper highlights the importance of high-order spatial statistics as the useful tool for the analysis of spatial contact relations between categories. Contrary to indicator variograms (Journel and Alabert 1990;Goovaerts 1997) that provide information about pair-wise relationships between categories, the third-and fourth-order spatial indicator moments reflect threeand four-wise relations between multiple categories in space. It should be noted that, as shown in the subsequent sections, the proposed method works without a TI; however, additional information from a TI can be incorporated as a secondary condition, ensuring that the high-order spatial indicator moments are driven by the available data.
The paper is organized as follows. First, high-order spatial indicator moments are introduced as a function of distances between points for two-point and multi-point cases. Then, a mathematical model for recursive approximation of high-order spatial indicator moments is presented, followed by the proposed high-order, data-driven, categorical simulation method. Subsequently, the proposed simulation algorithm is applied in two case studies to simulate the geological units of copper deposits. Discussion and conclusions follow.

High-Order Spatial Indicator Simulation
Let (Ω, F, P) be a probability space. Consider a stationary ergodic random vector = (Z 1 , Z 2 , … , Z N ) T , ∶ Ω → S N , defined on a regular grid D = { 1 , 2 , … , N } , ∈ R n , n = 2, 3 , where Ω is a space of all possible outcomes, F contains all combinations of Ω , S N is a set of states represented by categories S = {s 1 , s 2 , … , s K } , 1 3 and P is the probability measure, or probability. For example, the probability of Z 1 being at a state s k is defined as Without loss of generality, assume that s k = k, k = 1 … K . Let n = {z , = 1 … n} be a given set of conditioning data, where lowercase z stands for outcomes of random variable Z . The focus of high-order simulation techniques is to simulate the realization of the random vector for all nodes of a grid D with a given set of conditioning data n .
Similarly to Minniakhmetov and Dimitrakopoulos (2017b), the high-order categorical simulation method is based on the concept of sequential simulation (Journel and Alabert 1990;Journel 1993), where the joint probability distribution P(Z 1 = k 1 , Z 2 = k 2 , … , Z N = k N | ) of the random vector can be decomposed into the product of conditional univariate distributions According to Eq. (2), simulation of categorical variables can be done sequentially by visiting a grid node at a time and calculating P( However, in practice (Dimitrakopoulos and Luo 2004), instead of considering all previously simulated nodes and data {Z 1 , … , Z i−1 , } , only those in local neighborhood of Z i are considered, i.e.
where Λ i is the set of previously simulated nodes and data within the local neighborhood of Z i .
Similarly to Mustapha and Dimitrakopoulos (2010a, b), the conditional distribution in Eq. (3) can be calculated from the joint distribution. Without loss of generality, consider conditional distribution P(Z 0 = k 0 |Z 1 = k 1 , … , Z n = k n ) ; it can then be calculated using Bayes' rule (Ripley 1987) where P(Z 1 = k 1 , … , Z n = k n ) can be considered as a normalization coefficient due to the relations It can be shown that the probability is equivalent to spatial indicator moment (Vargas-Guzman 2011) where E is the expected value operator and I k (Z i ) is an indicator function From here on, indicator moments are denoted as a. Find the closest data samples i 1 , i 2 , … i n . The categories at these nodes are denoted by d. Draw a random value z i 0 from this conditional distribution (10) and assign it to the unsampled location i 0 . e. Add z i 0 to the set of sample hard data and the previously simulated values. f. Repeat Steps 2a-e for all the points along the random path defined in Step 1.
The next section presents a new method to calculate high-order spatial indicator moments for Eqs. (9-10).

High-Order Spatial Indicator Moments
Similarly to second-order statistics such as variograms, high-order spatial moments are calculated by discretizing distances between data samples into lags and finding all pairs, triplets, or multiplets separated by the same lags. To define lags in the two-dimensional case, the local neighborhood of each data sample is divided into 1 3 eight sectors ( oct = 1 … 8 ) and concentric circles with radius (r = {r 1 , r 2 , … r max } ) increasing by logarithmic law (Fig. 1). The choice of logarithmically increasing lags is driven mainly by computational resource limits. For example, to cover extents of 400 m (with 200 m typical continuity range for the deposits under consideration), the constant lag division with resolution at first lags of about 20 m requires 20 lags, which correspond to 20 4 = 160,000 bins in a fourth-order map for each possible combination of categories in four points, whereas logarithmically increasing lags {20, 30, 40, 70, 160, 400} require 6 4 = 1296 bins, i.e. 123 times fewer calculations. In Fig. 1, data samples are denoted by x 0 (central point), x 1 , x 2 , and x 3 . Data sample x 1 is located in octant o = 7 and lag h = r 3 , x 2 is in octant o = 1 and lag h = r 2 , and x 3 is in octant o = 5 and lag h = r 2 . Any point located in the central circle belongs to all octant and lag 0. Thus, high-order moments can be expressed as functions of octant index and lags, e.g.
where Z 0 , Z 1 , Z 2 are random variables at locations x 0 , x 1 , x 2 . From here on, calculated indicator moments are denoted as and are calculated using sampling average where ∧ denotes statistics calculated from data samples, i.e. sampling statistics, and N , is the number of all data samples z The data samples in Fig. 1 contribute to experimental high-order spatial statistics from order 2 to 4: second-order indicator moments . It should be noted that during simulation Step 2a, only one data sample per octant is used as conditional data to avoid calculation of high-order moments with repetitive random variables, . This is quite similar to the octant search approach for second-order statistics methods (Remy et al. 2009). If several data samples fall in the same octant, the choice is made randomly.
Following the logic of fitting theoretical variograms to variogram models (Journel and Huijbregts 1978), the sampling statistics (12) are not used directly to calculate joint distributions, but they are used to model high-order indicator moments.
The quadraginta octant division is the critical part of step 2b, which entails the calculation of high-order spatial indicator moment M (Z i 0 , Z i 1 , … , Z i n ) of Algorithm A1 above. For the sake of simplicity, consider three possible categories k ∈ {0, 1, 2} and three points: central point random value Z 0 to be simulated at location 0 and two neighborhood data samples z 1 and z 2 in arbitrary directions 1 and 2 . First, the octants (o 1 , o 2 ) and lag distances (h 1 , h 2 ) are calculated from lags are approximated using all available data and the nested algorithm presented in Sect. 3.1. Note that (k 1 , k 2 ) and (o 1 , o 2 ) are fixed and known from neighborhood data at 1 and 2 ; . It should be noted that using quadraginta octant search in Algorithm A1 is quite different from the search in the classical MPS methods, such as SNESIM (Strebelle and Cavelius 2014). Firstly, MPS methods reduce the number of neighborhood data when no exact replicates are found for a particular spatial configuration, whereas octant search provides all possible replicates at different lags with the fixed angles; for example, for the third-order moments (3-point statistics) for the fixed octants (o 1 , o 2 ) defined by neighborhood data, all replicates at different lag distances (h 1 , h 2 ) are found and used in the simulation of a value in a node. Secondly, octant search does not look for exact spatial replicates, but replicates within tolerances of lags and angles, which dramatically increases the number of replicates used in the simulation process. Lastly, there is no restriction on the "regularity" of the data sample locations or TI (if used), as the octant search works on points rather than on a grid.

Approximating High-Order Indicator Moments
Data available in some applications can be dense in space and give the impression that the high-order spatial statistics in Eq. (10) can be directly calculated from samples. However, the dimension of space in high-order spatial statistics grows exponentially as the order increases, such that even dense drilling is insufficient for the direct calculation of high-order statistics from the data. For example, in the fourthorder indicator moments, there are 17,296 possible combinations of directions (all possible three neighbor directions from 48 quadraginta octant division) and 2 4 possible combinations of indicators, for the case with two categories only, which results in 2 4 × 17,296 = 138,368 fourth-order spatial indicator moments. Figure 3 and Table 1 show the histogram and percentiles, respectively, of the number of replicates for fourth-and fifth-order spatial indicator moments found in the Olympic Dam data set described in Sect. 4.1. Replicates have been calculated using quadraginta octant division and 20 m lag tolerance. Overall, 80% of all possible spatial configurations from the drill-hole data had fewer than eight samples to calculate the fourth-order  1 3 indicator moment and fewer than six samples for the fifth-order indicator moments. Such a small number of replicates is not sufficient to provide a robust calculation of high-order moments for Eq. (10) of Algorithm A.1.
To increase the amount of information and, consequently, the quality of approximation of high-order moments, Minniakhmetov and Dimitrakopoulos (2017a, b) show that high-order indicator moments are bound by low-order moments where ∖h p denotes all the lags excluding the lag h p , and ∖k p denotes all the categories excluding the category k p . If the directions are close to orthogonal, then additional boundary conditions are valid where M k is first-order statistics, i.e. proportion of category k in the data. Using the boundary conditions Eqs. (13-14) and sampling statistics (12), high-order indicator moments are approximated by where M 0 ( , ) is a trend defined just by boundary conditions, and M ( ; ) is the B-spline regression of the mismatch between sampling statistics and trend M 0 ( ; ).
The trend M 0 ( ; ) connects lower-order moments with high-order moment by recursive formula where a = c∕r max , r max is the radius that defines the local neighborhood, and c is the user-defined parameter that controls the influence of boundary conditions, i.e. small values of c force the approximation to use lower-order statistics close to boundaries ( h p = 0, r max ) and sampling statistics in the area far from boundaries, whereas large values of c increase the influence of low-order statistics on high-order statistics. For data-rich environments, such as mining of mineral deposits, the use of smaller values of c is suggested, and for sparse data, large values.
As the term M 0 ( ; ) incorporates the connection between low and high orders, the M ( ; ) is only responsible for approximation of sampling statistics with zero boundary conditions. The approximation of the multidimensional function M ( ; ) is the classical linear regression problem with constraints. In the present work, the multidimensional cardinal B-spline regression is used (Friedman et al. 2001), where M ( ; ) is approximated using a linear combination of B-splines defined on uniform intervals.
where i 1 ,…,i n are coefficients of B-spline approximation, and B i,r (t) is the i th B-splines of order r on uniformly divided knot sequence {0, dr, 2dr … r max } , which are separated by step dr that increases with the order of moment M ( ; ) , thus providing more regularized approximation for higher orders and detailed approximation for lower orders. In practice, only orders up to 5 can be adequately calculated from the data; therefore, dr = r max ∕6 , dr = r max ∕4 , and dr = r max ∕2 are used for moments of order 3, 4, 5, respectively. For second-order moments, standard variogram modeling is utilized (David 1988). The coefficients i 1 ,…,i n are found using a least-squares algorithm to fit points, which are the residual of high-order moments calculated from the data samples and trend M 0 ( ; ) from Eq. (16) under zero boundary constraints where d are centers of lags used to calculate high-order statistics from the data (12).
Using all the above, the high-order moments are recursively constructed by starting from the second-order indicator moments. Figure 4 illustrates the approximation process. First, second-order indicator moments M k 0 k p (o p ; h p ), p = 1 … n , depicted by red solid lines in Fig. 4(a), are calculated from the basic variogram model. Then, the trend M 0 k 0 ,k 1 ,k 2 ( ; ) , which is the surface in Fig. 4(a), is calculated using these boundary conditions and Eq. (16). Next, the residuals M k 0 ,k 1 ,k 2 ( ; d ) , which are the black points in Fig. 4(b), are estimated by subtracting the trend M 0 k 0 ,k 1 ,k 2 ( ; ) from the sampling statistics M k 0 ,k 1 ,k 2 ( ; d ) using Eq. (18). Then, residual spatial moments M k 0 ,k 1 ,k 2 ( ; ) , depicted as the surface in Fig. 4(c), are approximated from residuals M k 0 ,k 1 ,k 2 ( ; d ) and zero boundary conditions (red lines in Fig. 4c), using the B-spline regression in Eq. (17). Finally, the third-order spatial indicator moments, shown as the surface in Fig. 4(d), are retrieved by adding residual spatial moments M k 0 ,k 1 ,k 2 ( ; ) to the trend M 0 k 0 ,k 1 ,k 2 ( ; ) using Eq. (15).
The calculated third-order spatial indicator moments M k 0 ,k 1 ,k 2 ( ; ) are used as the boundary conditions for the fourth-order spatial indicator moments M k 0 ,k 1 ,k 2 ,k 3 ( ; ) (Fig. 5). Note that the fourth-order moment is a three-dimensional function that requires eight boundary conditions. The same procedure is recursively repeated for the fourth and fifth orders.

Olympic Dam Copper Deposit, Australia
Olympic Dam, located in South Australia, is the fourth largest copper deposit in the world. A part of the Olympic Dam deposit covering an area 1 km by 1 km is used here to demonstrate the application of the proposed simulation method in a 1 3 case study. The grade cut-off of 1% Cu is applied to the available drillhole data to define two categories to be simulated; then results are validated using highorder spatial indicator moments. Data are available in 515 exploration drill-holes shown in Fig. 6. Three-dimensional simulated realizations are generated on a regular grid with 100 × 100 × 50 grid nodes and a block of 10 × 10 × 10 m 3 . Figure 7 depicts vertical sections from a simulated realization of the deposit. The figure shows that the proposed method generates spatially complex geometric structures that honor the drill-hole data. The spatial statistics and contacts are validated using high-order spatial indicator moments. It should be noted that the simulations are performed using all possible directions from the quadraginta octant division. For the third-order indicator moments, there are 1,128 possible combinations of directions, and each of the combinations has 2 3 possible combinations of categories 0 and 1. Thus, the total number of indicator maps is 9,024. For the fourth-order indicator moments, there are 17,296 possible combinations of directions, which results in 138,368 forth-order indicator maps. Without a doubt, the information in these indicator maps is quite similar, and the complex high-order relations can be expressed using a smaller number of moments. Therefore, only the orthogonal directions, so-called L-shape templates, are used for validation purposes. The third-order indicator moment maps M 111 ( 1 , 2 ) are calculated using the L-shape template (Mustapha and Dimitrakopoulos 2010a) with lags 1 = (idx, 0) 2 = (0, jdy) , indexed by i = 0 … 8 , j = 0 … 8 , where dx and dy are 50 m × 50 m. The moment map M 111 ( 1 , 2 ) shows the probability of having category 1 in three points separated from each-other by lags along X and Y (L-shape). Note that validation of high-order moments is done with an approach with regular steps and two directions as in Mustapha and Dimitrakopoulos (2010a, b), in contrast to the octant approach with logarithmically increasing lags in Sect. 3. The logarithmically increasing lags with small steps close to the origin are helpful to better inform the approximation of high-order moments close to the central node, whereas for validation it is important to visualize both short-range and long-range connectivity. The moment map shows the probability of having copper grades above 1% at three points separated by lags along X and Y, that is, the continuity of high grades. As can be seen in Fig. 8, the simulated realization generated by the proposed method (Fig. 8b) reproduces red and yellow areas in the moment map of the hard data (Fig. 8a), that is, continuity ranges. As shown above, values along axes are indicator moments of order 2, that is the conventional indicator covariances. The third-order indicator moment maps M 101 ( 1 , 2 ) are shown in Fig. 9. The moment map shows the probability of having copper grades above 1% at two points in the Y direction and copper grade below 1% at one point in the X direction. M 101 ( 1 , 2 ) reflects the contact plane between the copper grades above and below 1%, respectively, in the east-west direction. As can be seen in Fig. 9, the simulation using the proposed method (Fig. 9b) reproduces the spatial characteristics of the data between copper grades above and below 1% (Fig. 9a), represented by the yellow-red cone in the lower right part of the moment maps in Fig. 9.
The third-order indicator moment maps ̂ 110 (h 1 , h 2 ) are shown in Fig. 10. Similarly to the above, the moment map shows the probability of having copper grades above 1% at two points in the X direction and copper grade below 1% at one point in the Y direction. M 101 ( 1 , 2 ) reflects the contact plane between copper grades above and below 1%, respectively, in the north-south direction. As can be seen in Fig. 10, the realization generated using the proposed method (Fig. 10b) reproduces the Fig. 8 The third-order indicator moment maps M 111 ( 1 , 2 ) of a the available data, and b a simulated realization generated with the proposed method contact between copper grades above and below 1% as found in the data (Fig. 10a), represented by the yellow-red cone in the upper left part of the moment maps in Fig. 10.
To highlight the importance of high-order calculations, a realization from the sequential indicator simulation method (SISIM) is analyzed in terms of thirdorder spatial indicator moments M k 1 k 2 k 3 ( 1 , 2 ) . A section of the realization, shown Fig. 9 The third-order indicator moment maps M 101 ( 1 , 2 ) of a the data, and b a simulation using the proposed method Fig. 10 The third-order indicator moment maps ̂ 110 (h 1 , h 2 ) of a the data, and b a simulation using the proposed method 1 3 in Fig. 12(a), exhibits low nonlinear connectivity and a high number of small disconnected shapes. This is confirmed by third-order indicator moments M 111 ( 1 , 2 ) , M 101 ( 1 , 2 ) , M 110 ( 1 , 2 ) in Fig. 12b-d, respectively. In contrast to Figs. 8, 9, and 10, the third-order indicator moments from the SISIM method have low values for non-zero lags ( 1 , 2 ) and a high contrast between values along axes, i.e. secondorder statistics and values away from the axes. This indicates that the realization from the SISIM method does not reproduce complex relations of data and exhibits lower nonlinear connectivity of related categories.

Escondida Norte Copper Deposit, Chile
Escondida is a large porphyry copper deposit in Chile consisting of two open-pit mines, Escondida and Escondida Norte. A part of Escondida Norte, 2.5 km by 2.5 km by 0.5 km, is used in this section to present a case study. Four mineralization zones are simulated using the proposed approach: oxides, sulfides, mix of oxides and sulfides, and waste. Complex geometrical shapes of mineralization zones and geological contacts are validated here using high-order spatial indicator moments. High-order spatial indicator moments allow us to analyze cross-categorical relations and take into account geological aspects of mineral deposits, such as which category is always embedded within another, which categories cannot be in contact, and so on. Fig. 11 The fourth-order indicator moment maps M 1000 ( 1 , 2 , 3 ) of a the data, and b a simulation using the proposed method The drill-holes available are shown in Fig. 13. In general, mineralization zones are quite variable, and the uncertainty of related contacts need to be quantified. The three-dimensional simulated realizations generated are defined on 115 × 107 × 55 grid of blocks of 25 × 25 × 15 m 3 size. Sulfides are predominantly located in the bottom part and are covered by layers of mix and oxide zones. The upper part of the deposit consists mostly of waste materials. Vertical and horizontal sections of two simulated realizations using the proposed method are shown in Figs. 14 and 15, respectively. The simulations honor the layered structure of the mineralization zones and demonstrate higher variability of oxides and mix mineralization zones. The third-order indicator moment maps M OSS ( 1 , 2 ) (O stands for oxides, S for sulfides, M for mix, and W for waste) are calculated using lags 1 = (idx, 0) 2 = (0, jdy) indexed by i = 0 … 8 , j = 0 … 8 , where dx and dy are 50 m × 50 m. The moment map shows the probability of having oxide separated from sulfides by lags along X and Y, that is, a complex contact between oxides and sulfides. As can be seen in Fig. 16, the simulated realization generated with the proposed method (Fig. 16b) reproduces the red and yellow areas in the moment map of the data (Fig. 16a). The third-order indicator moment maps M SSW ( 1 , 2 ) are shown in Fig. 17. The moment map shows the probability of having sulfides at two points in the X direction and waste in the Y direction. M SSW ( 1 , 2 ) reflects the contact plane between sulfides and waste with a north to north-south direction. As can be seen in Fig. 17, the simulation using the proposed method (Fig. 17b) reproduces the corresponding spatial relations found in the data (Fig. 17a).
Similarly to the above, all other spatial indicator moments, that is, M k 1 k 2 k 3 ( 1 , 2 ) , M k 1 k 2 k 3 k 4 ( 1 , 2 , 3 ),∀k 1 , k 2 , k 3 , k 4 = {S, O, M, W} , of the simulated realizations were analyzed and found consistent with the spatial indicator moments from the drill-hole data. Fig. 16 The third-order indicator moment maps M OSS ( 1 , 2 ) of (a) the data, and (b) a simulated realization generated with the proposed method Fig. 17 The third-order indicator moment maps M SSW ( 1 , 2 ) of a the hard data, and b the simulation using the proposed method 1 3

Conclusions
This paper presents a new data-driven, high-order sequential method for the simulation of categorical random fields. The sequential algorithm is based on the B-spline approximation of high-order spatial indicator moments that are consistent with each other. The main distinction from commonly used MPS methods is that in the proposed approach, conditional distributions are constructed using high-order spatial indicator moments as functions of distances based on hard data. Thus, simulated realizations can be generated without a TI. Note that in applications with relatively large data sets, such as the simulation of mineral deposits, the higher-order statistics are deduced from hard data. However, the option of adding a TI to a data set is available only if sparse data sets are available, as is the case with petroleum reservoirs.
The basic concept of the method presented is to use recursive approximation models with enclosed boundary conditions, which are derived from the nested nature of high-order spatial indicator moments, as presented herein. To provide a robust estimation, the regularized B-splines are used. An additional critical aspect of the proposed approach is that different amounts of information can be retrieved for different levels of relations. Each order of spatial statistics is approximated using the appropriate number of B-splines to provide robustness to the approach and to avoid overfitting. Thus, lower-order statistics are estimated with a higher resolution than the higher-order statistics.
The simulation algorithm presented was tested at two real copper deposits, without using TIs. The results of the applications demonstrate that the proposed method reproduces complex spatial patterns and preserves the high-order spatial statistics in Fig. 18 The fourth-order indicator moment maps M MOWM ( 1 , 2 , 3 ) of a the hard data and b the simulation using the proposed method the data. While the proposed technique is fully data-driven, information from a TI can be incorporated with the proposed model as a trend to capture high-frequency features when available in the TI. Further research may consider improving the approximation methods.