Geographically Optimal Similarity

Understanding geographical characteristics of distribution patterns and spatial association is essential for spatial statistical inference such as factor exploration and spatial prediction. The geographical similarity principle was recently developed to explain the association between geographical variables. It describes the comprehensive degree of approximation of a geographical structure instead of explicit relationships between variables. However, there are still challenges for geographical similarity-based methods. For instance, all samples are used for prediction, leading to increased calculation burden and reduced prediction accuracy due to the noise and unrelated data in large spatial data sets. This study develops a geographically optimal similarity (GOS) model for accurate and reliable spatial prediction based on the geographical similarity principle. In GOS, the geographical configurations are first characterized, and similarities between unknown and known observation locations are assessed. Next, an optimal threshold is determined to select a small number of observations with optimal similarities for the prediction at each unknown location. Finally, a reliable uncertainty assessment approach is developed to assess and map uncertainties of GOS predictions. A new R package “geosimilarity” is developed to conduct GOS models. In this study, GOS is implemented in predicting spatial distributions of trace elements in a mining region in Australia. Results show that GOS can use a small number of observations to derive more accurate and reliable spatial predictions than linear regression and basic configuration similarity models. In addition, pattern characteristics of predictions can be improved by GOS by eliminating the phenomenon wherein predictions are clustered near mean values and contain striped textures. Therefore, GOS demonstrates greater potential for implementing the geographical similarity principle in spatial predictions by bringing information from samples with relatively high similarities at any location across space for more accurate and effective predictions in broader fields and practice.


Introduction
Spatial statistical inference is the basis of spatial analysis such as factor exploration and spatial prediction (Jacquez 1999;Møller 2013;Song and Wu 2021). Understanding distribution patterns and spatial association of geographical attributes is an essential approach for spatial statistical inference in geographical information science and mathematical geosciences (Hackeloeer et al. 2014). Current methods based on geographical characteristics for spatial statistical inference can be classified into five categories.
The first category involves assessing the spatial overlap relationships of geographical variables. Spatial overlap analysis is essential for understanding geographical characteristics, patterns, and associations (Zanni et al. 2021;Guo et al. 2010). It can be used to identify the location-based relations of multiple geographical variables to assess the structure of overlap maps and evaluate the overlay error propagation process (Shi et al. 2004).
The second category describes spatial dependence characteristics of variables for modeling spatial patterns and association. The spatial dependence principle assumes that values of attributes at near locations are more closely related than those at distant locations (Tobler 1970). The spatial dependence is usually quantified as spatial neighbor relations, lagged effects, or space-weighted matrix . The commonly used measures include spatial autocorrelation models (Moran 1950;Anselin 1995), spatial Bayesian hierarchical models (Haining and Haining 2003), geographically weighted regression and improved models (Brunsdon et al. 1996;Fotheringham et al. 2003), and singularity and anomaly models (Cheng 2007;Zuo et al. 2009;Chen and Cheng 2016). For instance, the widely used geographically weighted regression model explores the spatial non-stationarity of geographical attributes through the locally varied association between response and explanatory variables (Brunsdon et al. 1996).
The third category of methods aims at investigating the inherent unequal characteristics of geographical phenomena (i.e., spatial heterogeneity) (Hartshorne 1939). Geostatistical models (i.e., kriging family models) are commonly used methods in this category (Krige 1951;Goovaerts 1997). In kriging family models, hybrid models have been developed by integrating multiple types of geographical characteristics. The typical hybrid kriging models include regression kriging (Hengl et al. 2004), universal kriging (Zimmerman et al. 1999), machine learning-based kriging (Hengl et al. 2018), and geo-additive models (Kammann and Wand 2003). Another essential method is the spatial stratified heterogeneity models . For instance, the geographical detector model describes spatial stratified heterogeneity by comparing variance within strata and that of the whole study area (Wang et al. , 2016. Meanwhile, advanced models have been developed for spatial statistical inference based on the spatial stratified heterogeneity theory, including optimal parameters-based geographical detectors (OPGD) (Song et al. 2020), interactive detector for spatial associations (IDSA) , spatial association detector (SPADE) (Cang and Luo 2018), geographically optimal zones-based heterogeneity (GOZH) (Luo et al. 2022), and robust geographical detector (RGD) (Zhang et al. 2022) models.
The fourth category of methods is the second-dimension spatial association (SDA), which explores spatial association using geospatial data outside sampling locations that contain essential geographically local characteristics (Song 2022). SDA models implement more information from spatial data than the first-dimension spatial association (FDA) models that quantify spatial association only using data at sampling locations, such as statistical, machine learning, and most geospatial models. Studies demonstrate that SDA models provide more smooth, effective, and low-uncertainty spatial predictions than FDA models (Song 2022).
The last category is methods of the geographical similarity principle (i.e., the Third Law of Geography), which assumes that similar values of a geographical attribute at different locations are associated with their similar geographical configurations (Zhu et al. 2018;Zhu and Turner 2022). The geographical similarity principle includes three basic concepts. First, a specific target variable's geographical configuration reveals the target variable's geographical structure around study locations. The geographical configuration is defined by a set of geographical explanatory variables to the target variable. Second, the similarity of geographical configurations is used to describe the comprehensive degree of approximation of the geographical structure at a location compared with the geographical structure at all other locations. Thus, the geographical similarity of a target variable at an unknown location is determined by explanatory variables at all locations across the study area. Finally, the similarity of geographical configurations is a comparative relationship of a target variable at study locations instead of an explicit relationship between the target variable and explanatory variables commonly used in regression models. Under the geographical similarity principle, geographical attributes can be predicted regarding the geographical configurations of samples and unknown locations (Zhu et al. 2018;Zhu and Turner 2022).
However, there are still challenges for geographical similarity-based models in improving modeling accuracy and effectiveness by describing various geographical characteristics (Zhu and Turner 2022). For instance, the basic configuration similarity (BCS) model, which is the individual sample-based predictive soil mapping method (iPSM), is an innovative method based on the geographical similarity principle (Zhu et al. 2015). In the BCS model, all observations will be used for the prediction at each unknown location since the numbers of observations are relatively small in most of the BCS-based case studies. For a large data set, the computation time will be critically increased, and noise and unrelated data will be included with increased observations in BCS. Therefore, observations should be selected and reduced to improve the prediction accuracy and efficiency.
This study proposed a geographically optimal similarity (GOS) model for accurate and reliable spatial prediction based on the geographical similarity principle. The primary processes of GOS include characterizing geographical configurations using explanatory variables, assessing similarities between unknown and known observation locations, estimating the optimal percentage threshold parameter for selecting observations with the optimal similarities for each unknown location, predicting spatial distributions, and assessing prediction uncertainty. In this study, GOS is implemented in predicting spatial distributions of trace elements, including Cu, Zn, and Pb, in a mining region in Australia.
The remainder of this article is structured as follows. Section 2 describes the steps for calculating the GOS model. Section 3 presents the case study of implementing GOS in trace element prediction, which includes a study area, data, and experiment design. Section 4 shows the results of the GOS-based case study, including outcomes of GOS steps, model validation, spatial prediction and evaluation, and a discussion of the contributions of this study and future recommendations for relevant research. Finally, the study is concluded in Sect. 5.

Concepts
The geographical similarity assumes that similar values of a geographical attribute at different locations are associated with their similar geographical configurations (Zhu et al. 2018;Zhu and Turner 2022). Based on this principle, the proposed GOS model employs the similarity information at sampling locations with relatively higher similarities with an unknown location than that at other sample locations, which is the optimal similarity, instead of using the similarity information at all sample locations in the BCS model, for more effectively using the geographical similarity and more accurate and reliable spatial prediction. The details of the GOS model stages are presented in the following section.

Method
A schematic overview of the GOS model is shown in Fig. 1. The GOS model contains the following stages for spatial prediction: characterizing geographical configurations using explanatory variables, assessing similarities between unknown and observation locations, selecting samples with optimal similarities, and analyzing prediction uncertainty. The stages of the GOS model are explained as follows. A new R package "geosimilarity" is developed to conduct GOS models (https://cran.r-project.org/web/ packages/geosimilarity/index.html).

Characterizing Geographical Configurations
In the first stage, the geographical configurations are characterized using explanatory variables (i.e., covariates). Geographical configurations reveal the geographical structure and variation of a targeted variable (Zhu and Turner 2022). They are characterized by geographical explanatory variables closely related to the spatial variation of the targeted variable at both sample and unknown locations (Zhu et al. 2015). Thus, selecting effective explanatory variables for characterizing geographical configurations is essen- Fig. 1 Schematic overview of geographically optimal similarity (GOS) model tial. The primary criteria for selecting explanatory variables are the availability of data and the ability of variables to describe spatial variations of the response variable Z .
In GOS, the variable selection consists of the following steps. First, correlation analysis is used to select potential explanatory variables. Variables significantly correlated with the response variable are used in the subsequent steps. In the second step, the multicollinearity among explanatory variables is examined using a variance inflation factor (VIF). In general, a VIF higher than 10 or 4 will be considered as high multicollinearity, where 4 is a conservative threshold (Song et al. 2021a, b). The determination of the VIF threshold is related to the data size and aim of variable selection. Thus, in this study, 4 is used as the VIF threshold, and variables with VIF values higher than 4 will be removed until the VIF values of all the selected explanatory variables are lower than 4. Finally, the selected variables are used to characterize geographical configurations. The geographical configuration at a given location is presented as a vector of explanatory variables where e i (i = 1, ..., p) is the value of the explanatory variable X i (i = 1, ..., p) at a given location.

Assessing Similarity
The second stage is the assessment of similarities between unknown and observation locations based on geographical configurations. Consider the problem of predicting values of a response variable Z at unknown locations v using observations at locations u. The similarity between an unknown location v β (β = 1, ..., n) and an observation location u α (α = 1, ..., m) is computed as where S(u α , v β ) is the similarity, E i is the function for computing the similarity between locations u α and v β based on the ith covariate, and P is the function for determining the similarity between u α and v β by comparing the covariate-scale similarities of all covariates. In GOS, the minimum operator is used as the P function, as it was found to be effective in determining similarity in previous studies (Zhu et al. 1997(Zhu et al. , 2015. For the continuous variable X i , the function E i is defined as where σ is the standard deviation of explanatory variable X i , including all unknown and observation locations, and δ(u, v) is the square root of the mean deviation of X i at all unknown locations v β from that at the observation location u α Further, the similarity between an unknown location v β and all observation locations u is presented as a vector of similarities between u α and v β Thus, the value of the response variable at an unknown location can be predicted as a similarity-weighted mean valuê whereẐ (v β ) is equivalent to the BCS-based prediction at the unknown location v β .

Determining Optimal Similarity
The third stage is the selection of observation locations with optimal similarities with an unknown location by identifying a threshold for determining the optimal similarity. Identifying the optimal similarity threshold is performed using a cross-validationbased prediction error assessment approach. This process includes the following steps. First, the observation data are randomly divided into training and testing data sets. For instance, 50% of the data are training data, and 50% of the data are testing data. Next, similarities between training and testing locations are computed, and values of the response variable at testing locations are predicted using Eq. 6. The training and testing process is repeated multiple times, for example 10 times, for the more reliable threshold parameter identification. Third, a series of optional percentage threshold κ (κ ∈ (0, 1]) values are used to select observation locations with relatively high similarities with unknown locations where τ is the probability of the quantiles of similarity values. The quantile approach is effective for identifying optimal parameters in models for indicating geographical characteristics and spatial association (Song et al. 2020;Song and Wu 2021;Luo et al. 2021;Song 2022). If κ = 1 (i.e., τ = 0), all observations are used to predict values at each unknown location. Fourth, prediction accuracy is assessed using the crossvalidation root-mean-square error (RMSE) under each optional percentage threshold κ whereẐ j and Z j are the prediction and observation values at the jth testing location, respectively. Finally, the process of the above steps is repeated N times, and a relationship can be found between RMSE values and corresponding κ values; the optimal threshold is the percentage threshold that enables the minimum cross-validation RMSE where λ is the selected threshold that enables the optimal observation locations for computing similarity. The corresponding similarity threshold is S λ , and RMSE(κ) is the RMSE value under the percentage threshold κ. Thus, the similarity vector at the unknown location v β shown in Eq. 5 is converted as

Spatial Prediction and Uncertainty Analysis
The GOS-based spatial prediction is calculated with the optimal similarity threshold and similarity vectors derived from the above stage using the following equation whereẐ (v β ) is the prediction at the unknown location v β , and Z λ (u α ) is the observation at location u α with the optimal similarity. In GOS, the prediction uncertainty is inversely associated with the similarities between unknown and observation locations (Zhu and Turner 2022). The GOS uncertainty is calculated as where (v β ) is the uncertainty at the unknown location v β , Q is the quantile operator, and ζ is a probability value which is used to identify the similarity values that have a critical impact on prediction results. In GOS, the recommended ζ values are 0.9, 0.95, 0.99, 0.995, 0.999, and 1, where ζ = 1 means that Eq. 12 is equivalent to

Study Area and Trace Element Data
In this study, the GOS model is implemented in the spatial prediction of trace elements in the northwestern Shire of Leonora, a local government area (LGA) and a primarily mining region in Western Australia. The study area is 11,582.6 km 2 . Figure 2 shows the location of the study area in Australia, the geological (a) and geographical environments (b), and sampling locations of trace elements in the study area. The geological information includes the locations of faults and spatial distributions of lithology, including regolith, sedimentary siliciclastic, high-grade metamorphic rock, igneous felsic intrusive, igneous mafic intrusive, igneous mafic volcanic, igneous volcanic, and meta-igneous mafic/ultramafic volcanic. The spatial data for faults and lithology are sourced from the Surface Geology of Australia 1:1 million scale data set 2012 edition (Raymond et al. 2012). The geographical information consists of multiple types of natural and social environments, such as locations of mining sites, state and local roads, rivers, and lakes. The trace element samples are sourced from the Geological Survey of Western Australia (GSWA) Geochemistry data set, which contains geochemical element samples across Western Australia and is stored in the Western Australian Geochemistry or WACHEM database (Department of Mines, Industry Regulation and Safety, Government of Western Australia 2022). The samples of geochemical elements in the The GSWA Geochemistry data set has been implemented in a series of geological and mining analyses in Western Australia, including regional-scale regolith geochemistry (Morris et al. 1998), proterozoic mineralization identification (Morris et al. 2003), and mineral footprint assessment (Wells et al. 2016). However, studies have indicated that the GSWA Geochemistry data are "significantly underutilized" (Morin-Ka et al. 2019). Thus, in this study, the GSWA Geochemistry data are used to predict spatial distributions of trace elements in a mining region in Western Australia. In addition,

Experiment Design
The GOS-based spatial prediction of trace elements in the study area is designed as shown in Fig. 3. The case study predicts spatial distributions of trace elements  3 The process of GOS-based spatial prediction of trace elements at 1-km resolution. The case study process consists of five steps: preprocessing of trace element data, processing and selection of geological explanatory variables data, collection and processing of geographical explanatory variables data, GOS modeling and model validation, and spatial prediction and evaluation. The details of the steps are explained as follows. The first step is the preprocessing of trace element data. The trace element data were transformed using a logarithm function to avoid impacts on data distributions since trace element data are skewed distributed as shown in Table 1. The potential outliers of trace element data will not be removed if they exist in the logarithm-transformed data, as the high values may indicate the clusters of mineral deposits, which are essential for mining analysis (Cheng et al. 1994;Chen and Cheng 2018).
The second step is the processing and selection of faults and lithology types associated with trace elements. Faults and lithology are essential variables demonstrating geological conditions that are primary determinants of mineral deposits and linked with the spatial distributions of trace elements (Yuce et al. 2009;Song et al. 2012). Thus, this study employs the faults and lithology data sourced from the Surface Geology of Australia 1:1 million scale data set 2012 edition (Raymond et al. 2012) for describing geological conditions. The study area contains 202 parts of faults and eight types of lithology, as shown in Fig. 2. Methods for selecting trace element-related faults and lithology types and deriving faults and lithology-related variables are as follows. First, the mean value (μ m (F h , d)) of a trace element m at samples within a given distance d from a fault F h is compared with the mean value (μ m ) of this trace element across the study area. If μ m (F h , d) > μ m , the fault F h is selected as a trace element m-related fault. When all the trace element-related faults are selected, the distance to the selected faults is calculated as the faults-related variable for the spatial prediction. In this process, the preliminary analysis is required with consideration of the deposit process of trace elements and the spatial distributions of trace elements, faults, and lithology data. According to the preliminary statistical analysis, the parameter d is set as 10 km, as the trace element values within 10 km from faults are closely associated with the distance to faults. In addition, the mean value (μ m (L k )) of a trace element m in a certain type of lithology L k is compared with the mean value (μ m ) of trace element m in the whole study area. If μ m (L k ) > μ m , this type is selected as a trace element m-related lithology. When all the trace element-related lithologies are selected, the distance to the selected lithologies is calculated as the lithology-related variable for spatial prediction.
The third step is the collection and processing of data on potential geographical explanatory variables related to natural and social environments. Although variables of natural and social environments may not link with the causes of trace elements, they are still closely associated with trace element distributions and can be used for the prediction. In the study, nine geographical variables were collected, including elevation, slope, and aspect of terrain; distance to water, including rivers and lakes; vegetation coverage represented using the normalized difference vegetation index (NDVI); soil organic carbon and pH; distance to roads, including state and local roads; and distance to mining sites. The elevation, slope, and aspect of terrain were computed using the Australian The explanatory variable data were computed at both observation sample locations and the 1-km grids for prediction to characterize the geographical configurations. For the grid data, the raster data of elevation, slope, aspect, NDVI, organic carbon, and pH were then converted to grid data with a spatial resolution of 1 km. Distances between 1-km grids and water, roads, and mining sites were calculated using the collected spatial data for rivers and lakes, state and local roads, and mining sites.
The fourth step is GOS modeling and model validation. The prediction accuracy of GOS was evaluated by comparing it with a few commonly used spatial prediction models and previous geographical similarity-based models. The models used for comparing modeling accuracy with GOS include ordinary kriging (OK), multivariate linear regression (MLR), regression kriging (RK), and basic configuration similarity (BCS) models. In the cross-validation, observation data were randomly divided into two parts, where 50% of the data were training data, and 50% of the data were testing data. This process was performed 50 times, and the mean cross-validation indicators were used to evaluate model accuracy. The cross-validation indicators included RMSE as shown in Eq. 8 and mean absolute error (MAE) computed as The last step is the GOS-based spatial prediction of trace elements and prediction evaluation. The GOS-based spatial prediction and corresponding uncertainty estimation were performed according to the descriptions of the GOS model explained in Sect. 2. The spatial predictions derived from the above five models and prediction uncertainty derived from GOS and BCS were compared to evaluate modeling accuracy. The prediction uncertainty is evaluated using Eq. 12. In addition, this study compares the relationships between similarity and distance of the selected observations used for prediction to demonstrate that the observations with high similarity can be near or remote samples, which is one of the advantages of GOS. Finally, anomalies are identified from the GOS-based spatial predictions of trace elements to examine the probability of mineral deposits using a window-based local singularity analysis (Cheng 1999(Cheng , 2007Chen and Cheng 2016;Zuo et al. 2009). First, a series of windows is set with variable sizes at any locations in the study area, where the window sizes are recorded as r . In this study, the window sizes are set as 1 km, 2 km, 3 km, ..., and 10 km. Next, the mean values are calculated for a trace element within different window sizes at any location. The mean values are recorded as b(r ). Finally, the relationship between b(r ) and r is estimated using Eq. 14 where A is the singularity index, and c is a constant coefficient. The estimation of A at all locations in the study area can generate a map of local singularity. On the singularity map, A = 2 indicates the potential anomalies (i.e., extreme values from normal and lognormal distributions) (Cheng 2007).  Figure 4 shows the statistical summary of the logarithm-transformed trace element data for Cu, Zn, and Pb, respectively. As shown in the summary of the original sample data in Table 1, the trace element data have skewed distributions. The logarithm-transformed data of the three trace elements Cu, Zn, and Pb are close to normal distributions. The data of logarithm-transformed Cu contain three outliers with relatively high values, which are retained in the data set as they may be related to the essential information of mineral deposits.

Selection of Geological Variables
The geological conditions are closely associated with trace element distributions. Figure 5 shows the processes and results of selecting geological variables, including faults and lithology-related variables. Figure 5a shows the selected lithology types closely associated with trace elements. The mean values of trace elements in the areas of selected lithologies are higher than the overall mean values. Figure 5b shows the selected trace element-related faults and the comparisons of spatial distributions of trace element observations and the selected faults and lithology types. Results demonstrate that the relatively high values of trace elements are generally located surrounding the selected faults and lithologies. Figure 5c shows the relationships between trace elements and the geological variables for spatial prediction, including the distance to the selected faults and distance to the selected lithologies. The results show negative correlations between trace elements and distance to trace element-related faults or lithologies. Figure 6 shows the maps of potential explanatory variables used in the study to characterize geographical configurations and predict trace elements. The explanatory variables demonstrate the geological, natural, and social environments across space. The explanatory variables will be selected in GOS. Table 2 shows the selected variables for trace elements Cu, Zn, and Pb. The selected variables are significantly correlated with the logarithm-transformed trace element data. The variables with the highest correlation coefficients with Cu, Zn, and Pb are the distance to Cu-related lithology (R = −0.432), the distance to Zn-related lithology  Figure 7 shows the processes and results of determining the optimal percentage thresholds λ used to identify observations with optimal similarities with unknown locations based on geographical configurations. Results show that the cross-validation RMSE tends to be decreased at the beginning and increase after a certain point with the growth of κ, the percentage of data used for prediction. This means that if we use fewer observations, the predictions are not reliable. Suppose we use more observations to calculate similarities and perform predictions. In that case, the prediction error will be increased because observations with low similarities with the unknown locations will bring errors, noise, and unrelated information to the prediction process. Thus, the phenomenon also confirms that selecting a reasonable number of observations is essential to assess the similarity of geographical configurations. Results show that the optimal percentage thresholds λ values for the predictions of Cu, Zn, and Pb are  Process of determining thresholds of optimal similarities of geographical configurations for Cu, Zn, and Pb. κ: optional percentages of data with higher similarities than others; and λ: Threshold for determining the optimal similarity 0.05, 0.04, and 0.05, respectively. The λ values indicate that only about 4 to 5% of observation data are required for the spatial predictions at each unknown location. Figure 8 shows a summary of cross-validation MAE, and RMSE values of spatial predictions performed using OK, MLR, RK, BCS, and GOS for Cu, Zn, and Pb. Table 3 shows the summary of mean values, cross-validation MAE and RMSE, and error reductions by GOS compared with other models. Results show that compared with kriging and linear regression models, spatial prediction errors can be critically reduced by the geographical similarity-based models, including BCS and GOS. Further, compared with BCS, GOS can effectively reduce prediction errors by using the optimal similarity. Compared with OK, the MAE and RMSE values can be reduced by 8.6 to 14.4% and 4.3 to 10.6% by GOS models for Cu, Zn, and Pb predictions, respectively. Compared with BCS, the MAE and RMSE values can be further reduced by 2.3 to 3.4% and 0.9 to 2.4% by GOS models for predicting different types of trace elements,  Figure 9 shows the spatial predictions of trace elements across the study area using GOS, OK, MLR, RK, and BCS models. The distributions of predictions can indicate the following findings. First, compared with geographical similarity-based models, (i.e., BCS and GOS), OK underestimates the relatively high and low values, and MLR overestimates and underestimates regions with relatively high and low values, respectively. For instance, in the MLR-based Cu predictions, map values in the northwestern and southwestern regions were much lower than in maps of BCS-and GOS-based predictions. This means that the predictions in these regions may be underestimated.

Spatial Prediction
In the MLR-based Zn and Pb prediction maps, Zn in the central and northern regions and Pb in the central and southwestern areas are much higher than that predicted by BCS and GOS-based predictions and may be overestimated by MLR. In addition, BCS-based predictions contain more values close to the mean values of trace elements in the study area than that predicted by other models, and predictions have striped texture. The phenomenon is also demonstrated by the statistical density distributions (Allegre and Lewin 1995) of trace element predictions shown in Fig. 10. This phenomenon also existed in the BCS-based spatial predictions in previous studies, such as BCS-based elevation prediction and reconstruction (Zhu et al. 1997), soil moisture prediction maps (Zhu et al. 2015), flea index of transmitting plague (Du et al.  . Thus, the GOS models can effectively address the issue. From a geochemical perspective, GOS contributes to more accurate, reasonable, and locally detailed predictions. Thus, GOS brings more potential to implement the geographical similarity principle in spatial predictions in broader fields and address practical issues. Figure 11 shows statistics and spatial distributions of uncertainties of predictions derived from BCS and GOS models. Results of the three types of trace elements have similar patterns, and the modeling of Cu is used as an example to show the analysis of prediction uncertainty. Table 4 shows a summary of prediction uncertainties under different probability values (i.e., ζ = 0.9, 0.95, 0.99, 0.995, 0.999, 1), in quantiles for estimating uncertainties of predicting Cu, Zn, and Pb. The results of prediction uncertainties can be explained from the following aspects. First, the uncertainties of GOS-based predictions are generally lower than that of BCS-based predictions when ζ = 1. When ζ = 1, the mean uncertainties of the three trace elements derived from GOS range from 0.031 to 0.083, but that derived from BCS range from 0.043 to 0.283. The mean uncertainty can be reduced by 28.5 to 74.1% by GOS compared with BCS models when the ζ value ranges from 0.9 to 0.999. This means that the prediction uncertainty can be reduced compared to previous geographical similaritybased models. In addition, when ζ = 1, BCS and GOS models will generate identical uncertainties across the study area, although their predictions are critically different. In this scenario, the uncertainty is only related to the maximum similarity at unknown  locations. However, similarity values and observation samples used for the prediction at any unknown locations are different in BCS and GOS models. Therefore, it is recommended to consider uncertainty scenarios under different probability (ζ ) values to ensure the reliable assessment of geographical similarity-based spatial predictions. Figure 12 shows an analysis of the relationship between similarity and distance to sample observations with optimal similarities at any unknown locations across space. The analysis of Cu predictions is used as an example to demonstrate the relationships, and the results of Zn and Pb are similar to that of Cu predictions. In the Cu predictions, 5% of data (i.e., 47 samples), are used for the prediction at an unknown location.  Figure 12a and b show the mean (μ) and standard deviation (σ ) of the distances between the unknown location and the locations of the 5% of sample data used for predicting the trace element value at this unknown location. The maps indicate that data with high similarities can be samples at near, long-distance, and even remote locations. In Fig. 12c and d show four types of distances between unknown locations and their similar samples: H μ − H σ (red), H μ − L σ (orange), L μ − H σ (purple), and L μ − L σ (green), where H and L mean high and low, respectively. This means that in the central regions (green), most of the samples used for prediction are located at short distances, and the variations of the distances are limited, but it does not work in the other three types of regions. For instance, in the eastern region (red), most of the samples used for prediction are located in long-distance areas, and the distances are critically varied. Therefore, the analysis demonstrates that GOS can effectively implement data with high similarities and at critically varied distances to unknown locations for spatial prediction. Figure 13 shows the maps of local singularities of Cu, Zn, and Pb and their statistical a plots. Both the maps and density plots demonstrate that the singularities in most of

Contributions and Future Recommendations
In summary, the case study of the GOS-based spatial prediction of trace elements demonstrates the following advantages of GOS in spatial modeling based on geographical similarity. First, in GOS, a small number of observations with the optimal similarities for each unknown location (e.g., 4 to 5% of observations) are identified for prediction instead of using all observations in previous geographical similarity-based models. This means that the prediction at an unknown location is only associated with a small number of more similar observations than others instead of all observations. Secondly, by using fewer observations for the prediction, GOS can provide more accurate predictions regarding the cross-validation of prediction accuracy and the comparison with OK, MLR, RK, and BCS models. The RMSE can be reduced by 4.3 to 10.6% by GOS compared with OK and reduced by 0.9 to 2.4% by GOS compared with BCS. Third, the analysis of prediction distributions indicates that GOS can effectively address the issue that many prediction values are close to the mean value and have striped texture in previous geographical similarity-based models. GOS can bring more potential to implement the geographical similarity principle in spatial predictions in broader fields and practice. Finally, GOS includes a new and reliable uncertainty assessment approach (Eq. 12) for geographical similarity-based spatial prediction. The prediction uncertainty assessment indicates that the mean uncertainty in the study area can be reduced by 28.5 to 74.1% by GOS compared with BCS.
Future studies can be developed from the following aspects based on the GOS approach. First, GOS can be integrated with spatial models developed based on spatial dependence and heterogeneity theories for spatial prediction. Each type of spatial prediction method has its advantages, and previous massive studies indicate that integrating different methods can significantly improve prediction accuracy. For instance, GOS can replace the regression part in the regression kriging model to integrate GOS with kriging for prediction. Thus, the GOS-kriging model can reveal the geographical similarity and heterogeneity characteristics in geographical attributes. In this aspect, models that have the potentials to be integrated with GOS include kriging models, geographically weighted regression, geographical detectors, and machine learning algorithms. In addition, it would be helpful to evaluate the model performance of GOS and compare it with other models (e.g., kriging models, geographically weighted regression, and second-dimension spatial association (Song 2022)) under various scenarios, such as different sample sizes and spatial scales. Finally, more applications of GOS in broader fields are recommended to evaluate the effectiveness of the geographical similarity principle in spatial statistical inference. In addition to the definition in this study, the spatial similarity may also be defined by the local and surrounding geographical conditions, structures, and texture at the sample and unknown locations.

Conclusion
This study proposed a GOS model for the geographical similarity principle-based spatial prediction. The detailed steps of GOS and the implementation of GOS in practice were provided in the study. Compared with previous geographical similarity-based approaches, GOS contains the following two new components for modeling. First, an optimal percentage threshold determination approach was developed to select a small number of observations with optimal similarities for the prediction at each unknown location. In addition, a reliable uncertainty assessment approach was developed for assessing and mapping uncertainties of GOS predictions. In this study, GOS was implemented in predicting spatial distributions of trace elements, including Cu, Zn, and Pb, in a mining region in Australia. Results show that GOS can effectively explain geographical configuration information of trace elements and can use a small number of observations to derive more accurate and reliable spatial predictions than linear regression and basic configuration similarity models. In addition, pattern characteristics of predictions can be improved by GOS compared with previous geographical similarity-based models by eliminating the phenomenon that predictions are clustered near mean values and containing striped textures. Therefore, GOS can bring more potential to implement the geographical similarity principle in spatial predictions in broader fields and practice.