In sub-competition 1, we discuss our results concerning the RMSEs since our primary focus was on making predictions using covariance tapering. When applying covariance tapering, we had to make some initial decisions. First, we chose an appropriate taper function for each field. For this purpose, we estimated the smoothness parameter with an educated guess based on the empirical covariogram and the criteria described in Sect. 1. When we compare the chosen taper functions with the true smoothness parameter (cf. Table 1 in Huang et al. 2021), we have only misspecified taper functions for fields containing a nugget effect. Considering the true smoothness parameters are near the margins of the decision criteria between the mentioned taper functions, the misspecified taper functions did not seem to have an effect on the RMSE. Second, we chose an appropriate taper range. The idea for this choice was to use the estimated effective range, which was based on the empirical covariogram. Because the density of the tapered covariance matrix increases with increasing taper range, we worked with an upper limit of 0.3 for computational efficiency. We started analyzing each data set with a random subsample of at least 10’000 locations and then increased the subsample size or adjusted the taper range until the estimates became stable or computational limits were exceeded. As the estimates of the tapered Matérn are biased estimates of the Matérn covariance function, we presumed low performance concerning respective MMOMs and MLOEs; cf. Table S2 in Huang et al. (2021). It is natural but also crucial to use the same model for parameter estimation and prediction. Hence, we have used the tapered Matérn covariance function with estimates from 1a for prediction at the 10’000 locations. Therefore, we included all observations available and solved one massive linear system for each dataset.
Huang et al. (2021) observed for other submissions that there is a distinct separation of RMSE’s for fields with and without nugget effects. This separation, evident in Fig. 1, is in our results due to the simple kriging, which is essentially weighted spatial smoothing. Thus, the fields with a lower signal-to-noise ratio can be interpolated with smaller errors. This inherent smoothing also explains the higher RMSEs for fields with a relatively small true smoothness \(\nu = 0.6\); cf. Fig. 1, where our respective RMSEs are all above average linear trend. Adherent to taper theory is that the RMSE of the predictions decreases with increasing taper range. On the other hand, increasing the number of random subsamples seems to affect the RMSE in the right panel of Fig. 1 positively. The positive effect may be relativized by the lower quality predictions for data set 3 (rank 10), which seems to have high leverage on this positive effect. Moreover, the fields with a small true effective range \(\beta _{\text {eff}}\) seem to have a higher RMSE and distort the effect of the increasing subsamples. Additionally, the distribution of the ranks in the \(x\text {-direction}\) in both panels of Fig. 1 indicates no tendency regarding increasing taper range or subsample size. It seems that, for these fields, the taper range should be chosen as large as the available memory allows for high-quality predictions. This is supported by the fact that the number of neighbors is higher for a large taper range and a small subsample than for a large subsample. Noteworthy in this context is that for datasets 4, 5, 12 and 13, the MLOE and MMOM are very large due to wide range estimates. Despite this, the predictions are still of high quality since we used the tapered covariance estimates that were jointly optimized. Hence, the RMSE is reasonably low and we rank for these datasets in 1b as 4th, 7th, 8th and 2nd.
In sub-competition 2, we first chose the type of compactly supported Wendland covariance function based on the approximate likelihood. It turned out that smoothness equal to one, i.e. \(\text {Wendland}_1\) suited best for all four data sets. Once this parameter was established, sill and nugget were estimated based on random subsamples of size 5’000. Thereby, we set the ranges as 0.2 for 2a and 0.015 for 2b. For computational efficiency, we used a narrower range in 2b, which has 900’000 observations. Given that the fields in sub-competition 2a do not contain a nugget effect and are of equal size as the fields in 1b, we would expect only slightly higher RMSEs. Though, the RMSEs of 2a are more than minimally higher than the average predictions from 1b, likely due to insufficient local taper ranges and not enough random subsamples. In 2b, where we used narrower taper ranges, the RMSEs are much smaller and we rank higher among competitors. Having noted that the total data sizes are considerably larger in 2b, the kriging predictions are directly more accurate.