Customary and routine practice of geostatistical modeling assumes that inter-point distances are a Euclidean metric (i.e., as the crow flies) when characterizing spatial variation. There are many real-world settings, however, in which the use of a non-Euclidean distance is more appropriate, for example, in complex bodies of water. However, if such a distance is used with current semivariogram functions, the resulting spatial covariance matrices are no longer guaranteed to be positive-definite. Previous attempts to address this issue for geostatistical prediction (i.e., kriging) models transform the non-Euclidean space into a Euclidean metric, such as through multi-dimensional scaling (MDS). However, these attempts estimate spatial covariances only after distances are scaled. An alternative method is proposed to re-estimate a spatial covariance structure originally based on a non-Euclidean distance metric to ensure validity. This method is compared to the standard use of Euclidean distance, as well as a previously utilized MDS method. All methods are evaluated using cross-validation assessments on both simulated and real-world experiments. Results show a high level of bias in prediction variance for the previously developed MDS method that has not been highlighted previously. Conversely, the proposed method offers a preferred tradeoff between prediction accuracy and prediction variance and at times outperforms the existing methods for both sets of metrics. Overall results indicate that this proposed method can provide improved geostatistical predictions while ensuring valid results when the use of non-Euclidean distances is warranted.
This is a preview of subscription content, log in to check access.
This work was supported by the National Institutes of Allergy and Infectious Diseases [Grant No. 1R01AI123931-01A1 to F.C.C. (principal investigator)]. Additional support for B.J.K.D. was provided in part by the Johns Hopkins’ Environment, Energy, Sustainability & Health Institute Fellowship and the Center for a Livable Future-Lerner Fellowship, as well as The National Science Foundation’s Water, Climate, and Health Integrative Education and Research traineeship (Grant No. 1069213). The authors would like to thank Tim Shields for helping to develop the schematic maps displayed in this paper.
Berman JD, Breysse PN, White RH, Waugh DW, Curriero FC (2015) Evaluating methods for spatial mapping: applications for estimating ozone concentrations across the contiguous United States. Environ Technol Innov 3:1–10CrossRefGoogle Scholar
Bivand R, Keitt T, Rowlingson B (2016) rgdal: bindings for the geospatial data abstraction library, R package version 1.1-10 edn.Google Scholar
Boisvert JB (2010) Geostatistics with locally varying anisotropy. University of Alberta, EdmontonGoogle Scholar
Boisvert JB, Deutsch CV (2011) Programs for kriging and sequential Gaussian simulation with locally varying anisotropy using non-Euclidean distances. Comput Geosci 37:495–510CrossRefGoogle Scholar
Cheng SH, Higham NJ (1998) A modified Cholesky algorithm based on a symmetric indefinite factorization. SIAM J Matrix Anal Appl 19:1097–1110CrossRefGoogle Scholar
Congdon CD, Martin JD (2007) On using standard residuals as a metric of kriging model quality. In: Proceedings of the 48th AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics, and materials conference, Honolulu HIGoogle Scholar
Core Team R (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
Curriero FC (2006) On the use of non-euclidean distance measures in geostatistics. Math Geol 38:907–926CrossRefGoogle Scholar
Datta A, Banerjee S, Finley AO, Gelfand AE (2016) On nearest-neighbor Gaussian process models for massive spatial data. Wiley Interdiscip Rev Comput Stat 8:162–171CrossRefGoogle Scholar
Davis BJ, Jacobs JM, Davis MF, Schwab KJ, DePaola A, Curriero FC (2017) Environmental determinants of Vibrio parahaemolyticus in the Chesapeake Bay. Appl Environ Microbiol 83:e01117–e01147CrossRefGoogle Scholar
Lu B, Charlton M, Fotheringham AS (2011) Geographically Weighted Regression using a non-Euclidean distance metric with a study on London house price data. In: Procedia environmental sciences, pp 92-97. https://doi.org/10.1016/j.proenv.2011.07.017
Lucas C (2001) Computing nearest covariance and correlation matrices. M.S, Thesis, University of ManchesterGoogle Scholar
Maechler M (2016) sfsmisc: utilities from “Seminar fuer Statistik” ETH Zurich, R package version 1.1-0 edn.Google Scholar
Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, LondonGoogle Scholar
Matheron G (1971) The theory of regionalized variables and its applications. Les Cah Morphol Math 5:218Google Scholar
Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F (2015) e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien, R package version 1.6-7. edn.Google Scholar
Murphy R, Perlman E, Ball WP, Curriero FC (2015) Water-distance-based Kriging in Chesapeake Bay. J Hydrol Eng 20:0501403CrossRefGoogle Scholar
Novomestky F (2012) matrixcalc: collection of functions for matrix calculations, R package version 1.0-3 edn.Google Scholar
Rowlingson B, Diggle P (2015) splancs: spatial and space-time point pattern analysis, R package version 2.01-38 edn.Google Scholar
Sampson PD, Guttorp P (1992) Nonparametric estimation of nonstationary spatial covariance structure. J Am Stat Assoc 87:108–119CrossRefGoogle Scholar
Schlather M, Malinowski A, Menck PJ, Oesting M, Strokorb K (2015) Analysis, simulation and prediction of multivariate random fields with package RandomFields. J Stat Softw 63:1–25CrossRefGoogle Scholar