Abstract
Gaussian conditional realizations are routinely used for risk assessment and planning in a variety of Earth sciences applications. Assuming a Gaussian random field, conditional realizations can be obtained by first creating unconditional realizations that are then post-conditioned by kriging. Many efficient algorithms are available for the first step, so the bottleneck resides in the second step. Instead of doing the conditional simulations with the desired covariance (F approach) or with a tapered covariance (T approach), we propose to use the taper covariance only in the conditioning step (half-taper or HT approach). This enables to speed up the computations and to reduce memory requirements for the conditioning step but also to keep the right short scale variations in the realizations. A criterion based on mean square error of the simulation is derived to help anticipate the similarity of HT to F. Moreover, an index is used to predict the sparsity of the kriging matrix for the conditioning step. Some guides for the choice of the taper function are discussed. The distributions of a series of 1D, 2D and 3D scalar response functions are compared for F, T and HT approaches. The distributions obtained indicate a much better similarity to F with HT than with T.
Similar content being viewed by others
References
Bevilacqua M, Faouzi T, Furrer R, Porcu E (2016) Estimation and prediction using generalized Wendland covariance functions under fixed domain asymptotics. ArXiv 1607.06921v1:1–36. arXiv:1607.06921
Bochner S (1933) Monotone funktionen, stieltjessche integrale und harmonische analyse. Math Ann 108:378–410
Bohman H (1960) Approximate Fourier analysis of distribution functions. Ark Mat 4:99–157
Bolin D, Lindgren F (2013) A comparison between Markov approximations and other methods for large spatial data sets. Comput Stat Data Anal 61:7–21
Bolin D, Wallin J (2016) Spatially adaptive covariance tapering. Spat Stat 18–Part A:163–178
Chan G, Wood ATA (1999) Simulation of stationary Gaussian vector fields. Stat Comput 9(4):265–268
Chen Y, Davis TA, Hager WW, Rajamanickam S (2008) Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate. ACM Trans Math Softw 35(3):1–14
Chilès J, Delfiner P (2012) Geostatistics: modeling spatial uncertainty, 2nd edn. Wiley, London
Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J R Stat Soc Ser B 70:209–226
Davis TA (2006) Direct methods for sparse linear systems. SIAM Series Fundamentals of Algorithms, Philadelphia
Deltheil R (1926) Probabilités géométriques (Tome II, Fascicule II of : E. Borel, Traité du calcul des probabilités et de ses applications). Gauthier-Villars, Paris, 1–123
Dyjkstra E (1959) A note on two problems in connexion with graphs. Numer Math 1:269–271
Emery X (2004) Testing the correctness of the sequential algorithm for simulating Gaussian random fields. Stoch Env Res Risk Assess 18:401–413
Emery X (2008) Statistical tests for validating geostatistical simulation algorithms. Comput Geosci 34(11):1610–1620
Emery X, Lantuéjoul C (2006) TBSIM: a computer program for conditional simulation of three-dimensional Gaussian random fields via the turning bands method. Comput Geosci 32(10):1615–1628
Emery X, Arroyo D, Porcu E (2015) An improved spectral turning-bands algorithm for simulating stationary vector Gaussian random fields. Stoch Env Res Risk Assess 30(7):1863–1873. doi:10.1007/s00477-015-1151-0
Furrer R, Genton MG, Nychka D (2006) Covariance tapering for interpolation of large spatial datasets. J Comput Graph Stat 15(2):502–523
Gneiting T (2002) Compactly supported correlation functions. J Multivar Anal 83(2):493–508
Gneuss P, Schmid W, Schwarze R (2013) Efficient approximation of the spatial covariance function for large datasets—analysis of atmospheric \(\text{CO}_2\) concentrations. In: Discussion paper series recap15
Hovadik J, Larue D (2007) Static characterizations of reservoirs: refining the concepts of connectivity and continuity. Pet Geosci 13:195–211
Lantuéjoul C (2002) Geostatistical simulation. Springer, Berlin
Lim T, Teo P (2008) Gaussian fields and Gaussian sheets with generalized Cauchy covariance structure.arXiv:0807.0022v1
Lindgren F, Rue H, Lindstrom J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J R Stat Soc B 73:423–498
Lu TT, Shiou SH (2002) Inverses of \(2 \times 2\) block matrices. Comput Math Appl 43:119–129
Marcotte D (2015) TASC3D: a program to test the admissibility in 3D of non-linear models of coregionalization. Comput Geosci 83:168–175
Marcotte D (2016) Spatial turning bands simulation of anisotropic non linear models of coregionalization with symmetric cross-covariances. Comput Geosci 89:232–238
Matheron G (1965) Les variables régionalisées et leur estimation. Ph.D. thesis, Faculté des Sciences, Université de Paris, Masson
Matheron G (1971) The theory of regionalized variables and its applications. École nationale supérieure des mines 5:1–211
Paravarzar S, Emery X, Madani N (2015) Comparing sequential Gaussian and turning bands algorithms for cosimulating grades in multi-element deposits. C R Geosci 347:84–93
Philip J (1991) The probability distribution of the distance between two random points in a box. Department of Mathematics, Royal Institute of Technology, pp 1–13. https://people.kth.se/~johanph/habc.pdf
Porcu E, Daley DJ, Buhmann M, Bevilacqua M (2013) Radial basis functions for multivariate geostatistics. Stoch Env Res Risk Assess 27(4):909–922
Renard P, Allard D (2013) Connectivity metrics for subsurface flow and transport. Adv Water Resour 51:168–196
Safikhani M, Asghari O, Emery X (2016) Assessing the accuracy of sequential Gaussian simulation through statistical testing. Stoch Environ Res Risk Assess. doi:10.1007/s00477-016-1255-1
Sang H, Huang J (2012) A full scale approximation of covariance functions for large spatial data sets. J R Stat Soc B 74:111–132
Shinozuka M, Jan CM (1972) Digital simulation of random processes and its applications. J Sound Vib 25:111–128
Sneddon I (1951) Fourier transforms. McGraw-Hill, New Year
Stein M (1993) A simple condition for asymptotic optimality of linear predictions of random fields. Stat Probab Lett 17:399–404
Stein M (1999) Interpolation of spatial data: some theory for kriging. Springer, Berlin
Stein M (2013) Statistical properties of covariance tapers. J Comput Graph Stat 22:866–885
Wackernagel H (2003) Multivariate geostatistics: an introduction with applications, 3rd edn. Springer, Berlin
Wendland H (1995) Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Adv Comput Math 4(1):389–396
Wu Z (1995) Compactly supported positive definite radial functions. Adv Comput Math 4(1):283–292
Zhang H, Du J (2008) Covariance tapering in spatial statistics. In: Mateu J, Porcu E (eds) Positive definite functions: from Schoenberg to space-time challenges. Universidad Jaume I., Castellon (Spain)
Acknowledgements
We are indebted to one anonymous reviewer for his attentive and detailed review and for his numerous constructive comments. We thank Pr. Emilio Porcu from Universidad Técnica Federico Santa María in Valparaiso (Chile) for fruitful discussions and for providing us working material on generalized Wendland covariance functions and their use under fixed domain asymptotics. This research was financed in part by National Science Research Council of Canada (Grant RGPIN105603-05).
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of Proposition 2
Appendix: Proof of Proposition 2
We first establish two lemmas:
Lemma 1
(Adapted from Lu and Shiou 2002) Consider the symmetric block matrix
where \(\mathbf{D}_T\) is diagonal and \({\mathbf{K}}_0\) and \({\mathbf{K}}_1\) are symmetric non singular matrices such that \({\mathbf{K}}_1 - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T\) and \({\mathbf{K}}_0^{-1} - \mathbf{D}_T {\mathbf{K}}_1 \mathbf{D}_T\) are non singular. Then,
Proof
This lemma is a direct consequence of Theorem 2 in Lu and Shiou (2002). \(\square\)
Lemma 2
Let \({\mathbf{x}}_1,\dots , {\mathbf{x}}_n\) be n sample points in finite domain D and let \(C({\mathbf{h}})\) be a covariance function on D with \(C(\mathbf{0})=1\). Let Z be a zero mean Gaussian random field with covariance function \(C({\mathbf{h}})\) on D. Let further \({\mathbf{K}}\) be the \(n \times n\) matrix with elements \([{\mathbf{K}}]_{ij} = C({\mathbf{x}}_i-{\mathbf{x}}_j)\), for \(1 \le i,j \le n\) and \({\mathbf{k}}_s\) be the n vector with elements \([{\mathbf{k}}_{\mathbf{x}}]_i = C({\mathbf{x}}_i-{\mathbf{x}})\), for \({\mathbf{x}}\in D\) and \(1 \le i \le n\). Then, the matrix
is positive semi-definite for any \({\mathbf{x}}\in D\).
Proof
In order to show this, we need to show that for any vector \(\pmb {\lambda }= (\lambda _1,\dots ,\lambda _n)^{\prime } \in {\mathfrak{R}}^n\), it holds that
Let us denote \(S = \sum _{i=1}^n \lambda _i Z({\mathbf{x}}_i)\). Then, since \(\hbox {Var}\{S\} = \sum _{i=1}^n \sum _{j=1}^n \lambda _i \lambda _j [{\mathbf{K}}]_{ij}\) and \(\hbox {Cov}\{S,Z({\mathbf{x}})\} = \sum _{i=1}^n \lambda _i [{\mathbf{k}}_{\mathbf{x}}]_i\), Eq. (21) is equivalent to
Using \(\hbox {Cov}\{S,Z({\mathbf{x}})\}^2 \le \hbox {Var}\{S\} \hbox {Var}\{Z({\mathbf{x}})\}\) and \(\hbox {Var}\{Z({\mathbf{x}})\} = 1\), we thus get very easily that
which finishes the proof. \(\square\)
We are now ready to provide the proof of Proposition 2. We must show that \(\sigma ^2_{k,C_1}({\mathbf{x}}) \ge \sigma ^2_{k,C_0}({\mathbf{x}})\) for all \({\mathbf{x}}\in D\). As usual, we drop the dependency on \({\mathbf{x}}\) for sake of conciseness. Since \(\sigma ^2_{k,C_1} = \sigma ^2_0 - {\mathbf{k}}^{\prime }_1 {\mathbf{K}}_1^{-1} {\mathbf{k}}_1\) and \(\sigma ^2_{k,C_0} = \sigma ^2_0 - {\mathbf{k}}^{\prime }_0 {\mathbf{K}}_0^{-1} {\mathbf{k}}_0\), we need to prove that:
Since \({\mathbf{k}}_1 = {\mathbf{k}}_0 {\mathbf{k}}_T\) and \({\mathbf{K}}_1 = {\mathbf{K}}_0 \odot {\mathbf{K}}_T\), Eq. (22) is equivalent to
To show that this expression is always nonnegative, we will show that the matrix \({\mathbf{M}}\) with elements \([{\mathbf{M}}]_{ij} = [{\mathbf{K}}_0^{-1}]_{ij} - [{\mathbf{k}}_T]_i \, [\{{\mathbf{K}}_0 \odot {\mathbf{K}}_T\}^{-1}]_{ij} \, [{\mathbf{k}}_T]_j\), for \(1 \le i,j \le n\) is positive definite (p.d.) except for the trivial case \({\mathbf{K}}_0={\mathbf{K}}_1, {\mathbf{k}}_0={\mathbf{k}}_1\), corresponding to a taper with infinite range, where \({\mathbf{M}}={\mathbf{0}}\) and Eq. (22) equals zero. Introducing the diagonal matrix \(\mathbf{D}_T = \hbox {diag}({\mathbf{k}}_T)\), this matrix can also be written
Since \({\mathbf{M}}\) is invertible, it is p.d. if and only if \({\mathbf{M}}^{-1}\) is p.d. Using Lemma 1, its inverse is
Using Lemma 2, one has that \({\mathbf{K}}_T - {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }\) is p.d. Hence, using Schur’s product theorem, \({\mathbf{K}}_0 \odot ({\mathbf{K}}_T - {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }) = {\mathbf{K}}_0 \odot {\mathbf{K}}_T - {\mathbf{K}}_0 \odot {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }= {\mathbf{K}}_0 \odot {\mathbf{K}}_T - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T\) is p.d. and so is its inverse. As sums and products of p.d. matrices are p.d., we can conclude that \({\mathbf{M}}^{-1}\) in Eq. (25) is also p.d., which completes the proof. \(\square\)
Rights and permissions
About this article
Cite this article
Marcotte, D., Allard, D. Half-tapering strategy for conditional simulation with large datasets. Stoch Environ Res Risk Assess 32, 279–294 (2018). https://doi.org/10.1007/s00477-017-1386-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-017-1386-z