Skip to main content
Log in

Half-tapering strategy for conditional simulation with large datasets

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Gaussian conditional realizations are routinely used for risk assessment and planning in a variety of Earth sciences applications. Assuming a Gaussian random field, conditional realizations can be obtained by first creating unconditional realizations that are then post-conditioned by kriging. Many efficient algorithms are available for the first step, so the bottleneck resides in the second step. Instead of doing the conditional simulations with the desired covariance (F approach) or with a tapered covariance (T approach), we propose to use the taper covariance only in the conditioning step (half-taper or HT approach). This enables to speed up the computations and to reduce memory requirements for the conditioning step but also to keep the right short scale variations in the realizations. A criterion based on mean square error of the simulation is derived to help anticipate the similarity of HT to F. Moreover, an index is used to predict the sparsity of the kriging matrix for the conditioning step. Some guides for the choice of the taper function are discussed. The distributions of a series of 1D, 2D and 3D scalar response functions are compared for F, T and HT approaches. The distributions obtained indicate a much better similarity to F with HT than with T.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Bevilacqua M, Faouzi T, Furrer R, Porcu E (2016) Estimation and prediction using generalized Wendland covariance functions under fixed domain asymptotics. ArXiv 1607.06921v1:1–36. arXiv:1607.06921

  • Bochner S (1933) Monotone funktionen, stieltjessche integrale und harmonische analyse. Math Ann 108:378–410

    Article  Google Scholar 

  • Bohman H (1960) Approximate Fourier analysis of distribution functions. Ark Mat 4:99–157

    Article  Google Scholar 

  • Bolin D, Lindgren F (2013) A comparison between Markov approximations and other methods for large spatial data sets. Comput Stat Data Anal 61:7–21

    Article  Google Scholar 

  • Bolin D, Wallin J (2016) Spatially adaptive covariance tapering. Spat Stat 18–Part A:163–178

    Article  Google Scholar 

  • Chan G, Wood ATA (1999) Simulation of stationary Gaussian vector fields. Stat Comput 9(4):265–268

    Article  Google Scholar 

  • Chen Y, Davis TA, Hager WW, Rajamanickam S (2008) Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate. ACM Trans Math Softw 35(3):1–14

    Article  CAS  Google Scholar 

  • Chilès J, Delfiner P (2012) Geostatistics: modeling spatial uncertainty, 2nd edn. Wiley, London

    Book  Google Scholar 

  • Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J R Stat Soc Ser B 70:209–226

    Article  Google Scholar 

  • Davis TA (2006) Direct methods for sparse linear systems. SIAM Series Fundamentals of Algorithms, Philadelphia

    Book  Google Scholar 

  • Deltheil R (1926) Probabilités géométriques (Tome II, Fascicule II of : E. Borel, Traité du calcul des probabilités et de ses applications). Gauthier-Villars, Paris, 1–123

    Google Scholar 

  • Dyjkstra E (1959) A note on two problems in connexion with graphs. Numer Math 1:269–271

    Article  Google Scholar 

  • Emery X (2004) Testing the correctness of the sequential algorithm for simulating Gaussian random fields. Stoch Env Res Risk Assess 18:401–413

    Article  Google Scholar 

  • Emery X (2008) Statistical tests for validating geostatistical simulation algorithms. Comput Geosci 34(11):1610–1620

    Article  Google Scholar 

  • Emery X, Lantuéjoul C (2006) TBSIM: a computer program for conditional simulation of three-dimensional Gaussian random fields via the turning bands method. Comput Geosci 32(10):1615–1628

    Article  Google Scholar 

  • Emery X, Arroyo D, Porcu E (2015) An improved spectral turning-bands algorithm for simulating stationary vector Gaussian random fields. Stoch Env Res Risk Assess 30(7):1863–1873. doi:10.1007/s00477-015-1151-0

    Article  Google Scholar 

  • Furrer R, Genton MG, Nychka D (2006) Covariance tapering for interpolation of large spatial datasets. J Comput Graph Stat 15(2):502–523

    Article  Google Scholar 

  • Gneiting T (2002) Compactly supported correlation functions. J Multivar Anal 83(2):493–508

    Article  Google Scholar 

  • Gneuss P, Schmid W, Schwarze R (2013) Efficient approximation of the spatial covariance function for large datasets—analysis of atmospheric \(\text{CO}_2\) concentrations. In: Discussion paper series recap15

  • Hovadik J, Larue D (2007) Static characterizations of reservoirs: refining the concepts of connectivity and continuity. Pet Geosci 13:195–211

    Article  CAS  Google Scholar 

  • Lantuéjoul C (2002) Geostatistical simulation. Springer, Berlin

    Book  Google Scholar 

  • Lim T, Teo P (2008) Gaussian fields and Gaussian sheets with generalized Cauchy covariance structure.arXiv:0807.0022v1

  • Lindgren F, Rue H, Lindstrom J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J R Stat Soc B 73:423–498

    Article  Google Scholar 

  • Lu TT, Shiou SH (2002) Inverses of \(2 \times 2\) block matrices. Comput Math Appl 43:119–129

    Article  Google Scholar 

  • Marcotte D (2015) TASC3D: a program to test the admissibility in 3D of non-linear models of coregionalization. Comput Geosci 83:168–175

    Article  Google Scholar 

  • Marcotte D (2016) Spatial turning bands simulation of anisotropic non linear models of coregionalization with symmetric cross-covariances. Comput Geosci 89:232–238

    Article  Google Scholar 

  • Matheron G (1965) Les variables régionalisées et leur estimation. Ph.D. thesis, Faculté des Sciences, Université de Paris, Masson

  • Matheron G (1971) The theory of regionalized variables and its applications. École nationale supérieure des mines 5:1–211

    Google Scholar 

  • Paravarzar S, Emery X, Madani N (2015) Comparing sequential Gaussian and turning bands algorithms for cosimulating grades in multi-element deposits. C R Geosci 347:84–93

    Article  Google Scholar 

  • Philip J (1991) The probability distribution of the distance between two random points in a box. Department of Mathematics, Royal Institute of Technology, pp 1–13. https://people.kth.se/~johanph/habc.pdf

  • Porcu E, Daley DJ, Buhmann M, Bevilacqua M (2013) Radial basis functions for multivariate geostatistics. Stoch Env Res Risk Assess 27(4):909–922

    Article  Google Scholar 

  • Renard P, Allard D (2013) Connectivity metrics for subsurface flow and transport. Adv Water Resour 51:168–196

    Article  Google Scholar 

  • Safikhani M, Asghari O, Emery X (2016) Assessing the accuracy of sequential Gaussian simulation through statistical testing. Stoch Environ Res Risk Assess. doi:10.1007/s00477-016-1255-1

    Google Scholar 

  • Sang H, Huang J (2012) A full scale approximation of covariance functions for large spatial data sets. J R Stat Soc B 74:111–132

    Article  Google Scholar 

  • Shinozuka M, Jan CM (1972) Digital simulation of random processes and its applications. J Sound Vib 25:111–128

    Article  Google Scholar 

  • Sneddon I (1951) Fourier transforms. McGraw-Hill, New Year

    Google Scholar 

  • Stein M (1993) A simple condition for asymptotic optimality of linear predictions of random fields. Stat Probab Lett 17:399–404

    Article  Google Scholar 

  • Stein M (1999) Interpolation of spatial data: some theory for kriging. Springer, Berlin

    Book  Google Scholar 

  • Stein M (2013) Statistical properties of covariance tapers. J Comput Graph Stat 22:866–885

    Article  Google Scholar 

  • Wackernagel H (2003) Multivariate geostatistics: an introduction with applications, 3rd edn. Springer, Berlin

    Book  Google Scholar 

  • Wendland H (1995) Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Adv Comput Math 4(1):389–396

    Article  Google Scholar 

  • Wu Z (1995) Compactly supported positive definite radial functions. Adv Comput Math 4(1):283–292

    Article  Google Scholar 

  • Zhang H, Du J (2008) Covariance tapering in spatial statistics. In: Mateu J, Porcu E (eds) Positive definite functions: from Schoenberg to space-time challenges. Universidad Jaume I., Castellon (Spain)

Download references

Acknowledgements

We are indebted to one anonymous reviewer for his attentive and detailed review and for his numerous constructive comments. We thank Pr. Emilio Porcu from Universidad Técnica Federico Santa María in Valparaiso (Chile) for fruitful discussions and for providing us working material on generalized Wendland covariance functions and their use under fixed domain asymptotics. This research was financed in part by National Science Research Council of Canada (Grant RGPIN105603-05).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Marcotte.

Appendix: Proof of Proposition 2

Appendix: Proof of Proposition 2

We first establish two lemmas:

Lemma 1

(Adapted from Lu and Shiou 2002) Consider the symmetric block matrix

$$\left( \begin{array}{ll} {\mathbf{K}}_0^{-1} &{} \mathbf{D}_T \\ \mathbf{D}_T &{} {\mathbf{K}}_1\end{array}\right),$$

where \(\mathbf{D}_T\) is diagonal and \({\mathbf{K}}_0\) and \({\mathbf{K}}_1\) are symmetric non singular matrices such that \({\mathbf{K}}_1 - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T\) and \({\mathbf{K}}_0^{-1} - \mathbf{D}_T {\mathbf{K}}_1 \mathbf{D}_T\) are non singular. Then,

$$({\mathbf{K}}_0^{-1} - \mathbf{D}_T {\mathbf{K}}_1 \mathbf{D}_T)^{-1} = {\mathbf{K}}_0 + {\mathbf{K}}_0 \mathbf{D}_T ({\mathbf{K}}_1 - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T)^{-1} \mathbf{D}_T {\mathbf{K}}_0.$$

Proof

This lemma is a direct consequence of Theorem 2 in Lu and Shiou (2002). \(\square\)

Lemma 2

Let \({\mathbf{x}}_1,\dots , {\mathbf{x}}_n\) be n sample points in finite domain D and let \(C({\mathbf{h}})\) be a covariance function on D with \(C(\mathbf{0})=1\). Let Z be a zero mean Gaussian random field with covariance function \(C({\mathbf{h}})\) on D. Let further \({\mathbf{K}}\) be the \(n \times n\) matrix with elements \([{\mathbf{K}}]_{ij} = C({\mathbf{x}}_i-{\mathbf{x}}_j)\), for \(1 \le i,j \le n\) and \({\mathbf{k}}_s\) be the n vector with elements \([{\mathbf{k}}_{\mathbf{x}}]_i = C({\mathbf{x}}_i-{\mathbf{x}})\), for \({\mathbf{x}}\in D\) and \(1 \le i \le n\). Then, the matrix

$${\mathbf{K}}- {\mathbf{k}}_{\mathbf{x}}{\mathbf{k}}^{\prime }_{\mathbf{x}}$$

is positive semi-definite for any \({\mathbf{x}}\in D\).

Proof

In order to show this, we need to show that for any vector \(\pmb {\lambda }= (\lambda _1,\dots ,\lambda _n)^{\prime } \in {\mathfrak{R}}^n\), it holds that

$$Q = \sum _{i=1}^n \sum _{j=1}^n \lambda _i \lambda _j \left( [{\mathbf{K}}]_{ij} - [{\mathbf{k}}_{\mathbf{x}}]_{i} [{\mathbf{k}}_{\mathbf{x}}]_{j} \right) \ge 0.$$
(21)

Let us denote \(S = \sum _{i=1}^n \lambda _i Z({\mathbf{x}}_i)\). Then, since \(\hbox {Var}\{S\} = \sum _{i=1}^n \sum _{j=1}^n \lambda _i \lambda _j [{\mathbf{K}}]_{ij}\) and \(\hbox {Cov}\{S,Z({\mathbf{x}})\} = \sum _{i=1}^n \lambda _i [{\mathbf{k}}_{\mathbf{x}}]_i\), Eq. (21) is equivalent to

$$Q = \hbox{Var}\{S\} -\hbox {Cov}\{S,Z({\mathbf{x}})\}^2.$$

Using \(\hbox {Cov}\{S,Z({\mathbf{x}})\}^2 \le \hbox {Var}\{S\} \hbox {Var}\{Z({\mathbf{x}})\}\) and \(\hbox {Var}\{Z({\mathbf{x}})\} = 1\), we thus get very easily that

$$Q \ge \hbox {Var}\{S\} -\hbox {Var}\{S\} \hbox {Var}\{Z({\mathbf{x}}_0)\} = 0,$$

which finishes the proof. \(\square\)

We are now ready to provide the proof of Proposition 2. We must show that \(\sigma ^2_{k,C_1}({\mathbf{x}}) \ge \sigma ^2_{k,C_0}({\mathbf{x}})\) for all \({\mathbf{x}}\in D\). As usual, we drop the dependency on \({\mathbf{x}}\) for sake of conciseness. Since \(\sigma ^2_{k,C_1} = \sigma ^2_0 - {\mathbf{k}}^{\prime }_1 {\mathbf{K}}_1^{-1} {\mathbf{k}}_1\) and \(\sigma ^2_{k,C_0} = \sigma ^2_0 - {\mathbf{k}}^{\prime }_0 {\mathbf{K}}_0^{-1} {\mathbf{k}}_0\), we need to prove that:

$${\mathbf{k}}^{\prime }_0 {\mathbf{K}}_0^{-1} {\mathbf{k}}_0 - {\mathbf{k}}^{\prime }_1 {\mathbf{K}}_1^{-1} {\mathbf{k}}_1 \ge 0.$$
(22)

Since \({\mathbf{k}}_1 = {\mathbf{k}}_0 {\mathbf{k}}_T\) and \({\mathbf{K}}_1 = {\mathbf{K}}_0 \odot {\mathbf{K}}_T\), Eq. (22) is equivalent to

$$\sum _{i=1}^n \sum _{j=1}^n [{\mathbf{k}}_0]_i \left( [{\mathbf{K}}_0^{-1}]_{ij} - [{\mathbf{k}}_T]_i \, [\{{\mathbf{K}}_0 \odot {\mathbf{K}}_T\}^{-1}]_{ij} \, [{\mathbf{k}}_T]_j\right) [{\mathbf{k}}_0]_j \ge 0.$$
(23)

To show that this expression is always nonnegative, we will show that the matrix \({\mathbf{M}}\) with elements \([{\mathbf{M}}]_{ij} = [{\mathbf{K}}_0^{-1}]_{ij} - [{\mathbf{k}}_T]_i \, [\{{\mathbf{K}}_0 \odot {\mathbf{K}}_T\}^{-1}]_{ij} \, [{\mathbf{k}}_T]_j\), for \(1 \le i,j \le n\) is positive definite (p.d.) except for the trivial case \({\mathbf{K}}_0={\mathbf{K}}_1, {\mathbf{k}}_0={\mathbf{k}}_1\), corresponding to a taper with infinite range, where \({\mathbf{M}}={\mathbf{0}}\) and Eq. (22) equals zero. Introducing the diagonal matrix \(\mathbf{D}_T = \hbox {diag}({\mathbf{k}}_T)\), this matrix can also be written

$${\mathbf{M}}={\mathbf{K}}_0^{-1} - \mathbf{D}_T\, \{{\mathbf{K}}_0 \odot {\mathbf{K}}_T\}^{-1} \mathbf{D}_T.$$
(24)

Since \({\mathbf{M}}\) is invertible, it is p.d. if and only if \({\mathbf{M}}^{-1}\) is p.d. Using Lemma 1, its inverse is

$${\mathbf{M}}^{-1} = {\mathbf{K}}_0 + {\mathbf{K}}_0 \mathbf{D}_T\, \{{\mathbf{K}}_0 \odot {\mathbf{K}}_T - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T\}^{-1} \mathbf{D}_T {\mathbf{K}}_0.$$
(25)

Using Lemma 2, one has that \({\mathbf{K}}_T - {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }\) is p.d. Hence, using Schur’s product theorem, \({\mathbf{K}}_0 \odot ({\mathbf{K}}_T - {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }) = {\mathbf{K}}_0 \odot {\mathbf{K}}_T - {\mathbf{K}}_0 \odot {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }= {\mathbf{K}}_0 \odot {\mathbf{K}}_T - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T\) is p.d. and so is its inverse. As sums and products of p.d. matrices are p.d., we can conclude that \({\mathbf{M}}^{-1}\) in Eq. (25) is also p.d., which completes the proof. \(\square\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marcotte, D., Allard, D. Half-tapering strategy for conditional simulation with large datasets. Stoch Environ Res Risk Assess 32, 279–294 (2018). https://doi.org/10.1007/s00477-017-1386-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-017-1386-z

Keywords

Navigation