Half-tapering strategy for conditional simulation with large datasets

Marcotte, D.; Allard, D.

doi:10.1007/s00477-017-1386-z

Half-tapering strategy for conditional simulation with large datasets

Original Paper
Published: 06 February 2017

Volume 32, pages 279–294, (2018)
Cite this article

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

247 Accesses
11 Citations
Explore all metrics

Abstract

Gaussian conditional realizations are routinely used for risk assessment and planning in a variety of Earth sciences applications. Assuming a Gaussian random field, conditional realizations can be obtained by first creating unconditional realizations that are then post-conditioned by kriging. Many efficient algorithms are available for the first step, so the bottleneck resides in the second step. Instead of doing the conditional simulations with the desired covariance (F approach) or with a tapered covariance (T approach), we propose to use the taper covariance only in the conditioning step (half-taper or HT approach). This enables to speed up the computations and to reduce memory requirements for the conditioning step but also to keep the right short scale variations in the realizations. A criterion based on mean square error of the simulation is derived to help anticipate the similarity of HT to F. Moreover, an index is used to predict the sparsity of the kriging matrix for the conditioning step. Some guides for the choice of the taper function are discussed. The distributions of a series of 1D, 2D and 3D scalar response functions are compared for F, T and HT approaches. The distributions obtained indicate a much better similarity to F with HT than with T.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing the accuracy of sequential gaussian simulation through statistical testing

Article 06 May 2016

One Step at a Time: The Origins of Sequential Simulation and Beyond

Article Open access 05 February 2021

Uncertainty Assessment over any Volume without Simulation: Revisiting Multi-Gaussian Kriging

Article 02 January 2021

References

Bevilacqua M, Faouzi T, Furrer R, Porcu E (2016) Estimation and prediction using generalized Wendland covariance functions under fixed domain asymptotics. ArXiv 1607.06921v1:1–36. arXiv:1607.06921
Bochner S (1933) Monotone funktionen, stieltjessche integrale und harmonische analyse. Math Ann 108:378–410
Article Google Scholar
Bohman H (1960) Approximate Fourier analysis of distribution functions. Ark Mat 4:99–157
Article Google Scholar
Bolin D, Lindgren F (2013) A comparison between Markov approximations and other methods for large spatial data sets. Comput Stat Data Anal 61:7–21
Article Google Scholar
Bolin D, Wallin J (2016) Spatially adaptive covariance tapering. Spat Stat 18–Part A:163–178
Article Google Scholar
Chan G, Wood ATA (1999) Simulation of stationary Gaussian vector fields. Stat Comput 9(4):265–268
Article Google Scholar
Chen Y, Davis TA, Hager WW, Rajamanickam S (2008) Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate. ACM Trans Math Softw 35(3):1–14
Article CAS Google Scholar
Chilès J, Delfiner P (2012) Geostatistics: modeling spatial uncertainty, 2nd edn. Wiley, London
Book Google Scholar
Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J R Stat Soc Ser B 70:209–226
Article Google Scholar
Davis TA (2006) Direct methods for sparse linear systems. SIAM Series Fundamentals of Algorithms, Philadelphia
Book Google Scholar
Deltheil R (1926) Probabilités géométriques (Tome II, Fascicule II of : E. Borel, Traité du calcul des probabilités et de ses applications). Gauthier-Villars, Paris, 1–123
Google Scholar
Dyjkstra E (1959) A note on two problems in connexion with graphs. Numer Math 1:269–271
Article Google Scholar
Emery X (2004) Testing the correctness of the sequential algorithm for simulating Gaussian random fields. Stoch Env Res Risk Assess 18:401–413
Article Google Scholar
Emery X (2008) Statistical tests for validating geostatistical simulation algorithms. Comput Geosci 34(11):1610–1620
Article Google Scholar
Emery X, Lantuéjoul C (2006) TBSIM: a computer program for conditional simulation of three-dimensional Gaussian random fields via the turning bands method. Comput Geosci 32(10):1615–1628
Article Google Scholar
Emery X, Arroyo D, Porcu E (2015) An improved spectral turning-bands algorithm for simulating stationary vector Gaussian random fields. Stoch Env Res Risk Assess 30(7):1863–1873. doi:10.1007/s00477-015-1151-0
Article Google Scholar
Furrer R, Genton MG, Nychka D (2006) Covariance tapering for interpolation of large spatial datasets. J Comput Graph Stat 15(2):502–523
Article Google Scholar
Gneiting T (2002) Compactly supported correlation functions. J Multivar Anal 83(2):493–508
Article Google Scholar
Gneuss P, Schmid W, Schwarze R (2013) Efficient approximation of the spatial covariance function for large datasets—analysis of atmospheric $\text{CO}_2$ concentrations. In: Discussion paper series recap15
Hovadik J, Larue D (2007) Static characterizations of reservoirs: refining the concepts of connectivity and continuity. Pet Geosci 13:195–211
Article CAS Google Scholar
Lantuéjoul C (2002) Geostatistical simulation. Springer, Berlin
Book Google Scholar
Lim T, Teo P (2008) Gaussian fields and Gaussian sheets with generalized Cauchy covariance structure.arXiv:0807.0022v1
Lindgren F, Rue H, Lindstrom J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J R Stat Soc B 73:423–498
Article Google Scholar
Lu TT, Shiou SH (2002) Inverses of $2 \times 2$ block matrices. Comput Math Appl 43:119–129
Article Google Scholar
Marcotte D (2015) TASC3D: a program to test the admissibility in 3D of non-linear models of coregionalization. Comput Geosci 83:168–175
Article Google Scholar
Marcotte D (2016) Spatial turning bands simulation of anisotropic non linear models of coregionalization with symmetric cross-covariances. Comput Geosci 89:232–238
Article Google Scholar
Matheron G (1965) Les variables régionalisées et leur estimation. Ph.D. thesis, Faculté des Sciences, Université de Paris, Masson
Matheron G (1971) The theory of regionalized variables and its applications. École nationale supérieure des mines 5:1–211
Google Scholar
Paravarzar S, Emery X, Madani N (2015) Comparing sequential Gaussian and turning bands algorithms for cosimulating grades in multi-element deposits. C R Geosci 347:84–93
Article Google Scholar
Philip J (1991) The probability distribution of the distance between two random points in a box. Department of Mathematics, Royal Institute of Technology, pp 1–13. https://people.kth.se/~johanph/habc.pdf
Porcu E, Daley DJ, Buhmann M, Bevilacqua M (2013) Radial basis functions for multivariate geostatistics. Stoch Env Res Risk Assess 27(4):909–922
Article Google Scholar
Renard P, Allard D (2013) Connectivity metrics for subsurface flow and transport. Adv Water Resour 51:168–196
Article Google Scholar
Safikhani M, Asghari O, Emery X (2016) Assessing the accuracy of sequential Gaussian simulation through statistical testing. Stoch Environ Res Risk Assess. doi:10.1007/s00477-016-1255-1
Google Scholar
Sang H, Huang J (2012) A full scale approximation of covariance functions for large spatial data sets. J R Stat Soc B 74:111–132
Article Google Scholar
Shinozuka M, Jan CM (1972) Digital simulation of random processes and its applications. J Sound Vib 25:111–128
Article Google Scholar
Sneddon I (1951) Fourier transforms. McGraw-Hill, New Year
Google Scholar
Stein M (1993) A simple condition for asymptotic optimality of linear predictions of random fields. Stat Probab Lett 17:399–404
Article Google Scholar
Stein M (1999) Interpolation of spatial data: some theory for kriging. Springer, Berlin
Book Google Scholar
Stein M (2013) Statistical properties of covariance tapers. J Comput Graph Stat 22:866–885
Article Google Scholar
Wackernagel H (2003) Multivariate geostatistics: an introduction with applications, 3rd edn. Springer, Berlin
Book Google Scholar
Wendland H (1995) Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Adv Comput Math 4(1):389–396
Article Google Scholar
Wu Z (1995) Compactly supported positive definite radial functions. Adv Comput Math 4(1):283–292
Article Google Scholar
Zhang H, Du J (2008) Covariance tapering in spatial statistics. In: Mateu J, Porcu E (eds) Positive definite functions: from Schoenberg to space-time challenges. Universidad Jaume I., Castellon (Spain)

Download references

Acknowledgements

We are indebted to one anonymous reviewer for his attentive and detailed review and for his numerous constructive comments. We thank Pr. Emilio Porcu from Universidad Técnica Federico Santa María in Valparaiso (Chile) for fruitful discussions and for providing us working material on generalized Wendland covariance functions and their use under fixed domain asymptotics. This research was financed in part by National Science Research Council of Canada (Grant RGPIN105603-05).

Author information

Authors and Affiliations

Département des génies civil, géologique et des mines, Polytechnique Montréal, C.P. 6079 Succ. Centre-ville, Montréal, QC, H3C 3A7, Canada
D. Marcotte
Biostatistique et Processus Spatiaux (BioSP), INRA, 84914, Avignon, France
D. Allard PhD

Authors

D. Marcotte
View author publications
You can also search for this author in PubMed Google Scholar
D. Allard PhD
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. Marcotte.

Appendix: Proof of Proposition 2

We first establish two lemmas:

Lemma 1

(Adapted from Lu and Shiou 2002) Consider the symmetric block matrix

$$\left( \begin{array}{ll} {\mathbf{K}}_0^{-1} &{} \mathbf{D}_T \\ \mathbf{D}_T &{} {\mathbf{K}}_1\end{array}\right),$$

where $\mathbf{D}_T$ is diagonal and ${\mathbf{K}}_0$ and ${\mathbf{K}}_1$ are symmetric non singular matrices such that ${\mathbf{K}}_1 - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T$ and ${\mathbf{K}}_0^{-1} - \mathbf{D}_T {\mathbf{K}}_1 \mathbf{D}_T$ are non singular. Then,

$$({\mathbf{K}}_0^{-1} - \mathbf{D}_T {\mathbf{K}}_1 \mathbf{D}_T)^{-1} = {\mathbf{K}}_0 + {\mathbf{K}}_0 \mathbf{D}_T ({\mathbf{K}}_1 - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T)^{-1} \mathbf{D}_T {\mathbf{K}}_0.$$

Proof

This lemma is a direct consequence of Theorem 2 in Lu and Shiou (2002). $\square$

Lemma 2

Let ${\mathbf{x}}_1,\dots , {\mathbf{x}}_n$ be n sample points in finite domain D and let $C({\mathbf{h}})$ be a covariance function on D with $C(\mathbf{0})=1$. Let Z be a zero mean Gaussian random field with covariance function $C({\mathbf{h}})$ on D. Let further ${\mathbf{K}}$ be the $n \times n$ matrix with elements $[{\mathbf{K}}]_{ij} = C({\mathbf{x}}_i-{\mathbf{x}}_j)$, for $1 \le i,j \le n$ and ${\mathbf{k}}_s$ be the n vector with elements $[{\mathbf{k}}_{\mathbf{x}}]_i = C({\mathbf{x}}_i-{\mathbf{x}})$, for ${\mathbf{x}}\in D$ and $1 \le i \le n$. Then, the matrix

$${\mathbf{K}}- {\mathbf{k}}_{\mathbf{x}}{\mathbf{k}}^{\prime }_{\mathbf{x}}$$

is positive semi-definite for any ${\mathbf{x}}\in D$.

Proof

In order to show this, we need to show that for any vector $\pmb {\lambda }= (\lambda _1,\dots ,\lambda _n)^{\prime } \in {\mathfrak{R}}^n$, it holds that

$$Q = \sum _{i=1}^n \sum _{j=1}^n \lambda _i \lambda _j \left( [{\mathbf{K}}]_{ij} - [{\mathbf{k}}_{\mathbf{x}}]_{i} [{\mathbf{k}}_{\mathbf{x}}]_{j} \right) \ge 0.$$

(21)

Let us denote $S = \sum _{i=1}^n \lambda _i Z({\mathbf{x}}_i)$. Then, since $\hbox {Var}\{S\} = \sum _{i=1}^n \sum _{j=1}^n \lambda _i \lambda _j [{\mathbf{K}}]_{ij}$ and $\hbox {Cov}\{S,Z({\mathbf{x}})\} = \sum _{i=1}^n \lambda _i [{\mathbf{k}}_{\mathbf{x}}]_i$, Eq. (21) is equivalent to

$$Q = \hbox{Var}\{S\} -\hbox {Cov}\{S,Z({\mathbf{x}})\}^2.$$

Using $\hbox {Cov}\{S,Z({\mathbf{x}})\}^2 \le \hbox {Var}\{S\} \hbox {Var}\{Z({\mathbf{x}})\}$ and $\hbox {Var}\{Z({\mathbf{x}})\} = 1$, we thus get very easily that

$$Q \ge \hbox {Var}\{S\} -\hbox {Var}\{S\} \hbox {Var}\{Z({\mathbf{x}}_0)\} = 0,$$

which finishes the proof. $\square$

We are now ready to provide the proof of Proposition 2. We must show that $\sigma ^2_{k,C_1}({\mathbf{x}}) \ge \sigma ^2_{k,C_0}({\mathbf{x}})$ for all ${\mathbf{x}}\in D$. As usual, we drop the dependency on ${\mathbf{x}}$ for sake of conciseness. Since $\sigma ^2_{k,C_1} = \sigma ^2_0 - {\mathbf{k}}^{\prime }_1 {\mathbf{K}}_1^{-1} {\mathbf{k}}_1$ and $\sigma ^2_{k,C_0} = \sigma ^2_0 - {\mathbf{k}}^{\prime }_0 {\mathbf{K}}_0^{-1} {\mathbf{k}}_0$, we need to prove that:

$${\mathbf{k}}^{\prime }_0 {\mathbf{K}}_0^{-1} {\mathbf{k}}_0 - {\mathbf{k}}^{\prime }_1 {\mathbf{K}}_1^{-1} {\mathbf{k}}_1 \ge 0.$$

(22)

Since ${\mathbf{k}}_1 = {\mathbf{k}}_0 {\mathbf{k}}_T$ and ${\mathbf{K}}_1 = {\mathbf{K}}_0 \odot {\mathbf{K}}_T$, Eq. (22) is equivalent to

$$\sum _{i=1}^n \sum _{j=1}^n [{\mathbf{k}}_0]_i \left( [{\mathbf{K}}_0^{-1}]_{ij} - [{\mathbf{k}}_T]_i \, [\{{\mathbf{K}}_0 \odot {\mathbf{K}}_T\}^{-1}]_{ij} \, [{\mathbf{k}}_T]_j\right) [{\mathbf{k}}_0]_j \ge 0.$$

(23)

To show that this expression is always nonnegative, we will show that the matrix ${\mathbf{M}}$ with elements $[{\mathbf{M}}]_{ij} = [{\mathbf{K}}_0^{-1}]_{ij} - [{\mathbf{k}}_T]_i \, [\{{\mathbf{K}}_0 \odot {\mathbf{K}}_T\}^{-1}]_{ij} \, [{\mathbf{k}}_T]_j$, for $1 \le i,j \le n$ is positive definite (p.d.) except for the trivial case ${\mathbf{K}}_0={\mathbf{K}}_1, {\mathbf{k}}_0={\mathbf{k}}_1$, corresponding to a taper with infinite range, where ${\mathbf{M}}={\mathbf{0}}$ and Eq. (22) equals zero. Introducing the diagonal matrix $\mathbf{D}_T = \hbox {diag}({\mathbf{k}}_T)$, this matrix can also be written

$${\mathbf{M}}={\mathbf{K}}_0^{-1} - \mathbf{D}_T\, \{{\mathbf{K}}_0 \odot {\mathbf{K}}_T\}^{-1} \mathbf{D}_T.$$

(24)

Since ${\mathbf{M}}$ is invertible, it is p.d. if and only if ${\mathbf{M}}^{-1}$ is p.d. Using Lemma 1, its inverse is

$${\mathbf{M}}^{-1} = {\mathbf{K}}_0 + {\mathbf{K}}_0 \mathbf{D}_T\, \{{\mathbf{K}}_0 \odot {\mathbf{K}}_T - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T\}^{-1} \mathbf{D}_T {\mathbf{K}}_0.$$

(25)

Using Lemma 2, one has that ${\mathbf{K}}_T - {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }$ is p.d. Hence, using Schur’s product theorem, ${\mathbf{K}}_0 \odot ({\mathbf{K}}_T - {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }) = {\mathbf{K}}_0 \odot {\mathbf{K}}_T - {\mathbf{K}}_0 \odot {\mathbf{k}}_T {\mathbf{k}}_T^{\prime }= {\mathbf{K}}_0 \odot {\mathbf{K}}_T - \mathbf{D}_T {\mathbf{K}}_0 \mathbf{D}_T$ is p.d. and so is its inverse. As sums and products of p.d. matrices are p.d., we can conclude that ${\mathbf{M}}^{-1}$ in Eq. (25) is also p.d., which completes the proof. $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marcotte, D., Allard, D. Half-tapering strategy for conditional simulation with large datasets. Stoch Environ Res Risk Assess 32, 279–294 (2018). https://doi.org/10.1007/s00477-017-1386-z

Download citation

Published: 06 February 2017
Issue Date: January 2018
DOI: https://doi.org/10.1007/s00477-017-1386-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Half-tapering strategy for conditional simulation with large datasets

Abstract

Access this article

Similar content being viewed by others

Assessing the accuracy of sequential gaussian simulation through statistical testing

One Step at a Time: The Origins of Sequential Simulation and Beyond

Uncertainty Assessment over any Volume without Simulation: Revisiting Multi-Gaussian Kriging

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Proposition 2

Lemma 1

Proof

Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Half-tapering strategy for conditional simulation with large datasets

Abstract

Access this article

Similar content being viewed by others

Assessing the accuracy of sequential gaussian simulation through statistical testing

One Step at a Time: The Origins of Sequential Simulation and Beyond

Uncertainty Assessment over any Volume without Simulation: Revisiting Multi-Gaussian Kriging

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Proposition 2

Appendix: Proof of Proposition 2

Lemma 1

Proof

Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation