Abstract
Indicator kriging is widely used for mapping spatial binary variables and for estimating the global and local spatial distributions of variables in geosciences. For continuous random variables, indicator kriging gives an estimate of the cumulative distribution function, for a given threshold, which is then the estimate of a probability. Like any other kriging procedure, indicator kriging provides an estimation variance that, although not often used in applications, should be taken into account as it assesses the uncertainty of the estimate. An alternative approach to indicator estimation is proposed in this paper. In this alternative approach the complete probability density function of the indicator estimate is evaluated. The procedure is described in a Bayesian framework, using a multivariate Gaussian likelihood and an a priori distribution which are both combined according to Bayes theorem in order to obtain a posterior distribution for the indicator estimate. From this posterior distribution, point estimates, interval estimates and uncertainty measures can be obtained. Among the point estimates, the median of the posterior distribution is the maximum entropy estimate because there is a fifty-fifty chance of the unknown value of the estimate being larger or smaller than the median; that is, there is maximum uncertainty in the choice between two alternatives. Thus in some sense, the latter is an indicator estimator, alternative to the kriging estimator, that includes its own uncertainty. On the other hand, the mode of the posterior distribution estimator, assuming a uniform prior, is coincidental with the simple kriging estimator. Additionally, because the indicator estimate can be considered as a two-part composition which domain of definition is the simplex, the method is extended to compositional Bayesian indicator estimation. Bayesian indicator estimation and compositional Bayesian indicator estimation are illustrated with an environmental case study in which the probability of the content of a geochemical element in soil being over a particular threshold is of interest. The computer codes and its user guides are public domain and freely available.
Similar content being viewed by others
References
Aitchison J (1982) The statistical analysis of compositional data (with discussion). J R Stat Soc Ser B 44(2):139–177
Barancourt C, Creutin JD, Rivoirard J (1992) A method for delineating and estimating rainfall fields. Water Resour Res 28(4):1133–1144
Brus DJ, Gruijter JJ, Walvoort DJJ, de Vries F, Bronswijk JJB, Römkens PFAM, de Vries W (2002) Mapping the probability of exceeding critical thresholds for cadmium concentrations in soils in the Netherlands. J Environ Qual 31:1875–1884
D’Or D, Demouget-Renard H, Garcia M (2008) Geostatistics for contaminated sites and soils: some pending questions. geoENV VI—geostatistics for environmental applications. Springer, New York, pp 409–420
De Oliveira V (2004) A simple model for spatial rainfall fields. Stoch Environ Res Risk A 18:131–140
Deutsch CV, Journel AG (1998) GSLIB: geostatistical software library and user’s guide, 2nd edn. Oxford University Press, New York
Evans M, Hastings N, Peacock B (1993) Statistical Distributions, John Wiley, Hoboken, 170 pp
Fabbri P (2001) Probabilistic assessment of temperature in the euganean geothermal area (Veneto Region, NE Italy). Math Geol 33(6):745–760
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York
Goovaerts P (2001) Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. J Hydrol 228:113–129
Goovaerts P, AvRuskin G, Meliker J, Slotnick M, Jacquez G, Nriagu J (2005) Geostatistical modeling of the spatial variability of arsenic in groundwater of southeast Michigan. Water Resour Res 41(7):W07013
Graybill FA (1976) Theory and application of the linear model. Duxbury Press, Boston
Hoeksema RJ, Kitanidis PK (1985) Analysis of the spatial structure of properties of selected aquifers. Water Resour Res 21(4):563–572
Jones RM, Miller KS (1966) On the multivariate lognormal distribution. J Ind Math Soc 16(2):63–76
Journel A (1983) Nonparametric estimation of spatial distributions. Math Geol 15(3):445–468
Journel A, Isaaks EH (1984) Conditional indicator simulation: application to a Saskatchewan uranium deposit. Math Geol 16(7):685–718
Juang KW, Chen YS, Lee DY (2004) Using sequential indicator simulation to assess the uncertainty of delineating heavy-metal contaminated soils. Environ Pollut 127:229–238
Kitanidis PK (1983) Statistical estimation of polynomial generalized covariance functions and hydrologic applications. Water Resour Res 19(2):909–921
Kitanidis PK (1987) Parametric estimation of covariances of regionalized variables. Water Resour Bull 23(4):671–680
Kitanidis PK, Lane RW (1985) Maximum likelihood parameter estimation of hydrologic spatial processes by the Gauss–Newton method. J Hydrol 79(1–2):53–71
Mardia KV, Marshall RJ (1984) Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika 71(1):135–146
Mardia KV, Watkins (1989) On multimodality of the likelihood in the spatial linear model. Biometrika 76 (2):289–295
Minamide N (1985) An extension of the matrix inversion lemma. SIAM J Algebraic Discret Methods 6(3):371–377
Pannatier Y (1996) VARIOWIN: software for spatial data analysis in 2D, Springer-Verlag, New York
Pardo-Igúzquiza E (1998) Inference of spatial indicator covariance parameters by maximum likelihood using MLREML. Comput Geosci 24(5):453–464
Pawlowsky-Glahn V, Egozcue JJ (2001) Geometric approach to statistical analysis on the simplex. Stoch Environ Res Risk A 15(5):384–398
Pawlowsky-Glahn V, Egozcue JJ (2002) BLU estimators and compositional data. Math Geol 34(3):259–274
Reis AP, da Silva EF, Sousa AJ, Patinha C, Fonseca EC (2007) Spatial patterns of disperion and pollution sources for arsenic at Losal mine, Portugal. Int J Environ Health Res 17(5):335–349
Solow AR (1986) Mapping by simple indicator kriging. Math Geol 18(3):335–352
Tolosana-Delgado R (2005) Geostatistics for constrained variables: positive data, proportions and probabilities. PhD Thesis, University of Girona, Spain, 198 p
Tolosana-Delgado R, Pawlowsky-Glahn V, Egozcue JJ (2008) Indicator kriging without order relation violations. Math Geosci 40:327–347
Yang Q, Jung HB, Culbertson CW, Marvinney RG, Loiselle MC, Locke DB, Cheek H, Thibodeau H, Zheng Y (2009) Spatial pattern of groundwater arsenic occurrence and association with bedrock geology in Greater Augusta, Maine. Environ Sci Technol 43(8):2714–2719
Acknowledgments
We are grateful to Inés Iribarren, from the Spanish Geological Survey, for discussions and providing data. Sample data were taken through the contracts between the Spanish Geological Survey and the Local Government of Navarra, Spain (Comunidad Foral de Navarra) to determine the background levels of heavy metals in soils and to perform the geochemical cartography of soils and sediments. The content of this paper does not necessarily represent the official views of these Agencies. We would like to thank the reviewers for providing a constructive criticism.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A. Equivalence of the maximum likelihood indicator estimator and simple indicator kriging estimator
The starting point is the likelihood given by Eq. (5) and that is repeated here:
The maximum likelihood estimator is obtained by maximizing Eq. 17 or equivalently, by convenience, by minimizing the negative of the loglikelihood:
Thus the maximum likelihood estimator is given by the solution to the equation:
Thus:
Using the previous notation:
Having \( \Upsigma_{I0} = \Upsigma_{0I}^{T} :( 1\times n)\,{\text{matrix}},\,{\text{i}}.{\text{e}}.\,{\text{a}}\,{\text{vector}}\,{\text{of}}\,n\,{\text{elements}}. \) \( ({\mathbf{I}} - {\varvec{\mu}}_{I} )^{T} :( 1\times n)\,{\text{matrix}},\,{\text{i}}.{\text{e}}.\,{\text{a}}\,{\text{vector}}\,{\text{of}}\,n\,{\text{elements}}. \) \( \Upsigma_{00} :( 1\times 1)\,{\text{matrix}},\,{\text{i}}.{\text{e}}.\,{\text{an}}\,{\text{scalar}} \) \( {\varvec{\Upsigma}}_{I} :(n \times n)\,{\text{matrix}} \) \( \widetilde{{\varvec{\Upsigma}}}_{I} = \left[ {\begin{array}{*{20}c} {\Upsigma_{00} } & {\Upsigma_{0I} } \\ {\Upsigma_{I0} } & {{\varvec{\Upsigma}}_{I} } \\ \end{array} } \right] \)
An very well known relation that is needed is the inverse of a partitioned matrix (Graybill 1976):
Then
can be expanded as:
With \( A = I({\mathbf{u}}_{0} ) - m_{I} \quad ( 1\times 1)\quad {\text{matrix}},\,{\text{i}}.{\text{e}}.\,{\text{an}}\,{\text{scalar}} \) \( B = ({\mathbf{I}} - {\varvec{\mu}}_{I} )\quad (n \times 1)\,{\text{matrix}},\,{\text{i}}.{\text{e}}.\,{\text{a}}\,{\text{vector}}\,{\text{of}}\,n\,{\text{elements}} \) \( C = \left( {\Upsigma_{00} - \Upsigma_{0I} {\varvec{\Upsigma}}_{I}^{ - 1} \Upsigma_{0I}^{T} } \right)^{ - 1} \quad ( 1\times 1)\,{\text{matrix}},\,{\text{i}}.{\text{e}}.\,{\text{an}}\,{\text{scalar}} \) \( D = - \left( {\Upsigma_{00} - \Upsigma_{0I} {\varvec{\Upsigma}}_{I}^{ - 1} \Upsigma_{0I}^{T} } \right)^{ - 1} \Upsigma_{0I} {\varvec{\Upsigma}}_{I}^{ - 1} \quad ( 1\times n){\text{ matrix}},\,{\text{i}}.{\text{e}}.\,{\text{a}}\,{\text{vector}}\,{\text{of}}\,n\,{\text{elements}} \) \( E = - \left( {\Upsigma_{I} - \Upsigma_{0I}^{T} {\varvec{\Upsigma}}_{00}^{ - 1} \Upsigma_{0I} } \right)^{ - 1} \Upsigma_{0I}^{T} {\varvec{\Upsigma}}_{00}^{ - 1} \quad (n \times 1)\,{\text{matrix}},\,{\text{i}}.{\text{e}}.\,{\text{a}}\,{\text{vector}}\,{\text{of}}\,n\,{\text{elements}} \) \( F = \left( {\Upsigma_{I} - \Upsigma_{0I}^{T} {\varvec{\Upsigma}}_{00}^{ - 1} \Upsigma_{0I} } \right)^{ - 1} \quad (n \times n){\text{ matrix}} \)
Thus
Pre-multiplying Eq. 29 by \( \left( {\Upsigma_{00} - \Upsigma_{0I} {\varvec{\Upsigma}}_{I}^{ - 1} \Upsigma_{0I}^{T} } \right),{\text{a}}(1 \times 1)\,{\text{matrix,}}\,{\text{i}} . {\text{e}} .\,{\text{a}}\,{\text{scalar:}} \)
one reach the expression:
A needed result is the matrix inversion lemma (Minamide 1985):
Inserting (32) into (31) one obtains:
Next, expanding the second term of the left hand side of Eq. 33:
And after expanding the third term of the left hand side of Eq. 34 and simplification:
And because \( \frac{1}{2}({\mathbf{I}} - {\varvec{\mu}}_{I} )^{T} {\varvec{\Upsigma}}_{I}^{ - 1} \Upsigma_{0I}^{T} = \frac{1}{2}\Upsigma_{0I} {\varvec{\Upsigma}}_{I}^{ - 1} ({\mathbf{I}} - {\varvec{\mu}}_{I} ) \), as \( \Upsigma_{I}^{ - 1} \)is symmetric, (35) can be expressed as:
And expanding the second term of (36) one reach the expression:
Which after simplification and because \( \Upsigma_{0I} {\varvec{\Upsigma}}_{I}^{ - 1} \Upsigma_{0I}^{T} \) is a (1 × 1) matrix, i.e. an scalar and it has changed places accordingly.one has:
The third and fourth terms of the left hand side cancel to obtain:
one finally has
It may be seen else were (e.g. Goovaerts 1997) how Eq. 32 is exactly the simple indicator kriging estimator.
were the set of weights is given, in matrix form, by
Appendix B. Trapezoidal prior distribution
A flexible prior distribution for coding prior information at each unsampled location from secondary information is given by the trapezoidal pdf prior. The trapezoidal prior has been implemented in the provided software and it can be defined with four parameters which may be seen in Fig. 11. These four parameters (a, b, c, d) are but four indicator values and thus must be in the interval [0,1] and in non decreasing order \( 0 \le a \le b \le c \le d \le 1 \). With this four parameters, the trapezoidal pdf (Fig. 11a) may be transformed to the uniform in a given interval (Fig. 11e), triangular (Fig. 11b), left rectangular triangular (Fig. 11c), etc. This is an efficient way of coding fuzzy prior information into a prior distribution. Additionaly do not dependent on parametric parameters (mean, standard deviation, shape parameter, …) of parametrized distribution. For example the trapezoidal pdf of Fig. 11a says that the prior value of the indicator is most likely in the interval [b,c], with smaller likelihood is in the interval [a,b] and [c,d] and no likelihood in the interval [0,a] or [d,1]. The uniform prior will be specified with \( a = b = 0 \) and \( c = d = 1 \) (Fig. 11e).
Rights and permissions
About this article
Cite this article
Guardiola-Albert, C., Pardo-Igúzquiza, E. Compositional Bayesian indicator estimation. Stoch Environ Res Risk Assess 25, 835–849 (2011). https://doi.org/10.1007/s00477-011-0455-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-011-0455-y