A Segmentation Approach for Stochastic Geological Modeling Using Hidden Markov Random Fields

Wang, Hui; Wellmann, J. Florian; Li, Zhao; Wang, Xiangrong; Liang, Robert Y.

doi:10.1007/s11004-016-9663-9

A Segmentation Approach for Stochastic Geological Modeling Using Hidden Markov Random Fields

Published: 09 November 2016

Volume 49, pages 145–177, (2017)
Cite this article

Mathematical Geosciences Aims and scope Submit manuscript

Hui Wang¹,
J. Florian Wellmann¹,
Zhao Li²,
Xiangrong Wang³ &
…
Robert Y. Liang²

2199 Accesses
64 Citations
Explore all metrics

Abstract

Stochastic modeling methods and uncertainty quantification are important tools for gaining insight into the geological variability of subsurface structures. Previous attempts at geologic inversion and interpretation can be broadly categorized into geostatistics and process-based modeling. The choice of a suitable modeling technique directly depends on the modeling applications and the available input data. Modern geophysical techniques provide us with regional data sets in two- or three-dimensional spaces with high resolution either directly from sensors or indirectly from geophysical inversion. Existing methods suffer certain drawbacks in producing accurate and precise (with quantified uncertainty) geological models using these data sets. In this work, a stochastic modeling framework is proposed to extract the subsurface heterogeneity from multiple and complementary types of data. Subsurface heterogeneity is considered as the “hidden link” between multiple spatial data sets. Hidden Markov random field models are employed to perform three-dimensional segmentation, which is the representation of the “hidden link”. Finite Gaussian mixture models are adopted to characterize the statistical parameters of multiple data sets. The uncertainties are simulated via a Gibbs sampling process within a Bayesian inference framework. The proposed modeling method is validated and is demonstrated using numerical examples. It is shown that the proposed stochastic modeling framework is a promising tool for three-dimensional segmentation in the field of geological modeling and geophysics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling and Simulating Depositional Sequences Using Latent Gaussian Random Fields

Article 25 June 2020

Plurigaussian modeling of geological domains based on the truncation of non-stationary Gaussian random fields

Article 02 December 2016

Prestack inversion based on anisotropic Markov random field–maximum posterior probability inversion and its application to identify shale gas sweet spots

Article 01 December 2015

References

Attias H (2000) A variational Bayesian framework for graphical models. Adv Neural Inf Process Syst 12:209–215
Google Scholar
Auerbach S, Schaeben H (1990) Computer-aided geometric design of geologic surfaces and bodies. Math Geol 22:957–987
Article Google Scholar
Babak O, Deutsch CV (2009) An intrinsic model of coregionalization that solves variance inflation in collocated cokriging. Comput Geosci UK 35:603–614
Article Google Scholar
Besag J (1974) Spatial interaction and the statistical analysis of lattice systems. J R Stat Soc Ser B (Methodol) 36:192–236
Google Scholar
Besag J (1986) On the statistical analysis of dirty pictures. J R Stat Soc 48:259–302
Google Scholar
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22:719–725
Article Google Scholar
Blanchin R, Chilès J-P (1993) The Channel Tunnel: Geostatistical prediction of the geological conditions and its validation by the reality. Math Geol 25:963–974
Article Google Scholar
Caers J (2011) Modeling uncertainty in the earth sciences. Wiley, Chichester
Book Google Scholar
Caers J, Zhang T (2004) Multiple-point geostatistics: a quantitative vehicle for integrating geologic analogs into multiple reservoir models. G. M. Grammer, P. M. ldquoMitchrdquo Harris, and G. P. Eberli, Integration of outcrop and modern analogs in reservoir modeling. AAPG Memoir 80:383–394
Celeux G, Forbes F, Peyrard N (2003) EM procedures using mean field-like approximations for Markov model-based image segmentation. Pattern Recognit 36:131–144
Article Google Scholar
Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recognit 28:781–793
Article Google Scholar
Chugunova TL, Hu LY (2008) Multiple-point simulations constrained by continuous auxiliary data. Math Geosci 40:133–146
Article Google Scholar
Cline HE, Lorensen WE, Kikinis R, Jolesz F (1990) Three-dimensional segmentation of MR images of the head using probability and connectivity. J Comput Assist Tomogr 14:1037–1045
Article Google Scholar
Cross GR, Jain AK (1983) Markov random field texture models. IEEE Trans Pattern Anal Mach Intell 5:25–39
Daly C (2005) Higher order models using entropy, Markov random fields and sequential simulation, Geostatistics Banff 2004. Springer, New York, pp 215–224
Google Scholar
de Vries LM, Carrera J, Falivene O, Gratacós O, Slooten LJ (2009) Application of multiple point geostatistics to non-stationary images. Math Geosci 41:29–42
Article Google Scholar
Elkateb T, Chalaturnyk R, Robertson PK (2003) An overview of soil heterogeneity: quantification and implications on geotechnical field problems. Can Geotech J 40:1–15
Article Google Scholar
Figueiredo MA, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24:381–396
Article Google Scholar
Fjortoft R, Delignon Y, Pieczynski W, Sigelle M, Tupin F (2003) Unsupervised classification of radar images using hidden Markov chains and hidden Markov random fields. IEEE Trans Geosci Remote Sens 41:675–686
Article Google Scholar
Forbes F, Peyrard N (2003) Hidden Markov random field model selection criteria based on mean field-like approximations. IEEE Trans Pattern Anal Mach Intell 25:1089–1101
Article Google Scholar
Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J 41:578–588
Article Google Scholar
Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97:611–631
Article Google Scholar
Gao D (2003) Volume texture extraction for 3D seismic visualization and interpretation. Geophysics 68:1294–1302
Article Google Scholar
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741
Gonzalez J, Low Y, Gretton A, Guestrin C (2011) Parallel Gibbs sampling: From colored fields to thin junction trees, International Conference on Artificial Intelligence and Statistics, pp 324–332
Ising E (1925) Beitrag zur theorie des ferromagnetismus. Zeitschrift für Physik A Hadrons Nuclei 31:253–258
Google Scholar
Jessell MW, Valenta RK (1996) Structural geophysics: integrated structural and geophysical modelling. Comput Methods Geosci 15:303–324
Article Google Scholar
Kindermann R, Snell JL (1980) Markov random fields and their applications. American Mathematical Society, Providence, RI
Koch J, He X, Jensen KH, Refsgaard JC (2014) Challenges in conditioning a stochastic geological model of a heterogeneous glacial aquifer to a comprehensive soft data set. Hydrol Earth Syst Sci 18:2907–2923
Article Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT press, Cambridge
Koltermann CE, Gorelick SM (1996) Heterogeneity in sedimentary deposits: A review of structure-imitating, process-imitating, and descriptive approaches. Water Resour Res 32:2617–2658
Article Google Scholar
Lajaunie C, Courrioux G, Manuel L (1997) Foliation fields and 3D cartography in geology: principles of a method based on potential interpolation. Math Geol 29:571–584
Article Google Scholar
Li Z, Wang X, Wang H, Liang RY (2016) Quantifying stratigraphic uncertainties by stochastic simulation techniques based on Markov random field. Eng Geol 201:106–122
Article Google Scholar
Mallet J-L (1989) Discrete smooth interpolation. ACM Trans Gr 8:121–144
Article Google Scholar
Mallet J-LL (2002) Geomodeling. Oxford University Press Inc, Oxford
Google Scholar
Mann CJ (1993) Uncertainty in geology. Computers in Geology—25 Years of Progress. Oxford University Press, Oxford, pp 241–254
Mariethoz G, Caers J (2014) Multiple-point geostatistics: stochastic modeling with training images. wiley, New York
Book Google Scholar
Mariethoz G, Renard P, Cornaton F, Jaquet O (2009) Truncated plurigaussian simulations to characterize aquifer heterogeneity. Ground water 47:13–24
Article Google Scholar
McKenna SA, Poeter EP (1995) Field example of data fusion in site characterization. Water Resour Res 31:3229–3240
Article Google Scholar
McLachlan G, Peel D (2004) Finite mixture models. Wiley, Hoboken, NJ
Google Scholar
McLachlan GJ, Basford KE (1988) Mixture models. Inference and applications to clustering. Statistics: Textbooks and Monographs. Dekker, New York, p 1
McLachlan GJ, Krishnan T (2007) The EM algorithm and extensions. Wiley-Interscience, New York
Google Scholar
Norberg T, Rosén L, Baran A, Baran S (2002) On modelling discrete geological structures as Markov random fields. Math Geol 34:63–77
Article Google Scholar
Pham DL, Xu C, Prince JL (2000) Current methods in medical image segmentation. Ann Rev Biomed Eng 2:315–337
Article Google Scholar
Reitberger J, Schnörr C, Krzystek P, Stilla U (2009) 3D segmentation of single trees exploiting full waveform LIDAR data. ISPRS J Photogramm Remote Sens 64:561–574
Article Google Scholar
Rubin Y, Chen X, Murakami H, Hahn M (2010) A Bayesian approach for inverse modeling, data assimilation, and conditional simulation of spatial random fields. Water Resour Res 46:W10523
Article Google Scholar
Rue H, Held L (2005) Gaussian Markov random fields: theory and applications. CRC Press, Boca Raton
Book Google Scholar
Solberg AHS, Taxt T, Jain AK (1996) A Markov random field model for classification of multisource satellite imagery. IEEE Trans Geosci Remote Sens 34:100–113
Article Google Scholar
Strebelle S (2002) Conditional simulation of complex geological structures using multiple-point statistics. Math Geol 34:1–21
Article Google Scholar
Thornton C (1998) Separability is a learner’s best friend, 4th Neural Computation and Psychology Workshop, 9–11 April 1997. Springer, London, pp 40–46
Toftaker H, Tjelmeland H (2013) Construction of binary multi-grid Markov random field prior models from training images. Math Geosci 45:383–409
Article Google Scholar
Tolpekin VA, Stein A (2009) Quantification of the effects of land-cover-class spectral separability on the accuracy of Markov-random-field-based superresolution mapping. IEEE Trans Geosci Remote Sens 47:3283–3297
Article Google Scholar
Wang X, Li Z, Wang H, Rong Q, Liang RY (2016) Probabilistic analysis of shield-driven tunnel in multiple strata considering stratigraphic uncertainty. Struct Saf 62:88–100
Article Google Scholar
Wellmann JF (2013) Information theory for correlation analysis and estimation of uncertainty reduction in maps and models. Entropy 15:1464–1485
Article Google Scholar
Wellmann JF, Regenauer-Lieb K (2012) Uncertainties have a meaning: Information entropy as a quality measure for 3-D geological models. Tectonophysics 526:207–216
Article Google Scholar
Wellmann JF, Thiele ST, Lindsay MD, Jessell MW (2016) pynoddy 1.0: an experimental platform for automated 3-D kinematic and potential field modelling. Geosci Model Dev 9:1019–1035
Article Google Scholar
Xie H, Pierce LE, Ulaby FT (2002) SAR speckle reduction using wavelet denoising and Markov random field modeling. IEEE Trans Geosci Remote Sens 40:2196–2212
Article Google Scholar
Yuen KV, Mu HQ (2011) Peak ground acceleration estimation by linear and nonlinear models with reduced order Monte Carlo simulation. Comput Aided Civil Infrastruct Eng 26:30–47
Google Scholar
Zhang Y, Brady M, Smith S (2001) Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans Med Imaging 20:45–57
Article Google Scholar
Zhu H, Zhang L (2013) Characterizing geotechnical anisotropic spatial variations using random field theory. Can Geotech J 50:723–734
Article Google Scholar

Download references

Acknowledgements

Hui Wang and Florian Wellmann would like to acknowledge the support from the German research foundation (DFG) through the Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University. The authors would like to thank the anonymous reviewers for their constructive comments that have helped to improve the paper significantly.

Author information

Authors and Affiliations

The Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Schinkelstraße 2, 52062, Aachen, Germany
Hui Wang & J. Florian Wellmann
Department of Civil Engineering, The University of Akron, Akron, OH, 44325-3905, USA
Zhao Li & Robert Y. Liang
College of Engineering, Peking University, Beijing, 100871, China
Xiangrong Wang

Authors

Hui Wang
View author publications
You can also search for this author in PubMed Google Scholar
J. Florian Wellmann
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiangrong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Robert Y. Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Wang.

Appendices

Appendix A: MRF Energy and Likelihood Energy

According to Bayesian theory

$$\begin{aligned} p(\mathbf{x}|\mathbf{y},\Phi )\propto p(\mathbf{x})p(\mathbf{y}|\mathbf{x},\Phi )\propto \exp (-U(\mathbf{x})+\sum _{j\in V} {\log f_{x_j } (y_j ;\theta _{x_j } )} ). \end{aligned}$$

(26)

Although it is not possible to sample a posteriori realizations of x according to Eq. (26) directly, but one may note that the conditional random field $p(\mathbf{x}|\mathbf{y},\Phi )$ is still a Gibbs field if one substitutes $U^{\prime }(\mathbf{x})=U(\mathbf{x})-\sum \nolimits _{j\in V} {\log f_{x_j } (y_j ;\theta _{x_j } )} $ into Eq. (26). Assuming the emission distribution is Gaussian (i.e., $\theta _{x_j } =(\mu _{x_j },\Sigma _{x_j } ))$, the corresponding local conditional distribution is

$$\begin{aligned}&p(x_j |y_j,\mathbf{x}_{\partial _j },\theta _{x_j } )\propto \exp \left[ -U(x_j,\mathbf{x}_{\partial _j } )-\frac{1}{2}(x_j -\mu _{x_j } )^{T}\Sigma _{x_j } ^{-1}(x_j -\mu _{x_j } )\right. \nonumber \\&\left. \quad -\frac{1}{2}\log \left| {\Sigma _{x_j }} \right| \right] \end{aligned}$$

(27)

which can be rewritten as

$$\begin{aligned} p(x_j |y_j,\mathbf{x}_{\partial _j },\theta _{x_j } )\propto \exp [-(U(x_j,\mathbf{x}_{\partial _j } )+U(y_j |x_j,\theta _{x_j } ))] \end{aligned}$$

(28)

with the MRF energy $U(x_j,\mathbf{x}_{\partial _j } )$, and the likelihood energy is calculated as follows

$$\begin{aligned} U(y_j |x_j,\theta _{x_j } )=\frac{1}{2}(x_j -\mu _{x_j } )^{T}\Sigma _{x_j } ^{-1}(x_j -\mu _{x_j } )+\frac{1}{2}\log \left| {\Sigma _{x_j }} \right| . \end{aligned}$$

(29)

Appendix B: Chromatic Sampler

The chromatic sampler applies a classic graph coloring technique to parallel job scheduling, so that a direct parallelization of the sequential scan Gibbs sampler can be achieved. To be more specific, the entire set of voxels is decomposed into k subsets such that adjacent vertices in the corresponding graph will have different colors. The k-coloring of the MRF ensures that within a certain subset, all vertices are conditionally independent given the configuration of all other vertices in the remaining colors. Therefore, all vertices with the same color can be sampled independently and in parallel. According to Gonzalez et al. (2011), it is guaranteed that given p processors and a k-coloring of an MRF with n vertexes, the parallel chromatic sampler is ergodic and generates a single joint sample in running time: $O(n/p+k)$ which results in a p reduction in the mixing time. Given sufficient parallel resource, the running time is mainly dominated by the number of different colors k. A simple example is provided here: a three-dimensional grid system (Fig. 10) equipped with the neighborhood system defined in Sect. 2 is a graph with 8 colors ($k=8$).

Appendix C: Calculating Information Entropy

First, calculate the probability $P_l (i)$ of assigning a certain label $l\in L$ to a given voxel $i\in V$ using the following expression

$$\begin{aligned} P_l (X_i )=\frac{1}{n}\sum _{k=1}^n {I_l \left( x_i^{(k)} \right) }, \end{aligned}$$

(30)

where n is a predefined number of realizations after the burn-in period and $I_l (\cdot )$ is an indicator function which is defined as

$$\begin{aligned} I_l (x_i )=\left\{ {\begin{array}{ll} 1&{}\quad \hbox { for }x_i =l \\ 0&{}\quad \hbox { for x}_\mathrm{i} \ne l \\ \end{array}} \right. . \end{aligned}$$

(31)

Second, for voxel i, the information entropy reads

$$\begin{aligned} H(X_i )=-\sum _{l\in L} {P_l (X_i )\log P_l (X_i )}. \end{aligned}$$

(32)

Based on Eq. (32), the average information entropy for the entire physical domain can be calculated as

$$\begin{aligned} H_T (\mathbf{X})=-\frac{1}{\left| V \right| }\sum _{i\in V} {H(X_i )}, \end{aligned}$$

(33)

where $\left| V \right| $ denotes the cardinality of the set V. The average information entropy is used to quantify the uncertainties of the entire physical domain with a single number.

Appendix D: Geometric Separability Index (GSI)

According to Thornton (1998), the Geometric Separability Index (GSI) for a two-cluster complete data set (i.e., both cluster labels x and observed features y are known) is defined as

$$\begin{aligned} \mathrm{GSI}=\frac{\sum _{i=1}^n {(f(x_i )+f(x_i^{\prime } )+1)\bmod 2}}{n}. \end{aligned}$$

(34)

Here f(.) is a binary target function that returns 0 or 1 according to the input label, $x_i \in L$ is the label at site i. $x_i^{\prime } \in L$ is the label of site i’s nearest neighbor and n is the total number of data points. The nearest neighbor is defined using Euclidean distance in the feature space.

For cases with multiple clusters, simply define the binary target function linked to the specific label $l\in L$

$$\begin{aligned} f_l (x_i )=\left\{ {{\begin{array}{ll} 1&{}\quad x_i =l \\ 0&{}\quad x_i \ne l \\ \end{array}}} \right. . \end{aligned}$$

(35)

Then the average GSI is defined as the algebraic mean of $\mathrm{GSI}_l \hbox { }l\in L$

$$\begin{aligned} \overline{\mathrm{GSI}} =\frac{\sum \nolimits _{l\in L} {\mathrm{GSI}_l }}{\left| L \right| }, \end{aligned}$$

(36)

where $\left| L \right| $ is the cardinality of the label set L. The average GSI is used to provide a measure of the overall separability in a single number.

The average GSI intuitively quantifies the degree to which data points mix together and hence indicates how “difficult” the segmentation problem is. If the centroids of the clusters almost coincide or the observed data points are uniformly distributed in the feature space (i.e., highly overlapped), the GSI will be close to 0.5; in contrast, if there is almost no overlap among clusters, the GSI will be close to 1.0.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H., Wellmann, J.F., Li, Z. et al. A Segmentation Approach for Stochastic Geological Modeling Using Hidden Markov Random Fields. Math Geosci 49, 145–177 (2017). https://doi.org/10.1007/s11004-016-9663-9

Download citation

Received: 28 March 2016
Accepted: 19 October 2016
Published: 09 November 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s11004-016-9663-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Segmentation Approach for Stochastic Geological Modeling Using Hidden Markov Random Fields

Abstract

Access this article

Similar content being viewed by others

Modeling and Simulating Depositional Sequences Using Latent Gaussian Random Fields

Plurigaussian modeling of geological domains based on the truncation of non-stationary Gaussian random fields

Prestack inversion based on anisotropic Markov random field–maximum posterior probability inversion and its application to identify shale gas sweet spots

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: MRF Energy and Likelihood Energy

Appendix B: Chromatic Sampler

Appendix C: Calculating Information Entropy

Appendix D: Geometric Separability Index (GSI)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Segmentation Approach for Stochastic Geological Modeling Using Hidden Markov Random Fields

Abstract

Access this article

Similar content being viewed by others

Modeling and Simulating Depositional Sequences Using Latent Gaussian Random Fields

Plurigaussian modeling of geological domains based on the truncation of non-stationary Gaussian random fields

Prestack inversion based on anisotropic Markov random field–maximum posterior probability inversion and its application to identify shale gas sweet spots

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: MRF Energy and Likelihood Energy

Appendix B: Chromatic Sampler

Appendix C: Calculating Information Entropy

Appendix D: Geometric Separability Index (GSI)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation