Skip to main content
Log in

Bayesian Inversion in Hidden Markov Models with Varying Marginal Proportions

  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

Abstract

Knowledge of the sub-surface characteristics is crucial in many engineering activities. Sub-surface soil classes must, for example, be predicted from indirect measurements in narrow drill holes and geological experience. In this study, the inversion is made in a Bayesian framework by defining a hidden Markov chain. The likelihood model for the observations is assumed to be in factorial form. The new feature is the specification of the prior Markov model as containing vertical class proportion profiles and one reference class transition matrix. A criterion for selection of the associated non-stationary prior Markov model is introduced, and an algorithm for assessing the set of class transition matrices is defined. The methodology is demonstrated on one synthetic example and on one case study for offshore foundation of windmills. It is concluded that important experience from the geologist can be captured by the new prior model and that the associated posterior model is, therefore, improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Avseth P, Mukerji T, Mavko G (2005) Quantitative seismic interpretation: applying rock physics tools to reduce interpretation risk. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41(1):164–171

    Article  Google Scholar 

  • Billard L, Meshkani MR (1995) Estimation of stationary Markov chains. J Am Stat Assoc 90(429):307–315

    Article  Google Scholar 

  • Dymarski P (ed)(2011) Hidden Markov models, theory and applications. InTechOpen. http://www.intechopen.com/: InTechOpen

  • Eidsvik J, Mukerji T, Switzer P (2004) Estimation of geological attributes from a well log: an application of hidden Markov chains. Math Geol 36(3):379–396

    Article  Google Scholar 

  • Harbaugh JW, Bonham-Carter G (1970) Computer simulation in geology. Wiley, New York

    Google Scholar 

  • Krumbein YC, Dacey MF (1969) Markov chains and embedded Markov chains in geology. Math Geol 1(1):79–96

    Article  Google Scholar 

  • Ravenne G, Galli A, Doligez B, Beucher H, Eschard R (2002) Quantification of facies relationships in a proportion curves. In: Armstrong M, Bettini C, Champigny N, Galli A, Remacre A (eds) Geostatistics Rio 2000. Quantitative geology and geostatistics. Springer, Dordrecht, pp 7–51

    Google Scholar 

  • Robertson PK (2010) Soil behavior type from the CPT: an update. In: 2nd international symposium on cone penetration testing, CPT’10. Huntington Beach, CA, USA, pp 9–11

  • Scott SL (2002) Bayesian methods for hidden Markov models: recursive computing in the 21st century. J Am Stat Assoc 97:337–351

    Article  Google Scholar 

  • Ulvmoen M, Omre H, Buland A (2010) Improved resolution in Bayesian lithology/fluid inversion from prestack seismic data and well observations: part 2—real case study. Geophysics 75(2):B73–B82

    Article  Google Scholar 

  • Viterbi A (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inf Theory 13(2):260–269

    Article  Google Scholar 

  • Weissman GS, Fogg GE (1999) Multi-scale alluvial fan heterogeneity modeled with transition probability geostatistics in a sequence stratigraphic framework. J Hydrol 226(1–2):48–65

    Article  Google Scholar 

Download references

Acknowledgements

The research is made as a part of Ph.D.-study at School of Mathematical and Statistical Sciences, Hawassa University, Ethiopia. The funding is provided by the Ethiopian Department of Education and the Norwegian Agency for Development Cooperation. Also thanks to Ivan Depina, Sintef, Norway for providing and supporting the real geotechnical data.

Funding

Funding was provided by Ethiopian Department of Education, Ethiopia and Norwegian Agency for Development cooperation, Norway.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Selamawit Serka Moja.

Appendices

Appendix A: Markov Property of Posterior Model

Consider posterior model

$$\begin{aligned} \begin{aligned} p(\varvec{\mathbf {\kappa }}|\varvec{\mathbf {d}})&=\text{ const }\times p(d_1|\kappa _1)p(\kappa _1)\prod _{t\in {\mathcal {T}}_{-1}} p(d_t|\kappa _t)p(\kappa _t|\kappa _{t-1}) \\&=p(\kappa _1|\varvec{\mathbf {d}})\prod _{t\in {\mathcal {T}}_{-1}} p(\kappa _t|\kappa _{t-1},\varvec{\mathbf {d}}). \end{aligned} \end{aligned}$$

Rephrase \(p(\varvec{\mathbf {\kappa }} |\varvec{\mathbf {d}})\) by a general conditioning decomposition

$$\begin{aligned} p(\varvec{\mathbf {\kappa }}|\varvec{\mathbf {d}}) = p(\kappa _1|\varvec{\mathbf {d}})\prod _{t\in {\mathcal {T}}_{-1}} p(\kappa _t|\kappa _{t-1},\ldots , \kappa _1,\varvec{\mathbf {d}}). \end{aligned}$$

By demonstrating that for \(t\in {\mathcal {T}}_{-1}\)

$$\begin{aligned} p(\kappa _t|\kappa _{t-1},\ldots ,\kappa _1,\varvec{\mathbf {d}})=\frac{p(\kappa _t,\ldots ,\kappa _1,\varvec{\mathbf {d}})}{p(\kappa _{t-1},\ldots ,\kappa _1,\varvec{\mathbf {d}})}=p(\kappa _t|\kappa _{t-1},\varvec{\mathbf {d}}) \end{aligned}$$

the Markov property of the posterior model is proven.

Use notation \(\varvec{\mathbf {\kappa }}_{1:s}=(\kappa _1,\ldots ,\kappa _s)\) and \(\varvec{\mathbf {d}}_{s:T}=(d_s,\ldots ,d_T)\) and define

$$\begin{aligned} \begin{aligned} p(\varvec{\mathbf {\kappa }}_{1:s}|\varvec{\mathbf {d}})&=\sum _{\kappa _{T} \in \varOmega _{\kappa }} \cdots \sum _{\kappa _{s+1} \in \varOmega _{\kappa }} p(\varvec{\mathbf {\kappa }}|\varvec{\mathbf {d}}) \\&= \text{ const } \times p(d_1|\kappa _1) p(\kappa _1) \prod _{i=2}^{s} p(d_i|\kappa _i) p(\kappa _i|\kappa _{i-1})\\&\quad \times \sum _{\kappa _{T} \in \varOmega _{\kappa }}\cdots \sum _{\kappa _{s+1} \in \varOmega _{\kappa }} \prod _{j=s+1}^{T} p(d_j|\kappa _j) p(\kappa _j|\kappa _{j-1}) \\&= \ \text{ const } \ \times p(d_1|\kappa _1) \ p(\kappa _1) \ \prod _{i=2}^{s} \ p(d_i|\kappa _i) \ p(\kappa _i|\kappa _{i-1}) \times v_s(\kappa _s,\varvec{\mathbf {d}}_{s+1:T}). \\ \end{aligned} \end{aligned}$$

The latter factor is only a function of \(\kappa _s\) and \(\varvec{\mathbf {d}}_{s+1:T}\) since \((\kappa _T,\ldots ,\kappa _{s+1} )\) is marginalized out. Hence, we may write

$$\begin{aligned} \begin{aligned} p(\kappa _t|\kappa _{t-1},\ldots ,\kappa _1,\varvec{\mathbf {d}})&=\frac{p(\varvec{\mathbf {\kappa }}_{1:t}|\varvec{\mathbf {d}})}{p(\varvec{\mathbf {\kappa }}_{1:t-1}|\varvec{\mathbf {d}})} \\&= \ p(d_t|\kappa _t) p(\kappa _t|\kappa _{t-1}) \ \ \frac{v_t(\kappa _t,\varvec{\mathbf {d}}_{t+1:T})}{v_{t-1}(\kappa _{t-1},\varvec{\mathbf {d}}_{t:T})}\\&= \ p(\kappa _t|\kappa _{t-1},\varvec{\mathbf {d}}_{t:T}) \ \ \ \text{ QED. }\\ \end{aligned} \end{aligned}$$

Note that the demonstration also holds when the prior model \(p(\varvec{\mathbf {\kappa }})\) is a non-stationary Markov chain.

Appendix B: Forward–Backward Algorithm

Consider posterior model

$$\begin{aligned} \begin{aligned} p(\varvec{\mathbf {\kappa }}|\varvec{\mathbf {d}})&=\text{ const }\times p(d_1|\kappa _1)p(\kappa _1)\prod _{t\in {\mathcal {T}}_{-1}} p(d_t|\kappa _t)p(\kappa _t|\kappa _{t-1}) \\&=p(\kappa _1|\varvec{\mathbf {d}})\prod _{t\in {\mathcal {T}}_{-1}} p(\kappa _t|\kappa _{t-1},\varvec{\mathbf {d}}). \end{aligned} \end{aligned}$$

Moreover let \( \varvec{\mathbf {d}}_{1:t}=[d_1,\ldots ,d_t]\) be the subset of \(\varvec{\mathbf {d}}=\varvec{\mathbf {d}}_{1:T}\) up to time t.

figure a

Appendix C: Viterbi Algorithm

Consider maximum posterior prediction

$$\begin{aligned} \begin{aligned} \hat{\varvec{\mathbf {\kappa }}}_\mathrm{MAP} = \text{ MAP }\left\{ \varvec{\mathbf {\kappa }}|\varvec{\mathbf {d}}\right\}&= \mathrm {arg}\ {\underset{\varvec{\mathbf {\kappa }}}{\mathrm {max}}} \ \left\{ p(\varvec{\mathbf {\kappa }}|\varvec{\mathbf {d}})\right\} \\&= \mathrm {arg}\ {\underset{\varvec{\mathbf {\kappa }}}{\mathrm {max}}} \ \left\{ p(\kappa _1|\varvec{\mathbf {d}})\prod _{t\in {\mathcal {T}}_{-1}} p(\kappa _t|\kappa _{t-1}, \ \varvec{\mathbf {d}}) \right\} \end{aligned} \end{aligned}$$
$$\begin{aligned} p_\mathrm{MAP}=p(\hat{\varvec{\mathbf {\kappa }}}_{MAP}|\varvec{\mathbf {d}}). \end{aligned}$$

Moreover, let \(\varvec{\mathbf {\kappa }}_{t}(\kappa )=[{\hat{\kappa }}_{1}^{'},\ldots ,{\hat{\kappa }}_{t-1}^{'},\kappa ]\) be MAP-trace up to t given \(\kappa _t = \kappa \) with associated MAP-probability \(p^{M}_{t}(\kappa )\) for \(\kappa \in \varOmega _{\kappa },\) and \(p^{M}_{t}(\kappa |\kappa ^{'})\) be MAP-probability for \(\kappa _t = \kappa \) given that \({\hat{\kappa }}^{'}_{t-1}=\kappa ^{'}\) for \(\kappa ,\kappa ^{'} \in \varOmega _{\kappa }\).

figure b

Appendix D: Trend Prior Model

Consider spatial discretization \( {\mathcal {T}}: \left\{ 1, \ldots , T \right\} \) and categorical variable \( \varvec{\mathbf {\kappa }} = [ \kappa _1, \ldots , \kappa _T ] {;} \ \kappa _t \in \varOmega _\kappa : \left\{ 1, \ldots , K \right\} . \ \) Define prior Markov chain model

$$\begin{aligned} p(\varvec{\mathbf {\kappa }}) \ = \ p(\kappa _1) \ \prod _{t \in {\mathcal {T}}_{-1}} p(\kappa _t|\kappa _{t-1}) \end{aligned}$$

parametrized by initial pdf \( \ \varvec{\mathbf {p_1}} = [ p(\kappa _1) ]_{\kappa _1 \in \varOmega _\kappa } \) and a set of transition matrices \( \varvec{\mathbf {P}}_{t-1, \ t} \ = \ [ p(\kappa _t|\kappa _{t-1}) ]_{\kappa _{t-1}, \ \kappa _t \ \in \ \varOmega _\kappa } \ ; t \in {\mathcal {T}}_{-1}. \ \)

Consider a non-complete set of model parameters, where one is the set of marginal pdfs \( \varvec{\mathbf {p}}_{t}^{0} \ = \ [ p^{0}(\kappa _t) ]_{\kappa _t \in \varOmega _\kappa } \ ; t \in {\mathcal {T}}. \ \) and one is the reference transition matrix \( \varvec{\mathbf {P}}_{r} \ = \ [ p_{r}(\kappa |\kappa ^{'}) ]_{\kappa , \ \kappa ^{'} \ \in \ \varOmega _\kappa }. \ \) The challenge is to define a set of model parameters for the prior model \( \ p(\varvec{\mathbf {\kappa }}) \ \) where the marginal pdfs reproduces \(\varvec{\mathbf {p}}^{0}_{t}; \ t \in {\mathcal {T}}\) in the non-complete set of model parameters and where the set of transition matrices do not deviate too much from \(\varvec{\mathbf {P}}_r\). The set of parameters for the prior Markov chain model is defined by the following set of constrained optimization problems.

Initial

$$\begin{aligned} \varvec{\mathbf {p}}_1 \ = \ \varvec{\mathbf {p}}_{1}^{0} \end{aligned}$$

\(\text{ for } \ \ t=2,\ldots ,T \ \ \text{ let }\)

$$\begin{aligned} \varvec{\mathbf {P}}_{t-1, \ t} = \mathrm {arg}\ {\underset{\varvec{\mathbf {P}}}{\mathrm {min}}} \ \left\{ \Arrowvert \varvec{\mathbf {P}} - \varvec{\mathbf {P}}_r \Arrowvert _{wL^{2}} \right\} \end{aligned}$$

\(\text{ with } \)

$$\begin{aligned} \Arrowvert \varvec{\mathbf {P}} - \varvec{\mathbf {P}}_r \Arrowvert _{wL^{2}} \ = \ \sum _{\kappa \in \varOmega _\kappa } \sum _{\kappa ^{'} \in \varOmega _\kappa } w_{\kappa \ \kappa ^{'}} \ [ p(\kappa |\kappa ^{'}) - p_r(\kappa |\kappa ^{'}) ]^{2}\\ \text{ and } \ w_{\kappa \ \kappa ^{'}} \ = \ [[1-p_r(\kappa |\kappa ^{'})] p_r(\kappa |\kappa ^{'})]^{-1} \end{aligned}$$

constrained by

$$\begin{aligned} \begin{aligned}&\varvec{\mathbf {p}}_{t}^{0} \ = \ \varvec{\mathbf {P}}^{'} \ \varvec{\mathbf {p}}_{t-1}^{0} \ \ \text{ providing } \ \ p^{0}_{t}(\kappa ) \ = \ \sum _{\kappa \in \varOmega _\kappa } p(\kappa |\kappa ^{'}) \ p^{0}_{t-1}(\kappa ^{'});\ \ \ \kappa \in \varOmega _\kappa \\&\varvec{\mathbf {i}}_k \ = \ \varvec{\mathbf {P}} \ \varvec{\mathbf {i}}_k \ \ \text{ providing } \ \ 1 \ = \ \sum _{\kappa \in \varOmega _\kappa } p(\kappa |\kappa ^{'}); \ \ \ \kappa ^{'} \in \varOmega _\kappa \\&0 \ \varvec{\mathbf {I}}_k \ \le \varvec{\mathbf {P}} \ \ \text{ providing } \ \ 0 \ \le \ p(\kappa |\kappa ^{'}) ; \ \ \kappa , \ \kappa ^{'} \in \varOmega _\kappa \\ \end{aligned} \end{aligned}$$

end.

The solution to this optimization problem for a given \( \ t, \ \varvec{\mathbf {P}}_{t-1, \ t}, \ \) will be a valid transition matrix, which reproduces the marginal pdf \( \ \varvec{\mathbf {p}}_t^{0}. \ \) Moreover \( \ \varvec{\mathbf {P}}_{t-1, \ t} \ \) will appear with minimum weighted deviation from the reference transition matrix \( \ \varvec{\mathbf {P}}_{r}. \ \)

Since \( \ 1 \ = \ \varvec{\mathbf {p}}_{t}^{0}{'} \ \varvec{\mathbf {i}}_{\kappa } \ ; \ t \in {\mathcal {T}} \ \text{ and } \ \varvec{\mathbf {i}}_{\kappa } \ = \ \varvec{\mathbf {P}}^{'} \ \varvec{\mathbf {i}}_{\kappa } \ \) the first set of equality constraints is linearly dependent for each t. This linear dependence is canceled by removing the constraint for \( \ \kappa \ = \ K. \ \)

Parameterize \( \ \varvec{\mathbf {P}}_{t-1, \ t} \ \text{ and } \ \varvec{\mathbf {P}} \ \text{ by } \ \varvec{\mathbf {\alpha }}:\{\alpha _{\kappa ^{'} \kappa } = p(\kappa |\kappa ^{'}); \ \kappa , \kappa ^{'} \in \varOmega _\kappa \} \ \) and obtain for each t

$$\begin{aligned} \varvec{\mathbf {P}}_{t-1,\ t}(\varvec{\mathbf {\alpha }}) \ = \ \mathrm {arg}\ {\underset{\varvec{\mathbf {\alpha }}}{\mathrm {min}}} \ \left\{ \sum _{\kappa \in \varOmega _{\kappa }} \sum _{\kappa ^{'} \in \varOmega _{\kappa }} { \left[ \ \alpha _{\kappa ^{'}\kappa } - p_r(\kappa |\kappa ^{'}) \ \right] ^{2} \over \left( \ 1 - p_r(\kappa |\kappa ^{'}) \ \right) \ p_r(\kappa |\kappa ^{'}) } \right\} \end{aligned}$$

constrained by

$$\begin{aligned} \begin{array}{lll} &{} p_{t}^{0} (\kappa ) \ = \ \sum _{\kappa \in \varOmega _{\kappa }} \alpha _{\kappa ^{'}\kappa } \ p_{t-1}^{0}(\kappa ^{'}) &{} \kappa \in \varOmega _{\kappa }/K \\ &{} 1 \ = \ \sum _{\kappa \in \varOmega _{\kappa }} \alpha _{\kappa ^{'}\kappa } &{} \kappa ^{'}\in \varOmega _{\kappa } \\ &{} 0 \ \le \ \alpha _{\kappa ^{'}\kappa } &{} \kappa ^{'},\kappa \in \varOmega _{\kappa }. \\ \end{array} \end{aligned}$$

This constitutes an optimization problem with quadratic object function with two sets of linear equality constraints and one set of linear inequality constraints. No closed form analytical solution to this optimization problem exists. Note, however, that the reference transition matrix \( \ \varvec{\mathbf {P}}_r \ \) at which the object function is centered, obey the two latter sets of constraints. Moreover, note that the weights \( \ w_{\kappa \kappa ^{'}} \ \) make deviations from elements of \( \ \varvec{\mathbf {P}}_r \ \) close to the border of the inequality constraints very costly. If \( \ \varvec{\mathbf {p}}^{0}_{t-1} \ \) and \( \ \varvec{\mathbf {p}}^{0}_{t} \ \) do not deviate dramatically and \( \ \varvec{\mathbf {P}}_{r} \ \) is not chosen in conflict with these two marginal pdfs, most likely the inequality constraints will be non-active.

For each \( \ t, \ \) the optimization is performed sequentially

  1. 1.

    Minimization of the object function with the two sets of equality constraints is made. This optimization can be done analytically by Lagrange optimization in dimension \( \ (K^2 + (K-1) + K) \ \) and a closed form solution is identified. If \( \ \alpha _{\kappa ^{'}\kappa } \ \ge \ 0 \ \ \kappa ^{'}, \ \kappa \ \in \varOmega _{\kappa }, \ \) the solution to the optimization is identified. If \( \ \alpha _{\kappa ^{'}\kappa } \ < \ 0 \ \) for some \( \ \kappa ^{'}, \ \kappa \ \in \varOmega _{\kappa }, \ \) for example \( \ \kappa ^{'}, \ \kappa \ \in \varOmega ^{-} \subset \varOmega _{\kappa }, \ \) go to Step 2

  2. 2.

    Set \( \ \alpha _{\kappa ^{'}\kappa }=0 \ \ \kappa ^{'}, \ \kappa \ \in \varOmega ^{-}. \ \) Minimization of the object function with the two sets of equality constraints with respect to the remaining elements \( \ \alpha _{\kappa ^{'}\kappa }; \ \kappa ^{'}, \kappa \ \in \varOmega _{\kappa }\backslash \varOmega ^{-}. \ \) The optimization is done by Lagrange optimization in dimension \( \ ((K-\ell )^2 + (K-1) + K) \ \ \text{ with } \ \ \ell = | \varOmega ^{-} |, \ \) and a closed form solution is identified. If \( \ \alpha _{\kappa ^{'}\kappa } \ \ge \ 0; \ \kappa ^{'}, \ \kappa \in \varOmega _{\kappa }\backslash \varOmega ^{-}, \ \) the solution of the optimization is identified, otherwise iterate Step 2.

If the solution, \( \ \varvec{\mathbf {P}}_{t-1, \ t}, \ \) is identified in Step 1,  the exact solution to the optimization problem is found and the inequality constraints are inactive. If, however, the solution \( \ \varvec{\mathbf {P}}_{t-1, \ t}, \ \) is identified in Step 2, it is ensured to be a valid transition matrix reproducing the marginal pdfs, but it need not be the exact solution to the optimization problem. Lastly, no demonstration of the existence of a solution is currently made.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Moja, S.S., Asfaw, Z.G. & Omre, H. Bayesian Inversion in Hidden Markov Models with Varying Marginal Proportions. Math Geosci 51, 463–484 (2019). https://doi.org/10.1007/s11004-018-9752-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-018-9752-z

Keywords

Navigation