Skip to main content

Advertisement

Log in

Information Gathering in Bayesian Networks Applied to Petroleum Prospecting

  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

Abstract

The optimal design of data acquisition is not obvious in Bayesian network models. The dependency structure may vary dramatically, which makes learning and information evaluation complicated and sometimes non-intuitive. The motivation for working on this topic is petroleum exploration, and the application of this paper is prospect selection in the North Sea. Here, the data gathering is often carried out during seasonal campaigns, and it is useful to plan the experimentation and to understand which data are likely to be most informative. Information measures are used to compare possible future observation sets. Four information measures are studied: Shannon Entropy, sum of variances, Node-wise Entropy and overall prediction error. The Shannon Entropy is commonly considered the standard measure of information, and the Node-wise Entropy measure can be interpreted as an approximation to the former. The variance measure links uncertainty and variance. The prediction error measure is tied to decision-making rules. The results lead to new insight about prospect selection. For example, the Node-wise Entropy and the variance measure behave similarly, and the optimal observation set of Shannon Entropy does not correspond to what one intuitively would consider as minimizing unknown information in this case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Bhattacharjya D, Eidsvik J, Mukerji T (2010) The value of information in spatial decision making. Math Geosci 42(2):141-163. doi:10.1007/s11004-009-9256-y

  • Brown D, Smith J (2013) Optimal sequential exploration: bandits, clairvoyants, and wildcats. Oper Res 61(3):644-665. doi:10.1287/opre.2013.1164

  • Bueso M, Angulo J, Alonso F (1998) A state-space model approach to optimum spatial sampling design based on entropy. Environ. Ecol Stat 5:29-44. doi:10.1023/A:1009603318668

  • Cowell R, Dawid P, Lauritzen S, Spiegelhalter D (2007) Probabilistic networks and expert systems: exact computational methods for Bayesian networks. In: Statistics for engineering and information science series. Springer, New York

  • Ginebra J (2007) On the measure of the information in a statistical experiment. Bayesian Anal 2(1):167-212. doi:10.1214/07-BA207

  • Heavlin WD (2003) Designing experiments for causal networks. Technometrics 45(2):115–129. doi:10.1198/004017003188618751;10.2307/25047009

  • Jensen FV, Nielsen TD (2007) Bayesian networks and decision graphs, 2nd edn. Springer Publishing Company, Incorporated, New York. doi:10.1007/978-0-387-68282-2

  • Ko CW, Lee J, Queyranne M (1995) An exact algorithm for maximum entropy sampling. Oper Res 43(4):684-691. doi:10.1287/opre.43.4.684

  • Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, USA

  • Krause A, Guestrin C (2009) Optimal value of information in graphical models. J Artif Intell Res 35:557-591. doi:10.1613/jair.2737

  • Lauritzen SL, Spiegelhalter DJ (1988) Local computation with probabilities on graphical structures and their application to expert systems (with discussion). J R Stat Soc Ser B (Stat Methodol) 50(2):157–224

    Google Scholar 

  • Le N, Zidek J (2006) Statistical analysis of environmental space-time processes. In: Springer series in statistics. Springer, New York. doi:10.1007/0-387-35429-8

  • Lindley DV (1956) On a measure of the information provided by an experiment. Ann Math Stat 27(4):986-1005. doi:10.1214/aoms/1177728069

  • Martinelli G, Eidsvik J (2014) Dynamic exploration designs for graphical models using clustering with applications to petroleum exploration. Knowl-Based Syst 58:113-126. doi:10.1016/j.knosys.2013.08.020

  • Martinelli G, Eidsvik J, Hauge R, Førland MD (2011) Bayesian networks for prospect analysis in the North Sea. AAPG Bull 95(8):1423-1442. doi:10.1306/01031110110

  • Martinelli G, Eidsvik J, Hauge R (2013) Dynamic decision making for graphical models applied to oil exploration. Eur J Oper Res 230:688-702. doi:10.1016/j.ejor.2013.04.057

  • Masoudi P, Asgarinezhad Y, Tokhmechi B (2014) Feature selection for reservoir characterisation by Bayesian network. Arab J Geosci. doi:10.1007/s12517-014-1361-7 (cited by 1; article in press)

  • Mortera J, Vicard P, Vergari C (2013) Object-oriented Bayesian networks for a decision support system for antitrust enforcement. Ann Appl Stat 7(2):714-738. doi:10.1214/12-AOAS625

  • Royle JA (2002) Exchange algorithms for constructing large spatial designs. J Stat Plann Inference 100(2):121-134. doi:10.1016/S0378-3758(01)00127-6

  • Russell S, Norvig P (2003) Artificial intelligence: a modern approach, 2nd edn. Prentice Hall, USA

  • Shewry MC, Wynn HP (1987) Maximum entropy sampling. J Appl Stat 14(2):165-170. doi:10.1080/02664768700000020

  • Wees JDV, Mijnlieff H, Lutgert J, Breunese J, Bos C, Rosenkranz P, Neele F (2008) A Bayesian belief network approach for assessing the impact of exploration prospect interdependency: an application to predict gas discoveries in the Netherlands. AAPG Bull 92:1315-1336. doi:10.1306/06040808067

Download references

Acknowledgments

This work is funded by Statistics for Innovation \((\text {sfi})^2\), one of the Norwegian Centres for Research-based Innovation. The authors thank Arne Bang Huseby for valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marie Lilleborge.

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

Proof

Let \(\emptyset \subseteq A \subset B \subseteq L\).

The Shannon Entropy measure has

$$\begin{aligned} \mu _\mathrm{ShE}(A)&= - \mathbb {E}_{[X_{L}]} [ \log \mathbb {P}(X_{L {\setminus } A}|X_{A}) ] \\&= - \mathbb {E}_{[X_{L}]} [ \log ( \mathbb {P}(X_{L {\setminus } B}|X_{B})\mathbb {P}(X_{B {\setminus } A}|X_{A}) ) ] \\&= - \mathbb {E}_{[X_{L}]} [ \log \mathbb {P}(X_{L {\setminus } B}|X_{B}) ] - \mathbb {E}_{[X_{L}]} [ \log \mathbb {P}(X_{B {\setminus } A}|X_{A}) ] \\&\ge - \mathbb {E}_{[X_{L}]} [ \log \mathbb {P}(X_{L {\setminus } B}|X_{B}) ] \\&=\mu _\mathrm{ShE}(B), \end{aligned}$$

with equality if and only if the distribution \(\mathbb {P}(X_{B {\setminus } A}|X_{A})\) is trivial for each assignment to \(X_{A}\).

Observe that \(f_\mathrm{NwE}\) and \(f_\mathrm{Var}\) are strictly concave, since

$$\begin{aligned} f''_\mathrm{NwE}=\frac{-1}{p(1-p)} \quad \text { and }\quad f''_\mathrm{Var}=-2 \quad \forall p \in \left<0,1\right>, \end{aligned}$$

while \(f_\mathrm{PrE}\) is concave on \(\left[ 0,1 \right] \) and linear on \(\left[ 0,1/2 \right] \) and on \(\left[ 1/2,1\right] \). Fix an \(i \in L\), and assume a measure term of the form

$$\begin{aligned} \mu ^{i}(B) = \mathbb {E}_{[X_{B}]} f\left( \mathbb {P}(X_{i}=1|X_{B}) \right) , \end{aligned}$$

for a concave function \(f:[0,1] \rightarrow \mathbb {R}\) with \(f^{-1}(0)= \{0,1\}\). For a given assignment \(X_{A}=x_{A}\) to the random variables in A, \(\mathbb {P}(X_{i}=1|X_{B})\) is a function of \(X_{B {\setminus } A}\) and thus a random variable. By Jensen’s inequality,

$$\begin{aligned} f(\mathbb {P}(X_{i}=1|X_{A})) \ge \mathbb {E}_{[X_{B{\setminus } A}|X_{A}]}f(\mathbb {P}(X_{i}=1|X_{B})), \end{aligned}$$

with equality if and only if f is linear on

$$\begin{aligned} \left[ \min _{x_{B {\setminus } A}} \{\mathbb {P}(X_{i}=1|X_{B})\}, \max _{x_{B {\setminus } A}} \{\mathbb {P}(X_{i}=1|X_{B})\}\right] . \end{aligned}$$

If \(i \in B\), the right-hand side of the inequality is zero valued, and there is equality if and only if \(\mathbb {P}(X_{i}|X_{A})\) is trivial as well. If \(i \in L {\setminus } B\) and f is strictly concave, the inequality is strict unless

$$\begin{aligned} \mathbb {P}(X_{i}=1|X_{B}) \equiv \mathbb {P}(X_{i}=1|X_{A}). \end{aligned}$$

Since the assignment \(X_{A}=x_{A}\) was arbitrary,

$$\begin{aligned} \mathbb {E}_{[X_{A}]} \left[ f(\mathbb {P}(X_{i}=1|X_{A})) \right] \ge \mathbb {E}_{[X_{B}]} \left[ f(\mathbb {P}(X_{i}=1|X_{B})) \right] , \end{aligned}$$

and the claims follow. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lilleborge, M., Hauge, R. & Eidsvik, J. Information Gathering in Bayesian Networks Applied to Petroleum Prospecting. Math Geosci 48, 233–257 (2016). https://doi.org/10.1007/s11004-015-9616-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-015-9616-8

Keywords

Navigation