Skip to main content

Advertisement

Log in

Semi-Supervised Multi-Facies Object Retrieval in Seismic Data

  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

Abstract

Characterizing buried sedimentary structures through the use of seismic data is part of many geoscientific projects. The evolution of seismic acquisition and processing capabilities has made it possible to acquire ever-growing amounts of data, increasing the image resolution so that sedimentary objects (geobodies) can be imaged with greater precision within sedimentary layers. However, exploring and interpreting them in large datasets can be tedious work. Recent practice has shown the potential of automated methods to assist interpreters in this task. In this paper, a new semi-supervised methodology is presented for identifying multi-facies geobodies in three-dimensional seismic data, while preserving their internal facies variability and keeping track of the input uncertainty. The approach couples a nonlinear data-driven method with a novel supervised learning method. It requires a prior delineation of the geobodies on a few seismic images, along with a priori confidence in that delineation. The methodology relies on a learning of an appropriate data representation, and propagates the prior confidence to posterior probabilities attached to the final delineation. The proposed methodology was applied to three-dimensional real data, showing consistently effective retrieval of the targeted multi-facies geobodies mass-transport deposits in the present case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

Download references

Acknowledgements

The authors are grateful to the CGG Houston office for the provision of and permission to publish data, and to Karine Labat for proofreading the article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pauline Le Bouteiller.

Appendix A: Definitions and Proofs

Appendix A: Definitions and Proofs

1.1 A.1 Definition of the GLCM Features and the Envelope-Based Feature Used for Initial Data Representation (Sect. 2.1.1)

The two-dimensional GLCM contrast, correlation, energy and homogeneity features are defined below

$$\begin{aligned} \hbox {contrast}&= \frac{\sum _{i,j} (i-j)^2 M_{ij}}{(n-1)^2},\\ \hbox {correlation}&= \frac{1}{2}\left( \sum _{i,j} \frac{(i - \mu _{i})(j - \mu _{j})M_{ij}}{\sigma _{i} \sigma _{j}} + 1\right) ,\\ \hbox {energy}&= \sum _{i,j} M_{ij}^2,\\ \hbox {homogeneity}&= \sum _{i,j} \frac{M_{ij}}{1 + |i-j|}, \end{aligned}$$

where \(M_{ij}\) is the (ij) component of the GLCM matrix M computed for a specific vector; sums are done over all pairs of pixels in the image; and n is the number of gray levels used (\(n^2\) is the number of elements in M).

The envelope-based feature \(f_\mathrm{{env}}\) is computed from the envelope-seismic image \(S_\mathrm{{env}}\) where every trace has been replaced by its envelope. Denoting \(G^{\sigma = 3}_{2D}\) as the two-dimensional Gaussian filtering with standard deviation \(\sigma = 3\) pixels, and \(q_{95}\) as the 0.95 quantile of a set of values, the computation of the envelope-based feature can be resumed to

$$\begin{aligned} f_\mathrm{{temp}}&= G^{\sigma = 3}_{2D}(S_\mathrm{{env}}),\\ \text { and }f_\mathrm{{env}}&= \frac{f_\mathrm{{temp}}}{q_{95}(f_\mathrm{{temp}})}. \end{aligned}$$

All features are in a range close to [0 1].

1.2 A.2 Proof for Posterior Probability Assignment

This section gives a proof for the method presented in Sect. 2.2. Let us recall that here the goal is to minimize the error \(E_i\) associated with each cluster \(C_i\), thus obtaining \(y_i^\mathrm{{opt}}\), the optimized posterior probability associated with \(C_i\). With \(N_i\) as the number of points in \(C_i\) and \(p_{k,i} = p(s_{k}|C_{i})\) as the proportion of points in \(C_{i}\) whose prior probability value is \(s_{k}\), one can write

$$\begin{aligned} E_{i}&= N_i \sum _{k \in 1:n} f(s_{k},y_{i})\ p_{k,i},\\ \text { so that }\frac{E_{i}}{N_i}&= \sum _{k \in 1:n, y_{i} \le s_{k}}(s_{k} - y_{i})\ p_{k,i} + \alpha \sum _{k \in 1:n, y_{i} > s_{k}}(y_{i} - s_{k})\ p_{k,i}. \end{aligned}$$

Let us assume i is fixed, write E in the following for \(\frac{E_{i}}{N_i}\), and omit the i index in the rest of this section. One may now study the variations of E with respect to y in order to find \(y^\mathrm{{opt}}\) which minimizes E.

1.2.1 A.2.1 Variations of E Outside [0, 1]

If \(y < s_1 = 0\), then: \( E(y) = \sum _{k \in 1:n}(s_{k} - y)\ p_{k}\), and every term of this sum decreases when y increases up to \(s_1\); in other words: E is a strictly decreasing function of y for \(y < s_1\).

If \(y > s_n = 1\), then: \( E(y) = \alpha \sum _{k \in 1:n}(y - s_{k})\ p_{k}\), and every term of this sum decreases when y decreases down to \(s_n\); in other words: E is a strictly increasing function of y for \(y > s_n\).

Consequently (as expected), \(y^\mathrm{{opt}} \in [s_{1}, s_{n}] = [0, 1]\).

1.2.2 A.2.2 Variations of E Inside [0, 1]

Let \(k \in 1:n-1\). Let \(y, d > 0\), such that\(s_{k} \le y < y + d \le s_{k+1}\). Then one can note that

$$\begin{aligned} E(y + d)&= \alpha \sum _{j \in 1:k} (y + d - s_{j})\ p_{j} + \sum _{j \in k+1:n}(s_{j} - y - d)\ p_{j}\\&= E(y) + \alpha \sum _{j \in 1:k} d\ p_{j} + \sum _{j \in k+1:n} (-d)\ p_{j}\\&= E(y) + d\left( \alpha \sum _{j \in 1:k}p_{j} - \sum _{j \in k+1:n}p_{j}\right) \\&= E(y) + d \left( \alpha \sum _{j \in 1:k}p_{j} - \sum _{j \in k+1:n}p_{j} - \sum _{j \in 1:k}p_{j} + \sum _{j \in 1:k}p_{j}\right) . \end{aligned}$$

As \(\sum _{j \in 1:n}p_{j} = 1\) (sum of proportions), one can then write (\(d > 0\)) that

$$\begin{aligned} \frac{E(y + d) - E(y)}{d} = (\alpha + 1) \sum _{j \in 1:k}p_{j} - 1. \end{aligned}$$

Thus, the variation of E does not depend on y on \([s_k, s_{k+1}]\): E is monotonous on this interval. More precisely

$$\begin{aligned} \begin{array}{ll} E \text { is strictly decreasing on } [s_{k}, s_{k+1}] &{} \quad \text {if and only if } \sum _{j \in 1:k}p_{j} < \frac{1}{\alpha + 1};\\ E \text { is strictly increasing on } [s_{k}, s_{k+1}] &{} \quad \text {if and only if } \sum _{j \in 1:k}p_{j} > \frac{1}{\alpha + 1};\\ E \text { is constant on } [s_{k}, s_{k+1}] &{} \quad \text {if and only if } \sum _{j \in 1:k}p_{j} = \frac{1}{\alpha + 1}. \end{array} \end{aligned}$$

1.2.3 Different Cases According to the Distribution of p

If \(p_{1} > \frac{1}{\alpha + 1}\), then E is strictly increasing on \([s_{1}, s_{2}]\), so also on \([s_{1}, s_{n}]\). E being strictly decreasing for \(y < s_1\) (see above), E reaches its minimum on \(y^\mathrm{{opt}} = s_1\) (i.e. 0).

If \(p_{1} \le \frac{1}{\alpha + 1}\), then one can define q as follows

$$\begin{aligned} q = \max \left\{ k \in 1:n \text { such that } \sum _{j \in 1:k}p_{j} \le \frac{1}{\alpha + 1}\right\} . \end{aligned}$$

There are now two cases:

  • Case 1: \(\sum _{j \in 1:q}p_{j} = \frac{1}{\alpha + 1}\) Here, the sub-case \(q = n\) corresponds to the case when \(\alpha = 0\). The optimal value will then be \(y^\mathrm{{opt}} = 1\) ; note that this value will be for any cluster considered: taking \(\alpha = 0\) is not interesting. In the sub-case \(q < n\): E is constant on \([s_{q}, s_{q+1}]\), and \(y^\mathrm{{opt}}\) can take any value in \([s_{q}, s_{q+1}]\). Here it is taken as \(y^\mathrm{{opt}} = s_{q}\) (for consistency with the sub-case \(q = n\)).

  • Case 2: \(\sum _{j \in 1:q}p_{j} < \frac{1}{\alpha + 1}\) Here, the sub-case \(q = n\) corresponds to a function E strictly decreasing on \([s_{1}, s_{n}]\) ; E being strictly increasing for \(y > s_{n}\) (see above), E is minimum for \(y^\mathrm{{opt}} = s_{n} = 1\). In the sub-case \(q < n\): E is strictly decreasing on \([s_{q}, s_{q+1}]\) and strictly increasing on \([s_{q+1}, s_{q+2}]\) (from q’s definition), so E is minimum for \(y^\mathrm{{opt}} = s_{q+1}\).

1.3 A.3 Results of the Dimension Reduction

Figure 10 shows the results of the FS dimension reduction and its relationship with the principal components space, as explained in Sect. 3.2.

The alignment of features of a same cluster along a line crossing the origin of the graph confirms that the clusters were formed according to the positive and negative correlations of features. It also ensures that at least one feature was selected among each group of correlated features.

Fig. 10
figure 10

GLCM feature selection. a Case of the full-stack training dataset: GLCM features (blue X; c, d give the legend of the numbered labels) in the first PCs space. Blue thick lines show the groups of features created by the feature selection method in this study, with representative features of each group in red; b representative features selected in each scale for each dataset; c legend for feature numbers of scale 1; d) legend for feature numbers of scale 2; e legend for orientation of GLCM vectors

For scale 1, there were four possible orientations; one feature per orientation was selected for all datasets except the far-offset dataset. For scale 2, there were eight possible orientations. For the full-stack and near-offset datasets, one feature per typical orientation was selected, except for orientations [0, 2] and [2, 0]; similarly for the mid-offset dataset, orientations [0, 2] and [2, 1] were not selected. For the far-offset dataset, two features of the same orientation were selected, so that three orientations were not represented in the selected feature set: [2, 0], [2, 2] and \([1,-2]\).

1.4 A.4 Detailed Results on EM

Results presented in Table 3 are detailed support for Fig. 6c.

Table 3 Error metric EM compared between training images (TI) and validation images (VI) for several dataset, dimension reduction (DR) methods and \(\alpha \) values

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Le Bouteiller, P., Charléty, J. Semi-Supervised Multi-Facies Object Retrieval in Seismic Data. Math Geosci 52, 817–846 (2020). https://doi.org/10.1007/s11004-019-09822-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-019-09822-8

Keywords

Navigation