Semi-Supervised Multi-Facies Object Retrieval in Seismic Data


Characterizing buried sedimentary structures through the use of seismic data is part of many geoscientific projects. The evolution of seismic acquisition and processing capabilities has made it possible to acquire ever-growing amounts of data, increasing the image resolution so that sedimentary objects (geobodies) can be imaged with greater precision within sedimentary layers. However, exploring and interpreting them in large datasets can be tedious work. Recent practice has shown the potential of automated methods to assist interpreters in this task. In this paper, a new semi-supervised methodology is presented for identifying multi-facies geobodies in three-dimensional seismic data, while preserving their internal facies variability and keeping track of the input uncertainty. The approach couples a nonlinear data-driven method with a novel supervised learning method. It requires a prior delineation of the geobodies on a few seismic images, along with a priori confidence in that delineation. The methodology relies on a learning of an appropriate data representation, and propagates the prior confidence to posterior probabilities attached to the final delineation. The proposed methodology was applied to three-dimensional real data, showing consistently effective retrieval of the targeted multi-facies geobodies mass-transport deposits in the present case.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. Alves TM, Kurtev K, Moore GF, Strasser M (2014) Assessing the internal character, reservoir potential, and seal competence of mass-transport deposits using seismic texture: a geophysical and petrophysical approach. AAPG Bull 98(4):793–824.

    Article  Google Scholar 

  2. Berthelot A, Solberg AH, Gelius LJ (2013) Texture attributes for detection of salt. J Appl Geophys 88:52–69.

    Article  Google Scholar 

  3. Bishop CM, Svensén M, Williams CKI (1998) GTM: the generative topographic mapping. Neural Comput 10(1):215–234.

    Article  Google Scholar 

  4. Chopra S, Marfurt KJ (2014) Seismic facies analysis using generative topographic mapping. In: Birkelo B (ed) SEG technical program expanded abstracts 2014, pp 1390–1394.

  5. Clausi DA, Zhao Y (2003) Grey level co-occurrence integrated algorithm (GLCIA): a superior computational method to rapidly determine co-occurrence probability texture features. Comput Geosci 29(7):837–850.

    Article  Google Scholar 

  6. de Matos MC, Osorio PL, Johann PR (2007) Unsupervised seismic facies analysis using wavelet transform and self-organizing maps. Geophysics 72(1):P9–P21.

    Article  Google Scholar 

  7. de Silva AM, Leong PHW (2015) Feature selection. In: de Silva AM, Leong PHW (eds) Grammar-based feature generation for time-series prediction. SpringerBriefs in applied sciences and technology. Springer, Singapore, pp 13–24.

    Google Scholar 

  8. Eichkitz CG, Davies J, Amtmann J, Schreilechner MG, de Groot P (2015) Grey level co-occurrence matrix and its application to seismic data. First Break 33:71–77

    Google Scholar 

  9. Gao D (2008) Application of seismic texture model regression to seismic facies characterization and interpretation. Lead Edge 27(3):394–397

    Article  Google Scholar 

  10. Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 3(6):610–621.

    Article  Google Scholar 

  11. Hashemi H, de Beukelaar P, Beiranvand B, Seiedali M (2017) Clustering seismic datasets for optimized facies analysis using a sscsom technique. In: 79th EAGE conference and exhibition 2017, proceedings. EAGE Publications BV, Netherlands.

  12. Kohonen T (1986) Learning vector quantization for pattern recognition: technical report TKK-F- A601. Helsinki University of Technology

  13. Le Bouteiller P, Charléty J (2018) Procédé pour la détection d’objets géologiques dans une image sismique (patent pending)

  14. Long Z, Alaudah Y, Qureshi MA, Farraj MA, Wang Z, Amin A, Deriche M, AlRegib G (2015) Characterization of migrated seismic volumes using texture attributes: a comparative study. In: Schneider RV (ed) SEG technical program expanded abstracts 2015, pp 1744–1748.

  15. Lu Y, Cohen I, Zhou XS, Tian Q (2007) Feature selection using principal feature analysis. In: Lienhart R, Prasad AR, Hanjalic A, Choi S, Bailey B, Sebe N (eds) The 15th international conference, p 301.

  16. Marroquín ID, Brault JJ, Hart BS (2009) A visual data-mining methodology for seismic facies analysis: part 1—testing and comparison with other unsupervised clustering methods. Geophysics 74(1):P1–P11.

    Article  Google Scholar 

  17. Nivlet P (2007) Uncertainties in seismic facies analysis for reservoir characterisation or monitoring: causes and consequences. Oil Gas Sci Technol Rev IFP 62(2):225–235.

    Article  Google Scholar 

  18. Ogiesoba O, Hammes U (2012) Seismic interpretation of mass-transport deposits within the upper oligocene frio formation, south Texas Gulf Coast. AAPG Bull 96(5):845–868.

    Article  Google Scholar 

  19. Pitas I, Kotropoulos C (1992) A texture-based approach to the segmentation of seismic images. Pattern Recognit 25(9):929–945.

    Article  Google Scholar 

  20. Qi J, Lin T, Zhao T, Li F, Marfurt K (2016) Semisupervised multiattribute seismic facies analysis. Interpretation 4(1):SB91–SB106.

    Article  Google Scholar 

  21. Roy A, Romero-Peláez AS, Kwiatkowski TJ, Marfurt KJ (2014) Generative topographic mapping for seismic facies estimation of a carbonate wash, Veracruz basin, southern Mexico. Interpretation 2(1):SA31–SA47.

    Article  Google Scholar 

  22. Shafiq MA, Wang Z, Amin A, Hegazy T, Deriche M, AlRegib G (2015) Detection of salt-dome boundary surfaces in migrated seismic volumes using gradient of textures. In: Schneider RV (ed) SEG technical program expanded abstracts 2015, pp 1811–1815.

  23. Shipp RC, Weimer P, Posamentier HW (eds) (2011) Mass-transport deposits in deepwater settings. SEPM (Society for Sedimentary Geology), Tusla.

    Google Scholar 

  24. Soille P (2010) Morphological image analysis: principles and applications, 2nd edn. Springer, Berlin

    Google Scholar 

  25. Sokal RR, Rohlf FJ (1962) The comparison of dendrograms by objective methods. Taxon 11(2):33.

    Article  Google Scholar 

  26. Taha AA, Hanbury A (2015) Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. BMC Med Imaging 15:29.

    Article  Google Scholar 

  27. Wang Z, Hegazy T, Long Z, AlRegib G (2015) Noise-robust detection and tracking of salt domes in postmigrated volumes using texture, tensors, and subspace learning. Geophysics 80(6):WD101–WD116.

    Article  Google Scholar 

  28. Wang S, Yuan S, Yan B, He Y, Sun W (2016) Directional complex-valued coherence attributes for discontinuous edge detection. J Appl Geophys.

    Article  Google Scholar 

  29. Ward JH (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244.

    Article  Google Scholar 

  30. West BP, May SR, Eastwood JE, Rossen C (2002) Interactive seismic facies classification using textural attributes and neural networks. Lead Edge 21(10):1042–1049.

    Article  Google Scholar 

  31. Zhao T, Zhang J, Li F, Marfurt KJ (2016) Characterizing a turbidite system in Canterbury basin, New Zealand, using seismic attributes and distance-preserving self-organizing maps. Interpretation 4(1):SB79–SB89.

    Article  Google Scholar 

  32. Zhao T, Li F, Marfurt KJ (2017) Constraining self-organizing map facies analysis with stratigraphy: an approach to increase the credibility in automatic seismic facies classification. Interpretation 5(2):T163–T171

    Article  Google Scholar 

Download references


The authors are grateful to the CGG Houston office for the provision of and permission to publish data, and to Karine Labat for proofreading the article.

Author information



Corresponding author

Correspondence to Pauline Le Bouteiller.

Appendix A: Definitions and Proofs

Appendix A: Definitions and Proofs

A.1 Definition of the GLCM Features and the Envelope-Based Feature Used for Initial Data Representation (Sect. 2.1.1)

The two-dimensional GLCM contrast, correlation, energy and homogeneity features are defined below

$$\begin{aligned} \hbox {contrast}&= \frac{\sum _{i,j} (i-j)^2 M_{ij}}{(n-1)^2},\\ \hbox {correlation}&= \frac{1}{2}\left( \sum _{i,j} \frac{(i - \mu _{i})(j - \mu _{j})M_{ij}}{\sigma _{i} \sigma _{j}} + 1\right) ,\\ \hbox {energy}&= \sum _{i,j} M_{ij}^2,\\ \hbox {homogeneity}&= \sum _{i,j} \frac{M_{ij}}{1 + |i-j|}, \end{aligned}$$

where \(M_{ij}\) is the (ij) component of the GLCM matrix M computed for a specific vector; sums are done over all pairs of pixels in the image; and n is the number of gray levels used (\(n^2\) is the number of elements in M).

The envelope-based feature \(f_\mathrm{{env}}\) is computed from the envelope-seismic image \(S_\mathrm{{env}}\) where every trace has been replaced by its envelope. Denoting \(G^{\sigma = 3}_{2D}\) as the two-dimensional Gaussian filtering with standard deviation \(\sigma = 3\) pixels, and \(q_{95}\) as the 0.95 quantile of a set of values, the computation of the envelope-based feature can be resumed to

$$\begin{aligned} f_\mathrm{{temp}}&= G^{\sigma = 3}_{2D}(S_\mathrm{{env}}),\\ \text { and }f_\mathrm{{env}}&= \frac{f_\mathrm{{temp}}}{q_{95}(f_\mathrm{{temp}})}. \end{aligned}$$

All features are in a range close to [0 1].

A.2 Proof for Posterior Probability Assignment

This section gives a proof for the method presented in Sect. 2.2. Let us recall that here the goal is to minimize the error \(E_i\) associated with each cluster \(C_i\), thus obtaining \(y_i^\mathrm{{opt}}\), the optimized posterior probability associated with \(C_i\). With \(N_i\) as the number of points in \(C_i\) and \(p_{k,i} = p(s_{k}|C_{i})\) as the proportion of points in \(C_{i}\) whose prior probability value is \(s_{k}\), one can write

$$\begin{aligned} E_{i}&= N_i \sum _{k \in 1:n} f(s_{k},y_{i})\ p_{k,i},\\ \text { so that }\frac{E_{i}}{N_i}&= \sum _{k \in 1:n, y_{i} \le s_{k}}(s_{k} - y_{i})\ p_{k,i} + \alpha \sum _{k \in 1:n, y_{i} > s_{k}}(y_{i} - s_{k})\ p_{k,i}. \end{aligned}$$

Let us assume i is fixed, write E in the following for \(\frac{E_{i}}{N_i}\), and omit the i index in the rest of this section. One may now study the variations of E with respect to y in order to find \(y^\mathrm{{opt}}\) which minimizes E.

A.2.1 Variations of E Outside [0, 1]

If \(y < s_1 = 0\), then: \( E(y) = \sum _{k \in 1:n}(s_{k} - y)\ p_{k}\), and every term of this sum decreases when y increases up to \(s_1\); in other words: E is a strictly decreasing function of y for \(y < s_1\).

If \(y > s_n = 1\), then: \( E(y) = \alpha \sum _{k \in 1:n}(y - s_{k})\ p_{k}\), and every term of this sum decreases when y decreases down to \(s_n\); in other words: E is a strictly increasing function of y for \(y > s_n\).

Consequently (as expected), \(y^\mathrm{{opt}} \in [s_{1}, s_{n}] = [0, 1]\).

A.2.2 Variations of E Inside [0, 1]

Let \(k \in 1:n-1\). Let \(y, d > 0\), such that\(s_{k} \le y < y + d \le s_{k+1}\). Then one can note that

$$\begin{aligned} E(y + d)&= \alpha \sum _{j \in 1:k} (y + d - s_{j})\ p_{j} + \sum _{j \in k+1:n}(s_{j} - y - d)\ p_{j}\\&= E(y) + \alpha \sum _{j \in 1:k} d\ p_{j} + \sum _{j \in k+1:n} (-d)\ p_{j}\\&= E(y) + d\left( \alpha \sum _{j \in 1:k}p_{j} - \sum _{j \in k+1:n}p_{j}\right) \\&= E(y) + d \left( \alpha \sum _{j \in 1:k}p_{j} - \sum _{j \in k+1:n}p_{j} - \sum _{j \in 1:k}p_{j} + \sum _{j \in 1:k}p_{j}\right) . \end{aligned}$$

As \(\sum _{j \in 1:n}p_{j} = 1\) (sum of proportions), one can then write (\(d > 0\)) that

$$\begin{aligned} \frac{E(y + d) - E(y)}{d} = (\alpha + 1) \sum _{j \in 1:k}p_{j} - 1. \end{aligned}$$

Thus, the variation of E does not depend on y on \([s_k, s_{k+1}]\): E is monotonous on this interval. More precisely

$$\begin{aligned} \begin{array}{ll} E \text { is strictly decreasing on } [s_{k}, s_{k+1}] &{} \quad \text {if and only if } \sum _{j \in 1:k}p_{j} < \frac{1}{\alpha + 1};\\ E \text { is strictly increasing on } [s_{k}, s_{k+1}] &{} \quad \text {if and only if } \sum _{j \in 1:k}p_{j} > \frac{1}{\alpha + 1};\\ E \text { is constant on } [s_{k}, s_{k+1}] &{} \quad \text {if and only if } \sum _{j \in 1:k}p_{j} = \frac{1}{\alpha + 1}. \end{array} \end{aligned}$$

Different Cases According to the Distribution of p

If \(p_{1} > \frac{1}{\alpha + 1}\), then E is strictly increasing on \([s_{1}, s_{2}]\), so also on \([s_{1}, s_{n}]\). E being strictly decreasing for \(y < s_1\) (see above), E reaches its minimum on \(y^\mathrm{{opt}} = s_1\) (i.e. 0).

If \(p_{1} \le \frac{1}{\alpha + 1}\), then one can define q as follows

$$\begin{aligned} q = \max \left\{ k \in 1:n \text { such that } \sum _{j \in 1:k}p_{j} \le \frac{1}{\alpha + 1}\right\} . \end{aligned}$$

There are now two cases:

  • Case 1: \(\sum _{j \in 1:q}p_{j} = \frac{1}{\alpha + 1}\) Here, the sub-case \(q = n\) corresponds to the case when \(\alpha = 0\). The optimal value will then be \(y^\mathrm{{opt}} = 1\) ; note that this value will be for any cluster considered: taking \(\alpha = 0\) is not interesting. In the sub-case \(q < n\): E is constant on \([s_{q}, s_{q+1}]\), and \(y^\mathrm{{opt}}\) can take any value in \([s_{q}, s_{q+1}]\). Here it is taken as \(y^\mathrm{{opt}} = s_{q}\) (for consistency with the sub-case \(q = n\)).

  • Case 2: \(\sum _{j \in 1:q}p_{j} < \frac{1}{\alpha + 1}\) Here, the sub-case \(q = n\) corresponds to a function E strictly decreasing on \([s_{1}, s_{n}]\) ; E being strictly increasing for \(y > s_{n}\) (see above), E is minimum for \(y^\mathrm{{opt}} = s_{n} = 1\). In the sub-case \(q < n\): E is strictly decreasing on \([s_{q}, s_{q+1}]\) and strictly increasing on \([s_{q+1}, s_{q+2}]\) (from q’s definition), so E is minimum for \(y^\mathrm{{opt}} = s_{q+1}\).

A.3 Results of the Dimension Reduction

Figure 10 shows the results of the FS dimension reduction and its relationship with the principal components space, as explained in Sect. 3.2.

The alignment of features of a same cluster along a line crossing the origin of the graph confirms that the clusters were formed according to the positive and negative correlations of features. It also ensures that at least one feature was selected among each group of correlated features.

Fig. 10

GLCM feature selection. a Case of the full-stack training dataset: GLCM features (blue X; c, d give the legend of the numbered labels) in the first PCs space. Blue thick lines show the groups of features created by the feature selection method in this study, with representative features of each group in red; b representative features selected in each scale for each dataset; c legend for feature numbers of scale 1; d) legend for feature numbers of scale 2; e legend for orientation of GLCM vectors

For scale 1, there were four possible orientations; one feature per orientation was selected for all datasets except the far-offset dataset. For scale 2, there were eight possible orientations. For the full-stack and near-offset datasets, one feature per typical orientation was selected, except for orientations [0, 2] and [2, 0]; similarly for the mid-offset dataset, orientations [0, 2] and [2, 1] were not selected. For the far-offset dataset, two features of the same orientation were selected, so that three orientations were not represented in the selected feature set: [2, 0], [2, 2] and \([1,-2]\).

A.4 Detailed Results on EM

Results presented in Table 3 are detailed support for Fig. 6c.

Table 3 Error metric EM compared between training images (TI) and validation images (VI) for several dataset, dimension reduction (DR) methods and \(\alpha \) values

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Le Bouteiller, P., Charléty, J. Semi-Supervised Multi-Facies Object Retrieval in Seismic Data. Math Geosci 52, 817–846 (2020).

Download citation


  • Seismic interpretation
  • Object recognition
  • Semi-supervised analysis
  • Multi-facies geobody