Acoustic impedance prediction based on extended seismic attributes using multilayer perceptron, random forest, and extra tree regressor algorithms

Surachman, Lutfi Mulyadi; Abdulraheem, Abdulazeez; Al-Shuhail, Abdullatif; Kaka, Sanlinn I.

doi:10.1007/s13202-024-01795-7

Acoustic impedance prediction based on extended seismic attributes using multilayer perceptron, random forest, and extra tree regressor algorithms

Original Paper-Exploration Geophysics
Open access
Published: 08 May 2024

Volume 14, pages 1923–1931, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Petroleum Exploration and Production Technology Aims and scope Submit manuscript

Acoustic impedance prediction based on extended seismic attributes using multilayer perceptron, random forest, and extra tree regressor algorithms

Download PDF

540 Accesses
Explore all metrics

Abstract

Acoustic impedance is the product of the density of a material and the speed at which an acoustic wave travels through it. Understanding this relationship is essential because low acoustic impedance values are closely associated with high porosity, facilitating the accumulation of more hydrocarbons. In this study, we estimate the acoustic impedance based on nine different inputs of seismic attributes in addition to depth and two-way travel time using three supervised machine learning models, namely extra tree regression (ETR), random forest regression, and a multilayer perceptron regression algorithm using the scikit-learn library. Our results show that the R² of multilayer perceptron regression is 0.85, which is close to what has been reported in recent studies. However, the ETR method outperformed those reported in the literature in terms of the mean absolute error, mean squared error, and root-mean-squared error. The novelty of this study lies in achieving more accurate predictions of acoustic impedance for exploration.

Automated real-time prediction of geological formation tops during drilling operations: an applied machine learning solution for the Norwegian Continental Shelf

Article Open access 08 April 2024

Prediction of jumbo drill penetration rate in underground mines using various machine learning approaches and traditional models

Article Open access 18 April 2024

Comparison of neural networks techniques to predict subsurface parameters based on seismic inversion: a machine learning approach

Article 05 January 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The best definition of acoustic impedance is given in medical physics (Suzuki et al. 2019). They defined it as the resistance to the propagation of ultrasound waves through tissues. In parallel, the Earth’s layers can be likened to these biological tissues. The acoustic impedance is derived within each layer by multiplying the density of the material by its acoustic velocity (g/cm³ × m/s). Based on the lithology of the layer, it is understood that porosity is typically high in sand and low in shale. An increase in acoustic impedance implies a decrease in porosity (Agbadze et al. 2022). Since low acoustic impedance and high porosity are conducive to accumulating additional hydrocarbons in each layer, sands are more likely than shales to accumulate hydrocarbons (Ali and Al-Shuhail 2018). Recognizing this, there is a growing emphasis on developing methods to determine acoustic impedance as an intrinsic property of rock layers. Current research methodologies for estimating acoustic impedance can be broadly categorized into nonmachine learning-based and machine learning (ML)-based techniques. Some examples of nonmachine learning-based methods include direct, iterative, and nonlinear inversion methods (Liu et al. 2018).

A review of the latest ML techniques for acoustic impedance estimation was conducted by Zeng et al. (2021). They introduced new ML methods to predict reservoir parameters using both post- and prestack seismic attributes. Hampson et al. (2001) utilized multiattribute transforms and neural networks to predict log characteristics from seismic data. Cracknell and Reading (2013) analyzed aircraft and satellite data using random forest (RF) and support vector machines to identify lithologic contact zones. Harris and Grunsky (2015) employed geophysical and geochemical data to predict lithology using RF. Zhang et al. (2018) conducted an experiment using deep neural networks and convolutional neural networks (CNNs) to predict seismic lithology. Biswas et al. (2019) and Das and Mukerji (2020) advocated for pre- and poststack seismic inversion using CNNs. Priezzhev et al. (2019) compared various machine learning-supported regression models, including random forest, nearest neighbor, neural network, and adaptive classifier-ensemble models. According to a recent review by Zeng et al. (2021), the random forest method is deemed one of the most effective methods for addressing highly nonlinear problems. Our research aims to explore the methodology proposed by Mardani, which employs multilayer neural backpropagation to estimate acoustic impedance using six distinct seismic attribute inputs: amplitude, second derivatives, trace gradient, quadrature amplitude, instantaneous frequency, and gradient magnitude (Mardani and Thrust 2020).

This study aims to estimate acoustic impedance with enhanced well log resolution using various machine learning methods. These methods leverage extended inputs of seismic attributes to achieve greater accuracy. Several techniques can assess the accuracy of our predictions, with one notable approach being the coefficient of determination (R²) values. This research aims to achieve R² values near one, signifying that our predictions closely align with the actual acoustic impedance values. To meet these objectives, we face several challenges: How can we determine the acoustic impedance on a continuous well log scale using seismic attribute inputs that differ from those reported earlier in the literature? For instance, how does the extra trees regressor algorithm compare to the random forest and the multilayer perceptron regression algorithm? Moreover, how can we effectively apply these methods to real-world data for interpretation?

Methodology

This study primarily explored the application of machine learning in predicting acoustic impedance, contrasting it with the conventional band-limited impedance (BLIMP) inversion method. The groundwork data for the traditional and ML approaches were described in a prior paper (Mardani and Thrust 2020). In our study, we utilized BLIMP acoustic impedance solely for comparative analysis.

The implementation of acoustic impedance based on recursive inversion using poststack time-migrated seismic data and well logs results in a band-limited inversion. This inversion has a surface seismic frequency ranging from approximately 10–50 Hz (Mardani and Thrust 2020; Russell 1988).

Machine learning methods

Machine learning methods are utilized through the scikit-learn 1.2.1 library (Pedregosa et al. 2011). In regard to training and prediction, the input consists of depth, two-way travel time, and various seismic attributes. These attributes encompass amplitudes, their integrals, trace gradients, quadrature amplitudes, second derivatives, gradient magnitudes, instantaneous frequencies, phases, and cosine phases. The primary target for these procedures is the well log acoustic impedance, commonly referred to as the true AI. The output generated is the predicted acoustic impedance at the well log resolution.

Algorithms

In this study, three models were employed: multilayer perceptron regression (MLPR), random forest regression (RFR), and extra tree regression (ETR). A brief description of each is given below.

Multilayer perceptron regression (MLPR)

The multilayer perceptron (MLP) is encapsulated within the MLPRegressor class, which employs a backpropagation-trained multilayer perceptron. Given that the output consists of a series of continuous numbers, the square error serves as the loss function (Pedregosa et al. 2011). Broadly speaking, the inherent specifications of the MLPR function include a hidden layer size of 100, the rectified linear unit (ReLU) activation function (which outputs the input directly if it is positive and zero if it is negative), the Adam optimizer for optimization, an autobatch size, a constant learning rate, and a maximum iteration count of 1000, among others.

(1)

Random forest regression (RFR)

A random forest operates as a meta-estimator aggregating numerous decision trees, each trained on various subsamples of datasets. This aggregation aims to enhance the prediction accuracy and curtail overfitting (Pedregosa et al. 2011). The built-in function encompasses several estimators set at 100 and employs the squared error as its criterion, among other features.

$$\hat{y} = \frac{1}{B}\sum\limits_{b = 1}^{B} {f_{b} \left( x \right)}$$

(2)

Step-by-step mathematical derivation of random forest regression using scikit-learn (Pedregosa et al. 2011):

Initialize max_depth as the maximum depth of each tree and n_estimators as the number of trees in the forest.
For every forest tree, create a new dataset by randomly selecting a portion of the training data (with replacement), choose a subset of the features at random to consider while dividing each tree node, and build a decision tree using the new dataset and selected characteristics, with a maximum depth of max_depth.
To predict a new data point, each tree in the forest was predicted, and the predictions were averaged to obtain the final prediction.

The mathematical formulation for the random forest algorithm, as outlined by Pedregosa et al. (2011), is as follows:

1.
Initialization

n_estimators: This represents the number of trees in the forest.
max_depth: This denotes the maximum depth of each decision tree.

2.
For each tree in the forest
a.
Dataset creation

Let D be the original dataset containing N samples.
D′, a new dataset of size N′ (where N′ ≤ N), is constructed by randomly selecting N′ samples from D with replacement.

b.
Feature Selection.

Let n be the total number of features in the dataset.
Here, n′ is defined as the number of features to consider when splitting each node in the tree, ensuring that n′ ≤ n.
Let F represent the set of all features in the dataset.
F′, a new set of features, is constructed by randomly selecting n′ features from F without replacement.

c.
Decision Tree Construction.

A decision tree, denoted as T, is built using dataset D′ and feature set F′, ensuring that it does not exceed the specified maximum depth.

4.
Prediction for a New Data Point:
5a.
Individual Tree Predictions:

Let T1, T2…., and Tn be the n decision trees in the forest.
For a new data point x to be predicted, let yi be the prediction made by tree Ti for x.

b.
Final Prediction:

The final prediction y for x is determined by averaging the predictions y1, y2...., yn.

Extra tree regressor

For implementation in this class, averaging is employed as part of the meta-estimator approach. This method fits multiple randomized decision trees, often referred to as "extra trees," on various subsamples of the dataset. This strategy aims to enhance the prediction accuracy and mitigate overfitting (Pedregosa et al. 2011) (Figs. 1, 2, 3, 4, 5).

Results and discussion

Data visualization and analysis

We utilized the dataset from Mardani and Thrust (2020) to illustrate our selected method. The data are depicted in Fig. 6, which comprises an extension to nine inputs of seismic attributes in addition to those of depth and TWTT. The details and definitions of each attribute can be found in Appendix A. A plot of the zero-offset seismic trace is shown in the first track on the left. Seismic traces for all wells are used to extract the following training input attributes: the amplitude of the seismic trace, integral, second derivative, quadrature amplitude, trace gradient, gradient magnitude, instantaneous frequency, phase, and cosine of the phase as the nine of eleven input features, while the acoustic impedance at well log resolution is the target (Mardani and Thrust 2020).

The relationships between the attributes are more clearly visualized in the cross-correlation matrix presented in Fig. 7. Within this matrix, values closer to 1 indicate a positive correlation, while those nearing -1 signify a negative correlation. Leveraging the data splitting feature, we designate the AI log (acoustic impedance derived from the well log) as both the target and output. Other than AI_HRS_inv are treated as inputs. The AI_HRS_inv represents the band-limited inversion from prior research, generated using Hampson Russel software (Mardani 2020). According to the matrix, the most robust input‒output relationship exists between Quadr and AI_Log.

To evaluate the performance of the selected methods, we present the errors in the prediction results for RFR, MLPR, and ETR using the default parameters in Table 1. The comparison reveals that ETR consistently exhibits lower values for MAE, MSE, and RMSE than the other methods. To illustrate the performance of the three selected methods, we compare the R² values between the actual and predicted impedance values in Fig. 8. Notably, the R² value for the ETR surpasses those for both the RFR and MLPR. Specifically, ETR achieves the highest R², followed by RFR, outperforming MLPR. These findings further reveal that, in comparison to Mardani and Thrust (2020), who achieved an R² of 0.88 using the TensorFlow platform, both our ETR and RFR methods yield superior R² values, while our MLPR produces a slightly inferior result. Such discrepancies can arise due to variations in parameters such as the number of features, and platforms including different libraries utilized. In our study, we employed the sci-kit Learn 1.2.1 libraries for computations, whereas Mardani and Thrust (2020) utilized TensorFlow and Keras. Furthermore, we compare histograms of the prediction errors of the selected methods in Fig. 9. It is evident that the prediction error of the ETR method primarily spikes at zero error, unlike that of the RFR and MLPR methods, which display a broader spread of nonzero errors. This observation leads to the ranking of prediction error quality for the three-based methods as ETR > RFR > MLPR.

Table 1 Error values in prediction using the three methods

Full size table

The learning characteristics of the three regression methods can be discerned from Fig. 10. The training and testing scores of the RFR method gravitate toward 1, whereas the MLPR hovers approximately 0.6, although it continues to rise with an increase in training size. This suggests that the RFR method requires only a minimal training size, as it plateaus with a flat curve after reaching a training size of approximately 4000. This indicates that, given the same training size, the learning efficacy of RFR surpasses that of the MLPR method. Notably, the ETR, which stands out as our most effective method, consistently delivers outstanding training outcomes, ranging from a training size of 500 and maintaining this performance consistently to the endpoint, akin to the other two methods, ultimately yielding the most commendable test scores.

The final estimations of the acoustic impedance are shown in Fig. 11. The band-limited AI was reproduced using available data from Mardani and Thrust (2020) to investigate how the property changes with varying depth. They generally follow a similar trend but are unable to match the same variation for localized parts (Mardani and Thrust 2020). This figure juxtaposes the acoustic impedance derived from well logs, the predicted AI using the extended inputs, and the band-limited inversion. From the figure, it is evident that the AI predicted by the RFR method, represented by a red line, aligns more closely with the true AI, depicted by a blue line, than the AI predicted by the MLPR method, which is indicated by a yellow stripe. Overall, the most accurate prediction is rendered by the ETR method, highlighted by the green line. Furthermore, the predicted acoustic impedance reveals insights suggesting that hydrocarbons are likely to accumulate at depths ranging from approximately 3100–3200 m and 3300–3400 m.

Conclusions

1.
The estimation of acoustic impedance, utilizing extended inputs of depth, two-way travel time and seismic attributes, employed regression methods, including multilayer perceptron regression, random forest regression, and extra tree regression with the scikit-learn library.
2.
Our RFR and ETR studies exhibited improved determination coefficient values, surpassing those reported in the literature, even with a larger dataset due to additional features.
3.
Among the tested models, the extra tree regression model demonstrated superior performance in terms of the coefficient of determination and was particularly suitable for highly nonlinear well log scales.
4.
We conclude that for hydrocarbon exploration based on the available acoustic impedance data, the ETR model with default parameters represents the optimal choice for more accurate predictions. Other datasets may have different options.
5.
Future studies should consider adopting the ETR method with diverse real datasets, incorporating additional seismic attributes, and exploring alternative machine learning models. This comprehensive approach aims to identify an even more accurate machine learning model that can then be validated using real-world data.

Data availability

The research data and codes before modification are available at: https://github.com/mardani72/AI_ML_Seismic_Log/blob/master/AI_From_Seismic_Attributes_ML_final.ipynb

Abbreviations

B :: N_estimator, the number of decision trees in the forest
f(x) = max(0,x):: ReLu function
$\left.{f}_{b}(x\right)$ :: The predicted target variable for the input data point x by the bth decision tree in the random forest
w _hj :: Weights between the first and middle or hidden layer
x _o = + 1:: Bias of the first layer
X = x _j _; _j ₌₁ _→ _d :: Features of the first layer
y _i :: Output
$\hat{y}$ :: Predicted target variable for the input data point x
Z = z _j; _j ₌₁ _→ _d :: Features of the middle layer
η :: Learning rate, which is set to a value greater than zero
$v$ _ih :: Weights between the middle and upper or output layer
$\Delta {v}_{{\text{h}}}$ :: Incremental weight between the hidden and output layers
$\Delta {w}_{{\text{h}}j}$ :: Incremental weight between the first and hidden layers
AI:: Acoustic impedance
ANN:: Artificial neural network
MLPR:: Multilayer perceptron regression
RFR:: Random forest regression
ETR:: Extra tree regression
MAE:: Mean absolute error
MSE:: Mean squared error
RMSE:: Root-mean-squared error
ReLu:: Rectified linear unit

References

Agbadze OK, Qiang C, Jiaren Y (2022) Acoustic impedance and lithology-based reservoir porosity analysis using predictive machine learning algorithms. J Pet Sci Eng 208:109656
Article CAS Google Scholar
Ali A, Al-Shuhail AA (2018) Characterizing fluid contacts by joint inversion of seismic P-wave impedance and velocity. J Pet Explor Prod Technol 8:117–130
Article Google Scholar
Barnes AE (2016) Handbook of poststack seismic attributes. Society of Exploration Geophysicists
Book Google Scholar
Biswas R, Sen MK, Das V, Mukerji T (2019) Prestack and poststack inversion using a physics-guided convolutional neural network. Interpretation 7(3):SE161–SE174. https://doi.org/10.1190/INT-2018-0236.1
Article Google Scholar
Chu Z, Yu J, Hamdulla A (2021) Throughput prediction based on Extra Tree for stream processing tasks. Comput Sci Inf Syst 18(1):1–22
Article Google Scholar
Cracknell MJ, Reading AM (2013) The upside of uncertainty: Identification of lithology contact zones from airborne geophysics and satellite data using random forests and support vector machines. Geophysics 78(3):WB113–WB126
Article Google Scholar
Das V, Mukerji T (2020) Petrophysical properties prediction from prestack seismic data using convolutional neural networks. Geophysics 85(5):N41–N55. https://doi.org/10.1190/geo2019-0650.1
Article Google Scholar
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42
Article Google Scholar
Hampson DP, Schuelke JS, Quirein JA (2001) Use of multiattribute transforms to predict log properties from seismic data. Geophysics 66(1):220–236
Article Google Scholar
Harris JR, Grunsky EC (2015) Predictive lithological mapping of Canada’s North using random forest classification applied to geophysical and geochemical data. Comput Geosci 80:9–25
Article CAS Google Scholar
Liu J, Zhang J, Huang Z (2018) Accurate estimation of acoustic impedance based on spectral inversion. Geophys Prospect 66(1):169–181
Article Google Scholar
Mardani RA, Thrust GV (2020) Estimation of acoustic impedance from seismic data in well-log resolution using machine learning, neural network, and comparison with band-limited seismic inversion
Maurya SP, Singh NP (2018) Comparing pre-and post-stack seismic inversion methods-a case study from Scotian Shelf. Canada J Ind Geophys Union 22(6):585–597
Google Scholar
Maurya SP, Singh NP (2019) Estimating reservoir zone from seismic reflection data using maximum-likelihood sparse spike inversion technique: a case study from the Blackfoot field (Alberta, Canada). J Pet Explor Prod Technol 9:1907–1918
Article Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in {P}ython. J Mach Learn Res 12:2825–2830
Google Scholar
Petre I (2021) How to train a multilayer perceptron for regression. https://www.youtube.com/watch?v=Y-38j9pZ_QQ
Priezzhev II, Veeken PCH, Egorov SV, Strecker U (2019) Direct prediction of petrophysical and petroelastic reservoir properties from seismic and well-log data using nonlinear machine learning algorithms. Lead Edge 38(12):949–958
Article Google Scholar
Russell BH (1988) Introduction to seismic inversion methods (Issue 2). SEG Books.
Suzuki S, Gerner P, Lirk P (2019) Local anesthetics. In: Pharmacology and physiology for anesthesia. Elsevier, pp 390–411
Zeng H, He Y, Zeng L (2021) Impact of sedimentary facies on machine learning of acoustic impedance from seismic data: lessons from a geologically realistic 3D model. Interpretation 9(3):T1009–T1024
Article Google Scholar
Zhang G, Wang Z, Chen Y (2018) Deep learning for seismic lithology prediction. Geophys J Int 215(2):1368–1387
Google Scholar

Download references

Acknowledgements

The authors thank the College of Petroleum Engineering and Geosciences at King Fahd University of Petroleum and Minerals for supporting this research.

Funding

This study was funded by the College of Petroleum Engineering and Geosciences at King Fahd University of Petroleum and Minerals.

Author information

Abdulazeez Abdulraheem, Abdullatif Al-Shuhail and Sanlinn I. Kaka have contributed equally to this work.

Authors and Affiliations

Geoscience Department, KFUPM, 31261, Dhahran, Saudi Arabia
Lutfi Mulyadi Surachman, Abdullatif Al-Shuhail & Sanlinn I. Kaka
Petroleum Engineering Department, KFUPM, 31261, Dhahran, Saudi Arabia
Abdulazeez Abdulraheem

Authors

Lutfi Mulyadi Surachman
View author publications
You can also search for this author in PubMed Google Scholar
Abdulazeez Abdulraheem
View author publications
You can also search for this author in PubMed Google Scholar
Abdullatif Al-Shuhail
View author publications
You can also search for this author in PubMed Google Scholar
Sanlinn I. Kaka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lutfi Mulyadi Surachman.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest.

Ethical approval

The authors followed the moral standards of publications.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1. Seismic Attributes (Barnes 2016)

Amp (amplitude)

A measure of the raw amplitude of seismic trace values

D2 (Second Derivative)

If raw amplitude profiles do not manage to present continuity well enough, interpreters might consider the second derivative attribute quite helpful.

Int (Integral)

The integration of the raw trace amplitude over time

Quadr (quadrature amplitude)

Quadrature amplitude is the imaginary component of the analytical signal that is derived from the 90° phase of the original trace, through the Hilbert transform. When combined with the real part, they create the analytical signal.

Trace gradient

The gradient along the trace. The most significant gradient occurs at the most significant change.

Gradient magnitude

The magnitude of the instantaneous gradient in 3 dimensions utilizing adjacent traces.

Instantaneous frequency

The first time derivative of the instantaneous phase scaled to Hertz units.

Phase

The average value of a signal's phase spectrum; the relative position along a sinusoid.

CosPhase

Cosine of the phase.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Surachman, L.M., Abdulraheem, A., Al-Shuhail, A. et al. Acoustic impedance prediction based on extended seismic attributes using multilayer perceptron, random forest, and extra tree regressor algorithms. J Petrol Explor Prod Technol 14, 1923–1931 (2024). https://doi.org/10.1007/s13202-024-01795-7

Download citation

Received: 15 June 2023
Accepted: 16 March 2024
Published: 08 May 2024
Issue Date: July 2024
DOI: https://doi.org/10.1007/s13202-024-01795-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Acoustic impedance prediction based on extended seismic attributes using multilayer perceptron, random forest, and extra tree regressor algorithms

Abstract

Similar content being viewed by others

Automated real-time prediction of geological formation tops during drilling operations: an applied machine learning solution for the Norwegian Continental Shelf

Prediction of jumbo drill penetration rate in underground mines using various machine learning approaches and traditional models

Comparison of neural networks techniques to predict subsurface parameters based on seismic inversion: a machine learning approach

Explore related subjects

Introduction

Methodology

Machine learning methods

Algorithms

Multilayer perceptron regression (MLPR)

Random forest regression (RFR)

Extra tree regressor

Results and discussion

Data visualization and analysis

Conclusions

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Appendix 1. Seismic Attributes (Barnes 2016)

Appendix 1. Seismic Attributes (Barnes 2016)

Amp (amplitude)

D2 (Second Derivative)

Int (Integral)

Quadr (quadrature amplitude)

Trace gradient

Gradient magnitude

Instantaneous frequency

Phase

CosPhase

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation