Skip to main content

Advertisement

Log in

Machine learning with model selection to predict TOC from mineralogical constituents: case study in the Sichuan Basin

  • Original Paper
  • Published:
International Journal of Environmental Science and Technology Aims and scope Submit manuscript

Abstract

The total organic carbon content from rock samples is the fundamental quantitative and qualitative indicator of the existing organic matter in a reservoir. Generally, it is calculated manually through the analysis of rock samples of origin. However, this procedure demands time and resources since it depends on samples obtained from several intervals of wells in source rocks. Consequently, efforts on research have been conducted to assist this task. Machine learning approaches arise as an alternative to producing estimates for total organic carbon grounded on data well logs and stratigraphic analysis. Given this context, the present paper proposes using machine learning techniques to automate total organic carbon estimation. In order to provide flexibility to the model, a grid search procedure was combined with cross-validation to perform the model selection. This computational approach allows finding models that produced the best generalization capacity. Three methods were applied: Support Vector Machines, Extreme Learning Machine, and Ridge Regression. The proposed methodology was validated on core samples of the shale gas field YuDongNan area, Sichuan Basin. The Support Vector Machine method outperformed the other methods in several metrics analyzed, producing accurate predictions, showing that the approach present in this paper can be used as a surrogate model to assist geologists and petrologists in estimating total organic carbon values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Modified from Zhao et al. (2016)

Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Alshakhs M, Rezaee MR (2017) A new method to estimate total organic carbon (TOC) content, an example from goldwyer shale formation, the canning basin. Open Pet Eng J 10:118–133

    Article  CAS  Google Scholar 

  • Asante-Okyere S, Ziggah YY, Marfo SA (2021) Improved total organic carbon convolutional neural network model based on mineralogy and geophysical well log data. Unconv Res 1:1–8

    Google Scholar 

  • Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(Feb):281–305

    Google Scholar 

  • Chan SA, Hassan AM, Usman M, Humphrey JD, Alzayer Y, Duque F (2021) Total organic carbon (TOC) quantification using artificial neural networks: improved prediction by leveraging xrf data. J Pet Sci Eng. https://doi.org/10.1016/j.petrol.2021.109302

    Article  Google Scholar 

  • Chen S, Zhu Y, Qin Y, Wang H, Liu H, Fang J (2014) Reservoir evaluation of the lower silurian longmaxi formation shale gas in the southern sichuan basin of china. Mar Pet Geol 57:619–630. https://doi.org/10.1016/j.marpetgeo.2014.07.008

    Article  CAS  Google Scholar 

  • Chen Z, Hu K, Liu C (2013) Quantifying total organic carbon (TOC) from well logs using support vector regression. GeoConvention 2013: Canadian Society of Petroleum Geologists

  • Escobar-Briones E, García-Villalobos FJ (2009) Distribution of total organic carbon and total nitrogen in deep-sea sediments from the southwestern Gulf of Mexico. Bol Soc Geol Mex 61(1):73–86

    Article  Google Scholar 

  • Ge X, Wang Y, Fan Y, Fan Z (2015) Deng S (2015) Determination of total organic carbon (TOC) in tight reservoir using empirical mode decomposition-support vector regression (EMD-SVR): a case study from XX-1 Basin. Western China. ASEG Ext Abstr 1:1–10

    Google Scholar 

  • Goldberg K, Humayun M (2016) Geochemical paleoredox indicators in organic-rich shales of the Irati Formation, Permian of the Paraná Basin, southern Brazil. Braz J Geol 46:377–393

    Article  Google Scholar 

  • Goliatt L, Sulaiman SO, Khedher KM, Farooque AA, Yaseen ZM (2021) Estimation of natural streams longitudinal dispersion coefficient using hybrid evolutionary machine learning model. Eng Appl Comput Fluid Mech 15(1):1298–1320

    Google Scholar 

  • Handhal AM, Al-Abadi AM, Chafeet HE, Ismail MJ (2020) Prediction of total organic carbon at rumaila oil field, southern iraq using conventional well logs and machine learning algorithms. Mar Pet Geol 116:104347

    Article  CAS  Google Scholar 

  • ...Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R, Picus M, Hoyer S, van Kerkwijk MH, Brett M, Haldane A, del R’ıo JF, Wiebe M, Peterso P, G’erard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke C, Oliphant TE (2020) Array programming with NumPy. Nature 585(7825):357–362. https://doi.org/10.1038/s41586-020-2649-2

    Article  CAS  Google Scholar 

  • Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Neural networks, 2004. Proceedings. 2004 IEEE international joint conference on, IEEE, vol 2, pp. 985–990

  • Li Y, Liu H, Zhang L, Lu Z, Li Q, Huang Y (2013) Lower limits of evaluation parameters for the lower paleozoic longmaxi shale gas in southern Sichuan province. Sci China Earth Sci 56(5):710–717

    Article  CAS  Google Scholar 

  • Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp. 4768–4777

  • Mahmoud AA, Elkatatny S, Ali AZ, Abouelresh M, Abdulraheem A (2019) Evaluation of the total organic carbon (TOC) using different artificial intelligence techniques. Sustainability 11(20):5643

    Article  CAS  Google Scholar 

  • Mahmoud AA, Elkatatny S, Ali A, Abdulraheem A, Abouelresh M (2020) Estimation of the total organic carbon using functional neural networks and support vector machine. In: International petroleum technology conference, OnePetro

  • Mahmoud AAA, Elkatatny S, Mahmoud M, Abouelresh M, Abdulraheem A, Ali A (2017) Determination of the total organic carbon (TOC) based on conventional well logs using artificial neural network. Int J Coal Geol 179:72–80

    Article  CAS  Google Scholar 

  • Mazumdar A, Paropkari AL, Borole DV, Rao BR, Khadge NH, Karisiddaiah SM, Kocherla M, JoumlO HM (2007) Pore-water sulfate concentration profiles of sediment cores from Krishna-Godavari and Goa Basins, India. Geochem J 41(4):259–269. https://doi.org/10.2343/Geochemj.41.259

    Article  CAS  Google Scholar 

  • McDonald GC (2009) Ridge regression. Wiley Interdiscip Rev Comput Stat 1(1):93–100

    Article  Google Scholar 

  • McKinney W (2010) Data structures for statistical computing in Python. In: van der Walt S, Millman J (eds) Proceedings of the 9th python in science conference, pp. 56–61. https://doi.org/10.25080/Majora-92bf1922-00a

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  • Potratz G, Arauco CS, Castro J, Potratz J, Aurélio M, Pacheco C (2021) Automatic lithofacies classification with t-SNE and K-nearest neighbors algorithm. Anu Inst Geociênc - UFRJ 44:1–14

    Google Scholar 

  • Saporetti CM, Duarte GR, Fonseca TL, da Fonseca LG, Pereira E (2018) Extreme learning machine combined with a differential evolution algorithm for lithology identification. Rev Inf Teór Apl 25(4):43–56

    Article  Google Scholar 

  • Saporetti CM, da Fonseca LG, Pereira E, de Oliveira LC (2018) Machine learning approaches for petrographic classification of carbonate-siliciclastic rocks using well logs and textural information. J Appl Geophys 155:217–225

    Article  Google Scholar 

  • Saporetti CM, da Fonseca LG, Pereira E (2019) A lithology identification approach based on machine learning with evolutionary parameter tuning. IEEE Geosci Remote Sens Lett 16(12):1819–1823

    Article  Google Scholar 

  • Saporetti CM, Goliatt L, Pereira E (2021) Neural network boosted with differential evolution for lithology identification based on well logs information. Earth Sci Inf 14(1):133–140

    Article  Google Scholar 

  • Sattar AM, Gharabaghi B (2015) Gene expression models for prediction of longitudinal dispersion coefficient in streams. J Hydrol 524:587–596

    Article  Google Scholar 

  • Schultz RB (2004) Geochemical relationships of late paleozoic carbon-rich shales of the midcontinent, usa: a compendium of results advocating changeable geochemical conditions. Chem Geol 206(3):347–372

    Article  CAS  Google Scholar 

  • Sfidari E, Kadkhodaie-Ilkhchi A, Najjari S (2012) Comparison of intelligent and statistical clustering approaches to predicting total organic carbon using intelligent systems. J Pet Sci Eng 86–87:190–205

    Article  Google Scholar 

  • Shalaby MR, Jumat N, Lai D, Malik O (2019) Integrated TOC prediction and source rock characterization using machine learning, well logs and geochemical analysis: case study from the jurassic source rocks in shams field, NW Desert, Egypt. J Pet Sci Eng 176:369–380

    Article  CAS  Google Scholar 

  • Shi X, Wang J, Liu G, Yang L, Ge X, Jiang S (2016) Application of extreme learning machine and neural networks in total organic carbon content prediction in organic shale with wire line logs. J Nat Gas Sci Eng 33:687–702

    Article  CAS  Google Scholar 

  • Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222

    Article  Google Scholar 

  • Sun S, Sun Y, Sun C, Liu Z, Dong N (2013) Methods of calculating total organic carbon from well logs and its application on rock’s properties analysis. GeoConvention 2013: Integration.

  • Tan M, Song X, Yang X, Wu Q (2015) Support-vector-regression machine technology for total organic carbon content prediction from wireline logs in organic shale: A comparative study. J Nat Gas Sci Eng 26:792–802

    Article  CAS  Google Scholar 

  • Tariq Z, Mahmoud M, Abouelresh M, Abdulraheem A (2020) Data-driven approaches to predict thermal maturity indices of organic matter using artificial neural networks. ACS Omega 5(40):26169–26181

    Article  CAS  Google Scholar 

  • Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, van der Walt SJ, Brett M, Wilson J, Millman KJ, Mayorov N, Nelson ARJ, Jones E, Kern R, Larson E, Carey CJ, Polat I, Feng Y, Moore EW, VanderPlas J, Laxalde D, Perktold J, Cimrman R, Henriksen I, Quintero EA, Harris CR, Archibald AM, Ribeiro AH, Pedregosa F, van Mulbregt P, SciPy 10 Contributors (2020) SciPy 1.0: fundamental algorithms for scientific computing in python. Nat Methods 17:261–272

  • Wang P, Chen Z, Pang X, Hu K, Sun M, Chen X (2016) Revised models for determining TOC in shale play: example from devonian duvernay shale, western canada sedimentary basin. Mar Pet Geol 70:304–319

    Article  CAS  Google Scholar 

  • Wang X, Xie R, Wang T, Liu R, Shao L (2021) Total organic carbon content prediction of source rocks with conventional well log data based on regression committee machine. Arab J Geosci 14(15):1547. https://doi.org/10.1007/s12517-021-08053-8

    Article  CAS  Google Scholar 

  • Zhang J, Liu S, Xu H, Yu Z, Lai S, Zhang H, Geng G, Chen J (1998) Riverine sources and estuarine fates of particulate organic carbon from north china in late summer. Estuar Coast Shelf Sci 46(3):439–448

    Article  CAS  Google Scholar 

  • Zhao P, Mao Z, Huang Z, Zhang C (2016) A new method for estimating total organic carbon content from well logs. AAPG Bull 100(8):1311–1327

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank all who assisted in conducting this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to L. Goliatt.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Editorial responsibility: Parveen Fatemeh Rupani.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saporetti, C.M., Fonseca, D.L., Oliveira, L.C. et al. Machine learning with model selection to predict TOC from mineralogical constituents: case study in the Sichuan Basin. Int. J. Environ. Sci. Technol. 20, 1585–1596 (2023). https://doi.org/10.1007/s13762-022-04081-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13762-022-04081-3

Keywords

Navigation