Abstract
The total organic carbon content from rock samples is the fundamental quantitative and qualitative indicator of the existing organic matter in a reservoir. Generally, it is calculated manually through the analysis of rock samples of origin. However, this procedure demands time and resources since it depends on samples obtained from several intervals of wells in source rocks. Consequently, efforts on research have been conducted to assist this task. Machine learning approaches arise as an alternative to producing estimates for total organic carbon grounded on data well logs and stratigraphic analysis. Given this context, the present paper proposes using machine learning techniques to automate total organic carbon estimation. In order to provide flexibility to the model, a grid search procedure was combined with cross-validation to perform the model selection. This computational approach allows finding models that produced the best generalization capacity. Three methods were applied: Support Vector Machines, Extreme Learning Machine, and Ridge Regression. The proposed methodology was validated on core samples of the shale gas field YuDongNan area, Sichuan Basin. The Support Vector Machine method outperformed the other methods in several metrics analyzed, producing accurate predictions, showing that the approach present in this paper can be used as a surrogate model to assist geologists and petrologists in estimating total organic carbon values.
Similar content being viewed by others
References
Alshakhs M, Rezaee MR (2017) A new method to estimate total organic carbon (TOC) content, an example from goldwyer shale formation, the canning basin. Open Pet Eng J 10:118–133
Asante-Okyere S, Ziggah YY, Marfo SA (2021) Improved total organic carbon convolutional neural network model based on mineralogy and geophysical well log data. Unconv Res 1:1–8
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(Feb):281–305
Chan SA, Hassan AM, Usman M, Humphrey JD, Alzayer Y, Duque F (2021) Total organic carbon (TOC) quantification using artificial neural networks: improved prediction by leveraging xrf data. J Pet Sci Eng. https://doi.org/10.1016/j.petrol.2021.109302
Chen S, Zhu Y, Qin Y, Wang H, Liu H, Fang J (2014) Reservoir evaluation of the lower silurian longmaxi formation shale gas in the southern sichuan basin of china. Mar Pet Geol 57:619–630. https://doi.org/10.1016/j.marpetgeo.2014.07.008
Chen Z, Hu K, Liu C (2013) Quantifying total organic carbon (TOC) from well logs using support vector regression. GeoConvention 2013: Canadian Society of Petroleum Geologists
Escobar-Briones E, García-Villalobos FJ (2009) Distribution of total organic carbon and total nitrogen in deep-sea sediments from the southwestern Gulf of Mexico. Bol Soc Geol Mex 61(1):73–86
Ge X, Wang Y, Fan Y, Fan Z (2015) Deng S (2015) Determination of total organic carbon (TOC) in tight reservoir using empirical mode decomposition-support vector regression (EMD-SVR): a case study from XX-1 Basin. Western China. ASEG Ext Abstr 1:1–10
Goldberg K, Humayun M (2016) Geochemical paleoredox indicators in organic-rich shales of the Irati Formation, Permian of the Paraná Basin, southern Brazil. Braz J Geol 46:377–393
Goliatt L, Sulaiman SO, Khedher KM, Farooque AA, Yaseen ZM (2021) Estimation of natural streams longitudinal dispersion coefficient using hybrid evolutionary machine learning model. Eng Appl Comput Fluid Mech 15(1):1298–1320
Handhal AM, Al-Abadi AM, Chafeet HE, Ismail MJ (2020) Prediction of total organic carbon at rumaila oil field, southern iraq using conventional well logs and machine learning algorithms. Mar Pet Geol 116:104347
...Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R, Picus M, Hoyer S, van Kerkwijk MH, Brett M, Haldane A, del R’ıo JF, Wiebe M, Peterso P, G’erard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke C, Oliphant TE (2020) Array programming with NumPy. Nature 585(7825):357–362. https://doi.org/10.1038/s41586-020-2649-2
Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Neural networks, 2004. Proceedings. 2004 IEEE international joint conference on, IEEE, vol 2, pp. 985–990
Li Y, Liu H, Zhang L, Lu Z, Li Q, Huang Y (2013) Lower limits of evaluation parameters for the lower paleozoic longmaxi shale gas in southern Sichuan province. Sci China Earth Sci 56(5):710–717
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp. 4768–4777
Mahmoud AA, Elkatatny S, Ali AZ, Abouelresh M, Abdulraheem A (2019) Evaluation of the total organic carbon (TOC) using different artificial intelligence techniques. Sustainability 11(20):5643
Mahmoud AA, Elkatatny S, Ali A, Abdulraheem A, Abouelresh M (2020) Estimation of the total organic carbon using functional neural networks and support vector machine. In: International petroleum technology conference, OnePetro
Mahmoud AAA, Elkatatny S, Mahmoud M, Abouelresh M, Abdulraheem A, Ali A (2017) Determination of the total organic carbon (TOC) based on conventional well logs using artificial neural network. Int J Coal Geol 179:72–80
Mazumdar A, Paropkari AL, Borole DV, Rao BR, Khadge NH, Karisiddaiah SM, Kocherla M, JoumlO HM (2007) Pore-water sulfate concentration profiles of sediment cores from Krishna-Godavari and Goa Basins, India. Geochem J 41(4):259–269. https://doi.org/10.2343/Geochemj.41.259
McDonald GC (2009) Ridge regression. Wiley Interdiscip Rev Comput Stat 1(1):93–100
McKinney W (2010) Data structures for statistical computing in Python. In: van der Walt S, Millman J (eds) Proceedings of the 9th python in science conference, pp. 56–61. https://doi.org/10.25080/Majora-92bf1922-00a
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Potratz G, Arauco CS, Castro J, Potratz J, Aurélio M, Pacheco C (2021) Automatic lithofacies classification with t-SNE and K-nearest neighbors algorithm. Anu Inst Geociênc - UFRJ 44:1–14
Saporetti CM, Duarte GR, Fonseca TL, da Fonseca LG, Pereira E (2018) Extreme learning machine combined with a differential evolution algorithm for lithology identification. Rev Inf Teór Apl 25(4):43–56
Saporetti CM, da Fonseca LG, Pereira E, de Oliveira LC (2018) Machine learning approaches for petrographic classification of carbonate-siliciclastic rocks using well logs and textural information. J Appl Geophys 155:217–225
Saporetti CM, da Fonseca LG, Pereira E (2019) A lithology identification approach based on machine learning with evolutionary parameter tuning. IEEE Geosci Remote Sens Lett 16(12):1819–1823
Saporetti CM, Goliatt L, Pereira E (2021) Neural network boosted with differential evolution for lithology identification based on well logs information. Earth Sci Inf 14(1):133–140
Sattar AM, Gharabaghi B (2015) Gene expression models for prediction of longitudinal dispersion coefficient in streams. J Hydrol 524:587–596
Schultz RB (2004) Geochemical relationships of late paleozoic carbon-rich shales of the midcontinent, usa: a compendium of results advocating changeable geochemical conditions. Chem Geol 206(3):347–372
Sfidari E, Kadkhodaie-Ilkhchi A, Najjari S (2012) Comparison of intelligent and statistical clustering approaches to predicting total organic carbon using intelligent systems. J Pet Sci Eng 86–87:190–205
Shalaby MR, Jumat N, Lai D, Malik O (2019) Integrated TOC prediction and source rock characterization using machine learning, well logs and geochemical analysis: case study from the jurassic source rocks in shams field, NW Desert, Egypt. J Pet Sci Eng 176:369–380
Shi X, Wang J, Liu G, Yang L, Ge X, Jiang S (2016) Application of extreme learning machine and neural networks in total organic carbon content prediction in organic shale with wire line logs. J Nat Gas Sci Eng 33:687–702
Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222
Sun S, Sun Y, Sun C, Liu Z, Dong N (2013) Methods of calculating total organic carbon from well logs and its application on rock’s properties analysis. GeoConvention 2013: Integration.
Tan M, Song X, Yang X, Wu Q (2015) Support-vector-regression machine technology for total organic carbon content prediction from wireline logs in organic shale: A comparative study. J Nat Gas Sci Eng 26:792–802
Tariq Z, Mahmoud M, Abouelresh M, Abdulraheem A (2020) Data-driven approaches to predict thermal maturity indices of organic matter using artificial neural networks. ACS Omega 5(40):26169–26181
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, van der Walt SJ, Brett M, Wilson J, Millman KJ, Mayorov N, Nelson ARJ, Jones E, Kern R, Larson E, Carey CJ, Polat I, Feng Y, Moore EW, VanderPlas J, Laxalde D, Perktold J, Cimrman R, Henriksen I, Quintero EA, Harris CR, Archibald AM, Ribeiro AH, Pedregosa F, van Mulbregt P, SciPy 10 Contributors (2020) SciPy 1.0: fundamental algorithms for scientific computing in python. Nat Methods 17:261–272
Wang P, Chen Z, Pang X, Hu K, Sun M, Chen X (2016) Revised models for determining TOC in shale play: example from devonian duvernay shale, western canada sedimentary basin. Mar Pet Geol 70:304–319
Wang X, Xie R, Wang T, Liu R, Shao L (2021) Total organic carbon content prediction of source rocks with conventional well log data based on regression committee machine. Arab J Geosci 14(15):1547. https://doi.org/10.1007/s12517-021-08053-8
Zhang J, Liu S, Xu H, Yu Z, Lai S, Zhang H, Geng G, Chen J (1998) Riverine sources and estuarine fates of particulate organic carbon from north china in late summer. Estuar Coast Shelf Sci 46(3):439–448
Zhao P, Mao Z, Huang Z, Zhang C (2016) A new method for estimating total organic carbon content from well logs. AAPG Bull 100(8):1311–1327
Acknowledgements
The authors wish to thank all who assisted in conducting this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Editorial responsibility: Parveen Fatemeh Rupani.
Rights and permissions
About this article
Cite this article
Saporetti, C.M., Fonseca, D.L., Oliveira, L.C. et al. Machine learning with model selection to predict TOC from mineralogical constituents: case study in the Sichuan Basin. Int. J. Environ. Sci. Technol. 20, 1585–1596 (2023). https://doi.org/10.1007/s13762-022-04081-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13762-022-04081-3