Skip to main content

Advertisement

Log in

Lithology identification based on interpretability integration learning

  • RESEARCH
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

An interpretability model for intelligent lithology identification is proposed, which utilizes Ensemble Learning Stacking, Permutation Importance (PI), and Local Interpretable Model-agnostic Explanations (LIME) techniques. The aim of this method is to provide more accurate geological information and scientific support for oil and gas resource exploration. Two logging datasets from the public domain were used as experiments, and support vector machine (SVM), random forest (RF), and naive Bayes (NB) were employed as base learners, while SVM was utilized as the meta learner for lithology classification via stacking algorithm. The accuracy of the model was verified using evaluation metrics such as Area Under Curve (AUC), precision, recall, and F1-score. The PI and LIME techniques were employed to explain the lithology identification model. The results indicate that the stacking algorithm produced the best indexes and highest prediction accuracy. With respect to overall interpretation, PHIND, GR, and RT were found to have the most significant influence on lithology identification in a natural gas protection area in the United States, while DEN, CAL, and PEF were observed to be the most influential variables for lithology identification in the Daqing Oilfield in China. From the perspective of a single sample, the LIME algorithm can provide a quantitative prediction probability and degree of influence of the characteristic variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  • Ao Y, Li H, Zhu L, Ali S, Yang Z (2019) Identifying channel sand-body from multiple seismic attributes with an improved random forest algorithm. J Pet Sci Eng 173:781–792

    Article  Google Scholar 

  • Asante-Okyere S, Shen C, Ziggah YY, Rulegeya MM, Zhu X (2020) A Novel Hybrid Technique of Integrating Gradient-Boosted Machine and Clustering Algorithms for Lithology Classification. Nat Resour Res 29:2257–2273

    Article  Google Scholar 

  • Breiman L (2001) Random Forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Bressan TS, Souza MK, Girelli TJ, Junior FC (2020) Evaluation of machine learning methods for lithology classification using geophysical data. Comput Geosci 139:104475

  • Cao Z (2018) Cross-well lithology identification. figshare. Dataset. https://doi.org/10.6084/m9.figshare.6667646.v1

  • Chen Z, Chang R, Guo H, Pei X, Zhao W, Yu Z, Zou L (2022) Prediction of Potential Geothermal Disaster Areas along the Yunnan-Tibet Railway Project. Remote Sens 14:3036

    Article  Google Scholar 

  • Das S, Datta S, Zubaidi HA, Obaid IA (2021) Applying interpretable machine learning to classify tree and utility pole related crash injury types. IATSS Res 45:310–316

    Article  Google Scholar 

  • Dubois MK, Byrnes AP, Bohling GC, Seals SC, Doveton JH (2003) Statistically-based lithofacies predictions for 3-D reservoir modeling: examples from the Panoma (Council Grove) field, Hugoton embayment, southwest Kansas (abs). Proceedings, American Association of Petroleum Geologists 2003 Annual Convention, Salt Lake City, Utah, 12, A44, and Kansas Geological Survey Open File Report ♯2003-30, Kansas Geological Survey web site. http://www.kgs.ku.edu/PRS/publication/2003/ofr2003-30/index.html

  • Dubois MK, Bohling GC, Chakrabarti S (2007) Comparison of four approaches to a rock facies classification problem. Comput Geosci 33:599–617

  • Genuer R, Poggi J-M, Tuleau-Malot C, Villa-Vialaneix N (2017) Random Forests for Big Data. Big Data Res 9:28–46

    Article  Google Scholar 

  • Han R, Wang Z, Wang W, Xu F, Qi X, Cui Y (2021) Lithology identification of igneous rocks based on XGboost and conventional logging curves, a case study of the eastern depression of Liaohe Basin. J Appl Geophys 195:104480

    Article  Google Scholar 

  • Hsieh WW (2009) Machine learning methods in the environmental sciences: kernel methods. Cambridge University Press, Cambridge

  • Ibrahim M, Modarres C, Louie M, Paisley J (2019) Global explanations of neural network: Mapping the landscape of predictions, in: AIES 2019 - Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society

  • Jia H, Zhao J, jun-liu, Min-Zhang, Sun W (2021) Accurate heart disease prediction via improved stacking integration algorithm. J Imaging Sci Technol 65(3). https://doi.org/10.2352/J.ImagingSci.Technol.2021.65.3.030408

  • Liu Y, Yu Z, Chen C, Han Y, Yu B (2020) Prediction of protein crotonylation sites through LightGBM classifier based on SMOTE and elastic net. Anal Biochem 609:113903

    Article  Google Scholar 

  • Mateo-Sanchis A, Piles M, Amorós-López J, Muñoz-Marí J, Adsuara JE, Moreno-Martínez Á, Camps-Valls G (2021) Learning main drivers of crop progress and failure in Europe with interpretable machine learning. Int J Appl Earth Obs Geoinf 104:102574

    Google Scholar 

  • Ribeiro M, Singh S, Guestrin C (2016) “Why Should I Trust You?”: Explaining the Predictions of Any Classifier, in: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 97–101

  • Saporetti CM, da Fonseca LG, Pereira E, de Oliveira LC (2018) Machine learning approaches for petrographic classification of carbonate-siliciclastic rocks using well logs and textural information. J Appl Geophys 155:217–225

    Article  Google Scholar 

  • Saporetti CM, da Fonseca LG, Pereira E (2019) A Lithology Identification Approach Based on Machine Learning With Evolutionary Parameter Tuning. IEEE Geosci Remote Sens Lett 16:1819–1823

    Article  Google Scholar 

  • Shankar K, Lakshmanaprabu SK, Gupta D, Maseleno A, Albuquerque VH (2018) Optimal feature-based multi-kernel SVM approach for thyroid disease classification. J Supercomput 76:1128–1143

  • Sun J, Li Q, Chen M, Ren L, Huang G, Li C, Zhang Z (2019) Optimization of models for a rapid identification of lithology while drilling-A win-win strategy based on machine learning. J Pet Sci Eng 176:321–341

    Article  Google Scholar 

  • Swets JA (1988) Measuring the accuracy of diagnostic systems. Science (80-. ) 240:1285–1293

    Article  Google Scholar 

  • Wang K, Tian J, Zheng C, Yang H, Ren J, Liu Y, Han Q, Zhang Y (2021) Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput Biol Med 137:104813

    Article  Google Scholar 

  • Wang H, Xiong J, Yao Z, Lin M, Ren J (2017) Research survey on support vector machine, in: International Conference on Mobile Multimedia Communications (MobiMedia)

  • Wang K, Liu X (2021) An Anomaly Detection Method of Industrial Data Based on Stacking Integration. J Artif Intell 3(1). https://doi.org/10.32604/jai.2021.016706

  • Xu Z, Huang X, Lin L et al (2020) BP neural networks and random forest models to detect damage by Dendrolimus punctatus Walker. J For Res 31(1):107–121

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors have contributed to the conception and design of this study. Data collection was carried out by Xiaochun Lin. Xiaochun Lin and Shitao Yin constructed the experimental models and the development and testing of the proposed methods. The manuscript was written by Xiaochun Lin, and all authors have provided feedback and comments on the manuscript. All authors have reviewed and approved the final version of the manuscript.

Corresponding author

Correspondence to Shitao Yin.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Communicated by: H. Babaie

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, X., Yin, S. Lithology identification based on interpretability integration learning. Earth Sci Inform 16, 2211–2222 (2023). https://doi.org/10.1007/s12145-023-01024-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-023-01024-5

Keywords

Navigation