A Novel Hybrid Technique of Integrating Gradient-Boosted Machine and Clustering Algorithms for Lithology Classification
- 54 Downloads
The significant body of research on lithology identification in recent years has laid emphasis on the improvement of classification performance using hybrid machine learning methods. To the best of our knowledge, a hybrid lithology classification model that integrates clustering results of well log data has not been developed. This study, therefore, exploits the advantage of incorporating results from clustering well log data into 2 and 3 groups using K-means and Gaussian mixture models (GMM) to construct a more accurate gradient-boosted machine (GBM) lithology model. The findings of the study showed that improved performance in terms of classification accuracy rate was achieved by the K-means-based GBM classifiers. In addition, GMM-based GBM established an enhanced performance when the developed classifiers were tested on the entire dataset. A rigorous examination of the confusion matrices generated by the classifiers further revealed that the increase in the performance from the clustering-based hybrid GBM models was attributed to the improvement in recognizing mudstone and siltstone, which represents the main lithofacies that are found in the South Yellow Sea’s southern Basin. The findings from the present paper demonstrate that a clustering-based hybrid GBM model can handle new independent lithofacies classification better than GBM.
KeywordsK-means Gaussian mixture models Gradient-boosted machine Lithology
This work was supported by the Major National Science and Technology Programs in the “Thirteenth Five-Year” Plan period (Nos. 2016ZX05024-002-005, 2017ZX05032-002-004), the Outstanding Youth Funding of Natural Science Foundation of Hubei Province (No. 2016CFA055), the Program of Introducing Talents of Discipline to Universities (No. B14031), and the Fundamental Research Fund for the Central Universities, China University of Geosciences (Wuhan, No. CUGCJ1820).
- Amirgaliev, E., Isabaev, Z., Iskakov, S., Kuchin, Y., Muhamedyev, R., Muhamedyeva, E., et al. (2014). Recognition of rocks at uranium deposits by using a few methods of machine learning. Soft Computing in Machine Learning Advances in Intelligent Systems and Computing, 273, 33–40.CrossRefGoogle Scholar
- Bartetzko, A., Delius, H., & Pechnig, R. (2005). Effect of compositional and structural variations on log responses of igneous and metamorphic rocks. In P. K. Harvey, T. S. Brewer, P. A. Pezard, & V. A. Petrov (Eds.), Petrophysical properties of crystalline rocks (pp. 255–278). London: Geological Society Special Publications.Google Scholar
- Bramer, M. (2013). Data for data mining. In Principles of data mining. Undergraduate topics in computer science. London: Springer. http://dx.doi.org/10.1007/978-1-4471-4884-5_2. ISBN: 978-1-4471-4884-5.
- Cracknell, M. J., & Reading, A. M. (2014). Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information. Computers & Geosciences, 63(2014), 22–33.CrossRefGoogle Scholar
- Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39, 1–38.Google Scholar
- Golden, C. E., Rothrock, M. J., & Mishra, A. (2019). Comparison between random forest and gradient boosting machine methods for predicting Listeria spp. prevalence in the environment of pastured poultry farms. Food Research International. https://doi.org/10.1016/j.foodres.2019.03.062.CrossRefGoogle Scholar
- Kassenaar JDC (1991) An application of principal components analysis to borehole geophysical data. In Proceedings of the fourth international symposium on borehole geophysics for minerals, geotechnical and groundwater applications, Toronto, ON, Canada (pp. 211–218).Google Scholar
- Pang, Y., Zhang, X., Xiao, G., Wen, Z., Guo, X., Hou, F., et al. (2016). Structural and geological characteristics of the south yellow sea basin in lower Yangtze block. Geological Review, 62, 604–616. (In Chinese).Google Scholar
- Pechnig R, Bartetzko A, Delius H (2001) Effects of compositional and structural variations on log responses in igneous and metamorphic rocks. In Proceedings of the AGU fall meeting, San Francisco, CA, USA, Abstract V32C-0988.Google Scholar
- Pechnig, R., Delius, H., & Bartetzko, A. (2005). Effect of compositional variations on log responses of igneous and metamorphic rocks, Chapter 2: Acid and intermediate rocks. In P. K. Harvey, T. S. Brewer, P. A. Pezard, & V. A. Petrov (Eds.), Petrophysical properties of crystalline rocks (pp. 279–300). London: Geological Society Special Publications.Google Scholar
- Saggaf, M. M., & Nebrija, E. L. (2008). Estimation of lithologies and depositional facies from wire-line logs. AAPG Bulletin, 4, 1633–1646.Google Scholar
- Sebtosheikh, M. A., Motafakkerfard, R., Riahi, M. A., & Moradi, S. (2015). Separating well log data to train support vector machines for lithology prediction in a heterogeneous carbonate reservoir. Iranian Journal of Oil & Gas Science and Technology, 4, 1–14.Google Scholar
- Tan, P. N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining. Boston: Pearson Addison Wesley.Google Scholar