Abstract
Precise estimation of the shear strength (SS) parameters of soil is one of the important criteria in the geotechnical design process. The aim of this research is to find out the performance of five ensemble models, namely: (i) random forest (RF), (ii) extra trees regressor (ETR), (iii) light gradient boosting machine (LGBM), (iv) Adaboost regressor (ABR), and (v) gradient boosting regressor (GBR), for the prediction capacity of shear strength of soil (SSS). Based on the soil dataset consisting of 249 soil samples and 12 influencing factors, a common approach is formed to predict SSS with the help of the models considered in the research. The datasets for the testing and training phases of the model's generation and evaluation were created using the acquired data. The verification and assessment of models were executed with the help of the coefficient of determination (R2), root mean square log error (RMSLE), mean absolute error (MAE), mean squared errors (MSE), mean absolute percentage error (MAPE), and root mean square errors (RMSE) statistics. In order to determine the significance of various influencing parameters in the outcome prediction, a sensitivity analysis was also carried out. After the development and control of all five models, the best models were selected. Validation results showed that the ETR model has achieved a good modeling outcome with (MAE = 0.0289), (MSE = 0.0015), (RMSE = 0.0378), (R2 = 0.7934), (RMSLE = 0.0285), and (MAPE = 0.0900) outperforming other models. Hence, the model ETR produced better results with high accuracy and low error for the prediction of SSS. Also, the learning process of the best model was performed with the help of a model-based residual plot, a prediction error plot, a feature importance plot, Cook’s distance outlier detection plot, and a learning curve. The most powerful features that contributed to the prediction of SSS in the best model are the liquidity index (LI), water content (ω), void ratio (e), bulk density (ρb), and sample depth.
Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available in the [Cao et al. (2020)] repository, [https://doi.org/10.1007/s00366-020-01116-6].
References
Akkurt, S., Ozdemir, S., Tayfur, G., & Akyol, B. (2003). The use of GA–ANNs in the modelling of compressive strength of cement mortar. Cement and Concrete Research, 33(7), 973–979. https://doi.org/10.1016/S0008-8846(03)00006-1
Alavi, A. H., Gandomi, A. H., Sahab, M. G., & Gandomi, M. (2010). Multi expression programming: a new approach to formulation of soil classification. Engineering with Computers, 26, 111–118. https://doi.org/10.1007/s00366-009-0140-7
Alizadeh, M. J., Nodoushan, E. J., Kalarestaghi, N., & Chau, K. W. (2017). Toward multi-day-ahead forecasting of suspended sediment concentration using ensemble models. Environm Sci Pollut Res, 24(36), 28017–28025. https://doi.org/10.1007/s11356-017-0405-4
Asteris, P. G., Armaghani, D. J., Hatzigeorgiou, G. D., Karayannis, C. G., & Pilakoutas, K. (2019). Predicting the shear strength of reinforced concrete beams using artificial neural networks. Comp Concrete, 24(5), 469–488. https://doi.org/10.12989/cac.2019.24.5.469
Baba, A. M., Midi, H., Adam, M. B., & Abd Rahman, N. H. (2021). Detection of influential observations in spatial regression model based on outliers and bad leverage classification. Symmetry, 13(11), 2030. https://doi.org/10.3390/sym13112030
Biau, G., & Scornet, E. (2016). A random forest guided tour. TEST, 25(2), 197–227. https://doi.org/10.1007/s11749-016-0481-7
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Bui, D. T., Hoang, N. D., & Nhu, V. H. (2019). A swarm intelligence-based machine learning approach for predicting soil shear strength for road construction: a case study at trung luong national expressway project (Vietnam). Eng Comp, 35, 955–965. https://doi.org/10.1007/s00366-018-0643-1
Cao, M. T., Hoang, N. D., Nhu, V. H., & Bui, D. T. (2020). An advanced meta-learner based on artificial electric field algorithm optimized stacking ensemble techniques for enhancing prediction accuracy of soil shear strength. Eng Comp, 38, 2185–2207. https://doi.org/10.1007/s00366-020-01116-6
Chen, W., Panahi, M., & Pourghasemi, H. R. (2017). Performance evaluation of GIS-based new ensemble data mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA), differential evolution (DE), and particle swarm optimization (PSO) for landslide spatial modelling. CATENA, 157, 310–324. https://doi.org/10.1016/j.catena.2017.05.034
Chen, H., Asteris, P. G., Armaghani, D. J., Gordan, B., & Pham, B. T. (2019). Assessing dynamic conditions of the retaining wall: developing two hybrid intelligent models. Applied Sciences, 9(6), 1042. https://doi.org/10.3390/app9061042
Chen, W., Qu, S., Lin, L., Luo, Q., & Wang, T. (2023). Ensemble Learning methods for shear strength prediction of fly ash-amended soils with lignin reinforcement. Journal of Materials in Civil Engineering, 35(4), 04023022. https://doi.org/10.1061/(ASCE)MT.1943-5533.0004676
Chen, T., & Guestrin, C. (2016). Xgboost: a scalable tree boosting system acm sigkdd international conference on knowledge discovery and data mining. ACM Digital Library. Doi, 10(1145/2939672), 2939785.
Ching, J., & Phoon, K. K. (2018). Constructing site-specific multivariate probability distribution model using bayesian machine learning. Journal of Engineering Mechanics, 145(1), 04018126. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001537
Chongzhi, W., Lin, W., & Zhang, W. (2021). Assessment of undrained shear strength using ensemble learning based on Bayesian hyperparameter optimization. Modeling in Geotechnical Engineering. https://doi.org/10.1016/B978-0-12-821205-9.00014-9
Choobbbasti, A. J., Farrokhzad, F., & Barari, A. (2009). Prediction of slope stability using artificial neural network (Case study: Noabad, Mazandaran, Iran). Arabian Journal of Geosciences, 2, 311–319. https://doi.org/10.1007/s12517-009-0035-3
D’Ignazio, M., Phoon, K. K., Tan, S. A., & Lansivaara, T. (2016). Correlations for undrained shear strength of Finnish soft clays. Canadian Geotechnical Journal, 53(10), 1628–1645. https://doi.org/10.1139/cgj-2016-0037
Eyo, E. U., Abbey, S. J., Lawrence, T. T., & Tetteh, F. K. (2021). Improved prediction of clay soil expansion using machine learning algorithms and meta-heuristic dichotomous ensemble classifiers. Geoscience Frontiers, 13(1), 101296. https://doi.org/10.1016/j.gsf.2021.101296
Fan, J., Ma, X., Wu, L., Zhang, F., Yu, X., & Zeng, W. (2019). Light gradient boosting machine: an efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data. Agricultural Water Management, 225, 105758. https://doi.org/10.1016/j.agwat.2019.105758
Farrokhzad, F., & Choobbasti, A.J. (2010). Artificial neural network model for prediction of liquefaction potential in soil deposits. Proceedings, Fifth International Conference on Recent Advances in Geotechnical Earthquake Engineering and Soil Dynamics, San Diego, CA, US.
Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451
Gao, W., Wu, H., Siddiqui, M. K., & Baig, A. Q. (2018). Study of biological networks using graph theory. Saudi Journal of Biological Sciences, 25(6), 1212–1219. https://doi.org/10.1016/j.sjbs.2017.11.022
Gong, M., Bai, Y., Qin, J., Wang, J., Yang, P., & Wang, S. (2020). Gradient boosting machine for predicting return temperature of district heating system: a case study for residential buildings in Tianjin. Journal of Building Engineering, 27, 100950. https://doi.org/10.1016/j.jobe.2019.100950
Guo, C. Y., & Chang, K. H. (2022). A novel algorithm to estimate the significance level of a feature interaction using the extreme gradient boosting machine. International Journal of Environmental Resarch and Public Health, 19(4), 2338. https://doi.org/10.3390/ijerph19042338
Han, T., Siddique, A., Khayat, K., Huang, J., & Kumar, A. (2020). An ensemble machine learning approach for prediction and optimization of modulus of elasticity of recycled aggregate concrete. Construction and Building Materials, 244, 118271. https://doi.org/10.1016/j.conbuildmat.2020.118271
Huang, L., Asteris, P. G., Koopialipoor, M., Armaghani, D. J., & Tahir, M. M. (2019). Invasive weed optimization technique-based ANN to the prediction of rock tensile strength. Applied Sciences, 9(24), 5372. https://doi.org/10.3390/app9245372
Jin, X., Li, S., Zhang, W., Zhu, J., & Sun, J. (2020). Prediction of soil-available potassium content with visible near-infrared ray spectroscopy of different pretreatment transformations by the boosting algorithms. Applied Sciences, 10, 1520. https://doi.org/10.3390/app10041520
Karunanithi, N., Grenney, W. J., Whitley, D., & Bovee, K. (1994). Neural networks for river flow prediction. Journal of Computing in Civil Engineering, 8(2), 201–220. https://doi.org/10.1061/(ASCE)0887-3801(1994)8:2(201)
Kasabov, N. K. (1998). Foundations of neural networks fuzzy systems and knowledge engineering. MIT Press.
Kaveh, A., & Iranmanesh, A. (1998). Comparative study of backpropagation and improved counterpropagation neural nets in structural analysis and optimization. International Journal of Space Structures, 13(4), 177–185. https://doi.org/10.1177/026635119801300401
Kaveh, A., & Khalegi, H. A. (2000). Prediction of strength for concrete specimens using artificial neural network. Asian Journal of Civil Engineering, 2(2), 1–12.
Kaveh, A., & Servati, H. (2001). Design of double layer grids using backpropagation neural networks. Computers and Structures, 79(17), 1561–1568. https://doi.org/10.1016/S0045-7949(01)00034-7
Kaveh, A., Gholipour, Y., & Rahami, H. (2008). Optimal design of transmission towers using genetic algorithm and neural networks. International Journal of Space Structures, 23(1), 1–19. https://doi.org/10.1260/026635108785342073
Khan, N., Kamaruddin, M. A., Sheikh, U., Zawawi, M. H., Yusup, Y., Bakht, M. P., & Noor, M. (2022). Prediction of oil palm yield using machine learning in the perspective of fluctuating weather and soil moisture conditions: evaluation of a generic workflow. Plants, 11(13), 1697. https://doi.org/10.3390/plants11131697
Kiran, S., Lal, B., & Tripathy, S. (2016). Shear strength prediction of soil based on probabilistic neural network. Indian Journal of Science and Technology, 9(41), 1–6. https://doi.org/10.17485/ijst/2016/v9i41/99188
Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Comparisoning state-of-the-art classification algorithms for credit scoring: an update of research. European Journal of Operational Research, 247(1), 124–136. https://doi.org/10.1016/j.ejor.2015.05.030
Liang, W., Luo, S., Zhao, G., & Wu, H. (2020). Predicting hard rock pillar stability using GBDT, XGBoost, and LightGBM algorithms. Mathematics, 2020(8), 765. https://doi.org/10.3390/math8050765
Lin, S., Zheng, H., Han, C., Han, B., & Li, W. (2021). Evaluation and prediction of slope stability using machine learning approaches. Frontiers of Structural and Civil Engineering, 15(4), 821–833. https://doi.org/10.1007/s11709-021-0742-8
Liu, W. F., Leung, Y. F., & Lo, M. K. (2017). Integrated framework for characterization of spatial variability of geological profiles. Canadian Geotechnical Journal, 54, 47–58. https://doi.org/10.1139/cgj-2016-0189
Ly, H.-B., Nguyen, T.-A., & Pham, B. T. (2021). Estimation of soil cohesion using machine learning method: a random forest approach. Advances in Civil Engineering, 8873993, 1–14. https://doi.org/10.1155/2021/8873993
Moayedi, H., Gör, M., Khari, M., Foong, L. K., Bahiraei, M., & Bui, D. T. (2020). Hybridizing four wise neural metaheuristic paradigms in predicting soil shear strength. Measurement, 156, 107576. https://doi.org/10.1016/j.measurement.2020.107576
Mohammadzadeh, D., Bazaz, J. B., & Alavi, A. H. (2014). An evolutionary computational approach for formulation of compression index of fine-grained soils. Engineering Applications of Artificial Intelligence, 33, 58–68. https://doi.org/10.1016/j.engappai.2014.03.012
Mollahasani, A., Alavi, A. H., Gandomi, A. H., & Rashed, A. (2011). Nonlinear neural-based modeling of soil cohesion intercept. KSCE Journal of Civil Engineering, 15(5), 831–840. https://doi.org/10.1007/s12205-011-1154-4
Mousavi, S. M., Alavi, A. H., Gandomi, A. H., & Mollahasani, A. (2012). Nonlinear genetic-based simulation of soil shear strength parameters. Journal of Earth System Science, 120, 1001–1022. https://doi.org/10.1007/s12040-011-0119-9
Nanni, L., & Lumini, A. (2009). An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert Systems with Applications, 36(2), 3028–3033. https://doi.org/10.1016/j.eswa.2008.01.018
Nhu, V. H., Hoang, N. D., Duong, V. B., Vu, H. D., & Bui, D. T. (2020). A hybrid computational intelligence approach for predicting soil shear strength for urban housing construction: a case study at Vinhomes Imperia project, Hai Phong city (Vietnam). Engineering with Computers, 36, 603–616. https://doi.org/10.1007/s00366-019-00718-z
Panigrahi, R., Kuanar, S. K., & Kumar, L. (2021). An empirical study for method-level refactoring prediction by ensemble technique and SMOTE to improve its efficiency. International Journal of Open Source Software and Processes. https://doi.org/10.4018/IJOSSP.287612
Papadopoulos, S., Azar, E., Woon, W.-L., & Kontokosta, C. E. (2017). Evaluation of tree-based ensemble learning algorithms for building energy performance estimation. Journal of Building Performance Simulation, 11(3), 322–332. https://doi.org/10.1080/19401493.2017.1354919
Pham, B. T., Qi, C., Ho, L. S., Nguyen-Thoi, T., Al-Ansari, N., Nguyen, M. D., Nguyen, H. D., Ly, H.-B., Le, H. V., & Prakash, I. A. (2020a). Novel hybrid soft computing model using random forest and particle swarm optimization for estimation of undrained shear strength of soil. Sustainability, 12(06), 2218. https://doi.org/10.3390/su12062218
Pham, T. A., Ly, H.-B., Tran, V. Q., Giap, L. V., Vu, H.-L.T., & Duong, H.-A.T. (2020b). Prediction of pile axial bearing capacity using artificial neural network and random forest. Applied Sciences, 10(05), 1871. https://doi.org/10.3390/app10051871
Pham, B. T., Nguyen-Thoi, T., Ly, H.-B., Nguyen, M. D., Al-Ansari, N., Tran, V.-Q., & Le, T.-T. (2020c). Extreme learning machine based prediction of soil shear strength: a sensitivity analysis using monte carlo simulations and feature backward elimination. Sustainability, 12(06), 2339. https://doi.org/10.3390/su12062339
Rabbani, A., Samui, P., & Kumari, S. (2022). A novel hybrid model of augmented grey wolf optimizer and artificial neural network for predicting shear strength of soil. Modeling Earth Systems and Environment. https://doi.org/10.1007/s40808-022-01610-4
Raschka, S., & Mirjalili, V. (2019). Python machine learning (3rd ed., p. 2019). Packt Publishing Ltd.
Rathakrishnan, V., Beddu, B. T., & Ahmed, A. N. (2022). Predicting compressive strength of high-performance concrete with high volume ground granulated blast-furnace slag replacement using boosting machine learning algorithms. Scientific Reports, 12, 9539. https://doi.org/10.1038/s41598-022-12890-2
Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-validation. In M. T. Özsu (Ed.), Encyclopedia of database systems. (pp. 532–538). Boston, MA: Springer.
Rodriguez, J. D., Perez, A., & Lozano, J. A. (2009). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 569–575. https://doi.org/10.1109/TPAMI.2009.187
Salmasi, F., Nouri, M., Sihag, P., & Abraham, J. (2021). Application of SVM, ANN, GRNN, RF, GP and RT models for predicting discharge coefficients of oblique sluice gates using experimental data. Water Supply, 21(1), 232–248. https://doi.org/10.2166/ws.2020.226
Savargiv, M., Masoumi, B., & Keyvanpour, M. R. (2021). A new random forest algorithm based on learning automata. Computational Intelligence and Neuroscience, 2021, 5572781. https://doi.org/10.1155/2021/5572781
Seireg, H. R., Omar, Y. M. K., El-Samie, F. E. A., El-Fishawy, A. S., & Elmahalawy, A. (2022). Ensemble machine learning techniques using computer simulation data for wild blueberry yield prediction. IEEE Access, 10, 64671–64687. https://doi.org/10.1109/ACCESS.2022.3181970
Shamshirband, S., Nodoushan, E. J., Adolf, J. E., Manaf, A. A., Mosavi, A., & Chau, K.-W. (2019). Ensemble models with uncertainty analysis for multi-day ahead forecasting of chlorophyll a concentration in coastal waters. Engineering Applications of Computational Fluid Mechanics, 13(1), 91–101. https://doi.org/10.1080/19942060.2018.1553742
Sharafati, A., Asadollah, S. B. H. S., & Hosseinzadeh, M. (2020). The potential of new ensemble machine learning models for effluent quality parameters prediction and related uncertainty. Process Safety and Environmental Protection, 140, 68–78. https://doi.org/10.1016/j.psep.2020.04.045
Smith, G. N. (1986). Probability and statistics in civil engineering. Collins.
Song, Y., Zhou, H., Wang, P., & Yang, M. (2019). Prediction of clathrate hydrate phase equilibria using gradient boosted regression trees and deep neural networks. The Journal of Chemical Thermodynamics, 135, 86–96. https://doi.org/10.1016/j.jct.2019.03.030
Vanapalli, S. K., & Fredlund, D. G. (2000). Comparison of different procedures to predict unsaturated soil shear strength. Advances in Unsaturated Geotechnics, 99, 195–209. https://doi.org/10.1061/40510(287)13
Xia, Y., Liu, C., Li, Y., & Liu, N. (2017). A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Systems with Applications, 78, 225–241. https://doi.org/10.1016/j.eswa.2017.02.017
Zhang, C. B., Chen, L. H., Liu, Y. P., Ji, X. D., & Liu, X. P. (2010). Triaxial compression test of soil–root composites to evaluate influence of roots on soil shear strength. Ecological Engineering, 36, 19–26. https://doi.org/10.1016/j.ecoleng.2009.09.005
Zhang, W., Wu, C., Li, Y., Wang, L., & Samui, P. (2019). Assessment of pile drivability using random forest regression and multivariate adaptive regression splines. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards. https://doi.org/10.1080/17499518.2019.1674340
Zhang, W., Wu, C., Zhong, H., Li, Y., & Wang, L. (2020a). Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geoscience Frontiers, 12(1), 469–477. https://doi.org/10.1016/j.gsf.2020.03.007
Zhang, R. H., Wu, C. Z., Goh, A. T. C., Thomas, B., & Zhang, W. G. (2020b). Estimation of diaphragm wall deflections for deep braced excavation in anisotropic clays using ensemble learning. Geoscience Frontiers, 12(1), 365–373. https://doi.org/10.1016/j.gsf.2020.03.003
Zhang, W., Li, H., Han, L., Chen, L., & Wang, L. (2022). Slope stability prediction using ensemble learning techniques: a case study in Yunyang County, Chongqing, China. Journal of Rock Mechanics and Geotechnical Engineering, 14(4), 1089–1099. https://doi.org/10.1016/j.jrmge.2021.12.011
Zhou, J., Shi, X., Du, K., Qiu, X., Li, X., & Mitri, H. S. (2016). Feasibility of random-forest approach for prediction of ground settlements induced by the construction of a shield-driven tunnel. International Journal of Geo Mechanics, 17(6), 04016129. https://doi.org/10.1061/(ASCE)GM.1943-5622.0000817
Zhou, J., Li, E., Wei, H., Li, C., Qiao, Q., & Armaghani, D. J. (2019). Random forests and cubist algorithms for predicting shear strengths of rockfill materials. Applied Sciences, 9(08), 1621. https://doi.org/10.3390/app9081621
Zhou, J., Qiu, Y., Khandelwal, M., Zhu, S., & Zhang, X. (2021). Developing a hybrid model of Jaya algorithm-based extreme gradient boosting machine to estimate blast-induced ground vibrations. International Journal of Rock Mechanics and Mining Sciences, 145, 104856. https://doi.org/10.1016/j.ijrmms.2021.104856
Funding
There is no funding associated with this research article.
Author information
Authors and Affiliations
Contributions
AR: conceptualization and formulation, analysis and investigation, drafting of paper. PS: supervision, final correction, helps in preparation of draft. SK: supervision, final correction, helps in preparation of draft. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Consent to participate
Not applicable.
Consent to publish
All authors consent to the publication of the manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rabbani, A., Samui, P. & Kumari, S. Implementing ensemble learning models for the prediction of shear strength of soil. Asian J Civ Eng 24, 2103–2119 (2023). https://doi.org/10.1007/s42107-023-00629-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42107-023-00629-x