Abstract
The present research employs the models based on the relevance vector machine (RVM) approach to predict the unconfined compressive strength (UCS) of the cohesive virgin (fine-grained) soil. For this purpose, the Linear, Polynomial, Gaussian, and Laplacian kernel functions have been implemented in RVM models. Two types of RVM models have been developed: (i) single kernel function based (mentioned by SRVM) and (ii) dual kernel function-based (mentioned by DRVM). Each model has been optimized by each genetic (GA) and particle swarm optimization (PSO) algorithm. Eighty-five data points (75 training + ten testing) have been collected from the literature to train and test the SRVM and DRVM models. The data proportionality method has been used to create six training databases, i.e., 50%, 60%, 70%, 80%, 90%, and 100%, to determine the effect of the quality and quantity of training database on the performance, accuracy, and overfitting of the soft computing models. Ten conventional and three new performance parameters, i.e., a20 index, index of agreement (IOA), and index of scatter (IOS), have measured the performance of models. The present research concludes that (i) a strongly correlated pair of data points affect the performance and accuracy of the model; (ii) GA-optimized SRVM model MD119 has outperformed other SRVM and DRVM models with a20 = 100, IOA = 0.9947, and IOS = 0.0272; (iii) k-fold cross-validation test (k = 10) validates the capabilities of SRVM and DRVM models; (iv) model MD119 has predicted UCS better than GPR model MD11 developed in part 1 of this research; (v) high correlated data points increases the overfitting of the model; (vi) model MD119 has predicted UCS of lab tested soil with a confidence interval of ± 4.0%.
Similar content being viewed by others
Data availability
All data, models, and code generated or used during the study appear in the submitted article. The database may be provided on request.
Abbreviations
- a20:
-
A20-index
- a20A :
-
A20-index accuracy
- AD:
-
Anderson–Darling test
- AI:
-
Artificial intelligence
- ANN:
-
Artificial neural networks
- ANOVA:
-
Analysis of variance
- AOC:
-
Area over the curve
- BWOA:
-
Black widow optimization algorithm
- Coeff.:
-
Coefficient
- CSAFR:
-
Ratio of free lime to SAF
- CV:
-
Cross-validation model
- df:
-
Degree of freedom
- DMR:
-
Maximum dry density to the optimum moisture content
- DRVM:
-
Dual kernel function-based RVM
- DRVM-GA:
-
GA-optimized DRVM
- DRVM-PSO:
-
PSO-optimized DRVM
- DS:
-
Degree of saturation
- DST:
-
Direct shear test
- DUW:
-
Dry unit weight
- F :
-
F State value
- F crit:
-
F Critical value
- FC:
-
Fine content
- GA:
-
Genetic algorithm
- GAU:
-
Gaussian kernel
- Gauss-GA:
-
GA-optimized Gaussian
- Gauss-PSO:
-
PSO-optimized Gaussian
- GPR:
-
Gaussian process regression
- G-SRVM:
-
Gaussian kernel function-based SRVM
- ICA:
-
Imperialism competitive algorithm
- IOA:
-
Index of agreement
- IOAA :
-
Index of agreement accuracy
- IOS:
-
Index of scatter
- IOSA :
-
Index of scatter accuracy
- J:
-
Number of models
- LAP:
-
Laplacian kernel
- Lap-GA:
-
GA-optimized Laplacian SRVM
- Lap-PSO:
-
PSO-optimized Laplacian SRVM
- Lap-SRVM:
-
Laplacian kernel function-based SRVM
- lb:
-
Lower boundary
- LB:
-
Lower bound
- LIN:
-
Linear kernel
- Lin-GA:
-
GA-optimized linear SRVM
- Lin-PSO:
-
PSO-optimized linear SRVM
- LMI:
-
Legate McCabe's Index
- L-SRVM:
-
Linear kernel function-based SRVM
- MAE:
-
Mean absolute error
- MAEA :
-
Mean absolute error accuracy
- MAPE:
-
Mean absolute percentage error
- MAPEA :
-
Mean absolute percentage error accuracy
- MD:
-
Model
- ME:
-
Margin of error
- MOE:
-
Mean of error
- MRI:
-
Magnetic resonance imaging
- MS:
-
Mean square
- MVO:
-
Multi-verse optimization algorithm
- NMBE:
-
Normalized mean bias error
- NS:
-
Nash–Sutcliffe efficiency
- o3:
-
Confining pressure
- od:
-
Deviatoric stress
- P:
-
Porosity
- POLY:
-
Polynomial kernel
- Poly-GA:
-
GA-optimized polynomial SRVM
- Poly-PSO:
-
PSO-optimized polynomial SRVM
- PSO:
-
Particle swarm optimization algorithm
- PSO-XGBoost:
-
PSO-extreme gradient boosting
- P-SRVM:
-
Polynomial kernel function-based SRVM
- P-value:
-
Calculated significant value
- r, R :
-
Correlation coefficient
- R 2 :
-
Coefficient of determination
- R A :
-
Correlation coefficient accuracy
- REC:
-
Regression error characteristics curve
- RMSE:
-
Root mean square error
- RMSEA :
-
Root mean square error accuracy
- ROC:
-
Receiver operating characteristic curve
- RSR:
-
Root mean square error to observations' standard deviation ratio
- RSRA :
-
Root mean square error to observations' standard deviation ratio accuracy
- RVM:
-
Relevance vector machine
- SCA:
-
Sine cosine algorithm
- SE:
-
Standard error
- SG:
-
Specific gravity
- SRVM:
-
Single kernel function-based RVM
- SS:
-
Sum of squares
- SSO:
-
Social spider optimization algorithm
- Std Error:
-
Standard error
- StDev:
-
Standard deviation
- t state:
-
T Statistical
- TCS:
-
Triaxial compressive strength
- UA95 :
-
Uncertainly analysis
- uB:
-
Upper boundary
- UB:
-
Upper bound
- UCS:
-
Unconfined compressive strength
- VAF:
-
Variance accounted for
- VAFA :
-
Variance accounted for accuracy
- VIF:
-
Variance inflation factor
- VR:
-
Void ratio
- VST:
-
Vane shear test
- WBC:
-
Width of confidence bound
- WDC:
-
Number of W-D cycles
- WMAPE:
-
Weighted mean absolute percentage error
- WMAPEA :
-
Weighted mean absolute percentage error accuracy
References
Al-Bared MAM, Mustaffa Z, Armaghani DJ, Marto A, Yunus NZM, Hasanipanah M (2021) Application of hybrid intelligent systems in predicting the unconfined compressive strength of clay material mixed with recycled additive. Transp. Geotech. 30:100627. https://doi.org/10.1016/j.trgeo.2021.100627
Bardhan A, Samui P, Ghosh K, Gandomi AH, Bhattacharyya S (2021) ELM-based adaptive neuro swarm intelligence techniques for predicting the California bearing ratio of soils in soaked conditions. Appl Soft Comput 110:107595. https://doi.org/10.1016/j.asoc.2021.107595
Bardhan A, Kardani N, Alzo’ubi AK, Roy B, Samui P, Gandomi AH (2022) Novel integration of extreme learning machine and improved Harris hawks optimization with particle swarm optimization-based mutation for predicting soil consolidation parameter. J Rock Mech Geotech Eng. https://doi.org/10.1016/j.jrmge.2021.12.018
Bi J, Bennett KP (2003) Regression error characteristic curves. In: Proceedings of the 20th international conference on machine learning (ICML-03), pp 43–50
Chan JYL, Leow SMH, Bea KT, Cheng WK, Phoong SW, Hong ZW, Chen YL (2022) Mitigating the multicollinearity problem and its machine learning approach: a review. Mathematics 10(8):1283. https://doi.org/10.3390/math10081283
Christensen R (1996) Analysis of variance, design, and regression: applied statistical methods. CRC Press, Boca Raton
Daoud JI (2017) Multicollinearity and regression analysis. J Phys Conf Ser 949(1):012009. https://doi.org/10.1088/1742-6596/949/1/012009
Das SK, Samui P, Sabat AK, Sitharam TG (2010) Prediction of swelling pressure of soil using artificial intelligence techniques. Environ Earth Sci 61(2):393–403. https://doi.org/10.1007/s12665-009-0352-6
Do HD, Pham VN, Nguyen HH, Huynh PN, Han J (2021) Prediction of unconfined compressive strength and flexural strength of cement-stabilized sandy soils: a case study in Vietnam. Geotech Geol Eng 39(7):4947–4962. https://doi.org/10.1007/s10706-021-01805-z
Dodigović F, Ivandić K, Jug J, Agnezović K (2021) Multi-objective optimization of retaining wall using genetic algorithm. Environ Eng Inženjerstvo Okoliša 8(1–2):58–65. https://doi.org/10.37023/ee.8.1-2.8
Dong X, Chen L (2022) Parameter identification of 3D elastic-plastic model for tunnel engineering based on improved genetic algorithm. Math Probl Eng. https://doi.org/10.1155/2022/8305175
Falak ME, Benemaran RS (2023) Ensemble deep learning-based models to predict the resilient modulus of modified base materials subjected to wet-dry cycles. Geomech Eng 32(6):583–600
Gad AG (2022) Particle swarm optimization algorithm and its applications: a systematic review. Arch Comput Methods Eng. https://doi.org/10.1007/s11831-021-09694-4
Gareth J, Daniela W, Trevor H, Robert T (2013) An introduction to statistical learning: with applications in R. Springer, New York
Garg A, Tai K (2012) Comparison of regression analysis, artificial neural network and genetic programming in handling the multicollinearity problem. In: 2012 Proceedings of international conference on modelling, identification and control. IEEE, pp 353–358
Gelman A (2005) Analysis of variance—why it is more important than ever. Ann Stat 33(1):1–53. https://doi.org/10.1214/009053604000001048
Gul N, Mir BA (2022) Parametric study of glass fiber reinforced fine-grained soil with emphasis on microstructural analysis. Int J Geotech Eng 16(6):716–728
Gunaydin O, Gokoglu A, Fener M (2010) Prediction of artificial soil’s unconfined compression strength test using statistical analyses and artificial neural networks. Adv Eng Softw 41(9):1115–1123. https://doi.org/10.1016/j.advengsoft.2010.06.008
Gunst RF, Webster JT (1975) Regression analysis and problems of multicollinearity. Commun Stat Theory Methods 4(3):277–292. https://doi.org/10.1080/03610927308827246
Iqbal M, Onyelowe KC, Jalal FE (2021) Smart computing models of California bearing ratio, unconfined compressive strength, and resistance value of activated ash-modified soft clay soil with adaptive neuro-fuzzy inference system and ensemble random forest regression techniques. Multiscale Multidiscip Model Exp Design 4(3):207–225. https://doi.org/10.1007/s41939-021-00092-8
Kardani N, Zhou A, Shen SL, Nazem M (2021) Estimating unconfined compressive strength of unsaturated cemented soils using alternative evolutionary approaches. Transp Geotech 29:100591. https://doi.org/10.1016/j.trgeo.2021.100591
Kashani AR, Chiong R, Mirjalili S, Gandomi AH (2021) Particle swarm optimization variants for solving geotechnical problems: review and comparative analysis. Arch Comput Methods Eng 28(3):1871–1927. https://doi.org/10.1007/s11831-020-09442-0
Katoch S, Chauhan SS, Kumar V (2021) A review on genetic algorithm: past, present, and future. Multimed Tools Appl 80(5):8091–8126. https://doi.org/10.1007/s11042-020-10139-6
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN'95-international conference on neural networks, vol 4. IEEE, pp 1942–1948. https://doi.org/10.1109/ICNN.1995.488968
Khatti J, Grover KS (2023a) Prediction of UCS of fine-grained soil based on machine learning part 1: multivariable regression analysis, gaussian process regression, and gene expression programming. Multiscale Multidiscip Model Exp Design. https://doi.org/10.1007/s41939-022-00137-6
Khatti J, Grover KS (2023b) Prediction of compaction parameters for fine-grained soil: Critical comparison of the deep learning and standalone models. J Rock Mech Geotech Eng. https://doi.org/10.1016/j.jrmge.2022.12.034
Khatti J, Grover KS (2023c) Assessment of fine-grained soil compaction parameters using advanced soft computing techniques. Arab J Geosci 16(3):208. https://doi.org/10.1007/s12517-023-11268-6
Khatti J, Grover KS (2023d) CBR prediction of pavement materials in unsoaked condition using LSSVM, LSTM-RNN, and ANN approaches. Int J Pavement Res Technol. https://doi.org/10.1007/s42947-022-00268-6
Khatti J, Grover KS (2023e) Prediction of compaction parameters of compacted soil using LSSVM, LSTM, LSBoostRF, and ANN. Innov Infrastruct Solut 8(2):76. https://doi.org/10.1007/s41062-023-01048-2
Khatti J, Grover KS (2023f) Prediction of soaked CBR of fine-grained soils using soft computing techniques. Multiscale Multidiscip Model Exp Design 6(1):97–121. https://doi.org/10.1007/s41939-022-00131-y
Khatti J, Grover KS (2023g) Prediction of compaction parameters of soil using GA and PSO optimized relevance vector machine (RVM). ICTACT J Soft Comput 13(2):2890–2903. https://doi.org/10.21917/ijsc.2023.0399
Kim HY (2017) Statistical notes for clinical researchers: Chi-squared test and Fisher’s exact test. Restor Dent Endod 42(2):152–155. https://doi.org/10.5395/rde.2017.42.2.152
Larson MG (2008) Analysis of variance. Circulation 117(1):115–121. https://doi.org/10.1161/CIRCULATIONAHA.107.654335
Lawley DN (1938) A generalization of Fisher’s z test. Biometrika 30(1/2):180–187. https://doi.org/10.2307/2332232
Li F, Jiang A, Zheng S (2021) Anchoring parameters optimization of tunnel surrounding rock based on particle swarm optimization. Geotech Geol Eng 39(6):4533–4543. https://doi.org/10.1007/s10706-021-01782-3
Lin CC, Mudholkar GS (1980) A simple test for normality against asymmetric alternatives. Biometrika 67(2):455–461. https://doi.org/10.1093/biomet/67.2.455
Liu F, Zhao F, Tong M, Yang Y, Yu Z (2017) Hybrid RVM algorithm based on the prediction variance. In: International conference on neural information processing. Springer, Cham, pp 53–63, https://doi.org/10.1007/978-3-319-70087-8_6
Mansfield ER, Helms BP (1982) Detecting multicollinearity. Am Stat 36(3a):158–160. https://doi.org/10.1080/00031305.1982.10482818
Menard S (2002) Applied logistic regression analysis (No. 106). SAGE Publications, Thousand Oaks
Michalewicz Z, Schoenauer M (1996) Evolutionary algorithms for constrained parameter optimization problems. Evol Comput 4(1):1–32. https://doi.org/10.1162/evco.1996.4.1.1
Ngo HTT, Pham TA, Vu HLT, Giap LV (2021) Application of artificial intelligence to determined unconfined compressive strength of cement-stabilized soil in vietnam. Appl Sci 11(4):1949. https://doi.org/10.3390/app11041949
Obite CP, Olewuezi NP, Ugwuanyim GU, Bartholomew DC (2020) Multicollinearity effect in regression analysis: a feed forward artificial neural network approach. Asian J Probab Stat 6(1):22–33
Pereira J, Mendes J, Júnior JS, Viegas C, Paulo JR (2022) A review of genetic algorithm approaches for wildfire spread prediction calibration. Mathematics 10(3):300. https://doi.org/10.3390/math10030300
Pham TA, Tran VQ, Vu HLT (2021) Evolution of deep neural network architecture using particle swarm optimization to improve the performance in determining the friction angle of soil. Math Probl Eng. https://doi.org/10.1155/2021/5570945
Premarathne RPPK, Sawangsuriya A (2021) Prediction of unconfined compressive strength of cement stabilized pavement materials. IOP Conf Ser Mater Sci Eng 1075(1):012008. https://doi.org/10.1088/1757-899X/1075/1/012008
Rainarli E, Dewi KE (2018) Relevance vector machine for summarization. IOP Conf Ser Mater Sci Eng 407(1):012075. https://doi.org/10.1088/1757-899X/407/1/012075
Ray R, Kumar D, Samui P, Roy LB, Goh ATC, Zhang W (2021) Application of soft computing techniques for shallow foundation reliability in geotechnical engineering. Geosci Front 12(1):375–383. https://doi.org/10.1016/j.gsf.2020.05.003
Saadat M, Bayat M (2022) Prediction of the unconfined compressive strength of stabilised soil by Adaptive Neuro Fuzzy Inference System (ANFIS) and Non-Linear Regression (NLR). Geomech Geoeng 17(1):80–91. https://doi.org/10.1080/17486025.2019.1699668
Salehi M, Bayat M, Saadat M, Nasri M (2022) Prediction of unconfined compressive strength and California bearing capacity of cement-or lime-pozzolan-stabilised soil admixed with crushed stone waste. Geomech Geoeng. https://doi.org/10.1080/17486025.2022.2040606
Samui P (2012) Application of relevance vector machine for prediction of ultimate capacity of driven piles in cohesionless soils. Geotech Geol Eng 30(5):1261–1270. https://doi.org/10.1007/s10706-012-9539-9
Samui P, Karthikeyan J (2014) The use of a relevance vector machine in predicting liquefaction potential. Indian Geotech J 44(4):458–467. https://doi.org/10.1007/s40098-013-0094-y
Samui P, Kim D (2016) Determination of electrical resistivity of soil based on thermal resistivity using RVM and MPMR. Period Polytech Civ Eng 60(4):511–515. https://doi.org/10.3311/PPci.8206
Sarkhani Benemaran R, Esmaeili-Falak M, Javadi A (2022) Predicting resilient modulus of flexible pavement foundation using extreme gradient boosting based optimised models. Int J Pavement Eng, 1–20
Sawyer SF (2009) Analysis of variance: the fundamental concepts. J Manual Manip Ther 17(2):27E-38E. https://doi.org/10.1179/jmt.2009.17.2.27E
Shrestha N (2020) Detecting multicollinearity in regression analysis. Am J Appl Math Stat 8(2):39–42
Soltani A, Deng A, Taheri A, Mirzababaei M, Jaksa MB (2020) A dimensional description of the unconfined compressive strength of artificially cemented fine-grained soils. J Adhes Sci Technol 34(15):1679–1703
Tagimalek H, Maraki MR, Mahmoodi M, Azargoman M (2019) A hybrid SVM-RVM algorithm to mechanical properties in the friction stir welding process. J Appl Comput Mech. https://doi.org/10.22055/JACM.2019.31017.1811
Tenpe AR, Patel A (2020) Utilization of support vector models and gene expression programming for soil strength modeling. Arab J Sci Eng 45(5):4301–4319. https://doi.org/10.1007/s13369-020-04441-6
Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244
Tran VQ (2022) Hybrid gradient boosting with meta-heuristic algorithms prediction of unconfined compressive strength of stabilized soil based on initial soil properties, mix design and effective compaction. J Clean Prod 355:131683. https://doi.org/10.1016/j.jclepro.2022.131683
Vijayanand M, Varahamoorthi R, Kumaradhas P, Sivamani S, Kulkarni MV (2021) Regression-BPNN modelling of surfactant concentration effects in electroless NiB coating and optimization using genetic algorithm. Surf Coat Technol 409:126878. https://doi.org/10.1016/j.surfcoat.2021.126878
Vittinghoff E, Glidden DV, Shiboski SC, McCulloch CE (2006) Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. Springer, New York
Yildirim E, Avci E, Tanbay NA (2021) Prediction of unconfined compressive strength of microfine cement injected sands using fuzzy logic method. Acad Platf J Eng Smart Syst 11:87–94. https://doi.org/10.21203/rs.3.rs-232296/v1
Yimer YM, Makesh AP, Muhammed S (2021) Prediction of undrained shear strength and correlation in between soil parameters. J Phys Conf Ser 2040(1):012024. https://doi.org/10.1088/1742-6596/2040/1/012024
Yin F, Hao Y, Xiao T, Shao Y, Yuan M (2021) The prediction of pile foundation buried depth based on BP neural network optimized by quantum particle swarm optimization. Adv Civ Eng. https://doi.org/10.1155/2021/2015408
Zbigniew M (1996) Genetic algorithms + data structures = evolution programs. In: Computational statistics. Springer, New York, pp 372–373
Funding
No funding was received in assisting the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
JK: main author, conceptualization, literature review, manuscript preparation, application of AI models, methodological development, statistical analysis, detailing, and overall analysis; KSG: conceptualization, overall analysis, manuscript finalization, detailed review, and editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Khatti, J., Grover, K.S. Prediction of UCS of fine-grained soil based on machine learning part 2: comparison between hybrid relevance vector machine and Gaussian process regression. Multiscale and Multidiscip. Model. Exp. and Des. 7, 123–163 (2024). https://doi.org/10.1007/s41939-023-00191-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41939-023-00191-8