Application of KRR, K-NN and GPR Algorithms for Predicting the Soaked CBR of Fine-Grained Plastic Soils

Verma, Gaurav; Kumar, Brind; Kumar, Chintoo; Ray, Arunava; Khandelwal, Manoj

doi:10.1007/s13369-023-07962-y

Application of KRR, K-NN and GPR Algorithms for Predicting the Soaked CBR of Fine-Grained Plastic Soils

Research Article-Civil Engineering
Open access
Published: 22 June 2023

Volume 48, pages 13901–13927, (2023)
Cite this article

Download PDF

You have full access to this open access article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Application of KRR, K-NN and GPR Algorithms for Predicting the Soaked CBR of Fine-Grained Plastic Soils

Download PDF

Gaurav Verma¹,
Brind Kumar²,
Chintoo Kumar³,
Arunava Ray⁴ &
…
Manoj Khandelwal ORCID: orcid.org/0000-0003-0368-3188⁵

1397 Accesses
4 Citations
Explore all metrics

Abstract

California bearing ratio (CBR) test is one of the comprehensive tests used for the last few decades to design the pavement thickness of roadways, railways and airport runways. Laboratory-performed CBR test is considerably rigorous and time-taking. In a quest for an alternative solution, this study utilizes novel computational approaches, including the kernel ridges regression, K-nearest neighbor and Gaussian process regression (GPR), to predict the soaked CBR value of soils. A vast quantity of 1011 in situ soil samples were collected from an ongoing highway project work site. Two data divisional approaches, i.e., K-Fold and fuzzy c-means (FCM) clustering, were used to separate the dataset into training and testing subsets. Apart from the numerous statistical performance measurement indices, ranking and overfitting analysis were used to identify the best-fitted CBR prediction model. Additionally, the literature models were also tried to validate through present study datasets. From the results of Pearson’s correlation analysis, Sand, Fine Content, Plastic Limit, Plasticity Index, Maximum Dry Density and Optimum Moisture Content were found to be most influencing input parameters in developing the soaked CBR of fine-grained plastic soils. Experimental results also establish the proficiency of the GPR model developed through FCM and K-Fold data division approaches. The K-Fold data division approach was found to be helpful in removing the overfitting of the models. Furthermore, the predictive ability of any model is considerably influenced by the geological location of the soils/materials used for the model development.

Estimating the compressive strength of plastic concrete samples using machine learning algorithms

Article 05 August 2023

Intelligent Approaches for Predicting the Intact Rock Mechanical Parameters and Crack Stress Thresholds

Article 26 May 2024

Estimation of Engineering Properties of Soils from Field SPT Using Random Number Generation

Article 18 October 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Road transportation network facilitates transferring goods from one place to another and door-to-door services for passengers throughout the world. As of this, the road transport infrastructures majorly govern the economy of the country. In this aspect, many new expressways and green highways are being constructed in India by the Ministry of Road Transport and Highways (MoRTH) department through various infrastructure development plans. The pavement thickness design construction of these roads is based on the strength of the material used in the subgrade and subbase layer. Therefore, highway engineers always desire that the material used in the subgrade layer should fulfill some of the engineering and technical properties such as swell criteria, plasticity properties, soil settlement conditions, subgrade reaction, bearing capacity etc. A method for recognizing the strength of such layers is of utmost requisite in highway engineering.

In general, the California bearing ratio (CBR) test is espoused to measure the stiffness modulus and the shear strength of subgrade material [1, 2] which may be performed on either re-compacted samples in the laboratory or undisturbed samples cut from the field or in situ surface of subgrade formation [3]. The test is an indirect measure that compares the strength of subgrade material (at known density and moisture content) to standard crushed rock material [4, 5]. Both laboratory and in situ tests are based on the principle of penetrating a standard dimension plunger into a soil specimen at a deformation rate of 1.25 mm/min. The laboratory and field engineers always encounter several difficulties in obtaining the CBR value in the laboratory. Laboratory-soaked CBR test requires a large amount of materials (almost 6 kg), more effort to prepare the test specimen, and lastly, 96 h of the soaking period to simulate the field conditions. Consequently, all those activities make the CBR test more tedious, laborious, and time-consuming. Additionally, if the properties of soil change for each small stretch of highway, then preserving such a huge quantity of soil and conducting the CBR test in the laboratory is laborious and time-consuming. Laboratories are also often packed due to the long queue of materials testing, which causes a delay in testing as well as the testing reports, ultimately the design of construction projects. Furthermore, the test method includes the material transportation cost (from construction site to testing laboratory), testing charge, and finally, the dumping of tested materials, which became more exhausted and increased the final cost of the projects.

Owing to the aforementioned problems, many researchers considered that CBR needs to be replaced either partially or entirely. Although not a fundamental material property, it has a long history in pavement design, and it is reasonably correlated with the index and engineering properties of soil by several investigators in the past. To the author’s knowledge, the first fame in predicting the CBR value was earned by Kleyn [6]. Earlier, he attempted to address the discrepancy in the CBR test and later prepared a chart based on a nest of straight lines that relate CBR to PI and grading module for over 1000 soaked CBR tests obtained from road and airport work throughout central and southern Africa. Black [7] suggested that the relationship between CBR and ultimate bearing capacity depends on the type of soil and compaction method, i.e., static or dynamic. Agarwal and Ghanekar [8] tried to generate the correlation equation through statistical analysis between CBR and Atterberg limits for 48 soil samples collected from different parts of India. However, they could not find any significant correlation between these parameters. But when LL and OMC were added, they observed an improved correlation with adequate accuracy for the preliminary identification of materials. National Cooperative Highway Research Program [9] attempted to develop the correlation equation for CBR from the index properties for clean and coarse-grained soil. Kin [10] tried to develop the correlation equation for the CBR value of fine-grained and coarse-grained soil through gradational properties. Taskiran [11] attempted to establish the correlation for 151 CBR test data of fine-grained soils, taken from 354 test samples, by ANN and gene expression programming (GEP) methods. Both techniques were found to exhibit promising results. Using 124 datasets, Yildirim and Gunaydin [12] studied the estimation of CBR by regression and ANN approach. They observed that the ANN technique is better than the regression analysis. Erzin and Turkoz [13] tried to predict the CBR value of Aegean sand from the results of mineralogical properties through ANN and regression approach. They also found that ANN is superior to the regression technique. Farias, Araujo [14] used the local polynomial regression (LPR) and radial basis network (RBN) techniques for developing the predictive equations for the CBR of soil samples. Using 207 CBR test results of granular soil, Taha, Gabr [15] observed that the correlation obtained through ANN is of excellent accuracy and lower bias than the regression analysis. A comparative study conducted by Tenpe and Patel [16] for 389 datasets collected from City and Industrial Development Corporation, Maharashtra state in India, reveals that ANN and GEP are efficient in predicting the CBR value. Later, in another study, Tenpe and Patel [17] found that SVM can better predict the CBR value than GEP. Recently, Bardhan, Samui [18] attempted to predict the soaked CBR value of 312 soil datasets through a particle swarm optimization (PSO) algorithm with adaptive and time-varying acceleration coefficients. The comparative analysis of various extreme learning machine (ELM) based adaptive neuro swarm intelligence (ANSI) such as ELM coupled-modified PSO (ELM-MPSO), ELM coupled-time-varying acceleration coefficients PSO (ELM-TPSO) and ELM coupled-improved PSO (ELM-IPSO) reveals that the modified and improved version of PSO has high accuracy at early iterations than the standard PSO. In another investigation, Bardhan, Gokceoglu [19] observed that multivariate adaptive regression splines with piecewise linear (MARS-L) demonstrate a higher accuracy in predicting the soaked CBR as compared to MARS with piecewise-cubic (MARS-C), Gaussian process regression and genetic programming. Hassan, Alshameri [20] attempted to predict the CBR value of plastic fine-grained soil from their index properties and compaction parameters through multi-linear regression analysis (MLR). The study was conducted for the standard proctor compaction energy level, whereas the engineers always prefer the modified proctor compactive energy level to construct the highways and expressways. It is observed from the above literature investigations (also shortened in Table 1) that several artificial intelligence (AI)-based models were used to predict the soaked CBR value which demonstrates the precision from 80 to 100% (R² values 0.8 to 1.0). However, there are still some advanced computation approaches which have proven their competency in solving many problems of civil engineering. The literature studies also omitted the investigation of statistical analysis over the obtained results of the model. The deep insight view of literature studies reveals that the range of geotechnical parameters and quantity of dataset are limited. Using a large amount of dataset is always considered to be much worthwhile from generalization point of view [18, 19, 21,22,23,24].

Table 1 Brief overview of the literature study attempted to predict the CBR value of various soil types

Parameters	Ideal value
\({R}^{2}=1-\frac{\sum_{i=1}^{N}{\left({y}_{i}(a)-{y}_{i}(p)\right)}^{2}}{\sum_{i=1}^{N}{\left({y}_{i}(a)-\overline{{y }_{i}(a)}\right)}^{2}}\)	1	(15)
\(Adj. {R}^{2}=\left[1-\frac{N-1}{N-P-1}\left(1-{R}^{2}\right)\right]\)	1	(16)
\(R=\frac{{\sum }_{i=1}^{N}\left(\left({y}_{i}(a)-\overline{{y }_{i}(a)}\right)\left({y}_{i}(p)-\overline{{y }_{i}(p)}\right)\right)}{\sqrt{{\left({y}_{i}(a)-\overline{{y }_{i}(a)}\right)}^{2}{\left({y}_{i}(p)-\overline{{y }_{i}(p)}\right)}^{2}}}\)	1	(17)
\(MAE=\left[\frac{1}{N}\sum_{i=1}^{N}\left\|{y}_{i}(a)-{y}_{i}(p)\right\|\right]\)	0	(18)
\(MAPE (\%)=\left[\frac{1}{N}\sum_{i=1}^{N}\left\|\frac{{y}_{i}\left(p\right)-{y}_{i}(a)}{{y}_{i}(a)}\right\|\right]\times 100\)	0	(19)
\(RMSE=\sqrt{\frac{1}{N}\sum_{i=1}^{N}{\left({y}_{i}(a)-{y}_{i}(p)\right)}^{2}}\)	0	(20)
\(VAF \left(\%\right)=\left[1-\frac{Var({y}_{i}(a)-{y}_{i}(p))}{Var({y}_{i}(a))}\right]\times 100\)	100	(21)
\({I}_{P}=Adj. {R}^{2}+0.01 VAF-RMSE\)	2	(22)
\(IOA=1-\frac{\sum_{i=1}^{N}{\left({y}_{i}(a)-{y}_{i}(p)\right)}^{2}}{\sum_{i=1}^{N}{\left(\left\|{y}_{i}(p)-\overline{{y }_{i}(a)}\right\|+\left\|{y}_{i}(a)-\overline{{y }_{i}(a)}\right\|\right)}^{2}}\)	1	(23)
\(IOS=\frac{\sqrt{\frac{1}{N}\sum_{i=1}^{N}{\left({y}_{i}(a)-{y}_{i}(p)\right)}^{2}}}{\overline{{y }_{i}(p)}}\)	0	(24)
\(a20-ndex=\frac{n20}{N}\times 100\)	1	(25)
\({S}_{P}=\frac{{\left(Adj. {R}^{2}\right)}_{total}+{\left(0.01 VAF\right)}_{total}-{\left(RMSE\right)}_{total}}{{\left(\frac{Adj. {R}^{2}}{{R}^{2}}\right)}_{training}+{\left(\frac{Adj. {R}^{2}}{{R}^{2}}\right)}_{testing}}\)	1	(26)

Application of KRR, K-NN and GPR Algorithms for Predicting the Soaked CBR of Fine-Grained Plastic Soils

Abstract

Similar content being viewed by others

Estimating the compressive strength of plastic concrete samples using machine learning algorithms

Intelligent Approaches for Predicting the Intact Rock Mechanical Parameters and Crack Stress Thresholds

Estimation of Engineering Properties of Soils from Field SPT Using Random Number Generation

1 Introduction

1.1 Research Significance and Contributions

2 Machine Learning Algorithms and Statistical Assessment Indices

2.1 Applied ML Algorithms

2.1.1 Kernel Ridge Regression (KRR)

2.1.2 K-Nearest Neighbor (K-NN)

2.1.3 Gaussian Process Regression (GPR)

2.1.4 Hyperparameters Tuning Using a Grid Search

2.2 Statistical Performance Measurement Indices

2.2.1 Data Preparation and Analysis

2.3 Data Collection and Geographical Location

2.4 Laboratory Experiments

2.5 Statistical Visualization and Correlation Analysis

2.6 Data Divisional Approaches

2.6.1 K-Fold Division Approach

2.6.2 Fuzzy C-Means (FCM) Division Approach

3 Results

3.1 Statistical Performance of Developed Models

3.2 Visual Interpretation of Developed Models

3.2.1 Trend and Error Plot for the Developed Models

3.2.2 Regression Error Characteristics (REC) curve

3.2.3 Accuracy Analysis

3.3 Selection of Best-Fitted CBR Prediction Model

3.3.1 Ranking Analysis (RA)

3.3.2 Overfitting Ratio (OR)

3.4 Influence of ML Algorithms and Data Division Approaches on the Model Performance

3.5 Validation of Literature Study Models

4 Discussion of Results

5 Conclusions

Code and Data Availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation