Introduction

Problematic soils, such as expansive clayey soils, with poor geotechnical properties are frequently encountered worldwide. In Syria, the southern and southern-east provinces are particularly covered with such problematic soils [11]. Therefore, there is an urgent need to mitigate the effects of their undesirable properties, such as high plasticity, high compressibility, low strength, and high sensitivity to the variations in water content to make such soils suitable for possible construction and road projects [4, 12, 16]. The stabilization of problematic soils was introduced many years ago with the main aim of making soils capable of meeting the requirements of specific technical projects. Several attempts were made to stabilize these problematic soils using natural additives, such as lime and natural pozzolan [43, 46, 54, 62]. However, these additives were mostly used at micro levels. There is a lack of investigation into the use of nano natural additives for soil improvement applications [11, 13].

The pavement courses consist mainly of a surface layer (generally a bituminous layer) constructed on a base and sub-base layers, Fig. 1. These layers are usually laid on a compacted subgrade [81]. Sub-base materials are usually local aggregates. The subgrade course, which may be considered the functional part of the pavement, should be properly compacted to be able to carry the loads originating from the vehicles and the weights of the upper layers. Therefore, as the pavement will rest on this layer, it should be carefully examined and evaluated [5]. One of the most widely tools to assess the strength of the subgrade layer is the CBR test which can be of a great importance to the pavement designers. The pavement that will be rested on a subgrade layer having a lower CBR value will be thicker when compared with a subgrade layer having a higher CBR value [47]. Generally, all types of expansive clayey soils have very low CBR values. To be suitable for pavement construction materials, they should be improved using the possible stabilization approaches [69]. One of these ecological and economic approaches is the use of a combination of lime and NP, particularly when added together at a nano scale.

Fig. 1
figure 1

Schematic illustration of a pavement construction

CBR test may be considered costly and laborious, as well as it needs a large amount of the soil mixture. Therefore, a quick and reliable method to predict CBR value could be a beneficial approach. Development of such reliable prediction method is the main goal of the current study. Many researchers have attempted to estimate CBR values using soft computing systems [7,8,9, 63, 71, 74, 77, 79]. In Taskiran [74]’ study, an attempt was made to compare ANN with Gene expression programming (GEP) techniques in predicting CBR of fine-grained soils. He concluded that both techniques performed best when seven input parameters were employed in the developed models. He further emphasized that such predictive models could be helpful tool to be used for preliminary identification of soil. Varghese et al. [77] have also estimated the soaked CBR of fine-grained soils based on four influential input parameters. Their ANN constructed models revealed that CBR can be accurately predicted. Such an accurate prediction can also make a preliminary assessment of soil to be used in novel engineering project where there is a financial shortage and limited time. On the other hand, the study of Yildirim and Gunaydin [79] has compared ANN with MLR models to predict CBR of 124 fine-grained soils in Turkey. MLR and ANN models gave good performance with correlation coefficients exceeding 0.9, while the better performance in Hariri’ study [44] was noted in the ANN model. Farias et al. [32] found in their study conducted on a wide range of soils, that plasticity index (PI) was the most influential factor on CBR prediction, while Al-Busultan et al. [8], based on the sensitivity analysis of the ANN model developed for prediction of CBR, concluded that PI was the least important factor when compared with the other 15 input variables.

Plasticity index (PI) is considered very important property of fine-grained soils. PI is the range of moisture contents where the soil exhibits plastic properties. In general, soils with high PI values tend to be clay and soils with low PI values tend to have no or little content of clay. Clayey soils are characterized by higher PI values (˃ 10) (AASHTO T 89). PI is the numerical difference between the liquid limit and the plastic limit; i.e. PI = LL−PL. Soils of probably high volume change have PI values of 30 or more [37, 70].

Unfortunately, there are no studies to sufficiently assess the CBR of problematic soils stabilized by a combination of nano natural pozzolan and nano lime. Akbari et al. [2] have studied the effect of adding nanozeolite on stabilized soft soil. However, they added lime at micro not nano scale. In addition, Onyelowe [59] has investigated the effect of adding nanostructured clay to a problematic soil. However, they added a fixed percentage of 2% of OPC to the untreated and treated soil. Further, Abbasi and Mahdieh [1], Harichane et al. [42], Cheng et al. [29], Calik and Sadoglu [27], Shah et al. [66] and Rabab’ah et al. [62] have studied the addition of both of NP and lime for stabilization of expansive soils. However, these additions were at micro not nano level.

Multiple Linear Regression (MLR) is the simplest method that has been used for predicting the geotechnical properties of soil [51, 57]. As MLR does not yield reliable predictions (due to its low flexibility), different machine learning methods have been widely utilized to estimate the geotechnical properties of soils in a more accurate way. ANN is the most commonly used in predicting geotechnical properties of soils [8, 15, 16, 24, 63, 74, 79]. The popularity of ANN in civil engineering is attributed to its high adaptability in finding the complicated relationships between the inputs and the output variables, which results in higher accuracy of predictions [10, 22, 30, 51, 65, 77].

Although many studies have been focusing on laboratory testing of problematic soils [1, 11, 43, 62], no work has specifically concentrated on applying the machine learning for predicting the CBR & PI properties of clayey soils stabilized with a combination of NNP & NL. Moreover, according to the authors’ knowledge, no predictive models were reported in the literature on investigating the capability of FL and ANN for prediction of such properties when the stabilizers are used at nano levels. To fill this gap, three machine-learning methods, MLR, ANN and FL, were employed to predict CBR & PI properties. In addition, sensitivity analysis was carried out to measure the importance of each input variable on predicting the studied geotechnical properties.

This approach is considered very beneficial as it takes into account the key parameters such as NNP content, NL content, NNP fineness, SiO2active content of NNP, ILL & IPL. In addition, the predicted properties may contribute to proposals that may assist the pavement designers. Further, regions of similar geology such as Jordan and KSA may get benefits from the current analytical study. Furthermore, the rebuilding stage in Syria will inevitably need such economic approaches.

Experimental dataset and the studied variables

Unfortunately, the authors could not find further data relating to the use of combinations of NNP & NL as soil stabilizers in the literature. Therefore, the analyzed dataset was constructed on experimental results carried out by the authors on NNP-NL-based clayey soils. One hundred and twenty soil mixtures were experimentally prepared with five NNP contents, namely: 0%, 0.5%, 1%, 1.5% and 2%, five NL contents, namely: 0%, 0.3%, 0.6%, 0.9% and 1.2% with three NNP sizes; i.e. 50, 100 and 500 nm and five SiO2active contents of NNP. Figure 2 shows some details on the NNP & NL used in the experimental part, and Fig. 3b shows the particles size distribution for both NNP & NL used in the experiments. It is worth mentioning that the nanoparticles tend to agglomerate when using at high levels (i.e. more 3%) [13, 41]. Such agglomeration may hinder the nanoparticles to perform well in the soil mixtures. Consequently, the expected enhancement offered by the nano-additives when used at high dosages will be extremely affected. Therefore, there is a need to disperse the nanoparticles before its use in the mixtures [61]. For this purpose, nano-additives are mixed with water using ultra-sonic mixer for 3 min. This may effectively help in limiting the nano particles to agglomerate.

Fig. 2
figure 2figure 2

Map of the investigated quarry with Photograph of NP quarry (a, b). NP aggregates as received with SEM micrograph showing its vesicular nature (c, d). XRD analysis of NP (e). The laboratory-grinding machine (f). SEM micrographs of NNP of 100 and 500 nano “MPS” and NL (g, i, k), respectively and AFM micrographs of NNP of 100 and 500 nm size and NL (h, j, l), respectively

Fig. 3
figure 3

Particle size distribution of the investigated soils (a) and NNP & NL (b)

The investigated natural pozzolan was quarried from southeast of Syria, from a quarry located at the northeast of Harrat al-Shaam volcanic field which is a basaltic province of about 50,000 km2 covering parts from Syria, Jordan and KSA, Fig. 2a [10]. The main oxides of the investigated natural pozzolana are SiO2 (45%), Al2O3 (16%), Fe2O3 (10%), CaO (9%), MgO (8%) and alkali oxides (Na2O and K2O) (4%). Its mineralogical composition, as shown in Fig. 2e consists of two phases; i.e. crystalline and glassy. The main occurring minerals are Fujasite, Anorthite, Forstrite, Diopside and Calcite. It is lighter than water; its bulk density is less than 0.7 which is due to its vesicular nature, as clearly seen in Fig. 2d. Natural pozzolan was ground to the studied sizes; namely 500 nm, 100 nm and 50 nm using a laboratory centrifugal ball mill (Retsch, S100, Germany) for 275 min, 360 min and 425 min, respectively. The adopted NP dispatch/steel ball ratio was 1/5 at a revolution number of 300.

The investigated lime was quarried from Hama city, one of the mid-provinces in Syria. It is a quick lime obtained after the calcination of raw lime up to 950 °C in order to become more active. It was ground to a nano scale in similar way to that of NP. Its grading is illustrated in Fig. 3b.

The nanoadditives were scanned using: (i) a nanoscope easyScan II Atomic Force Microscope (AFM) [11], (ii) a VEGA II TESCAN Scanning Electron Microscope (SEM) fitted with EDAX AMETEK Energy Dispersive X-ray Spectroscopy (EDS). The results of scanning nano-natural additives can be clearly seen in Fig. 2d, g–l.

The studied problematic clayey soils were quarried from three sites located in the southern province of Syria. Their characterestics are tabluated in Table 1 and Fig. 4. Their gradings are plotted in Fig. 3a The main minerals existing in the studied soils, according to the XRD analysis shown in Fig. 5 are: kaolinite, illite and montmorillonite as clay minerals and calcite, quartz and feldspar as non-clay ones. The XRD analysis was carried out using SATOE STADI X-Ray Diffractometer at the following inputs: CuKa radiation, 40 keV and 30 mA, scan mode: 5°–70°, spead: 2°/min.

Table 1 Charateristics of the investigated soils
Fig. 4
figure 4

Plasticity chart for the Unified/ASTM soil classification system (S1: ; S2:; S3:)

Fig. 5
figure 5

XRD of the studied untreated clayey soils. (Q: quartz, K: kaolinite, C: calcite, I: illite, M: montmorrilonite, F: feldspar)

Atterberg limits (LL, PL) and CBR tests were carried out in accordance with ASTM D 4318 [19] and ASTM D1883 [21], respectively. Both LL & PL tests were conducted at room temperature. PI values were determined following the Atterberg limits test. In CBR test, three soil specimens form each untreated or treated soil mixture were tested after being soaked in water for 96 h. The specimens were compacted to a maximum dry density at the optimum moisture content determined by standard Proctor tests [20]. The soaked condition was adopted in the experimental part, as it simulates the behavior of pavement sub layers under heavy rain. Some photographs of the conducted experiments appear in Fig. 6.

Fig. 6
figure 6

Photographs of the experimental test set-up; Aterberg test (a), CBR test (b)

The six analyzed input variables are: (i) NNP content (0%, 0.5%, 1%, 1.5% and 2%); (ii) NL content (0%, 0.3%, 0.6%, 0.9% and 1.2%); (iii) Median particle size (MPS) of nano natural pozzolana (50nm, 100 nm, 500 nm); (iv) SiO2active (It ranges from 37.6 to 43.2); (v) ILL (It ranges from 58.5 to 74.2); (vi) IPL (It ranges from 29.7 to 36.8). The characteristics of the input and output variables are tabulated in Table 2.

Table 2 Characteristics of the independent and dependent variables

Prediction models

Multiple linear regression (MLR)

MLR is a statistical method. The general purpose of such a technique is to generate a correlation between variables, i.e. independent and dependent ones [50]. Prediction of CBR or PI with six independent variables can be expressed as follows:

$${\text{Y}}_{{\text{d}}} = {\text{b}}_{0} + {\text{ b}}_{{1}} {\text{X}}_{{1}} + {\text{ b}}_{{2}} {\text{X}}_{{2}} + {\text{ b}}_{{3}} {\text{X}}_{{3}} + {\text{ b}}_{{4}} {\text{X}}_{{4}} + {\text{ b}}_{{5}} {\text{X}}_{{5}} + {\text{b}}_{{6}} {\text{X}}_{{6}}$$
(1)

Where Yd is the dependent variable, i.e. either CBR or PI, bi values are the regression weights, which are computed in a way that minimizes the sum of squared deviations and Xi are the independent variables.

Artificial neural network

The behavior of soil is very complicated [67, 77]. Therefore, to build more precise predictive models, it is preferable to employ more influential variables in such predictive models. The nonlinear relations between the input and output variables can be successfully modeled by artificial neural networks (ANNs) technique [10, 40, 45]. Its flexibility and adaptability in generalizing the data, were behind the wide application in many field including civil and geotechnical engineering [10, 15, 17, 22, 24, 26, 30, 49, 60]. In the current study, for prediction of CBR and PI of the soil mixtures stabilized with combinations of NNP and NL, the dataset used to develop the ANN models was divided into subsets (i.e., 70 and 30% for training and testing sets, respectively).

Figure 7 shows the architecture of the ANN models developed for prediction of CBR & PI. Five and six neurons in the hidden layer were selected for predicting CBR & PI, respectively. It was selected a single hidden layer which could be sufficient to learn and solve the problems [25, 39]. The learning rate (lr) was 0.7 while the momentum (m) was 0.3 & 0.2 for each of CBR & PI networks, respectively. This selection was made after experimenting all possibilities of changing (lr) from 0.1 to 0.9 as well as changing (m) from 0.1 to 0.9.

Fig. 7
figure 7

Architecture of ANN models for prediction of PI (a) and CBR (b)

The feed-forward neural networks trained with the back-propagation learning algorithm were adopted in the current study [28, 72]. In the forward process, the input layers receive the inputs and then propagate through networks layer by layer to the output layer and produce the corresponding output values. However, reaching a lower error between predicted and experimental results necessitates the optimization of the process. [78]. The calculation of the error and the adjustment of the weights are made through the backward process which compares the predicted and experimental outputs of the network.

In the present study, the logistic sigmoid activation function with a scaling range between 0 and 1.0 was employed in the constructed models [40]:

$$f({\mathrm{\alpha }}_{i}) =\frac{1}{1+\mathrm{exp}(-ai)}$$
(2)

where α is a constant used to control the slope of the semi-linear region [65].

The data was normalized between 0 and 1 before submitting to the ANN. The final output can be obtained by repeating the procedure until no marked improvement is noted [56]. The predicted PI or CBR values have been plotted versus the experimental PI and CBR results. The ANN models were developed using MATLAB software, NN Tool.

Fuzzy logic

Fuzzy logic was first built from the theory of fuzzy sets by Zadeh [82]. It is considered a technique for formalizing approximate and non-exact situations. Due to its ease of implementation and the non-necessity of the mathematical modeling of the process, this logic has become more and more common in several fields such as civil and geotechnical engineering [14, 23, 38, 58, 68, 75]. In the theory of fuzzy sets, the modeling of uncertain notions of natural language is done by athematic formulas. The variables in the fuzzy set theory are no longer of binary nature (i.e. 0 or 1) but can take an infinite number of possible values between zero and one.

The membership functions, which can theoretically take any form, characterize the fuzzy subsets. In general, the most used membership functions are defined by geometric shapes. Triangular and trapezoidal shapes are most often used. The fuzzy rules are expressed in the form: IF (premise) THEN (conclusion). The premise may depend on several variables linked to each other or not. The conclusion is obtained by implication of fuzzy propositions. The architecture of Mamdani-based FL model consists of four essential parts, namely: (i) the fuzzifier, (ii) the knowledge base, (iii) the interface mechanism or rule evaluation and (iv) the defuzzifier. The fuzzifier turns real inputs into fuzzy linguistic variables, while the defuzzifier does the opposite.

120 datasets with six input variables obtained from the experimental part carried out by the authors were employed to develop FL model for the prediction of CBR & PI values. Constructing FL models was made using the FL toolbox in MATLAB. 88 and 72 Mamdani-based rules [55] expressed in the IF–Then form were written for prediction of CBR & PI, respectively. These possible rules relate the input variables to the output ones. The rules were written by verbal statements in such a way that is similar to the human thought. The “min” interface operator was used for finding the output sets, while the “centroid” method was employed for defuzzication [3]. The triangular membership functions were constructed based on the experience gained, as shown in Fig. 8 [75]. The main idea in the theory of FL is that any element belongs to different subsets of universal set. For instance, CBR value of 70 belongs to both high and very high subsets, with membership degrees of 0.80 and 0.20, respectively, Fig. 9a.

Fig. 8
figure 8

Triangular membership functions used in the fuzzy model for the six input variables; a NNP content, b NL content, c median particle size of NNP, d percentage of SiO2active, e ILL, f IPL

Fig. 9
figure 9

Triangular membership functions used in the fuzzy model for output variables; a CBR, b PI

Validation of the developed models

The validation of the constructed models was assessed using the following ten different criteria:

  1. i.

    Root mean squared error (RMSE). This criterion can be computed by the following formula:

    $$\mathrm{RMSE}=\sqrt{\frac{1}{n}{\sum }_{i=1}^{n}{(Error)}^{2}}$$
    (3)

    The constructed model will be better when the RMSE value is smaller.

  2. ii.

    Relative root mean squared error (RRMSE). It can be computed using the following function:

    $$RRMSE=\frac{RMSE}{\left|\overline{Exper }\right|}$$
    (4)
  3. iii.

    Mean Absolute Error (MAE). It can be calculated by the following formula:

    $$MAE=\frac{1}{n}{\sum }_{i=1}^{n}\left|Error\right|$$
    (5)
  4. iv.

    Mean absolute percentage error (MAPE). It can be given by the following formula:

    $$\mathrm{MAPE}=\left(\frac{1}{n}\sum_{i=1}^{n}\left|\frac{Error}{Exper}\right|\right)\times 100\%$$
    (6)
  5. v.

    Coefficient of determination (R2). It can be calculated by the following formula:

    $${\mathrm{R}}^{2}= 1- \frac{{\sum }_{i=1}^{n}{Error}^{2}}{{\sum }_{i=1}^{n}{(Exper- \overline{Exper })}^{2}}$$
    (7)

    When R2 is closer to one, there will be a closer relationship between the experimental and predicted values.

  6. vi.

    Se/Sy; Se is the standard error of the predicted values and Sy is the standard deviation of the experimental values. Lower Se/Sy values indicate more accurate models. Models of (Se/Sy ≤ 0.35) values are graded excellent [30]

    $$\frac{{S}_{e}}{{S}_{y}}=\sqrt{\frac{n*\left[{\sum }_{i=1}^{n}{Error}^{2}\right]}{\left(n-p\right)*\left[{{\sum }_{i=1}^{n}\left(Exper-\overline{Exper }\right)}^{2}\right]}}$$
    (8)
  7. vii.

    Correlation coefficient (r). It can be calculated by the following equation [83]:

    $$r=\frac{\sum_{i=1}^{n}(Pred-\overline{Pred })(Exper-\overline{Exper })}{\sqrt{\sum_{i=1}^{n}{(Pred-\overline{Pred })}^{2}}\sqrt{\sum_{i=1}^{n}{(Exper-\overline{Exper })}^{2}}}$$
    (9)
  8. viii.

    Performance index (Pi) can be expressed as follows [83]:

    $${P}_{i}=\frac{RRMSE}{1+r}$$
    (10)

    Higher correlation coefficient values and lower relative root mean squared error (RRMSE) values results in lower performance index values. Values of Pi closer to zero (e.g. Pi < 0.2) indicate a more accurate model [34, 48].

  9. ix.

    Adjusted Coefficient of efficiency (CE) [52]

    $$\mathrm{CE }= 1-\frac{{\sum }_{i=1}^{n}\left|Error\right|}{{\sum }_{i=1}^{n}\left|Exper-\overline{Exper }\right|}$$
    (11)

    Such a criterion can supplement the assessment of the predictive models. It reports the differences between the experimental and predicted values relative to the inherent variability of the experimental values. CE values closer to one indicate a more precise developed model.

    Where n is the total number of the analyzed data, Exper & Pred are the experimental and predicted values, respectively, \(\overline{Exper }\) is \(\overline{Pred }\) are the mean experimental and predicted values, respectively and p is the model parameters.

  10. x.

    Durbin–Watson statistic (DW). It is an important static criterion used to verify the existence of multicollinearity. DW values vary between zero and four. The developed models will be unaffected by multicollinearity when DW values fall in the acceptable range of 1.5 to 2.5.

Results and discussion

Plasticity indices of the treated soils

Figure 10 plots the results of PI for all clayey soil mixtures (S1, S2, S3) stabilized with combinations of NNP & NL. The results of the control clayey soil, i.e. with zero NNP & NL, were also plotted for comparison. It is clearly seen that all clayey soils, in terms of PI, have the highest values of PI; i.e. 30 or even more. However, when the nano-natural additives were incorporated, a significant decrease in the PI values was observed in all studied soils. The soil workability is improved with the decrease in PI values. The more pronounced decrease in PI can be noted when NL content was increased, and the best performance, in terms of PI, was achieved when 2% NNP and 1.2% NL were added together. PI values of less than 3 can be noted in all stabilized soils, irrespective of NNP fineness. This result can be explained as follows: (i) adding NNP may reduce the plasticity of the soil; (ii) adding NL to plastic soil causes a colloidal reaction. Such a reaction includes a replacement of naturally carried cations on the clay surface by Ca2+, an increase in pH value, and a reduction in double layer. This helps in flocculation and aggregation of colloidal clay particles, making them less plastic [13, 64]; (iii) the pozzolanic reactions occurring between the hydrated lime and the active silica and alumina will move the soil from this status, to become less plastic [13].

Fig. 10
figure 10figure 10

PI values of all clayey soil types treated with combinations of NNP & NL; S1-NNP100 nm (a) S1-NNP500 nm (b), S2-NNP100 nm (c), S2-NNP500 nm (d) S3-NNP100 nm (e), S3-NNP500 nm (f), respectively

Further grinding NP to about 100 nm or 50 nm median particle size may effectively contribute to the acceleration of such pozzolanic reactions. This can be obviously seen in Fig. 11, where much lower values of PI can be obtained when NNP of 100 nm or 50 nm size was added to the treated soils, as compared with NNP of 500 nm size. In addition, it can be noted from Fig. 11, that grinding NNP from 100 to 50 nm did not produce significant improvement in terms of PI reduction. Understanding such a behavior needs further investigation. However, from the authors’ point of view, occurring some agglomeration due to the Van der Waals forces evolved particularly when the nano particles are ground to finer sizes, may explain this behavior at the present time.

Fig. 11
figure 11

PI values of all clayey soil types treated with combinations of NNP of different MPS & NL of 0.6% content; S1-NNP-NL0.6 (a) S2-NNP-NL0.6 (b), S3-NNP-NL0.6 (c), respectively

CBR

The CBR test is commonly used to assess the soil strength. It is widely considered as a reliable method when the design of pavement is concerned [18, 33]. Figures 12 & 13 plots the results of CBR test for the soil mixtures stabilized with combinations of NNP & NL. From Fig. 12, it can be noted that CBR values increases as NNP & NL contents increase. The highest CBR values, which exceed 70% in almost all treated soil mixtures, irrespective of NNP size, were obtained when NNP & NL were added together at 2% and 1.2%, respectively. Subgrade having a soaked CBR value of more than 20% can be rated very good for pavement construction. However, when this value reaches up to 50% or more, the layer would act as a very good sub-base. Therefore, as seen in Fig. 12, all soil mixtures stabilized with combinations of NNP ≥ 0.5% “ground to 100 nm size or less” & NL ≥ 0.3% can be recommended for sub-base construction. The improvement in the CBR values can be attributed to the gradual formation of cementitious compounds such as C–S–H and C–A–S–H, when such combinations were used. The formation of cementitious compounds, when NP & L were added together to the problematic clayey soils were frequently reported in literature [6, 11, 46, 53]. Such a formation can be attributed to the reactions occurring between CaO present in lime and glassy phase, such as active silica (SiO2) and active alumina (Al2O3) in NP and the treated soil. These reactions are frequently referred to as “pozzolanic reactions”. Further, the prolonged grinding of NP may form a highly reactive material on the surface of the mineral particles. Therefore, nano-particles interact effectively with other compounds in the treated soil [11, 36]. Meanwhile, such cementitious compounds formation and the change in the soil morphology were also confirmed by a microstructural analysis carried out on two clayey specimens; namely untreated soil and soil treated with a combination of NNP = 1.5% and NL = 0.9% and cured for 7 days, as shown in Fig. 14. Furthermore, formation of more cementitious compounds can be enhanced through the 4-day-soaking procedure as specified in the CBR test, which is regarded as treated soil’ curing time.

Fig. 12
figure 12figure 12

CBR values of all clayey soil types treated with combinations of NNP & NL; S1-NNP100 nm (a) S1-NNP500 nm (b), S2-NNP100 nm (c), S2-NNP500 nm (d) S3-NNP100 nm €, S3-NNP500 nm (f), respectively

Fig. 13
figure 13

CBR values of all clayey soil types treated with combinations of NNP of different MPS & NL of 0.6% content; S1-NNP-NL0.6 (a) S2-NNP-NL0.6 (b), S3-NNP-NL0.6 (c), respectively

Fig. 14
figure 14

Microstructural analysis & EDX of clayey soil mixture specimens cured for 7 days; S1-NNP0-NL0 (a, c), S1-NNP1%-NL0.9% (b, d), respectively

Further, as obviously seen in Fig. 13, the treated soils containing NNP of 100 nm or 50 nm size have higher values of CBR when compared with those containing NNP of 500 nm size. However, it is worth noting that grinding NNP from 100 to 50 nm did not produce significant improvement in terms of CBR improvement. Understanding such a behavior needs further investigation. Meanwhile, from the authors’ point of view, occurring some agglomeration due to the Van der Waals forces evolved particularly when the nano particles are ground to a finer size, may explain this behavior at the current time.

Relationships between the investigated properties

A relationship between PI and CBR of the studied soil mixtures is plotted in Fig. 15. This relationship was calculated either for all obtained results. As shown in Fig. 15, it is obvious to note that the correlation between PI and CBR can be labelled excellent (R2 ≥ 0.913) irrespective of the median particle size of NNP and the soil type. Therefore, CBR value can be predicted by the knowledge of PI value, and thus saving time & money can be achieved. Such a strong correlation between CBR & PI can be ascribed to the improvement of the expansive clay soils, offered by the nano natural additives. The trend in the expansive soil improvement was moving towards lower PI values & higher CBR values when the dosage of nano-additives is increased and the MPS of NNP is decreased.

Fig. 15
figure 15

Relationship between the investigated properties of soil mixtures prepared using combinations of NNP & NL

It is worth mentioning that further considerable improvements in the treated soil properties, such as significant reductions in free swell and swelling pressure, an important increase in unified compressive strength (UCS) and shear strength and a considerable reduction in linear shrinkage, can be also obtained when nano-additives are used as soil stabilizers [13]. In addition, because of the lower affinity of nanonatural pozzolana to water when compared with the expansive original soil, no further water retention will be expected, as the water molecules are adsorbed to the surface of the clay minerals [11].

Prediction of CBR & PI

For prediction of CBR and PI of soil mixtures stabilized with combinations of NNP and NL, several models were constructed based on the dataset obtained through an experimental work conducted by the authors. Six input variables were adopted; i.e. NNP content, NL content, MPS, SiO2active, ILL and IPL. The models were developed using three different tools; regression, ANN & FL. Validation of these developed models was evaluated using different criteria as displayed in equations nr. (3, 4, 5, 6, 7, 8, 9, 10, 11) as well as the calculation of DW indicator.

Performance of CBR models

The performance criteria for the models constructed to predict CBR of stabilized soil mixtures are tabulated in Table 3. The results shown in Table 3 demonstrate that the CBR models constructed using ANN or FL have performed best when compared with the MLR model. They showed very low RMSE, low MAPE, very low Se/Sy and very high R2 values. RMSE values of less than 3.4, MAPE values of less than 11.4, Se/Sy values of less than 0.16 and R2 values of higher than 0.975 were recorded in the ANN & FL models. In contrast to the ANN & FL models, RMSE of 7.9, MAPE of more than 35, Se/Sy of 0.37 and R2 of 0.87 were recorded in the MLR model. In addition, the performance index (Pi) and the coefficient of efficiency (CE) values recorded in the MLR model went far from those recorded in ANN & FL models. The higher Pi (0.073) and the lower CE (0.63) values recorded in MLR model indicate a less accurate model. Further, from the results tabulated in Table 3, it is to be noted that ANN technique perform slightly better than FL as far as prediction of CBR is concerned. Furthermore, the obtained DW for ANN, FL & MLR models were in the ideal range (i.e. between 1.5 and 2.5). In spite of the lower performance of MLR; however, CBR can be predicted with a reasonable level of certainty using this method. Figure 16a clearly shows that the goodness-of-fit of the ANN & FL models is superior when compared to the MLR model.

Table 3 The performance criteria for the models developed to predict CBR & PI of soils stabilized by combinations of NNP & NL
Fig. 16
figure 16

The predicted versus the experimentally obtained values for all developed models, a CBR; b PI

Performance of PI models

The performance criteria for the models constructed to predict PI are tabulated in Table 3. A trend similar to that of CBR prediction can be noted when PI prediction is concerned. The results shown in Table 3 demonstrate that the PI models constructed using ANN or FL have performed best when compared to the MLR model. They provided very low RMSE, very low Se/Sy and very high R2 values. RMSE values of less than 1.75, MAPE values of less than 23.1, Se/Sy values of less than 0.22 and R2 values of higher than 0.954 were recorded in the ANN & FL models. In contrast to the ANN & FL models, RMSE of 4.38, MAPE of more than 54.88, Se/Sy of 0.55 and R2 of 0.71 were recorded in the MLR model. In addition, the higher performance index (Pi = 0.25) and the lower coefficient of efficiency (CE = 0.5) values recorded in the MLR model make it less accurate when compared to the ANN & FL models. Therefore, the prediction of PI using MLR technique would not be reliable.

The predicted versus experimental values are depicted in Fig. 16b. It is worth noting that the trend line is very well fitted to the results of both FL & ANN models, while the results of MLR are far distant from such a line. This indicates that desirable results can be obtained using both ANN & FL approaches. However, PI cannot be predicted with a reasonable level of certainty using MLR method. As no investigation on the prediction of CBR or PI of soils stabilized with combinations of NNP and NL was found in the literature, further studies taking into account more variables are highly recommended. Nevertheless, similar results can be traced in the literature, particularly when the CBR value of micro NP was predicted using ANN [8, 73, 74, 79].

It is to be noted in Fig. 17a that a lower MPS of NNP leads to higher CBR values, particularly when a higher content of NL is added. In addition, the combinations of 2% NNP and 1.2% NL have given the highest values of CBR, particularly with lowering MPS of NNP, as shown in Fig. 17b. Therefore, finer NNP is recommended to add along with NL to get the best results.

Fig. 17
figure 17

Some input variables versus CBR surface. a The combined effect of NL and NNP on CBR; b the combined effect of NL content and IPL on CBR

The relationships between the output (CBR or PI) and the input variables obtained using MLR could be written as follows:

-For the CBR predicted value,

$$\begin{aligned} {\text{CBR}} = \, & {5}.0{12} + {13}.{858}\, \times \,{\text{NNP}} + {37}.{968}\, \times \,{\text{NL}} - 0.0{11}\, \times \,{\text{MPS}}\, + \,{1}.{344}\, \times \,{\text{SiO}}_{{{\text{2active}}}} \\ & - \,0.{824} \times {\text{ILL}} + 0.{\text{433IPL }}\left( {P{\text{ - value}} \le 0.0{\text{5 for all inputs except for IPL}}} \right) \\ \end{aligned}$$
(12)

-For the PI predicted value,

$$\begin{aligned} {\text{PI}} = 31.794 - 4.022{\mkern 1mu} \times {\mkern 1mu} {\text{NNP}} - 13.647{\mkern 1mu} \times {\mkern 1mu} {\text{NL}} + 0.006{\mkern 1mu} \times {\mkern 1mu} {\text{MPS}} - 0.433{\mkern 1mu} \times {\mkern 1mu} {\text{SiO}}_{2} \qquad + 0.136{\mkern 1mu} \times {\mkern 1mu} {\text{ILL}} - 0.056{\mkern 1mu} \times {\mkern 1mu} {\text{IPL}} \\ (P - {\text{value}} \le 0.0{\text{5 for all inputs except for ILL and IPL }}\& {\text{ SiO}}_{{{\text{2active}}}} {\text{variables)}} \\ \end{aligned}$$
(13)

It is interestingly to note from Eqs. (12 and 13) that CBR & PI variables have a statistical significance. In addition, the input variables related to the contents of NL & NNP and its extremely small size have also statistical significance. Thus, CBR & PI can be predicted based on these input variables with a confidence level of 95% or more. Therefore, the addition of NP nanoparticles of extremely small size would offer a promising approach to the pavement engineers.

Sensitivity analysis of ANN models

The equation proposed by Garson [35] was employed to assess the relative importance of the input variables. It can be expressed as follows:

$${I}_{j}= \frac{\sum_{m=1}^{m=Nh}\left(\left(\frac{\left|{w}_{jm}^{ih}\right|}{\sum_{k=1}^{Ni}{|w}_{km}^{ih}|}\right)\times \left|{w}_{mn}^{ho}\right|\right)}{\sum_{k=1}^{k=Ni}\left\{\sum_{m=1}^{m=Nh}\left(\frac{\left|{w}_{km}^{ih}\right|}{\sum_{k=1}^{Ni}\left|{w}_{km}^{ih}\right|}\right)\times \left|{w}_{mn}^{ho}\right|\right\}}$$
(14)

where Ij is the relative importance of the jth input variable on the output; Ni and Nh are the numbers of input and hidden neurons, respectively; W is connection weights; the superscripts i, h, and o refer to input, hidden, and output layers, respectively; and subscripts k, m, and n refer to input, hidden, and output neurons, respectively.

The results of the sensitivity analysis conducted basing on the Garson equation [35] are shown in Fig. 16. The relative importance of the input variables (NNP content, NL content, MPS of NNP, SiO2active, ILL and IPL) for prediction of either CBR or PI are clearly seen in Fig. 18. Figure 18a shows that all variables have an effect on CBR. However, NL content was found to be the most influential variable with a relative importance of more than 50%. The higher relative importance of NL content can be attributed to the significant effect of this variable on the CBR, in terms of the enhanced strength of clayey soils stabilized with combinations of NNP and NL. Other variables such as NNP and MPS also have significant effects on CBR with relative importance values of 21 and 10%, respectively.

Fig.18
figure 18

The relative importance values of input variables for the investigated outputs (a CBR and b PI)

Similar trend was also noted when the sensitivity analysis of PI model is concerned. It can be seen in Fig. 18b, that the most influential variable, in terms of PI, is also NL. A relative importance of about 40%, was recorded for this variable alone, while, other variables such as NNP, MPS and IPL have relative importance values of about 27%, 11%, and 13%, respectively. This result confirms the critical role of NL and NNP in improving the soil properties, particularly those related to the plasticity and strength. Formation of more cementitious compounds such as C–S–H and C–A–S–H can be expected when using such nano-additives. These cementitious compounds may modify the soil morphology and thus a clayey soil mixture of higher workability and strength can be achieved.

Another sensitivity analysis process has confirmed the results obtained by the equation proposed by Garson [35]. This process is based on the evaluation of possible combinations of input variables [31, 80]. The performance of each possible group in terms of MSE & R was evaluated using the developed ANN model. Six groups of one variable, fifteen groups of two variables, twenty groups of three variables, ten groups of four variables, four groups of five variables and one group of six variables were tested by the constructed ANN models. The results obtained for the different possible groups were tabulated in Table 4. As observed in Table 4 the NL variable was the most influential variable. NL had the least value of MSE among other variables (144.54) and the best correlation factor (R) of 0.845. When NNP was used together with NL, the MSE decreased significantly down to 37.11 and R increased up to 0.965. This group of two variables was the best group among other possible groups of two variables. In addition, the best group of three variables were noted when the variable ILL was added to the former combination of NNL & NP. MSE of 21.24 & R of 0.973 were recorded for this group. Further, the group of four variables consisting of NNL, NNP, MPS and ILL achieved the best performance among the other four variables-consisting groups. MSE recorded a value of 18.08 while R recorded a value of 0.991. Similar performance was also obtained when SiO2 variable was incorporated in the former group of four variables. Furthermore, it is to be concluded that for each NL-containing group of 1, 2, 3, 4, 5 and 6 variables, the performance criteria were the best among all other possible groups.

Table 4 Evaluation of possible combinations of input variables when the prediction of CBR is concerned

Conclusion

The present study, which is the first of its kind in Syria, is an attempt to investigate the applicability of some ML techniques for prediction of two important properties (i.e. CBR & PI) of expansive clay soils stabilized by nano natural additives. Nano-natural pozzolan (NNP) and nano-lime (NL) quarried from Syrian sites were added at different levels to achieve such a soil stabilization. The predictive models were developed using MLR, ANN & FL techniques. The dataset was obtained from locally conducted experiments. Six input variables were employed in the current study. These variables are NNP content (%), NL content (%), MPS of NNP (nm), SiO2active of NNP, ILL and IPL.

Based on the results obtained, the following conclusions can be drawn:

  1. 1.

    Based on the experimental results, the combination of 2% NNP & 1.2%NL content appears to be the best dosage, CBR values are the highest, and PI values are the lowest. Further investigations using more combinations of nano-natural additives are recommended to reach the ideal dosage.

  2. 2.

    The values predicted by ANN & FL models are not far from the experimental results. Performance criteria, such as RMSE, MAPE, R2, Pi, CE and DW have demonstrated that ANN & FL models can accurately predict CBR & PI. However, MLR models are far less accurate than the ANN & FL ones. The correlation coefficient (r) is very close to one and the error of prediction in both ANN & FL models is lower when compared with that obtained by MLR analysis.

  3. 3.

    Sensitivity analysis showed that all studied variables (i.e. NNP content, NL content, MPS, SiO2active, ILL and IPL) have considerable effects on the investigated properties. However, NL was found to be the most influential variable with relative importance of more than 50% and about 40% in terms of CBR and PI, respectively. In addition, NNP content and to some lower degree MPS have also significant effects on the prediction of CBR & PI with relative importance values of about 25 and 10%, respectively. On the other hand, SiO2active & ILL were found to be the least influential variables in both cases, with relative importance values not exceeding 7%.

  4. 4.

    The pavement designers may tend to the FL predictive model despite the slightly lower performance when compared with the ANN models. This preference can be ascribed to the FL rules, which can be written in such a way that is similar to the human thought, while the ANN predictive model is not visible to the user.

  5. 5.

    Determining the CBR & PI values by laboratory tests is time-consuming, money and labor-intensive. Moreover, human errors and various laboratory conditions introduce another element of uncertainty in laboratory results. Thus, a reliable estimation of such properties is extremely helpful in designing soil mixtures for pavement construction. Such a reliable estimation can be done using more accurate techniques such as ANN & FL techniques rather than conventional regression techniques, which have a lower reliability in predicting the investigated properties.

  6. 6.

    Further investigation with a larger dataset and incorporation of further variables is highly recommended to confirm the results obtained by the developed predictive models. Variables such as the chemical composition of NNP, more geotechnical properties such as OMC & MDD, different fineness levels of NNP, NL…etc., can be further investigated.