Introduction

The information on subsurface soil layers obtained from retrieved samples from drilled boreholes in a desired area can be used for several engineering purposes, including resolving geological, geotechnical, and environmental issues. However, both the number of drilled boreholes and extending the provided exploration database are essentially dependent on the importance, target of study, scale, and the finances of the project, which may require special laboratory and field procedures.

Soil type prediction and soil mapping can be used by a wide range of individuals, for example farmers, town and country planners, conservationists, foresters, researchers, and engineers. The comparisons of the cost, sample size, accuracy, and desired classification level can show the efficiency of different digital and conventional soil mapping approaches in producing categorical maps of soil types (e.g., Zeraatpisheh et al. 2017; Piikki and Söderström 2018; Camera et al. 2017). However, the small number of soil maps with adequate scale and the corresponding high costs associated with soil surveying and mapping are also reported (Rizzo et al. 2016). On the other hand, characterizing the soil types for a particular landscape with a set of soil parameters is a difficult task due to the large variation in soils in a space (Santra et al. 2017a, b). The lack of an adequate spatial coverage scale for planning applications and inappropriate soil mapping consistency across countries can therefore be observed in most of the soil mapping carried out in many countries, even in Europe (Freire et al. 2013; Tizpa et al. 2015; Piikki and Söderström 2018; Yiming et al. 2018; Carré and Girard 2002; Nagaraj 2000). Development of digital soil mapping through soil type prediction using field data integrated with geographic information systems (GIS), geostatistical, and soft computing techniques can be useful to cover the mentioned gaps for new kinds of spatial soil information for more general and target-specific soil maps, which cannot be fully satisfied by legacy soil maps or formerly elaborated databases (Pásztor et al. 2017; Sindayihebura et al. 2017). Such requirements for improving the soil mapping methods have, however, been pointed out in previous studies (e.g., Peterson 1991; Band and Moore 1995; Zhu and Mackay 2001). In recent years, soft computing methods such as fuzzy logic and in particular artificial neural networks (ANNs) have, due to their significant abilities, been used as desirable approaches to produce soil mapping systems through the available datasets (Dobos et al. 2006; Zhu et al. 2010; Pásztor et al. 2017). The ANNs have also successfully been developed to estimate soil classification (e.g., Olanloye 2014; Kurup and Griffin 2006; Bhattacharya and Solomatine 2006), soil mapping (e.g. Zhu et al. 2010; Arel 2012; Choobasti et al. 2015) and soil behavior modeling (Edincliler et al. 2013; Cevik et al. 2010). The data extracted from these generated soil mappings can be used to predict engineering properties, especially for preliminary design purposes (Cabalar et al. 2012; Jaksa 1995).

Among the various types of ANNs, the multi-layer percepteron (MLP) and self-organized featured map (SOFM) are the ones most commonly used in digital soil mapping due to their utilizing different classification approaches (Freire et al. 2013), and more accuracy has been observed in the results of MLP than SOFM (Albuquerque et al. 2009). However, in developing the MLP problems such as computational effort, time, proneness to over fitting, and the empirical nature of the model should be considered (Kumar Gupta et al. 2017). Moreover, previous studies have demonstrated that an appropriate combination of ANNs and soil map data in known landscape features and spatial variation of soils not only showed high accuracy (Sarmento et al. 2010) but can also be employed to predict soil types at locations where there are no current soil maps (e.g. Freire et al. 2013; Mcbratney et al. 2003; Tso and Mather 2001; Zhu 2000; Behrens et al. 2005; Carvalho Junior et al. 2011).

Due to a lack of research on the performance of different ANN architectures and related aspects for soil spatial modeling on different scales and under different conditions (Freire et al. 2013), this study aims to examine the applicability of a developed multi-output generalized feed forward neural network (GFNN) based model using available peizocone penetration test data (CPTu) to provide a regional soil mapping in southwest Sweden. The soil behavior type (SBT) (Robertson et al. 1986) and soil type behavior index (IC) (Jefferies and Davies 1993) predicted by a GFNN model were converted to 2D soil type map distribution and then compared to known soil classification methods.

Applicability of CPTu information for soil type classification

The direct readings of CPTu, including depth, cone tip resistance (qc), sleeve friction (fs), and porewater pressure (u), can, after processing and correction operations, be converted to soil type classification charts (e.g., Robertson et al. 1986; Robertson 1990; Jefferies and Been 2006). Comparison of these charts with other laboratory based procedures such as a unified soil classification system (USCS) requires less detailed laboratory work (Abbaszadeh Shahri et al. 2015a; Arel 2012). The chart proposed by Robertson et al. (1986) consists of 12 zones that correspond to the USCS classes, whereas Robertson (1990) classifies the soils into nine zones. These charts use the processed and corrected CPTu data, including corrected cone tip resistance (qt) for pore pressure at the shoulder area (a), friction ratio (Rf), normalized cone resistance (Qt), and normalized friction factor (Fr), which can be obtained from Eqs. 14.

$$ {\mathrm{R}}_{\mathrm{f}}=\frac{{\mathrm{f}}_{\mathrm{s}}}{{\mathrm{q}}_{\mathrm{c}}}\times 100 $$
(1)
$$ {q}_t={q}_c+{u}_2\left(1-a\right) $$
(2)
$$ {\mathrm{Q}}_{\mathrm{t}}=\frac{{\mathrm{q}}_{\mathrm{t}}-{\upsigma}_{\mathrm{v}}}{\upsigma_{\mathrm{v}}^{\prime }} $$
(3)
$$ {\mathrm{F}}_{\mathrm{r}}=\frac{{\mathrm{f}}_{\mathrm{s}}}{{\mathrm{q}}_{\mathrm{t}}-{\upsigma}_{\mathrm{v}}^{\prime }} $$
(4)

By referring to Ku et al. (2010) as indicated by Abbaszadeh Shahri et al. (2015a), the reliability of IC for mechanical behavior classification of soil has been approved (Eq. 5). In this relation, the pore pressure ratio (Bq) can be obtained from Eq. 6.

$$ {\mathrm{I}}_{\mathrm{C}}=\sqrt{\left[\left({\left(3-\log \left({\mathrm{Q}}_{\mathrm{t}}\left(1-{\mathrm{B}}_{\mathrm{q}}\right)\right)\right)}^2+{\left(1.5+1.3\log {\mathrm{F}}_{\mathrm{r}}\right)}^2\right)\right]} $$
(5)
$$ {\mathrm{B}}_{\mathrm{q}}=\frac{{\mathrm{u}}_2-{\mathrm{u}}_0}{{\mathrm{q}}_{\mathrm{t}}-{\upsigma}_{\mathrm{v}}} $$
(6)

Where u0u0 is in situ pore pressure and σv and σv are total and effective overburden stresses, respectively.

Therefore, IC as an engineering concept can be implemented not only for soil classification using CPTu data but also to produce the soil profiles (Abbaszadeh Shahri et al. 2015a).

The Robertson et al. (1986) soil profiling chart uses qt, Rf, and Bq, whereas Robertson (1990) implemented Qt and Fr. The comparison of both these charts to identify the soil types using coded zones (SBT (Robertson et al. 1986) and SBTn (Robertson 1990)) is presented in Table 1, in which the boundary provided between the identified zones are used to present gradual conversion from fine-grained to coarse-grained soils. The results from these charts thus lead to the soil profile and identify the underlying soil types and their corresponding thickness.

Table 1 Proposed unification between 12 SBT zones (Robertson et al. 1986) and nine SBTn zones (Robertson 1990) and corresponding SBT index (IC)

Materials and datasets

The selected area (Fig. 1a) lies around the Göta River in southwest Sweden, which has been subjected to several geotechnical and geophysical investigations (Löfroth et al. 2011; Malehmir et al. 2013; Abbaszadeh Shahri et al. 2015a, b; Abbaszadeh Shahri 2016). The CPTu provided by the Swedish Geotechnical Institute (SGI) as well as some executed laboratory and field test results (e.g., Rannka et al. 2004; Löfroth et al. 2011; Millet 2011) were used in this paper. The provided topography map of the studied area was obtained thorough the digital elevation model (DEM) and then the CPTu test points used as well as some nearby towns were located on it (Fig. 1b). To construct the datacenter for further analyses the implemented CPTu soundings were divided into three categories using randomized selection with 55%, 25%, and 20% for training, testing, and validation sets, respectively (Fig. 1b).

Fig. 1
figure 1

(a) Location of studied area in southwest Sweden (www.vidiani.com/large-detailed-topographical-map-of-sweden/) and (b) location of randomized used CPTu test points on the provided topography map from DEM

Applied ANN model

The ANNs are small scale computer models of the human brain that can be trained at high speed for different nonlinear problems using appropriate learning algorithms (e.g., Duda et al. 2001; Jordan and Bishop 2004; Theodoridis and Koutroumbas 2009; Arel 2012).

In this paper, a GFNN based model (Worden et al. 2007; Abbaszadeh Shahri et al. 2015b) is developed as a special class of MLP to predict the soil type classes. The GFNN is derived from the extended shunting inhibitory artificial neural networks (SIANNs) and includes two types of inputs: excitatory (equal to the number of shunting neurons) and inhibitory (Arulampalam and Bouzerdoum 2003). Shunting inhibition is a powerful computational mechanism that plays an important role in sensory information processing and provides more freedom in selecting the optimum network structure (Arulampalam and Bouzerdoum 2003). The GFNN classifier not only covers the motioned advantage but also uses a generalized shunting neuron (GSN) model that enables the connections to jump over one or more layers and allows neurons to operate as adaptive nonlinear filters (Arulampalam and Bouzerdoum 2003; Abbaszadeh Shahri et al. 2015b). Moreover, the GSN is able to reduce both computational effort and memory requirements and can arrange the variables that should be used as excitatory inputs. In the GSN, all the excitatory input is summed and passed through an activation function similar to a perceptron neuron (Eq. 7).

In the GFNN architecture, neurons in each layer receive inputs only from the preceding layer and calculate their outputs according to Eq. 7, and then transmit the resulting signals to the next layer. The hidden layers of GFNN can consist of generalized shunting inhibitory neurons or perceptron-type neurons (Arulampalam and Bouzerdoum 2003; Worden et al. 2007; Abbaszadeh Shahri et al. 2015b). The role of shunting inhibitory layers is to perform a nonlinear transformation on the inputs in which the results can easily be combined by output neurons to provide the correct decision. Moreover, GFNNs are capable of forming complex, nonlinear decision boundaries as well as being able to solve the problems much more efficiently than MLPs in the same number of processing elements (Abbaszadeh Shahri et al. 2015b).

$$ {x}_j=\frac{b_j+f\left(\sum \limits_i{w}_{ji}{I}_j+{w}_{jo}\right)}{a_j+g\left(\sum \limits_i{c}_{ji}{I}_i+{c}_{jo}\right)} $$
(7)

Where:

xj: output (activity) of the jth neuron; Ij and Ii: inputs to the ith and jth neurons; aj: passive decay rate of the neuron (positive constant); wji and cji: connection weight from the ith inputs to the jth neuron; aj and bj: constant biases; g and f: activation functions.

To prevent network over-fitting, finding an optimum model both in structure and size for evaluating the results is an important task that depends on the quality of the implemented data (Wang and Strong 1996). The developed GFNN model in this paper was found through trial and error using Matlab (2016a) under an interactive computing environment with hundreds of functions. Moreover, the included neural network toolbox consists of source codes of various training algorithms, which depend on problems being able to be developed for adaptability in different situations.

Using the capabilities provided in Matlab, it can be connected to an Excel spreadsheet, which allows Matlab commands to be issued from Excel. This feature greatly enhances data manipulation and sharing between programs (Juang et al. 2001; Arel 2012). The number of neurons, activation transfer functions, network arrays, and training algorithms were the variables used to find the optimized model. The weights that minimize the error functions are then considered to be a solution to the learning problem, which can be found using the Delta-rule or gradient descent technique (Rojas 1996; Rumelhart et al. 1986). The quick propagation, conjugate gradient descent, step, momentum, and Levenberg-Marquardt were the implemented training algorithms. The logistic, hyperbolic tangent, linear, softmax axon, and bias axon functions were also used for activation of hidden and output layers and the sum of squares was employed as output error function, respectively.

The information from 58 located CPTu test points was implemented and randomized by 55%, 25%, and 20% to provide training, testing, and validation datasets (Fig. 3). Due to the widely approved efficiency of the proposed linked charts to cone parameters in determination of soil stratigraphy as well as identifying the soil type (Robertson et al. 1986; Robertson 1990) in practical oriented projects (Abbaszadeh Shahri et al. 2015a), in this paper, the SBT (Robertson et al. 1986) and IC were predicted as the output of the GFNN model. Using an iterative trial and error procedure, different GFNN structures were examined by changing the related components, including number of neurons, layers arrangements, training algorithms, and activation functions under different learning rates. For each of the tested structures, the network correlation (R2) and minimum root mean square error (RMSE) controlling criteria were calculated. However, if these criteria were not achieved, the number of epochs is then employed as the termination criterion. In this study, the number of epochs was set to 1000. Each structure was run three times. Among the 650 different tested models, two hidden layers including 14 neurons with 4-8-6-2 structure subjected to momentum algorithm and tangent hyperbolic activation function were found to be optimum. The information on inputs, outputs, training, and validation error as well as the calculated controlling criteria of the introduced model are given in Table 2, and the result of the predicted soil types for one of the training test points is presented in Fig. 4.

Table 2 Characteristics of optimum GFNN model using CPTu data

Mapping the predicted soil type classes

As presented in Table 2, the outputs of GFNN in prediction of SBT and IC were employed to produce the digitized distribution soil type maps at different depths. Due to observed matching between predicted values (SBT, IC) and field data (83.9 and 94.6%), the IC for soil mapping is selected and one example is shown in Fig. 4. Moreover, Robertson (2016) showed that using IC is more reliable for digitizing the soil classification in engineering applications and practices due to the predicted values being closer to those calculated using conventional methods (Eq. 5). Furthermore, the IC can delineate the clay or sand-like behavior of underlying soils and then can be converted to SBT using Table 1. This is an important issue to distinguish the area prone to slide or liquefaction, which is illustrated in the next section. A series of 2D soil type distributions based on the IC predicted by the GFNN model at different depth intervals for a characterized small scale of studied area including used test points (Fig. 3) were therefore provided and are reflected in Fig. 5(a–i). The mapped area in Fig. 5 was characterized using a rectangle in Figs. 2 and 3, respectively. Considering that IC > 2.6 can be assigned to susceptible soil layers, then the sensitivity (St) of the clayey materials in this area is increased with depth.

Fig. 2
figure 2

The generic architecture of feed forward SIANN structure (Arulampalam and Bouzerdoum 2003)

Fig. 3
figure 3

3D map of study area created from satellite image and location of randomized CPTu test points. The surrounded area in the white rectangle is used to represent the predicted 2D soil type map

Fig. 4
figure 4

Comparison between actual and predicted soil types in one of the training test points based on (a) SBT and (b) IC values

Fig. 5
figure 5

Predicted soil type distribution for 1.5 m depth intervals using the optimized GFNN model in this study (the boundary values of IC can be found in Table 1). The axes are based on decimal degrees. The geographical location in these maps has been delineated using a rectangle in Figs. 2 and 3

Fig. 6
figure 6

Data scattering of actual and predicted IC and SBT regarding the 1:1 slope line

Fig. 7
figure 7

Comparison of (a and b) predicted IC and SBT according to the defined boundaries (Table 1) using the GFNN model and CPTu interpretations with available USCS data (c) the classified soil classes using fuzzy approach (Zhang and Tumay 1999), and (d) classified soil types based on Douglas and Olsen (1981). The methods and used abbreviations were described in the text

Discussion

The introduced GFFN model was evaluated using validation datasets not previously fed or seen by the network. The compared results of predicted SBT and IC values in validation datasets with respect to interpreted CPTu data and conventional soil type charts showed 84.5 and 90.34% success in accuracy, whereas for all data sets 83.9 and 94.6% were observed. Moreover, the IC values are used to screen out layers susceptible to liquefaction (i.e., IC > 2.6) and also differentiate between clay-like and sand-like soils (Robertson 2016). This is an important issue in Sweden and also in areas which suffer from frequent landslides, which mainly occur in clayey soils and in quick clay in particular. As shown in Fig. 5, an expansion in clayey-like behavior zones with IC > 2.6 can be observed, which according to surveyed geological condition and landslides that have occurred in the area (Klingberg 2010) can be attributed to the presence of quick clay. The soil type distribution presented in Fig. 5 also showed a similar trend in geophysical and geotechnical investigations as well as laboratory tests (Abbaszadeh Shahri et al. 2015a; Malehmir et al. 2013; Löfroth et al. 2011), which aimed to provide high resolution subsurface layers and in particular distinguish the quick clays for a part of this area. It was thus discovered that soil type mapping based on IC can play an important role in detecting possible hazardous areas and delineating the clay-like behavior areas (i.e. IC > 2.6).

The data scattering conducted between actual and predicted SBT and IC values regarding the 1:1 slope line is shown in Fig. 6. The IC is a continuous range of numbers, whereas the SBT includes the integer values (Table 1). The aggregations of data around the slope line that express appropriate performance in IC are therefore more visible than SBT and the located data on the slope line indicate accurate prediction. The calculated coefficients of determination (R2) and the percentage of outputs classified correctly in prediction using the GFNN (Table 3) also showed better performance than SBT.

Table 3 Percentage of correctly classified predicted values using the GFFN model

The accuracy of predicted IC in the soil type maps produced for each test point was also compared to and evaluated against other known soil classification procedures (Zhang and Tumay 1999; Douglas and Olsen 1981) as well as available field and laboratory tests in the study area. As an example, one of the investigated test points belonging to validation datasets is presented (Fig. 7). For the presented test points, USCS data was also available (Fig. 7). The USCS, which is expressed by two letter symbols, classifies the soils based on texture and grain size in different groups for engineering geology purposes and can be applied to most unconsolidated materials (the first indicates the soil type and the second corresponds to grading and plasticity conditions). The M and C express the silt and clay, while L shows low plasticity, respectively (Fig. 7). As can be seen from Fig. 7a, the IC predicted using GFNN showed appropriate adaption with both USCS data and CPTu conventional soil type charts. It can be observed that the chart by Robertson et al. (1986) is able to cover most of the classes defined in USCS, whereas Zhang and Tumay (1999) showed high clayey soils containing 5–6% sand and 43–44% silt, which can be classified as CL, ML, and CH in USCS (Fig. 7c). The method used by Douglas and Olsen (1981) also classified the soil types in the CL-CH group (Fig. 7d). Comparison of these analyses showed more appropriate conditions when subjected to Robertson et al. (1986) and Zhang and Tumay (1999). The results of these analyses also showed that using IC is not only more reliable but can also be a logical reason for soil classification in engineering applications and practices. The test point marked in Fig. 5, which belongs to validation datasets, has been the subject of both geophysical, geotechnical, and laboratory investigations (Abbaszadeh Shahri et al. 2015a; Malehmir et al. 2013; Löfroth et al. 2011).

Conclusions

In the current study a multi-output GFNN based model was developed to predict the distribution map of soil types at various depth intervals using recorded CPTu data in the southwest of Sweden. Two outputs consisting of SBT and IC were utilized for the GFNN, and the results were compared to different conventional soil classification charts and a statistical fuzzy approach. The analyses showed that a four layer multi-output GFNN under 4-8-6-2 structure with a 94.3% success rate can be considered the best model in this study. Comparison of predicted GFNN outputs showed that the IC values can be used for soil distribution mapping and distinguishing of soil types due to better conditions than SBT. The performance of the GFNN model in estimating the complex soil profile plots was also controlled by calculated correlation of determination in soil type distribution maps using both predicted outputs (SBT and IC). Comparison of predicted soil type classes using GFNN with other known soil identification procedures showed acceptable accuracy in estimated soil type classes and digital soil type maps. However, an appropriate similar trend or even better situations than the conventional methods in many test points were observed. The studied area is prone to landslide hazard and the generated soil mapping can therefore play an important role in construction and environment issues. The integrity of GFNN with chart based soil classification methods thus indicates a powerful tool in providing a reliable prediction process.

It should be noted that the artificial intelligence based models can always be updated and modified to improve the results if new examples or more data become available and thus more complex and mostly unpredictable situations and problems in geotechnical engineering may be handled more confidently.