1 Introduction

Pile foundations are used to transmit the superstructure load to deeper strata, when the subsurface soil is of inadequate strength. Pile foundations are often subjected to axial and lateral loads. Under the action of lateral loads and moments, some of the piles in a group, may experience uplift displacement. In compressive loading, the tip resistance of piles plays a major role in pile capacity. In contrast to the compressive loading situation, the shaft resistance capacity alone works against the piles uplift force. On the other the hand, the tensile strength of soil is quite small in comparison to its shear strength and it can be safely neglected for a conservative estimate of uplift pile displacement.

To estimate the load-bearing capacity and settlement of piles, one or more of several pile loading tests (PLT) and pile dynamic analysis (PDA) tests may be performed, depending on the importance of a project. Due to high cost and time required for conducting such tests, it is a common practice for engineers to estimate the load-bearing capacity of piles using in situ tests, such as the cone penetration test (CPT), standard penetration test (SPT), dilatometer test and pressure meter test, and then to apply a reasonable safety factor value during the design process to achieve a stable foundation (Abu-Farsakh and Titi 2004). The cone penetration test is one of the most applicable penetration and investigation tests for analysis and design of piles. Its simplicity, speed, cost effectiveness, continuous recorded depth data, and ability to use various types of probe sensors have contributed to its popularity. Furthermore, due to the similarity of the rod function to the real prototype pile, the results of the cone penetration tests are more credible than other in situ tests for cases where piles are a better foundation system option. In this test, cone resistance (qc) and shaft friction are recorded simultaneously during the penetrations of tip to the soil. Two main direct and indirect approaches are available for using the cone penetration data when designing piles. In the indirect method, the CPT data (i.e., qc and fs) are used primarily to estimate the soil parameters, such as undrained shear strength (Su), internal friction angle (ϕ), and elasticity modules (Es). These parameters are then used to obtain the settlement of a pile using equations obtained from semi-experimental/analytical methods (Eslami and Fellenius 1997). In the direct method, the predicted bearing capacity is obtained by conducting the cone penetration test in the soil stratum, and by measuring the cone resistance (qc) and shaft friction (fs). These values are then directly used for predicted pile bearing capacity. In other words, up to now the direct method has been used for the prediction of pile capacity (Cai et al. 2009, 2012; Baziar et al. 2012). However in this study for the first time; the direct method has been applied for the prediction of uplift pile displacement. Considerable progress has been made since the past decades in the development of procedures for estimation of load settlement behavior of piles under compressive loading. However, only limited efforts have been implemented towards the understanding of piles behavior under uplift loads. Experimental studies in this direction have been carried out by various researchers like Ismael and Klym (1979), Rybnikov (1990) and Naggar et al. (2000). However, such experimental studies need to be complemented with analytical studies to broaden their application spectrum to suit for a variety of field and pile loading conditions.

2 Different Methods of Unit Shaft Resistance Estimation

In the literature, uplift capacity of pile is not widely discussed. Most researchers have considered the percentage of skin friction of pile capacity. The following are some of the methods.

2.1 Schmertmann and Nottingham Method

In this method, to estimate the unit shaft bearing capacity of the piles, two models have been introduced depending on the soil types (Schmertmann 1978; Nottingham 1975).

In the first method, to obtain shaft resistance of piles (rs), applicable in both clay and sand, the shaft friction (fs) is being employed using the following equation:

$$r_{s} = K \cdot f_{s}$$
(1)

where K is a dimensionless coefficient depending on the type of soil, shape and material of the pile, type of cone penetration and embedment ratio.

This value ranges from 0.8 to 2 for sandy soils and from 0.2 to 1.25 for the clayey soils. It has been assumed that in sands, the unit shaft resistance varies linearly from zero at the ground surface to the value obtained from the Eq. 1 up to depth equal to 8 times of the pile diameter from the surface.

In the second method, applicable only in sands, the shaft resistance is estimated based on cone resistance (qc) as follows:

$$r_{s} = {\text{C}} \cdot q_{c}$$
(2)

where C is a dimensionless coefficient depending on the type of pile ranging from 0.8 to 1.8 %.

2.2 European Method

The European method, suggested by De Ruiter and Beringen (1979), was developed based on the experiences obtained from the constructions in the North Sea. This method also differentiates between the piles driven in sandy or clayey soils.

To calculate the shaft bearing capacity of the pile in sands, Eqs. 1 and 2, suggested by Schmertmann (1978) and Nottingham (1975), are used with different values of the dimensionless coefficients. K in the Eq. 1 is assumed to be 1 and C in Eq. 2 is suggested to be 0.003.

To calculate the unit shaft friction in clayey soils using this method, first the undrained shear strength is calculated from the following equation

$$S_{u} = \frac{{q_{c} }}{{N_{k} }}$$
(3)

where Nk is the cone dimensionless factor obtainable from the in situ experiments ranging from 15 to 20. Then the unit shaft friction of the pile is calculated using the following equation:

$$r_{s} = \alpha S_{u}$$
(4)

where α is the cohesion factor as suggested by the US Oil Agency (Ardalan et al. 2009). This value ranges from 0.5 to 1 for over consolidated and normally consolidated clays, respectively.

2.3 French Method (LCPC)

This method has been proposed based on 197 experimental loading tests (Bustamante and Gianeselli (1982)). The shaft resistance of piles, in this method, is obtained using cone resistance, qc, as follows:

$$r_{s} = \frac{{q_{c} }}{{\alpha_{LCPC} }}$$
(5)

where αLCPC is a dimensionless coefficient of friction ranging from 30 to 200 depending on the type of soil, value of cone resistance, and the type and installation method of pile.

As can be observed, the Eq. 5 is indeed similar to the Eq. 2 with the dimensionless coefficient of C in the range of 0.5–3.0 %.

For all the mentioned methods (Eqs. 1, 2, 3, 4, 5), an upper bound of 120 kPa for the unit shaft friction of the pile is being considered.

2.4 Eslami and Fellenius Method

The shortcoming of the above methods was the fact that the soil classification was not taken into account in their calculation. Eslami and Fellenius (1997), employing the CPT apparatus, capable of measuring the piezocone pore pressure, developed a new model considering the soil classification.

The soil classification, in their method, is determined using the chart, based on the effective cone strength (qe) and shaft friction of the cone (fs).

In this method instead of measuring the cone resistance, qc, the effective cone resistance, qe, is used to estimate pile bearing capacity. The value of qe is obtained by deducting the measured pore pressure, U2 behind the cone, from the modified CPT cone resistance (qt), and then the unit shaft resistance is determined by the following equation:

$$r_{s} = C_{s} q_{e}$$
(6)

where Cs is a dimensionless coefficient of the shaft correlation which is a function of soil type. In order to determine the type of soil and subsequently to determinate Cs coefficient, the behavioral profile chart, as mentioned above, has been used. They suggested the Cs values as 0.08 for the soft soils with high sensitivity, 0.05 for clays, 0.025 for stiff clays or silty clays, 0.01 for silty sands, and 0.004 for clean sands.

3 Machine Learning Technique for Estimation of Bearing Capacity of Pile

Zadeh (1994), the inventor of the soft computing term, described the soft computing as “Soft computing is a collection of methodologies that aim to exploit the tolerance for imprecision and uncertainty to achieve tractability, robustness, and low solution cost. Its principal constituents are fuzzy logic, neuron computing, and probabilistic reasoning”. For complex problems where the relationship between the variables is unknown, the machine learning technique (for example artificial neural network (ANN) or Genetic Programming (GP), etc.) is a powerful predictive tool, as long as it resembles the nature of the situation.

Other researchers have previously shown that the complex phenomena such as liquefaction or pile capacity have been predicted more accurately by ANN than by the conventional methods (Goh 1996; Baziar and Saeedi Azizkandi 2013; Baziar et al. 2012).

The role model for soft computing is the human mind. Soft computing can be seen as an attempt of collection of techniques that mimic natural creatures: plants, animals, human beings, which are soft, flexible, adaptive and clever. It can be described as a family of problem-solving methods that have analogy with biological reasoning and problem solving. It includes basic methods such as neural networks (NN), genetic algorithms (GA) and genetic programming (GP), etc., the methods which do not derive from classical theories. Soft computing can also be seen as a foundation for the growing field of computational intelligence (CI) as an alternative to traditional artificial intelligence (AI) which is based on hard computing. In many ways, soft computing represents a significant paradigm shift in the aims of computing—a shift which reflects the fact that the human mind possesses a remarkable ability to store and process information which is pervasively imprecise, uncertain and lacking in categorization. Two soft computing approaches based on neural networks and genetic programming, were implemented in the present study.

3.1 Artificial Neural Network Models

A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects:

  • Knowledge is acquired by the network through a learning process.

  • Interneuron connection strengths known as synaptic weights are used to store the knowledge.

The procedure used to perform the learning process is called learning algorithm, a function to modify the synaptic weights of the network in an orderly fashion so as to attain a desired design objective.

In the general form of a neural network, the unit analogous to the biological neuron is referred to as processing element (PE). The network consists of many of these elements, organized into a sequence of layers or slabs with full or partial connections between successive layers specifically designated. Figure 1 shows simple two-layer network architecture. The neural network has an input buffer (not considered as a layer) to which data is presented to the network, and an output layer, which holds the response of the network to a given input. Layers distinguished from the input buffer and the output layer are called hidden layers. As shown in Fig. 1, a processing element (artificial neuron), usually excluding those in the input buffer, performs the summation (∑) and transfer function (F) to determine the value of its output. The S-shaped sigmoid function is commonly used as the transfer function. Neural networks typically are of two types: (1) ‘feed-forward’ or non-recurrent, where the network PE connections and thus the information flow are in one direction as shown in Fig. 1; (2) ‘recurrent’ which exhibits a more general network structure, allows feedback connections, through weights, extending from one layer to another or to itself. The type of network used in this research is feed-forward network. There are two main phases in the operation of a neural network: learning and recall. Learning is the process of adapting the connection weights in response to a number of examples (stimuli) being presented at the input buffer and, optionally, at the output buffer. The task is to arrive at a unique set of weights that are capable of correctly associating all example pattern(s), used in learning, with their desired output pattern(s). Usually, a training algorithm is used and held responsible for specifying how weights adapt in response to a learning example.

Fig. 1
figure 1

Typical architecture of artificial neural network (Baziar and Ghorbani 2005)

The most frequently used neutral network paradigm is the back-propagation learning algorithm. For improvement of speed and general performance of back-propagation, some techniques such as Delta-bar-Delta (momentum and adaptive learning rate) and Levenberg–Marquardt methods are commonly used.

In a typical learning session, learning examples are shown to the network for many thousands of items (epochs) until a certain preset criterion to stop the learning session is met. One such criterion is to consider the network to have adequately learned when the error between the output predicted by the network and the desired (target) output, accumulated in all learning examples such as sum-squared error, is less than a specified limit. Learning sessions often consume large amounts of computer time and can face serious problems including network ‘paralyses’ and ‘local minima’. Training algorithms may include deterministic or statistical procedures for the network weights adjustment. Statistical procedures have been used to alleviate the local minima problem and when high network accuracy is desired despite the longer training time that it requires. If the network training is successful, the network represents a model that can be recalled by applying a set of inputs to the network. Then, the model is expected to produce outputs that are satisfactorily close to the desired set of output(s) used in training.

In this study, the model consists of a hidden layer containing 5 neurons and the uplift pile displacement is considered as the only output.

3.2 Genetic Algorithm and Genetic Programming

As an optimization technique, genetic algorithm (GA), which was evolved from the principles of genetics and natural selection, tries to search the minima of a given function using a trial process. Genetic algorithm optimizes an array of input variables or chromosome sin different types such as binary strings (0, 1), real strings (0, 1… 9), and representation of tree (computer programs). Koza (1990) developed a special genetic algorithm known as ‘‘genetic programming (GP)’’in which each chromosome in the population is a program comprised of random mathematical functions and terminals. A function set could contain functions such as basic mathematical operators (+, −, n,/, etc.), Boolean logic functions (AND, OR, NOT, etc.), or any other user defined function (Cevik et al. 2009; Jafarian et al. 2010; Baziar and Saeedi Azizkandi 2013).

4 Data Set

In order to use the data mining methods such as neural networks, genetic programming and etc. getting a hold of appropriate data is an essential task. For any data set, three characteristics should be met:

  1. 1.

    Reliability, meaning that the data must be real and accurate.

  2. 2.

    The quantity of data must be adequate considering the dimensions and complexity of the problem, and

  3. 3.

    All aspects of the phenomenon should be covered in the resulted analysis.

The collected data set in this study consists of 157 experimental data from the real pile tension loading tests.

This database include information about the diameter and length of piles, type of soil, in situ CPT results and the static tension loading test results. This data has been reported from 5 sites in 3 countries. The sources used for collecting this database as well as the location of tests are noted in Table 1. The piles have been of different kinds driven in various districts. The minimum embedment length of piles is 8.2 m and the maximum length is 34.25 m. The diameter of piles range from 270 to 800 mm and the uplift bearing capacity of piles varies from 485 to 3,250 kN.

Table 1 Database references

To construct the soft computing model, it is common practice to divide the available data into two subsets; a training set and an independent validation set to estimate model performance in the deployed environment. In this study, 75 % of the data set (118 cases) was selected for training and the other 25 % (39 cases) was used for the validation of model ANN and GP. Like all empirical models, soft computing methods are unable to extrapolate beyond the range of their training data. Consequently, in order to develop the best possible model, given the available data, all patterns that are contained in the data are needed to be included in the training set. Similarly, since the testing set is used to determine when to stop training, it needs to be representative of the training set and should therefore also contain all of the patterns that are present in the available data. If all the available patterns are used to calibrate the model, then the most challenging evaluation of the generalization ability of the model is whether the validation data set includes all of the patterns. Consequently, it is essential that the data used for training and testing represent the same population. In order to achieve this desired population in the present study, several random combinations of the training and testing sets were tried until two statistically consistent data sets were obtained. The statistical parameters considered in this study including mean, standard deviation, minimum, maximum, and their ranges are presented in Table 2. Despite trying numerous random combinations of training and testing sets, there are still some slight differences in the statistical parameters of the training and testing sets. These can be attributed to the fact that the data contains singular, rare events that cannot be duplicated in two data sets. However, as a whole, the statistics are in good agreement and two data sets may be considered to represent the same population.

Table 2 ANN and GP input and output statistical

The following statistical parameters were used in order to assess the credibility and accuracy of the proposed models:

The coefficient of determination (R2); while different equations are available to obtain (R2), the following equation was employed in this study because of its popularity.

$${\text{R}}^{2} = \left( {1 - \frac{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} ({\text{M}}_{\text{i}} - {\text{P}}_{\text{i}} )^{2} }}{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} ({\text{P}}_{\text{i}} )^{2} }}} \right) \times 100\%$$
(7)

Root means squared error (RMSE):

$${\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} \left( {{\text{M}}_{\text{i}} - {\text{P}}_{\text{i}} } \right)^{2} }}{\text{n}}}$$
(8)

where; Mi, Pi and n are measured, predicted and number of data, respectively.

5 Results of ANN and GP Models

According to the ANN model, Table 3 shows the weights of the input-hidden and hidden-output layer connections for predicting the uplift displacement. The relative importance of various input parameters can be assessed by partitioning the hidden-output connection weights into components connected with each input variable (Table 4). As can be observed, the diameter and length have a significant effect on the uplift displacement of the piles. Also, Fig. 2 showed the ANN model which had high accuracy (R2 and RMSE 98.51 % and 1.96 for total data) with five input parameters.

Table 3 Connection weights of ANN-model
Table 4 Relative importance of ANN input variables
Fig. 2
figure 2

Estimated versus measured uplift displacement using ANN model

The predicted uplift load- displacement curve by ANN and their comparison with the measurement values for one of the pile is shown in Fig. 3.

Fig. 3
figure 3

The predicted uplift load- displacement by ANN and their comparison with the measurement values for one of the piles

In order to study the performance of the network regarding its consistency with the physics of the problem, a parametric study of the changes in uplift pile displacement has been conducted for the qc and fs parameters.

Figures 4 and 5 show predicted values of displacement as a function of qc and fs when other parameters are set to mean constant values. These figures also indicate that the settlement will decrease with an increase in the qc and fs.

Fig. 4
figure 4

Variation of the predicted Uplift displacement against qc, for the proposed ANN model

Fig. 5
figure 5

Variation of the predicted Uplift displacement against fs, for the proposed ANN model

The ANN model presented in this research clearly revealed the relative importance of the various parameters. Despite doing well in the assessments, the ANN model does not have the simplicity and applicability of the GP model.

The advantage of the ANN model was the ability to calculate the relative importance of each input parameter. The relative importance, obtained from the ANN model, reinforces the importance of the length and diameter of the pile when calculating the uplift displacement.

Many data sets were executed with various initial setting using GP method, and the performances of the obtained equations were benchmarked. Selection of the best model was based on statistical criteria including: R2 and root mean square error (RMSE). The following relationship was finally selected as the best model for prediction of uplift pile displacement, using (GP):

$$\begin{aligned} {\text{S}} &= 89.49{\text{L}} - 0.05129{\text{P}}_{\text{load}} - 179{\text{D}}_{\text{eq}} \\ & + \frac{{6231{\text{D}}_{\text{eq}}^{2} }}{{{\text{q}}_{\text{c}} - 0.1958}} - \frac{44.69}{{{\text{q}}_{\text{c}} }} + \frac{{0.05129{\text{P}}_{\text{load}} + 0.5377{\text{q}}_{\text{c}} }}{{{\text{D}}_{\text{eq}} }} \\ & - \frac{{87.32q_{c} {\text{L}}}}{{{\text{q}}_{\text{c}} - {\text{D}}_{\text{eq}} }} + \frac{{0.06175{\text{P}}_{\text{load}} - 0.04166}}{{{\text{q}}_{\text{c}} \left( {{\text{f}}_{\text{s}} - {\text{D}}_{\text{eq}} } \right)}} - 34.57 \\ \end{aligned}$$
(9)

Precision of the developed equation is examined by plotting the measured versus predicted values of displacement for all the data as shown in Fig. 6.

Fig. 6
figure 6

Estimated displacement by GP model versus measured uplift settlemnt

In this study, the statistical parameters, R2 and RMSE are 83 % and 9.7 mm, respectively, for all data.

6 Conclusion

In this paper, neural network and genetic programming methods were used to predict the uplift pile displacement. A database including 157 cases, from 5 sites located in Japan, USA, and Taiwan was implemented to develop the new models.

While the ANN model gave more accurate results than the GP model, the GP model in practice is considered more applicable due to its simple usage and yielding an equation. The purposes of presenting the ANN model in this research were to evaluate the effect of various parameters on the uplift pile displacement and to provide the accurate model to estimate uplift pile displacement. Despite its vivid appearance in assessments and prescience, the results of ANN model do not have the same simplicity and applicability of the GP model.

As a result of ANN analysis, five parameters of Pload, qc, fs, Deq and L for predicting displacement were selected as input parameters to develop the GP-based model. The proposed GP model showed a reasonably good performance for all the data sets with (R2 = 83 % and RMSE = 9.7 mm).

The excellent result of the ANN-2011 model (R2 and RMSE) indicates the effectiveness of input parameters in calculation of the uplift pile displacement. The GP model is very simple to use, understandable by engineers and applicable in practice.

Advantage of this equation, compared to the previous research, to estimate uplift pile displacement are:

  • It estimates pile displacement for tension pile based on the real results of cone penetration tests.

  • Most previous researches were based on compressive load-settlement equations, while the presented model in this study is based on only tension-displacement tests

  • It proposes a practical and user-friendly equation with good degree of accuracy.