1 Introduction

Foundations are generally categorized into deep and shallow (spread) foundations. The use of the latter is recommended from the economic perspective if the subsurface condition is good enough (Gibbens and Briaud, 1995). Nevertheless, bearing capacity and allowable settlement of foundations are of great concern to geotechnical engineers (Momeni et al., 2013). The former is often referred to a maximum load that the soil can tolerate before its failure. A number of researchers (Terzaghi, 1943; Meyerhof, 1963; Vesic, 1973) formulated bearing capacity theory, which for reasons of brevity is not repeated here as it is well established.

However, in the recent past, the use of skirted shallow foundations, or in other words thin-walled spread foundations, is highlighted in several studies. In this regard, Al-Aghbari and Mohamedzein (2004) as well as Eid et al. (2009) mentioned that providing skirts (thin-walls) for the spread foundation forms an enclosure in which the soil is confined. They hypothesized that the existence of such walls results in transferring the foundation loads to the laterally confined soil and then to the deeper sand layers that are more confined than shallow layers due to an increase in overburden pressure. Providing thin-walls for foundations can increase the total depth of failure and change the failure pattern of the soil which may lead to more shear strength mobilization.

Al-Aghbari and Mohamedzein (2004) reported the increase in bearing capacity by a factor in the range of 1.5 to 3.9 when skirt foundations are used instead of simple (surface) foundations. In another study, Al-Aghbari and Dutta (2008) addressed an increase in the bearing capacity from 11.2% to 70% due to incorporation of structural skirts.

Eid (2013) mentioned that incorporation of skirts leads to significantly higher bearing capacity. According to his conclusion, depending on thin-wall length to footing width ratio (Lw/B), thin-walled foundations exhibited 1.4 to 3 times higher bearing capacity in comparison with simple foundations.

Nazir et al. (2013) introduced a specific thin-walled spread foundation (Fig. 1) suitable for industrialized building systems. Their numerical investigations showed the workability of the aforementioned foundation. Momeni et al. (2015b) reported that the use of a thin-walled spread foundation compared with a surface foundation can increase the bearing capacity by almost twice in both loose and dense sands.

Fig. 1
figure 1

Thin-walled spread foundation: (a) isometric view; (b) bracing system; (c) bottom view; (d) cross sectional view. Reprinted from (Momeni et al., 2015b), Copyright 2015, with permission from ICE Publishing

Nevertheless, the fact that famous bearing capacity equations are proposed for conventional spread foundations rather than thin-walled spread foundations encouraged the authors to develop a predictive model of bearing capacity for such footings. As highlighted in the next section, the use of an artificial neural network (ANN) in foundation engineering problems, i.e., bearing capacity, is recommended in many studies (Shahin, 2015).

However, the majority of the predictive models of bearing capacity are built using a conventional ANN which suffers from getting trapped in local minima and a slow rate of learning. In this regard, several studies reported the use of optimization algorithms, such as the genetic algorithm (GA) and particle swarm optimization (PSO) algorithm, for enhancing the ANN performance (Section 3.4).

In this study, an attempt was made to develop an improved ANN-based predictive model of bearing capacity for thin-walled spread foundations. For this reason, four small scale footing load tests were conducted in the laboratory. The laboratory tests results as well as a relatively large number of related recorded cases of footing load tests compiled from literature formed the required dataset for developing the predictive model proposed in this study.

It is worth mentioning that as far as authors are aware, there is no comprehensive and well-established ANN-based predictive model for this kind of foundation (thin-walled spread foundations). Thus, the study presented here is different from previously proposed predictive models of bearing capacity. In addition, the work presented here takes advantage of a relatively large dataset, i.e., 149 recorded cases, which significantly reduces the likelihood of model over fitting (Zorlu et al., 2008).

2 Related predictive models

Successful application of a conventional ANN in geotechnical engineering is addressed in many studies (Shahin et al., 2001; Jahed Armaghani et al., 2014; Tonnizam Mohamad et al., 2014). As tabulated in Table 1, in foundation engineering, numerous researchers showed the workability of ANNs for predicting either settlement or bearing capacity of foundations. Table 1 also gives the dataset number, the coefficient of determination values, as well as the major input parameters of the recently proposed predictive models. According to Table 1, footing geometrical and soil properties are influential parameters in ANN-based predictive models.

Table 1 Related works on the application of ANN in foundation engineering

3 Methods

3.1 Artificial neural network

The application of ANN as a promising tool in solving geotechnical and rock engineering problems has drawn considerable attention (Meulenkamp and Alvarez Grima, 1999). In essence, the use of ANNs as function approximation tools is of interest when the contact nature between the influential parameters on model output(s) and the output parameter(s) is unknown. In other words, when the underlying problem is difficult to model explicitly or it is difficult to find a close form solution for a problem, the use of ANNs is advantageous (Garrett, 1994). Nevertheless, a typical ANN consists of a number of interconnected layers (input, hidden(s), and output(s) layers). Each layer comprises one or more processing units known as neurons or nodes. The nodes of different layers are connected to each other through adjustable connection weights. However, as reported by Hornik et al. (1989), usually one hidden layer is good enough for approximating any continuous function.

The use of more hidden layers can increase the model complexity and the likelihood of model over fitting, which should be avoided in designing ANNs. Before interpreting new information, the network needs to be trained.

The back-propagation (BP) algorithm is the most commonly implemented training algorithm in ANNs (Dreyfus, 2005). In essence, the role of the BP algorithm is to optimize the connection weights which lead to a desirable mean square error (MSE). The MSE is referred as the square difference between the predicted outputs and the target outputs. The best outputs are often predicted after several feedforward-backward passes. In the forward pass, the data presented to input layer (denoted by Ii in Fig. 2, i=1, 2, …) start to feedforward. In this step, each input node transmits several signals to the hidden nodes. In other words, each node in the hidden layer (Ni in Fig. 2) receives the sum of weighted input signals (input values multiplied by random connection weights, \(\sum {{W_{ij}}}\) in Fig. 2) as well as a threshold value known as bias (Bi in Fig. 2). The output of each hidden neuron is subsequently obtained by applying a transfer function (usually a sigmoid function) on the net input values of hidden neurons (Fig. 2). The same procedure is repeated for the next layers until the output (O in Fig. 2) is predicted. Having known the actual outputs, if the predicated outputs are not desirable, the network should back propagate and update the connection weights. This process is known as backward pass. These passes are repeated until the desirable outputs are predicted. Readers are referred to fundamental artificial intelligence books for more information about ANNs (Fausett, 1994).

Fig. 2
figure 2

Typical architecture of ANN. Reprinted from (Momeni et al., 2014), Copyright 2014, with permission from Elsevier

3.2 Particle swarm optimization

Particle swarm optimization (PSO) is an evolutionary computation algorithm which implements a nonlinear procedure for solving a continuous global optimization problem. PSO was proposed by Kennedy and Eberhart (1995). In PSO, particles are candidate solutions of a problem. At first, a random number of particles forms a population. Each particle is given a random position and velocity. Subsequently, an iterative procedure is implemented to find the best solution (often global minima). The particle positions, at this stage, are adjusted based on the particle experience as well as the swarm experience. To be more specific, each particle keeps track of its best position as well as the global best position of other particles. In PSO terminology, the best position which a particle has experienced and the global best experience achieved by other particles are referred as pbest and gbest, respectively.

Nevertheless, PSO is trained to propel towards its pbest and gbest. For this reason, a new velocity term is determined for each particle based on its distance from its pbest and gbest. In the next step, these two pbest and gbest velocities are randomly weighted to generate a new velocity value for a specific particle. As a consequence, in the next iteration, the position of the particle will be affected (Eberhart and Shi, 2001). The relatively simple PSO velocity update and movement equations are given in the following lines. They are mainly used for determining the actual movement of a particle and adjusting the velocity vector.

$${V_{{\rm{new}}}} = V + {r_{{1}}}{c_{{1}}} \times ({p_{{\rm{best}}}}-p) + {r_{{2}}}{c_{{2}}} \times ({g_{{\rm{best}}}}-p), $$
(1)
$${p_{{\rm{new}}}} = p + {V_{{\rm{new}}}}, $$
(2)

where c1 and c2 are pre-defined coefficients, r1 and r2 are random values in the range (0, 1) sampled from a uniform distribution, Vnew and V are the new and current velocity vectors, pnew and p are the new and current positions of particles, respectively.

3.3 Genetic algorithm

The genetic algorithm (GA), which was first introduced by Holland (1975), is one of the most popular evolutionary algorithms. Similar to PSO, in GA there are some candidate solutions that mature over time to reach an optimal solution. In GA terminology, the candidate solutions, which form an initial population, are called chromosomes. Chromosomes are in the form of a linear string comprising 0 and 1. Similar to other optimization algorithms, GA starts with defining the optimization parameters and cost function (usually MSE), and ends when the stopping criteria are met. Each iteration in GA is termed generation. However, to produce the next generation in GAs, three genetic operations should be performed. These operators include reproduction, crossover, and mutation.

As stated by Momeni et al. (2014), reproduction is a step in which the best chromosomes are selected according to their scaled values and based on the given criteria of fitness, subsequently they will be passed to the next generation. Crossover, on the other hand, generates offspring (also called new individuals) by combining certain parts of the parents. In essence, as mentioned by Jadav and Panchal (2012), in the crossover process, two parents as well as an arbitrary crossover point are selected. Subsequently, through merging the left side genes of the first parent with right side genes of the second parent, the first offspring is generated. For creating the second offspring, an inverse procedure is repeated. The last GA operator is mutation. This operator is utilized to describe a sudden change which might appear in the allele of a chromosome. The negligible arbitrary changes applied to the element of a chromosome help GA to search a broader space. A typical GA is described in the following lines:

  1. 1.

    Forming a group of candidate solutions (initial population);

  2. 2.

    Finding the cost of each chromosome:

    1. (1)

      Preferentially transferring the chromosomes with lower costs to the next generation;

    2. (2)

      Applying crossover and mutation function for creating new chromosomes;

  3. 3.

    Checking convergency (if the stopping criterion is not met, repeat the second step);

  4. 4.

    Introducing the fittest chromosome as the solution.

3.4 Hybrid artificial neural network

It was mentioned above that the main reasons behind using the optimization algorithms are the ANN deficiencies in getting trapped in local minima as well as the ANN slow rate of learning (Lee et al., 1991). However, recent literature suggested several optimization algorithms (i.e., PSO, GA) for enhancing ANN performance like the study of Rashidian and Hassanlourad (2013) on a GA enhancing ANN performance and improving on its drawbacks. Such a conclusion was also highlighted in other recent studies (Majdi and Beiki, 2010; Jadav and Panchal, 2012; Liu et al., 2012). PSO beneficial effects were covered by Momeni et al. (2015c). PSO as a global search algorithm can be implemented to improve ANNs’ performance by adjusting the weight and bias of ANNs (Shi and Eberhart, 1999; Mendes et al., 2002). In fact, PSO and GA as global search algorithms help ANN as a local search algorithm to avoid it getting trapped in local minima and to find global minima.

It was discussed earlier that finding the minimum MSE is the main aim in ANNs. In the improved ANNs, the ANN connection weights are trained with the optimization algorithms rather than with a conventional back-propagation learning algorithm. The reason is to increase the chance of finding global minima (the least MSE).

4 Model dataset

To obtain a sufficient size of dataset for developing the predictive model of bearing capacity, an extensive literature review gave a database comprising 145 recorded cases of thin-walled footing load tests (Villalobos, 2007; Al-Aghbari and Mohamedzein, 2004; Eid et al., 2009; Tripathy, 2013; Wakil, 2013; Momeni et al., 2015b). The results of four laboratory tests (to be discussed later) were also added to the dataset to enhance the diversity of the data. Table 2 summarizes the minimum, maximum, as well as the average values of the dataset used in this study. Needless to say, the bearing capacity of thin-walled spread foundations in cohesionless soils is related to footing width B, soil internal friction angle φ, soil unit weight ϒ, and Lw/B. Therefore, these parameters were used as the input parameters of the predictive models. The bearing capacity (Qu) of the thin-walled spread foundations in cohesionless soils was set as the model output.

Table 2 Summarized dataset

5 Model development procedure

In the first step, a parametric study was utilized for determining the suitable PSO-based ANN parameters comprising swarm size, network architecture, and the number of iterations. For this purpose, a MATLAB code was prepared. Subsequently, based on an educated guess, an ANN model with one hidden layer and seven hidden neurons was set to be an initial model. Random sampling was used for dividing the dataset into two subsets: training and testing. The use of this procedure was suggested in several studies (Alvarez Grima and Babuška, 1999; Singh et al., 2001; Rabbani et al., 2012; Tonnizam Mohamad et al., 2014). Nevertheless, it should be mentioned that 80% of the data were used for network training and the other 20% were used for testing the prediction power of the model. The first step in the parametric study was to determine the number of iterations as well as the optimum swarm size. The former is often used as the termination criterion. The termination criterion is a condition that, upon being met, ends the iterative procedure. Therefore, an effort was made to investigate the effect of iteration numbers on the network performance.

A model with a default 800 iterations, 150 particle size, and acceleration constants (c1 and c2) equal to 2 was built. Fig. 3 suggests there is no remarkable change in the MSE after 450 iterations. Hence, 450 iterations was set to be the maximum number of iterations.

Fig. 3
figure 3

Effect of iteration numbers on model performance

The best swarm size also needs to be determined. It is worth mentioning that enlarging the swarm size (number of particles) increases the model complexity as well as the training time, while small swarm size may negatively affect the performance of the model. Hence, a sensitivity analysis was performed to find the optimum number of particles. For this purpose, the effect of a wide range of swarm sizes (70–800 from small to large) on model performances was studied, and the MSE was determined in each analysis.

Other PSO parameters were not changed during this step. That is, to find the best PSO structure, the number of iterations was set to be 450 and acceleration constants (c1 and c2) equal to 2 were used as suggested by Shi and Eberhart (1998) and Mendes et al. (2002). Fig. 4 shows the values of MSE obtained for different swarm sizes.

Fig. 4
figure 4

Effect of swarm size on model performance

As displayed in Fig. 4, the model built with 600 particles performed best with an MSE of 0.009. Therefore, the number of particles was set to be 600. It should be mentioned that before modelling, the data were normalized to values between 1 and 1.

After determining the optimum PSO parameters, the network architecture was designed. To obtain the proper network design, four hybrid models were run with different numbers of hidden neurons in one layer. However, for evaluating the performance of the model, R values of the testing dataset were studied.

Nevertheless, 4, 5, 6, or 7 neurons were trialed in each hidden layer to obtain the proper number of neurons. Table 3 shows the analyses results. The tabulated results in Table 3 indicate that the model which was built with seven neurons in the hidden layer (the fourth model) outperforms the other models. The correlation coefficient and MSE, equal to 0.98 and 0.005, respectively, for the testing dataset suggest the model superiority. Therefore, the architecture of this model was considered to be optimum for the predictive model of bearing capacity.

Table 3 Prediction performance of different PSO-based ANN models

In developing a GA-based ANN predictive model, a procedure similar to the PSO-based ANN was utilized to determine the GA parameters. However, in the interests of brevity, it is not repeated here. Nevertheless, after conducting the parametric study, it was found that the best GA-based ANN for the problem of interest is expected when the GA parameters presented in Table 4 are used. After finding the suitable GA parameters, similar to the ANN model improved with PSO, the optimum number of hidden neurons in the GA-based ANN model was obtained. This is tabulated in Table 5 where the performances of various models were evaluated using the MSE values and correlation coefficients. As displayed in Table 5, the models consisted of various hidden nodes, i.e., 4, 5, 6, and 7 in one hidden layer. According to Table 5, the last model, i.e., Model 4, works better. The correlation coefficient and MSE of 0.87 and 0.010, respectively, for testing dataset indicate that the aforementioned model performs best. The results of this predictive model will be discussed in Section 7.

Table 4 Optimum GA parameters
Table 5 Prediction performance of various GA-based ANN models

To have a better understanding, the prediction performances of the hybrid models were compared with those of a conventional ANN. For this purpose, using an ANN model constructed with seven hidden neurons in one hidden layer, the bearing capacity of the thin-walled spread foundations were predicted. The Levenberg-Marquardt (LM) learning algorithm was used for training the network. More details on this learning algorithm (its efficiency for training) were discussed in the study conducted by Hagan et al. (1996). Nevertheless, like improved ANN models, in the conventional ANN model, random sampling was used (80% of the data for training the model and the rest for testing purpose).

6 Laboratory test procedure

In this section the procedure implemented in conducting the footing load tests is highlighted. In this study, the load carrying capacity of a specific thin-walled foundation proposed by Nazir et al. (2013) (Fig. 1) is investigated. The footing is referred to as an industrialized building systems (IBS) footing mainly because it is developed to be used in IBS. In essence, two IBS footings as shown in Fig. 5 were loaded in the Johor Bahru sand (in loose and dense states). The sand particles ranged from 0.063 mm to 1.18 mm. The mean grain size (D50), effective grain size (D10), and coefficient of uniformity Cu were 0.5 mm, 0.142 mm, and 3.52, respectively. The unit weight and internal friction angle for loose and dense sands were 14.26 kPa and 29.23°, 15.54 kPa and 36.24°, respectively. Several studies addressed the minimization of scale effect if the ratio of footing width to soil mean grain size exceeds 100 (Habib, 1974; Taylor, 1995; Kalinli et al., 2011). Therefore, knowing the D50 of the sand, small scale IBS footings of width 80 mm (18.75 times smaller compared with proposed full scale in Fig. 1) were prepared. Fig. 5 also shows the dimensions of the footings used in this study. As shown in Fig. 5, apart from Lw/B, other dimensions are the same for both footings.

Fig. 5
figure 5

Thin-walled shallow foundations used in this study

(a) IBS footing with shorter wall; (b) IBS footing with longer wall

The footings were given a rough surface by gluing sand to their base and sides. All tests were performed in a test box with length, width, and height of 920, 620, and 620 mm, respectively (Fig. 6). The box was large enough to minimize the boundary condition effects and it was made of Plexiglas to provide better view of soil deformation. To reconstruct the sand at desired relative densities, i.e., 30% (loose) and 75% (dense), a new mobile sand pluviator system (Fig. 6), invented by Khari et al. (2014), was used. The use of this technique for achieving the desired relative density has been highlighted by many researchers (Madabhushi et al., 2006). The footings were placed at the center of the box and were driven (pushed) on the sand. Eid et al. (2009) mentioned that using this procedure will not lead to more than 4% bearing capacity overestimation. The testing tank was then placed over a frame made of 6-inch U-type steel profile (1 inch=2.54 cm).

Fig. 6
figure 6

Mobile sand pluviator and test tank

The load was applied slowly to the model footings by means of a pneumatic loading shaft in a continuous operation. The load was measured with a 20-kN load cell with an accuracy of ±0.01% resting between the footing and the load frame. The footing settlement was measured based on the readings of two linear variable displacement transducers (LVDT) installed on a reference beam (Fig. 7). The LVDT’s accuracy was ±0.01% of full range (100 mm).

Fig. 7
figure 7

Laboratory footing load test

The load was increased if the rate of settlement change was less than 0.003 mm/min over 3 consecutive minutes. The use of this procedure was reported in several studies (Adams and Collin, 1997; Briaud and Gibbens, 1999; Chen et al., 2007). However, footings were loaded in relatively loose and dense sands until the soil settlement reached almost 25 mm.

7 Results and discussion

The superimposed load-displacement curves of footing load tests in loose sand are shown in Fig. 8. Overall, the results are not surprising. That is to say, for longer wall length, higher bearing capacity is expected. Fig. 8 suggests that the load carrying capacity of an IBS footing with shorter walls is 560 N while as expected, for an IBS footing with longer walls, the bearing capacity was found to be almost 830 N.

Fig. 8
figure 8

Load-displacement curve of thin-walled footings in loose sand

It is interesting to note that results suggest that selecting a value beyond 10% of footing width leads to bearing capacity overestimation. In fact, Fig. 8 suggests that the soil failure is captured when the soil deformation reached almost 8%B. Although the figure is self explanatory, it should be highlighted that for interpreting the failure load, the recently proposed L1-L2 method (Akbas and Kulhawy, 2009) was utilized. In this method, the initial point (L2) of the last part of load-displacement curves is referred to as the axial bearing capacity of the footing. As displayed in Fig. 8, the effect of thin-walls on the transition part of load-displacement curves is obvious. Although the initial parts of curves are different to some extent that may generally be attributed to some uncertainties in conducting the tests. For example, reconstructing the sand in exactly the same density using sand raining technique is almost impossible and it is always expected to have sand a bit denser or looser than what is intended. This effect is more pronounced in loose sand compared with dense sand as the rate of sand raining, when constructing the loose sand, is much more than that of dense sand. Therefore, it is more difficult to achieve exactly the same relative density when the loose sand has to be reconstructed several times.

Nevertheless, similarly, the load-displacement curves of thin-walled footings in dense sands are presented in Fig. 9. According to Fig. 9, the bearing capacity of an IBS footing with shorter walls (Lw/B=0.5) is 1990 N. For an IBS footing with longer walls (Lw/B=1.12), the load carrying capacity was found to be 2985 N.

Fig. 9
figure 9

Load-displacement curve of thin-walled footings in dense sand

Overall, Fig. 9 suggests that the general shear failure of sand is what is expected and it may be attributed to the dense state of the soil. Additionally, no specific bulging was observed for the IBS footing with either higher or shorter wall length (Fig. 10 as an example), which might be due to the incorporation of thin-walls. However, as stated in the study of Wakil (2013), it should be highlighted here that the bearing capacity failure is not necessarily grouped in three specific categories as proposed by Vesic (1973), i.e., general, local, and punching shear failure, mainly because these categories are suggested for unskirted footings.

Fig. 10
figure 10

IBS footing in dense sand after soil failure

Nevertheless, in thin-walled foundations, the soil located between walls is confined; hence, the footing and the confined soil inside the walls are acting as one integrated system. Consequently, as the length of walls increases, in a sense, the foundation embedded depth increases. Therefore, the bearing capacity of thin-walled foundations is enhanced as the wall length increases. However, in terms of negligible uncertainties in conducting the IBS footing load tests, a conclusion similar to loose sand can be drawn for dense sand as well, which in the interests of brevity is not repeated here. Overall, it was found that when Lw/B of the footings in both loose and dense sands is increased from 0.5 to 1.12, the bearing capacity is increased almost 0.5 times.

This is in reasonable agreement with Wakil (2013)’s findings where he reported an almost 0.3 times increase in bearing capacity of thin-walled foundations in dense sands when Lw/B is increased from 0.5 to 1.

The results of different predictive models of bearing capacity described in the previous sections are also highlighted in this section. Fig. 11 shows the normalized predicted Qu using PSO-based ANN, GA-based ANN as well as conventional ANN models versus the normalized measured bearing capacities of thin-walled spread foundations for the training dataset.

Fig. 11
figure 11

Prediction performance of PSO-based ANN (a), GA-based ANN (b), and conventional ANN models (c) (training dataset)

The correlation coefficient of PSO-based ANN model (R=0.91) suggests that the proposed PSO-based ANN model works better in predicting Qu compared with GA-based ANN and conventional ANN. The R values of GA-based ANN and conventional ANN are only 0.71 and 0.84, respectively.

The prediction performance of the proposed models for the testing dataset is shown in Fig. 12. The correlation coefficient R is equal to 0.98 in Fig. 12a, which also suggests the good reliability and relative superiority of the proposed PSO-based ANN model in predicting Qu. The correlation coefficients for GA-based ANN and conventional ANN of testing data are only 0.87 and 0.64, respectively. This is in good agreement with Nazir et al. (2014) as well as Marto et al. (2014) who suggested the feasibility of the PSO-based ANN model in foundation engineering problems. Overall, the PSO-based ANN model performs best. Therefore, this study suggests the use of the PSO-based ANN model in predicting the bearing capacity of thin-walled foundations.

Fig. 12
figure 12

Prediction performance of PSO-based ANN (a), GA-based ANN (b), and conventional ANN models (c) (testing dataset)

8 Conclusions

The close agreement between the measured and predicted (using the PSO-based ANN model) bearing capacities revealed the applicability of hybrid ANNs, as a feasible, practical, and quick tool in predicting the bearing capacity of thin-walled spread foundations in cohesionless soils. The correlation coefficient R and MSE equal to 0.98 and 0.005, respectively, for the testing dataset indicated the superiority of the proposed predictive model, which was built with seven hidden nodes in one hidden layer. The correlation coefficients for GA-based ANN and conventional ANN were 0.80 and 0.65, respectively. The dataset comprised 145 recorded cases of thin-walled footing load tests in sandy soils compiled from the literature as well as four conducted laboratory tests of this study. In particular, the laboratory tests confirmed the beneficial effect of wall length on the bearing capacity. It was found that a 0.5 times bearing capacity improvement is expected when Lw/B is increased from 0.5 to 1.12.