Base resistance of super-large and long piles in soft soil: performance of artificial neural network model and field implications

This study aims to examine the performance of artificial neural network (ANN) model based on 1137 datasets of super-large (1.0–2.5 m in equivalent diameter) and long (40.2–99 m) piles collected over 37 real projects in the past 10 years in Mekong Delta. Five key input parameters including the load, the displacement, the Standard Penetration Test value of the base soil, the distance between the loading point and pile toe, and the axial stiffness are identified via assessing the results of field load tests. Key innovations of this study are (i) use of large database to evaluate the effect that random selection of training and testing datasets can have on the predicted outcomes of ANN modelling, (ii) a simple approach using multiple learning rates to enhance training process, (iii) clarification of the role that the selected input factors can play in the base resistance, and (iv) new empirical relationships between the pile load and settlement. The results show that the random selection of training and testing datasets can affect significantly the predicted results, for example, the confidence of prediction can drop under 80% when an average R2 > 0.85 is required. The analysis indicates predominant role of the displacement in governing the base resistance of piles, providing significant implication to practical designs.


Introduction
Deep pile foundation has become one of the most preferable options for high-rise buildings and transport infrastructure (e.g., bridges and heavy freight rail tracks) around the world as it can carry massive loads in a relatively limited area. While it is well understood that the axial bearing capacity of a pile is mainly contributed by its shaft friction and base (or toe) resistance, immense effort has gone to establishing different methods to estimate bearing capacity of piles over the past years. Many solutions are based on complex mathematical derivations and numerical analysis [14,27,38,41], while others employ empirical equations derived from field tests such as Standard Penetration Test (SPT value) [19,30,49], Cone Penetration Test (CPT) [22,36,67] and field static/dynamic load tests [10,20,30]. Despite these various solutions, a significant limitation that most conventional approaches commonly share is their limited consideration of past experience as well as data. Indeed, empirical parameters which are normally determined based on experience of the designers are often used to optimize predictions [11,20,23]; however, there is usually a lack of systematic method to compute these parameters properly. For example, the effect of displacement on the ultimate bearing capacity of piles is normally evaluated through field load tests, which are usually time-consuming and highly resource demanding, while empirical methods to ease this process have not been established very well. This context thus remains a need for novel approaches that can incorporate experience-based factors as well as the valuable existing data, and thereby & Thanh T. Nguyen thanh.nguyen-4@uts.edu.au 1 generate user-friendly processes for enhanced practical designs.
In recent years, the use of data-based techniques such as machine learning (ML) to predict geotechnical issues has received increasing attentions [26,64,68,72]. Of numerous emerging ML techniques, artificial neural networks (ANN) is the most common approach to develop forecasting models of various geotechnical issues such as slope stability [40], soil properties [29,48], bearing capacity [7,28], deep excavation [2,71], mining [12,57], tunnelling [59], jet grouting [58], among others. As strictly developed based on existing data, the quality and size of data play a pivotal role in building ANN models. Nevertheless, one of the major issues in the past ANN modelling of pile foundations was the use of relatively limited data (i.e., common number of trained data points \ 175) covering a variety of pile shapes, materials and construction methods (see Table 1). This can lead to low accuracy and/or reliability when applying these ANN models to real projects because the new inputs can easily be out of the trained range due to data scattering. The limited data also allows only one or several subsets of training and testing data while establishing the model, thus hindering our understanding of the effect that data randomness can have on the confidence of predictions. In addition, Table 1 shows that past studies mainly addressed small and medium piles, i.e., diameters \ 1.0 m and length \ 45 m, whereas larger piles were hardly found. In fact, large and long piles are commonly employed in alluvial and coastal regions where soft soil is usually a significant barrier to the infrastructure development [47]. Therefore, an ANN model established based on a considerably large database of super-large and -deep piles in soft soil becomes essential.
A common issue across past ANN models of pile foundations was the use of CPT data to estimate the shaft and base resistances of piles for input parameters while modelling [4,6,43,45,51]. Because the CPTs are very different from the actual working condition of piles, especially for large and long piles (e.g., the ratio of diameter and displacement, penetration speed, the stress level along the shaft), the pile shaft and base resistances computed from the sleeve and tip values measured by CPTs can deviate significantly from their real values [63]. Further, the soil deformation based on CPTs is much larger than that of piles, while failures of soil based on CPT data do not represent very well the ultimate state of piles under loading, resulting in inaccurate understanding of the mobilizations of base and shaft resistances in pile foundation. For example, Fellenius (2015) [22] indicated that the pile toe resistance calculated from CPT results can considerably overestimate the actual value measured in field tests. Besides, CPTs are normally limited up to 50 m depth [21,62], whereas large bored piles can easily exceed this depth, causing a lack of method to estimate the shaft and base resistances for these long piles. These limitations of past ANN models require a more rigorous determination of influencing factors on the bearing capacity of piles that can be used as the key input parameters for establishing the model.
The major motive of this study is to overcome the above limitations by using a large database of pile foundations to examine and advance the application of ANN in predicting base resistance of super-large and -deep piles, which was not available in any past ANN models of pile foundation.
The key innovations are the random selection of training and testing datasets derived from 1137 data points (86  (Fig. 1). The equivalent pile diameter varied from 1.0 up to 2.5 m, while the embedded depth of these piles was from 40.2 to 99 m with the test load up to 10,910 tons, all of which were much larger than those used in previous ANN models for piles. A simple approach based on combining multiple learning rates is also proposed to enhance training process. Further, the outcomes yielded from the established ANN models are then innovatively employed to determine the contribution that different factors can make over the base resistance, which advances our knowledge of pile foundation. The study will also provide empirical equations, which do not exist in literature as well as state of the practice, to assist practical designs. It is noteworthy that although advanced ML combined with optimization techniques can be used, this study focuses on the most fundamental concept of ANN model due to its simplicity and thus a wider range of practical implications.
2 An overview of super-large and long piles used in Mekong delta

Geological condition
Mekong Delta is a weak geology region in Vietnam with a very thick mud deposit (Fig. 2a) from the Mekong River. Located at the margin region of the Mekong delta and downstream of Sai Gon River, the geology of Ho Chi Minh City is extremely complicated with a very thick layer of soft to very soft clay, ranging from 6 to 30 m below the ground surface as described in Fig. 2b, resulting in considerable challenges for designing foundation. This clayey soil has an average void ratio e 0 = 2.2, water content w n-= 80% and plasticity index PI = 53 [32]. The liquid and plastic limits are around 89% and 36%, respectively; thus they are well classified as CH type. The unconfined compressive strength (q u ) of this soil varies from 20.3 to 49.1 kPa while the SPT value (N) is only around 0-2. Medium and dense sands are about 3 m thick and distributed widely from 10 to 60 m depth. The SPT value of this soil ranges from 8 to more than 25 blows/ft [33]. Underneath this layer is a 1.5-3 m-thick layer of stiff to very stiff clay where q u varies from 80 to 250 kPa and SPT value is between 25 and 50 blows/ft. Followed this layer is a dense fine sand with the SPT value ranging from 30 to greater 70 blows/ft.

Characteristics of pile foundation in Ho Chi Minh city
Most pile foundation for high rise buildings in Ho Chi Minh city is often designed to carry heavy load, i.e., from 800 to more than 3000 tons. The test load is usually required to be 2-3 times larger than the design load, resulting in a pile load in test cases of up to 9000 tons. For this very large magnitude, the bored piles are normally required to be installed up to the stiff soil layer which is located from 60 to 100 m depth. In addition to this very deep installation, the cross sections of these piles must be large enough to meet the requirement of pile slenderness. Generally, there are two common types of piles being used in this region; they are bored piles with diameter from 1 to 2 m and rectangular barrette piles in the range 0.6 m 9 2.4 m to 1.2 m 9 2.8 m (equivalent to 2.5 m in diameter considering the same perimeter). Given this range of diameters plus an installation depth varying from 40.2 to 99 m, these piles can be classified as super-large and -long piles compared to the normal range, i.e., 0.8-1.3 m in diameter and 10-40 m in length of large piles commonly used in past studies [1,4,8,18]. Static load test (SLT) and O-cell load test (OLT) are currently the two most common field tests used to examine the design load and determine the ultimate load capacity of piles. The SLT needs large space for setup, while its test load is normally limited up to 4,500 tons. Therefore, for test loads exceeding 4,500 tons, the OLT is often adopted as it can be conducted in narrow space for very long piles. To measure the development of the shaft and base resistances according to loading point displacement, strain gauges are attached to steel reinforcement bars in the cages     The axial load distribution and mobilized base resistance of the pile at each loading levels are represented in Fig. 3c and d, respectively. The data show that majority of the shaft friction is distributed along the upper half of the pile, i.e., around 45 m depth from the ground surface, whereas it drops apparently over the lower half of the pile. The larger the test load, the more non-uniform distribution the axial load along the pile. As the displacement reaches the largest level of 49.46 mm, the base resistance increases to 3853 kPa. These results mean that the contributions that the shaft friction and base resistance make over the total ultimate bearing capacity need to be estimated with respect to the test load and the displacement variations, especially for large-diameter and long piles.   3 Artificial neural network model and data collection

Artificial neural networks (ANNs)
A typical ANN model includes the input, hidden and output layers that imitates the information processing system of the nervous system in the human brain to identify and predict characteristics of a problem based on its given dataset [9]. In all these layers, the nodes (i.e., neurons) are reasonably arranged and communicate with each other through weighted connections. Each neuron adopts an activation function that aims to receive information from prior neurons, process and transmit signals to subsequent neurons or output of the network. Performance of an ANN model can be evaluated by comparing the predicted results with real data via several assessment factors such as the coefficient of determination (R 2 ), mean squared error and absolute errors [43,45,55]. Back-propagation is the most common algorithm widely used for training ANN models [16] due to its simplicity while ensuring high degree of accuracy in predicting data behaviour. The current study also adopted the back-propagation approach, while the gradient descent was employed to optimize the error and obtain the global minimum.

Data collection and features
The current ANN model was developed based on a database consisted of 1137 sets which were collected from load tests of 86 super-large and long bored piles at various projects in Mekong Delta, Vietnam. Basic information such as the location, size and loading parameters of these piles is summarized in Table 2 Table 3. Apparently, the current database of piles is much more comprehensive and rigorous compared to most past studies using ANN models to predict bearing capacity of pile foundations [4,6,42,43,51,54,61]. Moreover, all the piles are ranged from large to super-large size including only two types, i.e., 75% bored piles and 25% barrette piles (Fig. 5a). These unique features of the current database will ensure a more robust and reliable development of ANN model to predict behaviour of pile foundation. The average SPT values at pile toe were used to feature soil properties. It is important to consider the failure zone of soil below the pile toe to obtain the average SPT values. Meyerhof (1976) [41] recommended the failure zone to be 10D above and 4D below the pile toe (D is the equivalent diameter of pile). According to [19], the failure zone under pile toe of small diameter piles was in the range of 2D to 8D depending on physical properties of soil layers above the pile toe. For lager diameter piles, the failure zone extends to 1D below pile toe as proposed by [5]. In this study, the average SPT values in 1D zone below the pile toe was used because of the large diameter bored piles. The range of SPT values is shown in Table 3.

Model inputs and outputs
As the main objective of this study is to predict the mobilized base resistance of piles based on ANN approach, the input parameters are identified as those directly influence behaviour of the base resistance considering the load test data (e.g., Figs. 3 and 4). It can be determined that the base resistance mainly varies with 5 major different parameters which are the applied load (P), displacement of loading point (D p ), axial stiffness (AE), SPT values (N) of the soil beneath the pile toe and the distance from loading point to the pile toe (L p ). The pile test data (e.g., Figs. 3 and 4) show that increasing the applied load P and the corresponding displacement D p will cause the base resistance to increase; thus they were considered as two primary input parameters in the current model. It is apparent that stiffness of the soil beneath the pile toe will play an important role in governing the mobilized base resistance, i.e., the larger the SPT value, the larger the base resistance. In other word, the SPT value must be included in the influencing factors while establishing the ANN model. Furthermore, the distance between the loading point to the pile toe is a crucial parameter because varying this distance will result in different load distribution along the pile, thus affecting the base resistance. Figure 6 shows that the mobilized base resistance (i.e., Q b /(P/A)) increases when the distance L p decreases. In particularly this ratio rises swiftly from around 10% to more than 30% when L p becomes less than 10 m. This observation, in fact, corroborates well previous data for large diameter bored piles [1,10,17,38,44]. The longer the distance, the harder the base resistance can be mobilized. Note also that for bored piles, the distance between loading point and pile toe is   actually the length of the pile as the loading point is located at the pile head, whereas this distance is the O-Cell point to the pile toe. The distribution of L p in the current study is shown in Fig. 7.
It is important to note that the effect of pile size was included through the axial stiffness which was computed based on Young's modulus of the pile material and the cross-sectional area [35]. In addition, the pile length is reflected via the distance between the loading point and the

Data division and reprocessing
In the current model, 1137 datasets were used for developing and testing the ANN model. The available data is randomly divided into two subsets including a training dataset for model calibration and an independent testing dataset for model verification. Specifically, 966 data points (85%) were used for training and the remanding 171 data points (15%) were used for testing. The major innovation in the current data processing was that the study constantly repeated the training and testing process for 250 different random selections from the entire database and then the probability analysis was applied to evaluate the results. Most previous studies assumed homogenous data and randomly selected training and testing datasets at only one or few times [4,45,46,51,55], resulting in incomplete understanding of how the random distribution of data can affect the prediction outcomes. Specifically, the questions of whether the established model would provide identical predictions over different subsets of training and testing data, and how this random division can affect the prediction confidence have not been clarified. Indeed, for any data collected from real construction sites, the variation in data properties is usually significant, requiring an investigation on the probability of outcome distribution while varying data selection. The input and output variables are scaled in a unit range to eliminate their dimension and ensure that all variables receive equal attention during training [54]. The scaling range is normally chosen with respect to the ultimate limits of activation function used in the hidden layers, for instance, the scaling range between -1 and 1 when the tanh function is used, while the range from 0 to 1 is normally used for the sigmoid function. While there might be some advanced activation functions developed in recent times ) [60], our preliminary investigations on 4 common activation functions including Sigmoid, Tanh, ReLU and Leaky ReLU showed that the Sigmoid function provided the most accurate and relevant outcomes (i.e., R 2 = 0.96 in training phase), given the fundamental form of ANN. The Sigmoid function was hence adopted in this study with scaling process given by: where x n , x min and x max are the normalized (scaled), the minimum and maximum values of variable x.

Determination of model architecture
The performance of an ANN model essentially depends on its designed network architecture, therefore, defining the optimum network architecture is vital to establish an effective ANN model. The network architecture includes the number of hidden layers associated with their nodes and optimum weights. Hornik, Stinchcombe, and White [31] indicated that an ANN model with one hidden layer can approximate any continuous function. In designing ANN architectures, Lawrence [37] recommended that increasing the number of hidden layers should be the last option, whereas optimizing the number of hidden nodes (N hn ) can enhance performance of the ANN model more effectively. The optimal N hn depends on the numbers of input and output variables as proposed in previous studies (Table 4). However, the predictive performance of an ANN model is commonly evaluated through the coefficient of determination (R 2 ) as well as the root mean squared error (RMSE), therefore, the trial-and-error approach [54] can be used to determine the best predictive performance. In other words, the current study examined various network architectures with different numbers of hidden nodes using the same input and output data to find out the best architecture. This optimum architecture was then applied to different random sets of data to investigate the probability distribution of predicted outcomes. In this work, an ANN model with one hidden layer was adopted for network construction as presented in Fig. 8, while N hn were varied from 5 to 40 nodes to determine the optimal network architecture. Several previous studies [2,53] were based on the error of training datasets to select optimal network architecture; however, it is noteworthy that the error in testing or validating datasets is always the main target of a predictive model. An optimal model does not only have the capacity of minimizing the error during testing data due to overfitting or underfitting, but also has the simplest structure (i.e., less input parameters as well as nodes) for reduced computational cost. Figure 9 presents the influence of the N hn on the predictive performance of the ANN model. For the simple networks in which the N hn is less than 10, the largest value of R 2 in training and testing datasets were relatively small, i.e., only 0.889 and 0.805, respectively. When N hn increases and exceeds 30, the value of R 2 in training datasets rises considerably to 0.969 and 0.983 for N hn = 30 and 40, respectively. In the meantime, the value of R 2 in testing datasets drops significantly from 0.899 to around 0.8 when N hn increases from 20 to 40, indicating the over-fitting of the model when N hn is too large. This means that the network with 20 hidden nodes where the values of R 2 in the training and testing data were large and close to each other can be considered as the best performing model. In fact, the time for training the 40-hidden node system was also approximately double the time used for 20-hidden node computation; thus the network with 20 hidden nodes was selected for further analysis in this study. This, in fact, disagrees with the previous empirical equations shown in Table 4.

Criteria of termination
During training process, it is crucial to decide when to stop the training as this can critically affect the performance and reliability of an ANN model. Therefore, it requires rigorous training strategy and supporting techniques to avoid overfitting which usually happens if model training is excessive. In contrast, the insufficient model training will lead to inaccurate outcomes and/or under-fitting. There are many techniques to determine when to stop training process as discussed in previous studies [56], meanwhile cross-validation techniques can be used to mitigate overfitting issue [46]. In the current work, the coefficients of determination R 2 calculated from the actual and predicted values of pile base resistance in both training and testing datasets were considered to terminate the training process. The training dataset was then used to modify the connection weights, while the testing dataset was used to evaluate performance of the trained network. The values of R 2 at various loops in training and testing data were obtained, and the training process was terminated only when the R 2 value of testing data was unchanged or begun to decrease.

Optimizing learning rate for enhanced training process
The initial connection weights of the model were randomly generated in the range from 0 to 1 and the optimal weights were obtained by training the network to reach the lowest prediction error. The training cycles to achieve termination criteria and the lowest prediction error depended on the learning rate (LR), which determined the size of the corrective steps that the model can take to optimize the error in each back-propagation. Shahin [54] shows that an LR of 0.2 is normally the minimum for an accepted error in predicting driven piles and drilled shafts. Alkroosh [3] indicated the optimum LR of 0.08 for minimal prediction errors through training their ANN model using a LR ranging from 0.05 to 0.6. A high LR can shorten the iteration number of the training process, but can be difficult to achieve the global minimum. On the other hand, a smaller LR can make the training time-consuming and become stuck in the local minimum. In this study, the trial-anderror approach was used to determine the appropriate values of LR for the network training. An LR in the range from 0.005 to 0.05 was assumed for training the ANN models consisting of 20 hidden nodes, while the initial weights matrix was kept identical across different cases. The training results using different LR values are shown in Fig. 10. The network training did not converge when LR [ 0.05. When LR was large, i.e., from 0.03 from 0.05, the convergence rate of training was great in the first cycles, resulting in an R 2 value of 0.942 within the first 2 million loops. However, the convergence rate significantly dropped in later cycles, causing R 2 unchanged at 0.952 despite the training running through more than 4 million additional loops. For the lower values of LR, i.e., 0.005 and 0.01, the convergence rate in the first 2 million training cycles was medium to achieve an R 2 value of 0.931 and 0.907, respectively. This rate remained unchanged in the Table 4 The recommended numbers of hidden nodes in past studies

References
Numbers of hidden nodes [13] 2 N i þ 1 [52] NoþNi 2 [39] ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi N i Â N o p [50] 2þNoÂNiþ0:5NoÂ No 2 þNi ð ÞÀ3 NoþNi [66] 2Ni 3 [34] 2 N i N i numbers of input variables, N o numbers of output variables Acta Geotechnica (2023) 18:2755-2775 2767 next 3 million training episodes with R 2 rising to 0.959 and 0.957, respectively. In the later loops, R 2 increased slowly and reached the largest value of 0.966 after around 10 million loops. It is noteworthy that the smaller the LR value, the larger the ultimate level of R 2 . Figure 10 shows that the LR of 0.01 (blue line) was the optimal value for training the current network as it needed less training loops to reach the largest value of R 2 . Many past studies [3,45,46,54] used only one value of LR in the network training, which can lead to the aforementioned limitations such as excessive training time and local minimum. A recent review study [70] shows that a fixed value of LR does not always work well. To overcome these problems, a relevant combination of multiple LR values in the training process was investigated in this study. Specifically, high LR values (i.e., 0.05 and 0.03) were employed in the first training episodes to quickly achieve the near global minimum on the error surface. In the next cycles, lower LR values (i.e., 0.01 and 0.005) were used to . This approach is more effective with the slope of error reduction maintained at large corrective steps and thus can be well considered while designing an ANN model.

Performance of ANN model
The reliability of ANN models in predicting the mobilized base resistance of the current piles is presented in Fig. 11. It can be seen that the random division of database into training and testing sets has a huge influence on the obtained results. Different cases of random data division resulted in different degrees of accuracy despite the same model features (e.g., model architecture, multiple learning rates of 0.05-0.03-0.01-0.005 and criteria of termination) being used. In 250 investigated random cases, the optimal R 2 values of training and testing datasets varied in a wide range, i.e., from 0.91 to 0.98 and 0.80 to 0.96, respectively. This proves that the investigation on the prospective range of predicted outcomes that the random data division can result in is vital to the confidence of using ANN model in practice. One cannot conclude about the predicted results with only one or few random cases, for example, R 2 in the testing cases can drop to low level, i.e., 0.8 that requires a serious attention. Figure 12 shows the probability distribution of the computed R 2 . In 250 random cases, the most common values of R 2 in the training and testing datasets were around 0.96 and 0.89, respectively, which can be used as representative to evaluate the ANN model performance. The obtained random results indicate that the probability distribution of the R 2 error in training and testing datasets is similar to a left (negative)-skewed distribution [15,65] where the left side of the peak R 2 receives more distribution compared to the right side. This is understandable as it would certainly become harder to achieve a precision that is larger than the peak level (e.g., 0.96 in training), especially for the present real and large database, resulting in a drop in distribution on the right side. The distribution of R 2 is more uniform in the testing data (Fig. 12b); in fact, most R 2 values are allocated around 0.89 (from 0.87 to 0.91). However, the number of predictions with R 2 value [ 0.91 decreases more apparently compared to the opposite side (i.e., R 2 \ 0.87). Besides, the red solid lines in Fig. 12 demonstrate the cumulative probability of the R 2 value, which shows that R 2 develops more steeply in training compared to testing data. Overall, for a required R 2 [ 0.85 with a confidence of 80% in testing cases, the current ANN model can be considered acceptable, however for a higher degree of confidence, more effort is certainly required. Figure 13 shows representative predicted base resistance using the current ANN model (one hidden layer, N hn = 20

Influencing factors to the base resistance of piles based on the prediction outcomes
Sensitivity analysis was carried out to identify the influence of individual input variables on the predicted results as well as the performance of ANN model. Garson [24] firstly, proposed a simple technique, namely weights method, to determine the relative importance (RI) of the input variables in one-hidden layer neural network by examining the connection weights of the trained network. The weights method fundamentally involves a procedure of partitioning the hidden-output connection weights of each hidden neuron h into components associated with each input neuron i as shown in Eq. (2). The detailed procedures and algorithm of this method can be found elsewhere [25], which can be described by: in which where o denoted the output neuron and w is the connection weights. Figure 14 shows the results obtained from the sensitivity analysis. It is interesting that the displacement has the most dominant effect (i.e., 28.3% relative importance to the predicted outcome) on the mobilized base resistance followed by the load-pile toe distance and the SPT value of soil beneath the pile. The load P has the least significant influence with only about 10.5% relative importance. This can be explained by the fact that the magnitude of displacement had included both the effects of the load P and the soil-pile interaction, thus its role was exacerbated, whereas the contribution of the load became less significant. It can be seen that the distance L p also has a great impact on the base resistance, which suggested the importance of considering this parameter when estimating the pile base resistance. The deeper the pile is, the harder the base resistance can increase. SPT value of the soil at the pile toe has a medium effect on the base resistance compared to other factors. This is understandable as most of large diameter bored piles were based on stiff soil layers where the SPT values were large and in relatively similar range.

Empirical estimate of displacement and mobilized base resistance
In order to facilitate the practical application of the current model outcomes, all the data points representing the relationship between the mobilized base resistance and the loading point displacement were combined and their empirical relationships were developed. Specifically, empirical relationships between the loading point displacement D p (mm) and pressure P/A (ton/m 2 ) were established differently for the static and O-cell load tests based on the measured data. The hyperbolic function which is widely used to describe the nonlinear behaviours of pile load-displacement curve [38,69] was adopted. The fitting curves are shown in Fig. 15a for static load tests and Fig. 15b for O-cell load tests. These expressions of the pile pressure (P/A) are depicted in Eqs. (3) and (4) for static load and O-cell load tests as follows.
Although there is a certain deviation between the results estimated by the proposed equations and field data, they are a fast approach that enables practical engineers to roughly estimate the pile load with respect to a given pile displacement.
Consequently, the above empirical relationships were employed to estimate the vertical displacement of the load point which was then be used to estimate the base resistance using the validated ANN model. The resulting charts representing the variation of the base resistance with different degrees of point load displacement and the distance L p are shown in Figs. 16 and 17 for bored piles and barrette piles tested by the static and O-cell loading methods. It is noted that these charts describe the common practical context of piles where 3 typical parameters including the embedded length, cross section and SPT values (N) of pile are usually considered. For piles tested by static load, the displacement less than 50 mm was employed to investigate the mobilized base resistance by the trained network. The number of data points with the displacement higher than 50 mm were relatively small (Fig. 15a), so the prediction using the trained network for displacement larger than 50 mm can reduce the reliability. Similar consideration was also applied to the piles tested by O-cell load where the vertical displacement of loading points was less than 20 mm (Fig. 15b).
The obtained results in Figs. 16 and indicate that the mobilized base resistance considerably depends on the displacement and embedded length of piles. There are nonlinear relationships of the pile base resistance with the pile head displacement. For those piles tested by static load and the pile head displacement lower than 20 mm during the innital loading steps, the base resistance was hardly mobilized; in fact, its incremental rate only begins to increase when the displacement [ 20 mm. The mobilized base resistance increases rapidly as the displacement exceeds 25 mm. Besides, the longer the pile, the slower the mobilization of base resistance considering the same magnitude of displacement. At a similar displacement of 40 mm, the piles installed at 10 m deeper would result in 10%-20% lower mobilization of base resistance. For the piles tested by O-Cell load, the distance from loading point to pile toe was significantly shorter, the base resistance begins to increase signficantly when the

Conclusions and practical implications
The current study used 1,137 datasets of super-large and long bored piles collected across 37 real projects to develop an ANN model to predict base resistance of piles. Salient   1. A relevant combination of multiple leaning rates, e.g., descending order of LR (i.e., fast-to-slow) from 0.05 to 0.005 in the training process was proved to improve the training time as well as avoiding the issue with local minimum significantly. For example, using the combined different values of LR can save 40% the time for training compared to constant LR approach. 2. The random selection of training and testing datasets can affect considerably the predicted results, for example, R 2 in the testing cases can drop from 0.95 to 0.8 using the same model features. Therefore, one cannot conclude well about the confidence of the predicted results if only one or few random cases are selected. The model results based on 250 random divisions of data showed that the probability distribution of R 2 was likely to follow a left (negative)-skewed form with much more dispersion in training processes. 3. The sensitivity analysis of the input parameters indicated that the displacement D p has the most dominant effect on the mobilized base resistance (28.3%) followed by the load-pile toe distance (L p ) and the SPT value (N) of soil beneath the pile. The load P has the least impact with only about 10.5% relative importance. These five different input parameters could be considered relevant to develop an ANN model to predict the base resistance of super-large and long piles. 4. The mobilized base resistance at the displacement of 20 mm rose swiftly from around 10% to more than 30% when L p became less than 10 m. The base resistance of the piles in O-Cell tests begun to develop much earlier compared to static tests. Empirical equations and design charts were also proposed for practical engineers to estimate the settlement and base resistance of piles based on the outcomes from the current ANN model It is important to note that while the current findings were established based on the data of bored and barrette piles, they can bring significant values to a broader context of pilling foundations such as cement columns and driven piles where the base resistance and bearing capacity share the same mechanism. For example, the 5 key parameters and their contributions to the base resistance of piles can be applied to different types of pile foundation, whereas the effect of data randomness should be considered in general application of ANN techniques.
Acknowledgements The Authors acknowledge the support from various local companies and partners such as Hoang Binh Construction Group, FECON South JSC., Bachy Soletanche VN, among others around Mekong Delta while collecting and analysing field data for model development in this study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.
Data availability The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.