1 Introduction

Bridges play an essential role in the socio-economic development of a country. The bridges are usually constructed across rivers, valleys, or other obstacles. The riverbeds are subjected to general scour because of aggradation and degradation of bed level due to sudden changes in velocity and flow depth. The beds are subjected to contraction and local scour when rivers flow through any obstruction. Local scouring takes place around the obstacles located in the flowing water. The failure of the river bridges is mainly due to the scouring action around the pier and abutments during flooding [1]. Therefore, considering scour depth around the pier and abutment is essential while designing the bridge foundations. The scour depth mainly depends on the flow, sediment, and abutment parameters. Flow parameters include velocity and depth as influencing factors. The sediment properties, like sediment size, sediment type, shear stress, and angle of repose, are essential factors affecting the scour intensity and scour depth. The shape, size, and skewness of the pier and abutments affect the local scour [2]. The scour mechanism and formation of the vortex depend on these parameters.

Several researchers have opted for experimental and numerical studies to understand the scour mechanism, patterns, and methods to measure scour depth considering different parameters. Most of the studies are concentrated on the local scour around the single pier and pier groups [3,4,5,6,7,8,9,10]. Many experimental studies are also conducted on abutment scour by considering different parameters [11,12,13,14,15,16,17,18,19]. Some numerical studies are also available that study the scour mechanism and depth around the bridge pier and abutments [32]. The data required for the numerical investigations are usually collected from experimental and field studies. Various researchers used different tools like HEC-RAS [20, 33], SSIIM [15], and RANS modeling [21] to simulate the scour pattern and calculate the scour depth. Accurate estimation of scour depth and simulation of the scouring process are very challenging in a laboratory-scale model due to the complexity of the scour phenomenon and flow field.

Today, using various soft computing tools has become an extremely viable prediction approach with the use of experimental data. At this point, several researchers have applied those models to the prediction of scour depth around the pier using different factors. Melville [22] synthesized the data on bridge scour from previous studies conducted in various countries. He applied an integrated approach to estimate the scour depth around piers and abutments using empirical relations. The empirical relations were established by considering the flow, sediment, and foundation parameters. Similarly, Moradi et al. (2019) [23] applied the ANFIS model to assess the abutment scour depth using experimental data for clear water conditions. A standard ANFIS model was applied and compared with ANFIS—fuzzy C-mean clustering and ANFIS—grid partitioning models. In this modeling, eleven input parameters were used in different combinations. It was found that the model developed by using all the parameters performed well compared to other input combinations. In a similar way, Azimi et al. (2019) [24] developed the ANFIS model by hybridizing it with the genetic algorithm (GA) and singular value decomposition (SVD) to obtain better scour depth predictions. The models were developed by considering factors that affect the scour depth. From the sensitivity analysis, it was clear that the Froude number and ratio of flow depth to scour hole radius were the main factors affecting the scour depth near the abutment. Previously, Najafzadeh et al. [25] considered modeling abutment scour depth in cohesive soils using the GMDH technique. This study used montmorillonite and kaolinite clay soils with different soil properties as input parameters to build the proposed models. GMDH models were compared with ANFIS, radial basis function-neural network (RBF-NN) and empirical equations. The GMDH model outperformed in scour depth prediction compared to other models. Indeed, the GMDH models predicted the scour depth around abutments in clear water and live bed conditions in another study by the same authors [26]. The dimensionless parameters from the previous works were used in the proposed GMDH models, and their performance was compared with the SVM model. An advanced soft computing technique, namely gradient tree boosting (GTB), was applied to predict the scour depth around the bridge piers [27]. The results of the GTB model were compared with the GMDH model results, and it was found that the GTB model gave good results in terms of all statistical parameters. The study by Hosseini et al. (2016) [34] compared three different models (Multiple Nonlinear Regression, Artificial Neural Networks and ANFIS) for predicting the time varying scour depth around abutments. The ANFIS model had the best results, with an R2 value of 0.98. Recently, Zhang and Zhao [35] found that the convolutional neural network (CNN) model was able to predict the local scour depth with a high degree of accuracy. The study used data collected from laboratory tests and field observations to train the CNN model. The data included information about the flow velocity, flow depth, sediment size, and pier shape.

In the present study, an attempt was made to employ machine learning tools for predicting scour depth around vertical wall, semicircular and 45° wing wall abutments. Different models with various input variables were tested in this study. The performance of different machine learning models, such as ANFIS, GTB, GMDH, and MARS models, were compared.

2 Theoretical overview

2.1 Adaptive neuro-fuzzy inference system (ANFIS)

Adaptive Neuro-Fuzzy Inference System (ANFIS) integrates fuzzy logic principles with artificial neural networks, making it a universal estimator. ANFIS benefits from the neural network approach for solving various function approximation problems. An ANFIS model uses a hybrid learning method for mapping the relationship between input and output data by establishing the optimal number of membership functions. Such a hybrid framework makes ANFIS modeling more systematic and less dependent on expert knowledge. A fuzzy interference system consists of five functional blocks, as explained in Jang [28]. A fuzzy system works based on the IF–THEN rules and a database defining the membership functions (MF), referred to as a knowledge base. The rule base of ANFIS contains fuzzy if–then rules.

A decision-making unit forms an interface between fuzzification and defuzzification, thus transforming inputs into crisp outputs. The adaptive networks consist of feedforward neural networks with supervised learning capability. The learning rule for an adaptive network is based on minimizing or adapting to the error measure based on the output parameters from the nodes. A hybrid learning rule makes the gradient descent adaptive to speed up substantially. A general ANFIS model [29] consists of five layers, including input and output layers. Firstly, the input layer receives the input variables. Next, the fuzzification layer fuzzifies the input variables and the rule layer applies fuzzy rules to the fuzzified input variables. The inference layer calculates the output of the fuzzy rules and lastly the defuzzification layer defuzzifies the output of the inference layer.

2.2 Gradient tree boosting (GTB)

Gradient tree boosting (GTB) is an ensemble of multiple decision tree models to create a more robust and powerful machine learning technique for regression and classification problems. Different boosting methods weigh positive and negative samples, whilst GTB globally converges the algorithm following the negative gradient. Considering \({\{{x}_{i},{y}_{i}\} }_{i=1}^{n}\) as dataset. Using specific loss functions, the gradient descent algorithm ensures the convergence of the GTB. The basic steps of GTB are as follows. For the initial constant value of the model β and the number of iterations m = 1: M (M is the times of iteration), the gradient directions of the residuals are calculated. Further, basic regressors fit sample data and get the initial model according to the least squares approach. The loss function is then minimized accordingly, and a new step size for the model is then calculated. The model is then further updated. Essential parameters in GTB models are the number of trees, learning rate and the maximum depth of each estimator. This method can work on combinations of partially missing data sets and is preferred by most researchers for its accuracy and speed in handling complex and massive data sets. The GTB model reduces the difference in loss function at proceeding levels, i.e., more the number of decision trees, the more will be diminishing errors. GTB is sensitive to outliers; hence, using a mean absolute error would reduce the effect of outliers. For further detailed information on GTB the authors redirect the readers to refer [36, 37].

2.3 Group method of data handling (GMDH)

Group method of data handling (GMDH) is a machine learning algorithm that approximates the relationship between inputs and the output by a nonlinear mapping composed of successive layers of neurons using polynomial transfer functions. It is a computer-based mathematical model that uses multi-parameters with a fully automatic model to optimize the parameters. It produces rules after each iteration and adds to the set of rules.

A basic explanation for the regression problem is to identify a function (\(\widehat{f}\)) as an alternative for a latent utility function (\(f\)) to predict ŷ from the input [\(X=\left({x}_{1,}{x}_{2},{x}_{3,}\dots .{x}_{n}\right)\)] to be as close to the expected output \((y)\) as possible. To this end, \(M\) observations, including the multivariable unit–single variable output, are considered as follows:

$${y}_{i} = f( {x}_{i1},{x}_{i2},{x}_{i3}\dots ,{x}_{in} ), (i=1, 2, 3, \dots M)$$

The input vector X, trains the GMDH network for predicting the \(\hat{y}\) values:

$$\widehat{{y}_{i}}= \widehat{f}( {x}_{i1},{x}_{i2},{x}_{i3}\dots ,{x}_{in} ), (i=1, 2, 3, \dots M)$$

GMDH can iteratively sort complex polynomial models to identify the finest model. It matches the external criteria and ensures the minimization of the squares of the difference between the predicted and expected values. GMDH has seen application in many fields such as prediction, data mining, complex systems modeling, knowledge discovery, optimization and pattern recognition. For further detailed information on GMDH the authors redirect the readers to refer [38, 39].

2.4 Multivariate adaptive regression splines (MARS)

The Multivariate Adaptive Regression Splines (MARS) was formulated by Friedman (1991) [30]. The MARS model is an adaptive nonlinear regression model that uses numerous individual linear basis functions arranged in stepwise layers over a predictor variable space. The MARS model uses a hypothetical relation-building mechanism based entirely on regression data. It builds a complex model to correlate multiple variables using a set of spline basis functions and other parameters. The piecewise linear functions are chosen to fit the data as closely as possible, while also avoiding overfitting. The MARS algorithm typically uses a greedy approach to find the optimal number and location of knots, where the piecewise linear functions are joined. Out-of-bag (OOB) error is a measure of the error that a MARS model makes on data; a low OOB error indicates that the model is likely to generalize well. The generalized MARS model is summarized in Rezaie-balf et al. (2017) [31]. MARS model has the advantage of automatically performing variable selection and transformation, creating a non-monotonic relationship, and handling the curse of dimensionality at high speed.

3 Data analysis and model development

The study by Dey and Barbhuiya [11] provided valuable data on the time-dependent scour depth (ds) near different abutments. The study was conducted in a laboratory flume considering vertical, semicircular and 45° wing wall abutments. The laboratory model was a scaled representation of a real-time hydraulic flow situation. Scour depth analysis was done individually by considering different parameters. Here, the dimensional parameters of flow and sediment were considered to predict the scour depth. The scour depth (ds) was predicted using the five input parameters such as length (L) and breadth (B) of the abutment, sediment size (d50), approaching flow depth (h) and average approaching flow velocity (U). The length and breadth of abutments ranging from 4 cm to 13 cm and 8 cm to 36 cm, respectively, were considered for the modeling of scour depth prediction. The flow depth of 5 cm to 25 cm, different sediment diameters of 0.03 mm to 0.31 mm, and velocities of 0.22 m/s to 0.67 m/s were also considered.

The entire data set (99 samples) was divided randomly into two sets, i.e., ≈ 70% for training and ≈ 30% for testing the model. The training and testing sets comprised 71 and 28 samples, respectively. The range and quality of the divided data sets for both input and output parameters are tabulated in Table 1. Descriptive statistics such as maximum, minimum, mean, kurtosis and standard deviation for all three abutment types are presented separately.

Table 1 Descriptive statistics of abutment scour data

The scour depth was predicted using soft computing models from the available experimental data. ANFIS, GTB, GMDH, and MARS models were used after optimizing the model parameters. The performance of the ANFIS model depends on the number and type of membership function and the number of epochs. The GMDH model performs better by selecting the optimum number of neurons, layers, selection pressure and loops. By determining the optimum values of N_estimators, learning rate, loss, maximum depth, minimum sample split and random rate, the GTB model gives a good prediction of scour depth. The MARS model performance is based on model parameters such as maximum degree, penalty, midspan_alpha, endspan_alpha and endspan. The optimal model parameters obtained by automated grid search approach for all the soft computing models are tabulated in Table 2.

Table 2 Optimal model parameters used for calibration of models

4 Performance evaluation metrics

The scour depth predicted from the soft computing models ANFIS, GTB, GMDH and MARS were compared with experimental scour depth values. The efficiency and errors of the model predictions are evaluated using the following statistical indices:

Mean Absolute Error (MAE), \({\text{MAE}}=\frac{\sum_{i=1}^{N}\left|{P}_{i}-{O}_{i}\right|}{N}\)

Root Mean Square Error (RMSE), \({\text{RMSE}}=\sqrt{\frac{\sum_{i=1}^{N}{\left({O}_{i}-{P}_{i}\right)}^{2}}{N}}\)

Relative Root Mean Square Error (RRMSE), \({\text{RRMSE}}=\frac{RMSE}{{\sigma }_{obs}}, 0\le {\text{RRMSE}}\le 1\)

Nash–Sutcliffe Efficiency (NSE), \({\text{NSE}}=1-\left(\frac{\sum_{i=1}^{N}{\left({O}_{i}-{P}_{i}\right)}^{2}}{\sum_{i=1}^{N}{\left({O}_{i}-\overline{O }\right)}^{2}}\right)\)

Normalized Nash–Sutcliffe Efficiency (NNSE), \({\text{NNSE}}=\frac{1}{2-NSE}, 0\le {\text{NNSE}}\le 1\)

Kling-Gupta efficiency (KGE), \({\text{KGE}}=1-\sqrt{{\left(R-1\right)}^{2}+{\left(\beta -1\right)}^{2}+{\left(\gamma -1\right)}^{2}} , 0\le {\text{KGE}}\le +1\)

Wilmott Index (WI), \({\text{WI}}=\frac{\sum_{i=1}^{N}{\left({O}_{i}-{P}_{i}\right)}^{j}}{\sum_{i=1}^{N}{\left(\left|{P}_{i}-\overline{O }\right|+\left|{O}_{i}-\overline{O }\right|\right)}^{j} }, 0\le {\text{WI}}\le 1\)

where,

Correlation coefficient, \(R=\left[\frac{\sum_{i=1}^{N}\left({O}_{i}-\overline{O }\right)\left({P}_{i}-\overline{P }\right)}{\sqrt{\sum_{i=1}^{N}{\left({O}_{i}-\overline{O }\right)}^{2}\sum_{i=1}^{N}{\left({P}_{i}-\overline{P }\right)}^{2}}}\right]\)

Bias ratio, \(\beta =\frac{\overline{P} }{\overline{O} }\)

Variability, \(\gamma = \frac{{CV_{p} }}{{CV_{o} }} = \frac{{\frac{{\sigma_{p} }}{{\overline{P}}}}}{{\frac{{\sigma_{o} }}{{\overline{O}}}}}\)

\({\sigma }_{obs} or\) \({\sigma }_{o}\) is standard deviation of experimental data; \({\sigma }_{p}\) is standard deviation of model predicted data; \(O\) is experimental/measured values; \(P\) is model predicted values; \(N\) is the number of total data set; \(\overline{O }\) is mean of measured data; \(\overline{P }\) is mean of model predicted data;\(j\) is exponent term.

5 Results and discussion

The results derived from the ANFIS, GTB, GMDH, and MARS models for predicting scour depth are discussed in this section. The results advocate the suitability of the tested machine learning models in predicting the time-dependent scour (ds) around bridge abutments of different shapes. The performances of all machine learning models developed was assessed using the various performance evaluation metrics mentioned in Sect. 4. Apart from this, scatter plots, violin plots and Taylor diagrams were further used for in-depth analysis of the results.

5.1 45° wing wall abutment

The performance of models tested for 45° wing wall abutment in terms of errors and accuracy is presented in Table 3. The GTB model performed better than all the other models tested for predicting scour depth. It performed better in both the training and testing phases and displayed an RMSE of 0.624 cm and a WI of 0.994 in the testing phase.

Table 3 Performance evaluation metrics

The scatter plot presented in Fig. 1 highlights the efficacy of the GTB models. The scour depths predicted by the GTB model were in close agreement with the observed values. The scatter plot analysis shows better R2 values (0.9792) for the GTB model compared to all the tested models. The MARS model also displayed better performance with an R2 value of 0.9781. The scatter plot clearly shows that the ANFIS models could not predict the extreme (smaller and larger) values of the scour depth. The GMDH model predicted the smaller values but could not predict the larger ones efficiently. Violin plots are generally used to observe the distribution of numeric data between multiple groups. The violin plot presented in Fig. 2 reveals the ability of the GTB model to predict scour depth around the 45° wing wall abutment. The violin plots of the observed dataset and the GTB model's predictions matched each other in terms of symmetry and variability characteristics. The Taylor diagram plotted (Fig. 3) to evaluate the model performances in terms of RMSD, R, and standard deviation also advocated the superiority of GTB models. It was also observed from the Taylor diagram that the performance of the GTB model and the MARS model was similar.

Fig. 1
figure 1

Scatter plots of observed v/s predicted scour depth pertaining to 45° wing wall Abutment

Fig. 2
figure 2

Violin plots show the distribution of relative estimation error of different models with respect to 45° wing wall Abutment

Fig. 3
figure 3

Taylor Diagram for comparative evaluation of performance of individual models pertaining to 45° wing wall Abutment

5.2 Semicircular wall abutment

Table 3 represents the results for the semicircular wall abutment. It can be observed from the table that all the tested models exhibited performance similar to the 45° wall abutment. In the case of semicircular wall abutment, the GTB model performed better than the ANFIS, GMDH, and MARS models. The GTB model with an RMSE of 0.580 cm and a NNSE of 0.987 was the best. It can also be noticed that the ANFIS model performed better in the training phase but could not reproduce the same results in the testing phase. In the testing phase, the GTB model offered the best fit between the measured and predicted scour values (Fig. 4) with an R2 value of 0.987. It can be further observed that the GTB model tested for the data of semicircular-shaped wall abutment had the lowest RRMSE (0.108) indicating robust performance by the model.

Fig. 4
figure 4

Scatter plots of observed v/s predicted scour depth pertaining to Semicircular Abutment

In the case of semicircular wall abutment, the GMDH and the ANFIS models demonstrated similar performance. These models performed well at predicting smaller scour values but could not precisely predict the larger scour depth values. The violin plot (Fig. 5) shows similar data distribution patterns by GTB, GMDH, and MARS models compared to the pattern of observed scour values. The Taylor diagram (Fig. 6) also clearly signposts the superiority of the GTB model in estimating scour depth around a semicircular wall abutment.

Fig. 5
figure 5

Violin plots show the distribution of relative estimation error of different models with respect to Semicircular Abutment

Fig. 6
figure 6

Taylor Diagram for comparative evaluation of performance of individual models pertaining to Semicircular Abutment

5.3 Vertical wall abutment

The proposed models were also tested for predicting the scour depth around vertical wall abutments. The performance of the models is presented in Table 3. The results clarify that in the case of vertical wall abutment, the MARS model (RMSE = 0.692 cm, WI = 0.996) performed better than the ANFIS, GTB, and GMDH models. The MARS model has displayed good training as well as testing performance. The scatter plot (Fig. 7) indicates a good fit between the observed scour values and the predicted values by the MARS model. The Violin plot (Fig. 8) shows a similar distribution of data in the case of MARS and GMDH models, which is in coherence with the observed dataset. The Taylor diagram (presented in Fig. 9) potrays the efficacy of the MARS model in predicting scour depths around the vertical wall abutment.

Fig. 7
figure 7

Scatter plots of observed v/s predicted scour depth pertaining to Vertical Wall Abutment

Fig. 8
figure 8

Violin plots show the distribution of relative estimation error of different models with respect to Vertical Wall Abutment

Fig. 9
figure 9

Taylor Diagram for comparative evaluation of performance of individual models pertaining to Vertical Wall Abutment

The overall results of the study suggest that the GTB and MARS models are a good choice for predicting scour depth around abutments. At the same time, the performance of the ANFIS model was poor compared to other models.

6 Conclusions

The ANFIS, GTB, GMDH and MARS models were applied to predict the abutment scour depth around 45° wing wall, semicircular and vertical wall abutments. The predicted values from the models were compared with experimental/observed values and analyzed using statistical metrics. The following conclusions were drawn from the study. The GTB model performed well in predicting scour depths around the 45° wing wall and semicircular abutments compared to ANFIS, GMDH and MARS models. In the case of vertical wall abutments, the MARS model performed well for scour depth prediction compared to the other three models. Based on the statistical metrics, the model performance shows that both GTB and MARS models performed relatively better in scour depth prediction. The study suggests that machine learning models can be a useful tool for predicting scour depth around bridge abutments. The GTB and MARS models are good options, but the performance of the models varies depending on the type of abutment. It is important to choose the right model for the specific application.