An in-depth comparative analysis of data-driven and classic regression models for scour depth prediction around cylindrical bridge piers

The study focuses on the critical concern of designing secure and resilient bridge piers, especially regarding scour phenomena. Traditional equations for estimating scour depth are limited, often leading to inaccuracies. To address these shortcomings, modern data-driven models (DDMs) have emerged. This research conducts a comprehensive comparison involving DDMs, including support vector machine (SVM), gene expression programming (GEP), multilayer perceptron (MLP), gradient boosting trees (GBT) and multivariate adaptive regression spline (MARS) models, against two regression equations for predicting scour depth around cylindrical bridge piers. Evaluation employs statistical indices, such as root-mean-square error (RMSE), coefficient of determination (R2), mean average error (MAE) and normalized discrepancy ratio (S(DDRmax)), to assess their predictive performance. A total of 455 datasets from previous research papers are employed for assessment. Dimensionless parameters Froude number Fr=Ugy\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left( {Fr = \frac{U}{{\sqrt {gy} }}} \right)$$\end{document}, Pier Froude number FrP=Ug′D\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Fr_{P} = \frac{U}{{\sqrt {g^{\prime } D} }}$$\end{document}, and the ratio of scour depth to pier diameter (yD)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\frac{\text{y}}{{\text{D}}})$$\end{document} are carefully selected as influential model inputs through dimensional analysis and the gamma test. The results highlight the superior performance of the SVM model. In the training phase, it exhibits an RMSE of 0.1009, MAE of 0.0726, R2 of 0.9401, and SDDR of 2.9237. During testing, the SVM model shows an RMSE of 0.023, MAE of 0.017, R2 of 0.984, and SDDR of 5.301. Additionally, it has an average error of − 0.065 and a total error of − 20.642 in the training set and an average error of − 0.005 and a total error of − 0.707 in the testing set. Conversely, the M5 model exhibits the lowest accuracy. The statistical metrics unequivocally establish the SVM model as significantly outperforming the experimental models, placing it in a higher echelon of predictive accuracy.


Introduction
In the realm of river engineering, scour presents itself as a formidable challenge, instigating the gradual erosion and degradation of bridge structures.Numerous accounts substantiate erosion's role as a catalyst for the deterioration of bridges.Consequently, the precise estimation of bridge scour depth assumes paramount importance.Despite the multitude of diverse investigations into bridge pier scour depth, its inherent complexity underscores the significance of formulating precise simulators, a pursuit that commands considerable attention from both researchers and engineers alike.Within the USA, the primary contributors to bridge damage have been identified as scouring and flooding, as attested by numerous sources (Wardhana and Hadipriono 2003).The Austrian Federal Railways (BBB) experienced substantial financial losses amounting to approximately USD  (2) K s = 1.0, 0.9, 0.75 and 0.7 for rectangular, semicircular, elliptic and lenticular nose shape, respectively Chitale (1962) σ: standard deviation od bed sediment particles

Table 2
A brief literature review on scour depth prediction using DDMs Reference MLM utilized Results Kumar et al. (2023) Bagging regressor (BR), AdaBoost regressor (ABR), and support vector regression (SVR) The outcomes of their analysis revealed that BR and ABR outperformed the SVR Choudhary et al. (2023) Adaptive neuro-fuzzy inference system (ANFIS) and gene expression programming (GEP) The outcomes of their study unveiled that the ANFIS model exhibited superior predictive precision Tola et al. (2023) A review of MLMs application on scour depth forecasting Most of the time, the proposed MLMs outperformed scour depth prediction which is the main target of designing new MLMs Roshni (2023) GEP, M5-TREE, ANFIS, multivariate adaptive regression splines (MARS) The results also show the outperformance of ANFIS model over the other selected datadriven models and conventional empirical equations Baranwal and Das (2023) SVM It has been discovered to be more reliable and efficient in estimating scour depth around bridge piers Rathod and Manekar (2023) artificial neural network (ANN), ANFIS, support vector machine SVM, M5, GEP, group method of data handling (GMDH) Quantitative and qualitative results indicated that SVM performed better Pandey et al. (2023) Categorical boosting (CatBoost) in conjunction with extra tree regression (ETR) and K-nearest neighbor (KNN) The gradient boosting decision tree (GBDT) method selected features with higher importance 113 million due to flooding events coupled with bridge collapses (Kellermann et al. 2016).Additionally, the projected expenditure for mitigating scour risk across Europe from 2040 to 2070 is estimated to reach USD 611 million per annum (Nemry and Demirel 2012).Given its integral role, this critical infrastructure component underscores the prominence of research endeavors focusing on augmenting safety during the design phase and minimizing the likelihood of bridge failures.In this context, researchers have introduced a range of experimental equations, a selection of which is displayed in Table 1.The review of existing literature demonstrates that over the recent decades, a variety of mathematical equations have been proposed to forecast the scour depth around bridge piers.However, these equations, often rooted in empirical observations, are fraught with numerous limitations (Brandimarte et al. 2012).Furthermore, their efficacy is typically confined to specific experimental conditions (Bateni et al. 2007).Mueller and Wagner (2005) undertook an assessment of 22 mathematical equations using field data, revealing a consistent trend of overestimating scour hole dimensions in comparison with actual measurements.Similarly, Landers and Mueller (1996) conducted a comparative analysis of five empirical formulas for bridge pier scour prediction based on field data, concluding that none of the selected formulas yielded accurate estimations of scour depth.Gaudio et al. (2010) conducted a comparative study involving six design formulas for predicting scour depth, juxtaposing the results with field data.Their investigation disclosed that all utilized formulas generated predictions that were deemed unreasonably inaccurate.Multiple other scholars have documented the deficiencies inherent in experimental-based formulations when it comes to forecasting the depth of scour around bridge piers (Rahimi et al. 2020).
In response to these challenges, researchers have increasingly directed their efforts towards leveraging artificial intelligence (AI) techniques to enhance the accuracy of pier scour depth prediction.Within this context, machine learning methods (MLMs), which constitute a prominent subset of AI methodologies, have garnered significant interest among researchers in the realm of engineering prognostication.MLMs operate by scrutinizing datasets, with a specific emphasis on identifying interrelations among input, internal, and output variables, all while circumventing the need for explicit comprehension of the system's underlying physical mechanisms (Qaderi et al. 2020).Table 2 displays the compilation of a literature review encompassing diverse MLMs techniques employed for the modeling of scour depth around bridge piers.
The primary impetus behind this research stems from the prowess and prospective applications of MLMs.In pursuit of this objective, the current study systematically employed extensive datasets derived from empirical experiments conducted within various laboratory flume settings.These datasets encompassed a diverse spectrum of sediment gradations and coarse material fractions.The resultant data points exhibit a substantial breadth of variability, thereby facilitating the utilization of the SVM, the GEP, the ANN and empirical equations for the purpose of predicting scour depths in the vicinity of cylindrical bridge piers.To discern the optimal predictive models, a comprehensive analysis involving statistical indices has been undertaken.

Material and methods
The forthcoming research endeavor will encompass a systematic sequence of steps, characterized by the following methodological delineations: (i) data collection and nalysis, (ii) dimensional analysis, (iii) sensitivity analysis, (iv) identification of key inputs, (v) implementation of prescribed models, (vi) output analysis.This structured framework encapsulates the logical progression of the research endeavor, designed to yield robust and substantiated findings.The methodology employed in the current investigation for the prediction of bridge pier scour depth is elucidated through the schematic representation depicted in Fig. 1.

Dataset complied
The data employed in the current investigation were sourced from pre-existing studies documented in the literature, specifically conducted under conditions characterized by clear water.These studies encompass a diverse array of laboratory flumes and field data, incorporating a wide spectrum of sediment compositions and hydraulic flow scenarios.In totality, a dataset comprising 455 dependable data points was curated and subsequently utilized as the foundation for the present research endeavors.It is noteworthy that among this total, 168 datasets pertain to field data, while 287 datasets pertain to laboratory data.Table 3 delineates the statistical metrics corresponding to the datasets associated with each respective reference.Within this tabular representation, the variables are defined as follows: D signifies the diameter of the pier, Y pertains to the flow depth, U denotes the flow velocity, U c represents the critical velocity, D 50 encapsulates the average size of sediment particles, and S embodies the scour depth.
The abbreviations Max, Min and S.D stand for the maximum value, minimum value, and standard deviation of the datasets, respectively.As elucidated in the detailed descriptions outlined in Table 3, the simulation of relative scour depth is achieved through a fusion of laboratory and field data.To mitigate the impact of numerical scale variations, all model inputs have been normalized to a standardized range between zero and one.

Dimensional analysis
The scour hole depth can be elucidated through the consideration of three primary categories: (i) flow conditions, (ii) sediment characteristics, and (iii) bridge pier geometry.The subsequent equation can be formulated as follows: In the equation provided, where g represents the acceleration due to gravity, ρ denotes the density of water, ρ s signifies the density of sediment, U stands for the mean velocity of the flow, U c represents the critical velocity of sediment particles, y represents the flow depth, μ denotes the dynamic viscosity of water, D stands for the diameter of the bridge pier, and D 50 represents the mean diameter of sediment particles.As duplicate parameters, three D, U, and ρ parameters were opted to extract dimensionless parameters using the Pi-Buckingham theory.The outcome of the dimensional analysis can be articulated as follows: Here U √ gy is the Froude number (Fr) of the flow, was omitted, given the inclusion of the Fr and Fr D .Consequently, Eq. ( 18) is simplified to the following form:

Gamma test
As elucidated by Koncar (1997), the gamma test is a nonparametric statistical method employed to estimate an output by identifying the optimal set of input-output datasets based on the best mean square error values.This method is introduced as a suitable approach for determining the most effective combination of diverse input variables to accurately describe the output.In this method, the dataset is supposed as x i , y i , 1 ≤ i ≤ M) , where the input vectors x i ∈ R m are m dimensional vectors and corresponding outputs y i ∈ R are scalars.The vectors x influences the output y.The association among the inputs and output variables is defined by the following equation: Here G and Γ represent gradient and interception of the regression line (x = 0), respectively, and y is output.Smaller values of G and Γ indicate that the corresponding input variables are more suitable.In addition to these two criteria, an indicator denoted as V − Ratio = Γ 2 (y) where Γ represents the gamma function and 2 (y) is the output variance, is employed to identify the optimal input parameters.The values of V-Ratio range from 0 to 1.A V-Ratio value closer to zero for each input parameter signifies the effectiveness of that particular input.The various combinations of input variables have been delineated following the format introduced by Mask (Malik et al. 2021).Since Eq. ( 19) incorporates five dimensionless parameters, the Mask representation employs five digits corresponding to the five parameters: , respectively.In the representation provided, the digits '1' and '0' signify whether an input is included ('1') or not included ('0').Therefore, '10,100' indicates that Exponential radial basis function (ERBF) lowest values of Γ, G, and V-Ratio.It is important to highlight that, owing to the distinct ranges of variation for each parameter, all analyses have been conducted using normalized data, as described by the following equation: Here x min and x max are the minimum and maximum values of variable x, and x normal is the normalized value of x i .Table 4 displays the outcomes of the gamma test based on Eq. ( 19).From the data presented in Table 4, it is evident that the sixth model, which includes the parameters y D , Fr and Fr D (11,100), demonstrates the most favorable test results, characterized by the lowest values of Γ (0.052), G (0.372), and V-Ratio (0.305).Table 5 presents a brief statistical characteristics of input and output parameters.An overall graphic view for scour depth variation has been illustrated in Fig. 2.

Overview of MLMs involved
A general view of the GEP Proposed by Ferreira (2001), the GEP constitutes a genetic algorithm that operates by managing a populace of individuals.These individuals are selected based on their fitness and subsequently subjected to genetic diversity through the application of one or more genetic operators, as expounded upon by Mitchell (1996).The GEP amalgamates diverse components, encompassing mathematical and logical expressions, polynomial constructs, decision trees, and assorted operators.The programming of GEP entails the utilization of linear chromosomes, which are articulated through expression trees (ETs).The procedural depiction of GEP's operational sequence is delineated in Fig. 3, as illustrated by the GEP simulation flowchart.
The initial phase involves generating an inaugural population derived from equations that constitute random amalgamations of a predefined array of functions.This assemblage encompasses mathematical operators within equations, alongside terminating elements like problem variables and constants.Proceeding to the subsequent stage, each constituent of the population is evaluated based on established fitness criteria.Subsequently, the third stage encompasses the generation of a fresh population via the deployment of equations.Advancing to the fourth stage, the preceding procedure is reiterated iteratively with the aim of attaining the highest possible yield of outcomes.

A general view of the SVM
Conceived by Vapnik (1995), the SVM stands as a nonlinear search algorithm employed for classification purposes, grounded in the structural risk minimization principle derived from statistical learning theory, as elucidated by Qaderi et al. (2020).Originally introduced for classification tasks, this algorithm underwent subsequent development, leading to an extended version designed for non-parametric regression analysis, referred to as Support Vector Regression (SVR).At its core, the SVM draws upon the foundation of statistical training theory.Analogous to regression equations, the linkage between the dependent variable Y and the independent variables x i is formalized as an algebraic equation, encompassing a noise component, as depicted below: where i (x) is the kernel function, b is the characteristics of the regression function, and W i is the weighted vectors.Table 6 documents distinct categories of kernel functions.Notably, empirical evidence stemming from multiple studies has substantiated the superior efficacy of the radial basis function (RBF) over alternative kernel functions, as demonstrated by Dibike et al. (2001).Within the realm of RBF, two pivotal tuning parameters, specifically the penalty parameter denoted as C and the epsilon parameter symbolized as ε, are identified and calibrated with the aim of optimizing performance outcomes.

A general view of the M5
Model trees (MTs) are employed as a strategic approach to address intricate problems by partitioning them into more manageable subproblems.This technique entails the division of the parameter space into distinct subspaces, subsequently constructing an Adept linear regression model for each subset, referred to as a terminal, node, or leaf.The M5 algorithm, designed for the creation of model trees, establishes a hierarchical tree structure, frequently binary in nature.This structure encompasses splitting rules at nonterminal nodes and expert models at the terminal leaves.The M5 algorithm employs a divide-and-conquer principle, as visually depicted in Fig. 4.
Within the M5 algorithm, the standard deviation (SD) functions as the designated criterion for performing splits based on class distinctions.Furthermore, it computes the projected reduction in error resulting from evaluating each variable at the designated node.The formulation employed to calculate this reduction, known as the SDR, is central to the construction of the M5 model tree, and can be expressed as follows: where T represents a set of examples that reach the node; T i denotes the sets of examples that have the i-th outcome of the potential set; and SD represents the standard deviation.

A general view of the GBT
Statistically expounded upon by Breiman et al. (1984), Hastie et al. (2001), andDe'ath andFabricius (2000), contemporary decision trees (DTs) employ a strategic methodology for partitioning the predictor space into distinct rectangles.This process involves the sequential application of rules to delineate regions characterized by the highest degree of homogeneity in their responses to predictor variables.Illustrated in Fig. 5, each of these regions is associated with a constant value.In the context of classification trees, this constant value represents the most probable class.Conversely, for regression trees, the constant value signifies the mean response of observations within that specific region.It is noteworthy that regression trees operate under the assumption of errors conforming to a normal distribution, as stipulated by Hastie et al. (2001).
To improve the DTs precise, boosting methods have been developed based on this idea that it is easier to find and average many rough rules of thumb, than to find a single, highly accurate prediction rule (Schapire 2003).Gradient boosting is one of the common boosting method that a DT of fixed size is utilized as a base learner to improve fitting quality of every base learner, so-called gradient boosting tree, GBT.In the GBT, each subset tree is trained primarily with data that has been erroneously predicted by the previous tree.This makes the model more focused on complex cases and less on issues that are easy to predict (Breiman 1984).

A general view of the MARS
Developed by Friedman (1991) as a nonparametric regression model, the MARS is an algorithm with remarkable performance to estimate and simulate the interaction between input and target parameters of a linear or nonlinear continuous  dataset.The MARS system fits an adaptive nonlinear regression model using multiple piecewise linear basis functions hierarchically ordered in consecutive splits over the predictor variable space.In other words, it is a high-precision technique for modeling systems which is based on the dataset.The generalized form of the MARS model can be expressed as follows: where y is the output parameter and c o and N are the constant, the number of basic functions, respectively.H kN (x (k,n) ) is basis function where x (k,n) is the predictor of the k-th of the m-th product.

A general view of the MLP
MLPs represent a fundamental and versatile class of the ANN that have found widespread application in various fields.The MLPs are a type of feedforward artificial neural network characterized by their layered structure.Their architecture includes three main parts as follows: (i) input layer; (ii) hidden layer; and (iii) output layer.The input layer of an MLP receives the initial data or features and transmits them to the hidden layers.Each neuron in the input layer corresponds to a feature in the input data.MLPs can have one or more hidden layers between the input and output layers.These hidden layers contain neurons (or nodes) that apply weighted sums and activation functions to their inputs.The number of hidden layers and neurons in each layer is a crucial architectural choice.The output layer produces the final result or prediction of the network.The number of neurons in the output layer depends on the nature of the task.MLPs are trained using supervised learning, where they learn to map input data to target output values.The most common training algorithm for MLPs is backpropagation, coupled with gradient descent or its variants.This process involves adjusting the weights and biases of the neurons to minimize a predefined loss function, typically mean squared error for regression tasks and cross-entropy for classification tasks.Neurons in MLPs use activation functions to introduce nonlinearity into the model.Common activation functions include sigmoid, hyperbolic tangent (tanh), and rectified linear unit (ReLU).The choice of activation function can significantly impact training and model performance.

NLR models
Various empirical and experimental formulas were proposed to estimate scour hole depth base on flow, sediment, and bridge pier characteristics.The formulas that are more compatible with collected data in this research work have been presented in .
Table 7 The formulas listed in.Table 7 are used to compare performance between experimental models and AIs.

Analyzing performance through statistical metrics
The performance of DDMs and empirical models is appraised using root-mean-square error (RMSE), mean average error (MAE), coefficient of determination (R 2 ).These indices are defined as follows:   Here O and P are observed and predicted values of scour depth, respectively, and N is the total number of the dataset.Aforementioned indices represent average error values of the implmented models.To rectify this fault, the developed discrepancy ratio (DDR) statistic has been represented: For better judgment and visualization, the Gaussian function of DDR values should be illustrated in a standard normal distribution.To this end, firstly, the DDR values of scour depths must be standardized and then using Gaussian function the normalized value of DDR (S DDR ) is calculated.Secondly, the values of S DDR are plotted against standardized values of scour depth (Z DDR ).At Z DDR vs. S DDR graph, more tendencies in error distribution to the centerline and larger values of S DDR have more precision (Noori et al. 2010).

Results and discussion
Using performance evaluation metrics, the simulation accuracy of each DDMs has been assessed and is presented in Table 8.This table provides a comprehensive overview of the performance of DDMs during both their training and testing phases.The data have been partitioned into a 70% training set and a 30% testing set.In addition to the statistical (29) DDR = Predicted value Observed value − 1 metrics delineated in Table 8, an examination of the residual distribution and the alignment between observed and calculated data, as depicted by the compliance curve, has been employed to assess the fidelity of the model simulations.In this context, Table 9 showcases several key statistical properties pertaining to the residual errors associated with each model's output.
In reference to Table 8, the performance indices (RMSE, MAE, R 2 , DDR max ) for the SVM model during the training and testing stages are (0.1009, 0.0726, 0.9401, 2.9237) and (0.023, 0.017, 0.984, 5.301).Furthermore, the associated hyperparameters for the SVM model, specifically the setting parameters C, ε and γ, are set to 63, 0.5, and 0.2, respectively.Additionally, the radial basis function (RBF) kernel function has been selected as the kernel function for the SVM model.The error variation range exhibits fluctuations within the span of -0.366 to 0.076 throughout the training phase, and a narrower range of − 0.084 to 0.065 during the subsequent testing phase (Table 9).Furthermore, noteworthy is the substantial decrease in the mean error value, which decreases sharply from − 0.065 in the training phase to an analogous value of − 0.005 in the test phase.This dramatic reduction is further underscored by the total error count diminishing markedly, plummeting from an initial − 20.642 during the training phase to a mere − 0.707 in the test phase.The tabulated data unequivocally establish that this model Fig. 9 The output of the M5 model through the training and the testing phases possesses the most favorable statistical indices related to errors, signifying its unequivocal superiority over the other models.Figure 6 depicts the graphical representation of data fitting for the SVM model, illustrating its conformity with the observed data and the distribution of residuals.The salient observation and overarching inference gleaned from this graphical representation are that the model's precision is notably conducive for diminutive scour values.Furthermore, it is noteworthy that the model exhibits higher errors when dealing with larger datasets.
The performance metrics are RMSE = 0.2229, MAE = 0.1674, R 2 = 0.7796, DDR max = 1.2109 for the training phase and RMSE = 0.114, MAE = 0.071, R 2 = 0.872, DDR max = 1.553 for the testing phase.These performance indicators have been computed based on the setting parameter values specified in Table 10.The structural representation of the GEP model, including the functions utilized, is visually represented in the form of a tree expression in Fig. 7. Additionally, the specific values of the constants employed in Fig. 7 are as follows: G1C0 = 9.322418, G1C1 = 3.138397, G2C0 = − 0.491638, G2C1 = 2.819794, G3C0 = 0.961823, G3C1 = 5.220947.This observation suggests a tendency for overestimation within the GEP model.Reffering to Table 9, the error fluctuation range observed during the training period spans from − 0.674 to 0.235, while in the subsequent test phase, it narrows to a range of − 0.639 to 0.182.This marked reduction is vividly apparent in the average error values, diminishing notably from − 0.142 in the training phase to − 0.055 in the test phase.Such a discernible trend is further corroborated by the total error    In the presented figures, the disparity between the observed and computed values becomes readily apparent, particularly in datasets characterized by substantial values.Notably, when assessing the comparative performance of the M5 and GBT models, it is evident that the latter yields outputs with a higher degree of relative accursacy.Moreover, the conspicuous presence of overestimation is unmistakably evident in both of these models.
The accuracy assessment of the MARS model outputs is predicated on several statistical indicators, including RMSE, MAE, R 2 , and DDR max .In the training phase, these indicators yield values of 0.2022, 0.1441, 0.8373, and 1.4324, respectively.In the test phase, corresponding values are 0.090, 0.057, 0.917, and 1.883, respectively.The relative scour depth can be computed using the following mathematical relationship, with specific details of the BFs provided in Table 11: The values 0.156 and − 0.716 denote the upper and lower bounds of errors observed during the training period, while during the test phase, the range contracts − 0.389 and 0.130, respectively (Table 9).A noteworthy decrease in the cumulative error is evident, with a reduction of nearly eightfold, transitioning from − 41.629 during the training phase to − 5.501 during testing, underscoring the model's improved performance.Figure 12     These metrics yield values of (0.1545, 0.1116, 0.8781, 1.8315) for the training dataset and (0.058, 0.041, 0.964, 2.924) for the testing dataset.Furthermore, a visual representation of the MLP model's output is depicted in Fig. 13.Upon revisiting Table 9, it becomes evident that, in the case of the MLP model, the fluctuations in error values during the training period span from − 0.513 to 0.137, while during the test period, they range from − 0.291 to 0.019.Furthermore, the mean error has witnessed a noteworthy reduction, declining from − 0.097 in the training phase to − 0.039 in the test phase.This performance enhancement is underscored by a 30% reduction in the total error index, as clearly indicated by the tabulated figures.
In our comprehensive comparison of the the DDMs employed in this research, we leverage the distribution curve of observational and computational data plotted around the ideal 1:1 line, as illustrated in Fig.

Regression equations
In this section, we assess the outcomes derived from the regression equations, as indicated in Table 7.The efficacy of predicting relative scour depth is presented in Table 12, which portrays the quality of these predictions.Within Table 12, it is evident that the statistical performance evaluation metrics for both models exhibit remarkable proximity to each other.However, the most substantial disparity lies in the value of the DDR max index, where the the US, DOT (2003) equation achieves a notably higher score of 0.9931, in contrast to the Aksoy and Eski (2016) equations, which yield a lower score of 0.6863.Additionally, as per the data provided in Table 13, the residual indicators for both models manifest nearly identical values.Figure 16 visually portrays the distribution of residual in the experimental equations, revealing a pronounced non-compliance trend among data points with higher values.Moreover, Fig. 17 illustrates the distribution of data estimated by the emprical equations, with points closely clustered around the 1:1 line.This clustering, particularly evident in the predictions made by the US, DOT (2003) equation, underscores its relative superiority.Lastly, Fig. 18 reinforces the notion of the US, DOT (2003) equation's superior performance in comparison with the equation presented by Aksoy and Eski (2016), as evidenced by the higher peak value along the vertical axis.

Conclusion
Scour phenomena around bridge piers are inherently intricate, necessitating a comprehensive understanding of their underlying mechanisms in order to effectively assess and predict scour hazards.To date, the development of precise methods for estimating scour depth remains an ongoing challenge.In the contemporary context, machine learning techniques have emerged as potent tools for predicting scour depth, leveraging experimental data to enhance our predictive capabilities in this domain.This study undertakes a comprehensive comparative analysis to evaluate the efficacy of various DDMs, specifically the SVM, the GEP, the MLP, the GBT, The M5, the MARS and two experimental equations, in the computation of scour depth around circular bridge piers.The outcomes of this investigation, while affirming the capacity and potential of DDMs in forecasting the scour depth of bridge piers, exhibit a notably enhanced relative precision in comparison to alternative models.Sequentially, the MLP, the MARS, the GEP, the GBT and the M5 models have ascribed themselves to subsequent ranks of the SVM.For the purpose of juxtaposing and assessing the relative accuracy of the results derived from DDMs, empirical equations were employed to assess  the scour depth of bridge foundations.The precision of the outputs generated by this subset of equations demonstrates their occupancy of lower echelons when ranked against the DDMs.In a holistic appraisal, it can be posited that both categories, namely DDMs and empirical equations, exhibit proficiency in scour depth prediction.Nonetheless, the utilization of AI-based models yields more precise outcomes, as elucidated by the findings expounded by researchers in Table 2, albeit predicated upon the availability of an extensive repository of recorded data encompassing both independent and dependent variables, thereby serving as a preliminary and indispensable prerequisite for the application of these models.
Singh et al. (2022) Multi-level ensemble machine learning (ML), extreme gradient boosting (XGBoost), adaptive boosting (Adaboost), and random forest (RF) The outcomes of simulation indicated that the ML was superior to the standalone ensemble techniques Choi and Choi (2022) SVM, ANFIS, GEP, nonlinear regression (NLR) the SVM are found to predict the maximum scour depths better than GEP, ANFIS and NLR Pandey et al. (2022) GBDT, cascaded forward neural network (CFNN), and kernel ridge regression (KRR) The results demonstrated that the GBDT model outperformed the CFNN and KRR models in both scenarios Dang et al. (2021) ANN-LM, ANN-PSO, ANN-FA (Firefly algorithm) The ANN-PSO hybrid model was the most superior Qaderi et al. (2020) SVM, GEP, ANFIS, ANN, GMDH The results indicated that ANFIS was the superior model in terms of all statistical criteria in both the training and the testing phases Pandey et al. (2020) Multiple linear regression (MLR) and genetic algorithm (GA) The employment of the GA led to more precise predictions of scour depth in comparison with the MLR approach Adib et al. (2020) Multi layer perceptron (MLP), ANFIS, SVM MLP had better performance than the others Bateni et al. (2019) GEP, MARS The comparison showed that both models perform better than the regression-based empirical equations Sreedhara et al. (2019) Particle swarm optimization (PSO)-tuned SVM, ANFIS The PSO-SVM model can be adopted as accurate and efficient alternative approach in predicting scour depth of the pier Goel (2019) M5 tree Results indicated the improvement to great extent and had has shown better results Parsaie et al. (2018) GMDH, SVM, MLP All developed models have suitable performance, however; the SVM model was a bit more accurate Ebtehaj et al. (2017) Self-adaptive extreme learning machine (SAELM), ANN, SVM, SAELM had better performance than the other models Sharafi et al. (2016) SVM, ANN, ANFIS, NLR SVM-polynomial predicted scour depth with higher accuracy and lower error Akib et al. (2014) ANFIS, linear regression (LR) ANFIS's results were highly accurate, precise, and satisfactory Pal et al. (2011) ANN-BPNN, ANN-GRNN, SVR SVR was the better scour depth predictor for their used dataset Firat (2009) ANN (RBNN), ANFIS, Multiple linear regression (MLR) The best results belonged to the ANFIS model in predicting scour depth 231 Page 4 of 22

Fig. 1
Fig. 1 Flowchart applied for the present paper to opt superior predictions

Fig. 2 A
Fig. 2 A 3D view of scour depth mapping number (Fr D ) of the sediment particle, U √ gD is the Froude number (Fr p ) of the bridge pier, Uy is the Reynolds number (Re) of the flow.Due to the presence of turbulent flow conditions, the Re was excluded from the analysis.Additionally, the parameter U U c Fig. 3 Flowchart of the GEP

Fig. 6
Fig. 6 Distribution of dataset and residuals for the SVM

Fig. 7 Fig. 8
Fig. 7 Tree expression of the GEP output

Fig. 10
Fig. 10 Distribution of dataset and residuals for the M5

Fig. 11
Fig. 11 Distribution of dataset and residuals for the GBT of 0.284 (RMSE), 0.183 (MAE), 0.698 (R 2 ) and 0.591 (DDR max ).In contrast, the GBT model demonstrates similar trends with values of 0.3667 (RMSE), 0.2599 (MAE), 0.6708 (R 2 ) and 0.7579(DDR max ) during the training phase, and values of 0.125 (RMSE), 0.084 (MAE), 0.819 (R 2 ) and 1.147 (DDR max ) in the testing phase.Figure9provides a visual representation of the output structure of the M5 model, illustrating its performance during both training and testing phases.For the M5 model, a notable contrast exists between the minimum and maximum error values, which range from − 1.931 to 0.459 during the training period and from − 1.611 to 0.157 in the test phase (Table9).This discrepancy is further emphasized by the average error values, which stand at − 0.286 in the training phase and − 0.162 in the test phase.Remarkably, the cumulative errors for this model exhibit substantial magnitudes, amounting to − 593.91 during training and − 21.830 during testing.In the case of the GBT model, the range of error fluctuations spans from − 1.220 to 0.365 during training and from − 0.591 to 0.325 in the test processes.Notably, the total error value decreases significantly, with a reduction of nearly tenfold, declining from -68.893 during training to − 6.917.Notably, Figs. 10 and 11 reveal the conspicuous lack of alignment between observed and computed data, underscored by the substantial residual error in both models.

Fig. 12
Fig. 12 Distribution of dataset and residuals for the MARS

Fig. 13
Fig. 13 Distribution of dataset and residuals for the MLP

Fig. 14 Fig. 15
Fig. 14 Scatter plot 0f observed vs. predicted values of relative scour depth for DDMs 14. Points situated closer to this line signify the relative superiority of a given model's output.Notably, the black filled dots within this figure represent the performance of the SVM model, which conspicuously stands out as the most superior among the models under consideration.This distinction is both clear and unequivocal.Furthermore, as an additional metric for comparing the data-driven models, we analyze the graphical characteristic of the DDR index, depicted in Fig. 15.The compactness of the curve in proximity to the vertical axis and the heightened peak value along the vertical axis serve

Fig. 16
Fig. 16 Distribution of dataset and residuals for empirical equations

Fig. 17 Fig. 18
Fig. 17 Scatter plot of observed vs. predicted values of relative scour depth for empirical equations

Table 3
Statistical characteristics of collected datasets

Table 6
Types of kernel functions

Table 8
The outcomes assessment metrics of the DDMs included

Table 10
Setting parameters of GEP to predict scour depth

Table 11
The MARS model constructed BFs with their correspond-

Table 12
The outcomes assessment metrics of the empirical predictors