Hybrid model of support vector regression and innovative gunner optimization algorithm for estimating ski-jump spillway scour depth

Scour hole that occurs downstream of the hydraulic structures threatens the safety and stability of the hydraulic structures. The scour around the structures is a complex and important hydraulic phenomenon; hence, it requires a data extensive research for the accurate estimation of scour depth. Although many analytical models are available for scour depth estimation, they suffer from huge limitations. In this research, the support vector regression (SVR) model and SVR ensemble with the metaheuristic algorithm of innovative gunner (SVR-AIG) models have been developed for accurate prediction of scour depth downstream of the ski-jump spillways. Field measurements including head and discharge intensity are used for developing the models. The performances of the models are compared using root mean square error (RMSE), mean average error (MAE), and correlation coefficient (CC) criteria and some statistical plots. The results showed that the hybrid SVR-AIG-based estimations (with CC = 0.987, 0.991, RMSE = 2.839, 1.987, and MAE = 2.247, 1.201) are more accurate than the SVR standalone model estimations (with CC = 0.942, 0.975, RMSE = 5.686, 4.040, and MAE = 4.114, 3.201) at the training and testing phases. This study is an important reference for analyzing the high capability of the AIG as an optimization tool in improving scour estimations of a standalone model. Also, this algorithm eliminates the trial-and-error procedure to optimize the internal parameters during the model development.


Introduction
Spillways are major parts of dams for controlled and uncontrolled disposal of excess inflows to reservoirs, especially in flood conditions. Downstream of the spillways, at the point of contact of the jet with the riverbed, a hole is excavated into the soil and rocks due to the high energy of water. Therefore, for high-flow discharges, spillways are devised with energy dissipaters such as ski jumps to reduce the downstream scouring. There are many types of spillways, out of which the skijump bucket type is more commonly used. The energy dissipation in such a spillway is in the form of a jet of water leaving away from the bucket lip into the air, and then, falling into the plunge pool formed at the point of impact on the tailwater. To reduce heavy soil erosion and thereby dam failures, and to evaluate the stability of the dam and other hydraulic structures, accurate estimation of downstream scouring is very crucial.
In order to study erosion and scour downstream of hydraulic structures, physical hydraulic models such as Navier-stokes and associated equations by computational fluid dynamics (CFD) methods of finite element and finite volume have been widely applied (Zhang et al. 2014). However, these models due to their costs and complexities in design and analysis have become inefficient and timeconsuming. Hence, researchers investigate other fast, easy, and accurate methods for scouring estimation in hydraulic studies. Recently, hydraulic experts employ soft computing techniques in estimating scouring in many studies (Muzzammil 2008;Guven et al. 2009;Adarsh 2010;Ebtehaj and Bonakdari 2013;Rikar et al. 2016;Najafzadeh et al. 2017;Parsaie et al. 2018;Abdollahpour et al. 2019). The main goal of prediction with AI techniques was following the recent developments to obtain the best model performances (Khatibi et al. 2017). In the last decade, artificial neural networks (ANN) as artificial intelligence (AI) techniques are widely employed in predicting hydraulic and other flow parameters (Muzzammil 2008;Emamgholizadeh 2012;Onen 2014;Raikar et al. 2016;Pourzangbar et al. 2017). However, due to many uncertainties in ANN modeling techniques, many attempts have been made by researchers to improve the model efficiency by applying optimization algorithms or developing other AI methods. The application of support vector regression (SVR) model in scour hole modeling has been significantly used in recent times (e.g., Goyal and Ojha 2011;Sharafi et al. 2016;Hoang et al. 2018;Sun et al. 2021). It is imperative to note that the dimensionality of the input space in the SVR model does not affect the computational complexity. Moreover, the models' prediction accuracy is insensitive to the outliers in the datasets. The SVR model requires large datasets for model development; it suffers from dimensionality and it is computationally demanding too (Awad and Khanna 2015).
The advantage of hybrid models over stand-alone models is that it takes the strength of each model and neutralizes the weaknesses, which results in improving the overall performances of the developed models. Many researchers have taken the advantage of hybrid models in hydrologic and hydraulic problems. An application of a hybrid smart artificial firefly colony algorithm (SAFCA)-based support vector regression (SAFCAS) model in modeling scour depth near bridge piers is seen in the research work of Chou and Pham (2014). Through their work, the hybrid model integrates the firefly algorithm (FA), chaotic maps, adaptive inertia weight, Lévy flight, and SVR model for scour depth modeling. The model's performances were compared and assessed with other numerical models and empirical models. Salih et al. (2020) ensemble enhanced binary particle swarm optimization (PSO) algorithm with SVR model (tBPSO-SVR) for submerged weir scour modeling. The comparison of the developed hybrid model with other machine learning models shows the outperformance of tBPSO-SVR model over other selected models.
Based on the literature, the efficiency of the optimization algorithm of innovative gunner (AIG) in soft computing hybrid models is less explored in water-related problems and hence needs further research to explore the effectiveness of its algorithm performance in the field of water resources problems. To the best of the authors' knowledge, no studies have been carried out exploring the applicability of scour hole modeling using the hybrid model SVR-AIG. Hence, in this study, the effectiveness of the SVR-AIG hybrid model application over the stand-alone SVR model in scour depth modeling has been studied. Taylor diagrams, and scatter and Violin plots along with other measurement indices are  (Azamathulla et al. 2008) Page 3 of 10 11 used for checking the considerable improvements in SVR performance by employing the optimization AIG algorithm.

Case study and data description
Conventionally, the scour depth downstream of the ski-jump buckets spillways (Fig. 1) has been estimated using various empirical equations derived from the experimental datasets. Veronese equation is among these equations, in which scour can be estimated as (Yildiz and Uzucek 1994): where d s is the vertical depth of scour below tailwater, H 1 is the effective energy of the jet entering tailwater, and q represents specific discharge. Wu (1973) suggested another equation for estimating relative scour in ski-jump spillways as follows: Martins (1975) also proposed this equation for the estimation of scour depth: (1) In the present work, the data of the previous experimental works, obtained from the mentioned and other traditional prediction formulae based on only q and H 1 , have been collected and compiled in Table 1 to investigate the usefulness of the SVR-AIG approach to predict scour depth at downstream of ski-jump spillways. To elaborate more, (

Support vector regression (SVR)
The SVR model was first developed by Vapnik (1995) for regression and classification problems. The SVR model approach is considered a nonparametric approach, as it mainly relies on kernel functions. This model is developed based on statistical learning theory for structural risk minimization (Safavi and Esmikhani 2013). Also, this aims to decrease the learning machine's confidence interval and (3) d s = 1.5q 0.6 H 1 0.1 Fig. 2 Flowchart of SVR-AIG model empirical risk for attaining a strong generalization capacity (Raghavendra and Deka 2015).
A set of training data, x i + d i N i , is considered for model development, where x i is the input vector, d i is the target vector, and N is the number of observations. The regression function of SVM can be written as follows: The variables are defined as: i is a weight vector, b is a bias, and ∅ i is a nonlinear transfer function that maps the input vectors into a high dimensional feature space, where a simple linear regression can deal with the complex nonlinear regression of the input space.
The SVR models minimize the e-insensitivity loss function to find a solution for the following equation: where and represent the slack variables, which reduces the errors in the training process by the loss function over the error tolerance ; C is a positive trade-off parameter which represents the degree of the empirical error in the optimization problem and d i is the desired or target value (Suryanarayana et al. 2014). Mathematical manipulations are adopted for transforming the objective function into the binary formulation. The dual problem is transformed into an objective function for quadratic coding, which was first employed to solve the SVR technique to guarantee a global minimum. During the training stage of the SVR model, the application and selection of the optimization algorithm are critical, as this determines the precision of optimization variables, training speed, and memory constraint. As a result, the following optimization is utilized and can be expressed as mentioned in Eq. 4 (Suryanarayana et al. 2014): This is opted for solving linear regression problems, rather than nonlinear regression cases; a modified version of Eq. (4) is used and written as follows (Isazadeh et al. 2017): where k x i , x represents the kernel function. The selection of suitable internal parameters is important for better prediction accuracy (Zounemat-Kermani et al. 2016). For the present study, a linear kernel is selected.
Some advantages and disadvantages of the SVR model are: (i) It is robust to the outliers.

Algorithm of innovative gunner (AIG)
The AIG is developed by Pijarski and Kacejko (2019) developed the AIG, which is considered as one of the quick metaheuristic optimization algorithms for solving many optimization problems. The other advantages include a high convergence rate and less time for optimization with high accuracy. It is evident from previous literature that this is one of the more efficient algorithms than other known swarm intelligence algorithms. Detailed information about the algorithm is given by Dehghani and Poudeh, (2021a, b). The overall steps involved in the model development of SVR-AIG are detailed in the flowchart shown in Fig. 2. The pseudo-code of the AIG algorithm is found in Roshni et al., (2022) also given in Fig. 3, which further elaborates the high-level working of the proposed scheme.
Both the SVR and hybrid SVR-AIG models were coded in MATLAB software.

Performance criteria
To validate the performance of the SVR-AIG model with respect to the standalone SVR model, three statistical score metrics were employed. These metrics can be described as: I: Root mean square error (RMSE) expressed as: II: Mean absolute error (MAE) expressed as: where O i and P i are the measured and estimated ith value of the scour depth (ds), O i and P i are the average of the measured and estimated scour depth value, respectively, and N is the number of data.

Results and discussion
In this research, the SVR model is used to predict the scour depth, and the performance is compared with the hybrid SVR-AIG model. The first stage of applying the models is to normalize the data and divide them for the training (70% of the whole data) and testing dataset (30% of the whole data). The experimental dataset including the head and discharge intensity was considered as the models' input variables and the scour depth below the tailwater level was the output variable of the models. The parameters considered for constructing and developing the models' structure are as presented in Tables 2 and 3. The performance indices of root mean square error (RMSE), mean average error (MAE), and correlation coefficient (CC) are summarized in Table 4 for the training and testing phases. It is evident from Table 4 that the strategy of applying the AIG algorithm in the hybrid SVR-AIG model (with CC = 0.987, 0.991, RMSE = 2.839, 1.987, and MAE = 2.247, 1.201) has significantly improved the performances of the SVR model (with CC = 0.942, 0.975, RMSE = 5.686,4.040,and MAE = 4.114,3.201) at the training and testing phases based on all the performance indices. This proves the high ability of the AIG algorithm in improving the standalone SVR model.
The visual presentation of the test results for SVR and SVR-AIG models during the training and testing phase is shown in Fig. 4. It is observed from the time series plots that the predicted values are more aligned with the observed values for both the selected models. However, more accurate predictions of scour depth are evidently seen for the developed SVR-AIG model than SVR stand-alone model. The 1:1 correlation plots also show a significant correlation between the observed and the predicted values with R 2 = 0.9742 for training and R 2 = 0.9847 for testing phases for the SVR-AIG model. The error plots were prepared for both SVR and SVR-AIG models and the results are shown in Fig. 5. The relative error plot of the testing phase indicates that the error percentage varies from − 30% to + 15% in the SVR model and -30% to + 10% in the SVR-AIG model. It is interesting to note that the relative errors measured in the SVR-AIG model were less compared to the Model SVR in both the training and testing phases. It is also noteworthy that the SVR-AIG model results reduce the maximum relative error value by about 30%.
Taylor diagrams (Fig. 6) visualize the variation of CC, standard deviation, and RMSE of the SVR and SVR-AIG outputs. The plot indicates that the hybrid SVR-AIG model outperforms the SVR model results in both the training and testing stages. Figure 7 shows the violin plots of the models for the training and testing phases. As can be seen from this figure, the shape of the violin of the SVR-AIG model is more similar to the observed shape at both the training and testing phases. This means that the predicted values of scour depth for the SVR-AIG model are more close to the observed/measured values at both the training and testing stages.
In general, the results of the evaluation of the statistical criteria and plots showed that the SVR-AIG model outperforms the SVR model in predicting the scour depth. This proves the high ability of the AIG optimization algorithm in improving the SVR performance and estimation of its optimal parameters. The results are in accordance with the results of Dehghani and Poudeh (2021a;b), who found that the hybridization of the AIG algorithm performs better than other optimization algorithms in improving the ANN models for hydrological modeling.

Conclusions
The focus of this research was to find the effectiveness of the SVR-AIG hybrid approach for scour depth estimation of the ski-jump spillways. The performance of the SVR-AIG   6 Taylor diagrams of the models at the training and testing periods hybrid models and the standalone SVR models in predicting the scour depth has been evaluated using statistical indicators. The performance indices (RMSE, MAE, and CC) and graphical indicators clearly indicated that the SVR-AIG method performs more precisely than the SVR standalone model results. The improved performance of the SVR may be attributed to the AIG algorithm that can solve complex nonlinear problems with greater accuracy than the standalone models. Therefore, to reduce heavy soil erosion, dam failures, and to evaluate the stability of the dam and other hydraulic structures it is recommended to apply the hybrid SVR-AIG model to predict scouring depth accurately. For future studies, it is recommended to develop the other hybrid models applying the AIG optimization algorithm and investigate its efficiency in improving the accuracy of the standalone models in estimating scour depth significantly. Availability of data and material They are available from the corresponding author on reasonable request.

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.