Introduction

Rectangular and triangular channels are considered as a certain channel type with trapezoidal cross sections. Trapezoidal-shaped channels are widely used in irrigation channels. In general, side weirs are employed to regulate and measure flow in open channels. Additionally, the side weirs along the trapezoidal channels are used for irrigation networks, drainage lands and other hydraulic purposes. Flow along the side weir is considered as a spatially varied flow (SVF) with decreasing discharge.

On the one hand, several experimental, analytical and numerical studies have been done on the hydraulic characteristics of side weirs located on the rectangular channels.

For example, Cheong (1991) conducted a laboratory study regarding the flow hydraulic behavior over side weirs in trapezoidal channels. The experimental result analysis suggested a relationship for calculating the side weir discharge coefficient. The relationship was as a function of Froude number (Fr) as follows:

$$C_{\text{d}} = 0.45 - 0.22Fr^{2}$$
(1)

Additionally, Novak et al. (2013), Granata et al. (2013), Emiroglu et al. (2014), Azimi et al. (2015), Parvaneh et al. (2016), Azimi and Shabanlou (2017), Maranzoni et al. (2017) and Azimi and Shabanlou (2017) studied hydraulic of the side weirs on the main channels.

On the other hand, soft computing and artificial intelligence techniques have been employed as efficient and useful tools to model the complex phenomena in nonlinear hydraulics, hydrology and water resource science (Mondal et al. 2012; Saba et al. 2017; Saghi et al. 2015; Khoshbin et al. 2016; Shaghaghi et al. 2017; Azimi et al. 2018). In addition, Bilhan et al. (2010) predicted the sharp-crested rectangular side weir discharge capacity as a function of the hydraulic geometry characteristics, main channel and hydraulic parameters using feedforward neural network (FFNN) and radial basis neural network (RBNN) algorithms. Kisi et al. (2012) estimated the discharge coefficient of a triangular labyrinth side weir located on a rectangular channel in subcritical flow condition using radial basis neural network (RBNN), generalized regression neural network (GRNN) and gene expression programming (GEP) models. Emiroglu and Kisi (2013) applied artificial neural networks (ANNs) and adaptive neuro-fuzzy inference systems (ANFISs) to predict trapezoidal labyrinth side weir discharge coefficient in subcritical flow regime. Zaji and Bonakdari (2014) modeled the triangular side weir discharge coefficient using multi-layer perceptron neural networks (MLPNNs), radial basis neural networks (RBNNs) and nonlinear particle swarm optimization (PSO). Ebtehaj et al. (2015a, b, c) used the neural network algorithm of group method of data handling (GMDH) to model the discharge capacity of rectangular side orifices on the side walls of main channels. Ebtehaj et al. (2015a, b, c) estimated the discharge coefficient of sharp-crested side weirs in subcritical flow condition. Ebtehaj et al. (2015a, b, c) utilized GEP to determine the discharge coefficient of side weirs located on the side walls of rectangular channels. Czibula et al. (2014) simulated the intelligent selection of data representations using support vector machines (SVMs). Also using the SVM, Anifowose et al. (2015) improved the prediction of petroleum reservoir characterization. Bonakdari et al. (2015) predicted the discharge coefficient of the triangular side weir using ANFIS.

Furthermore, Azimi et al. (2017a) simulated the discharge coefficient of the side weirs located on the trapezoidal channels using GEP. They presented an equation for computing the discharge coefficient of the side weirs. Additionally, Azimi et al. (2017b) surveyed the factors affecting the discharge coefficient of side weirs on trapezoidal channels using extreme learning machine (ELM). They identified the Froude number as the most effective input variable to model discharge coefficient of the side weirs.

Moreover, support vector machines and support vector regression have been extensively used in various fields (Hu and Zheng 2015; Martínez López et al. 2014; Zhou et al. 2015). Previous studies have shown that there is insufficient research on determining the side weir discharge coefficient in trapezoidal channels. Therefore, in this study, the discharge coefficient of the side weirs located on the trapezoidal channels is modeled using support vector machines (SVMs). According to the effective parameters on the discharge coefficient, some models are developed to calculate the discharge coefficient. Subsequently, the best model for predicting the discharge coefficient of this type of hydraulic structure is introduced.

Experimental apparatus

In this study, Cheong’s (1991) experimental model is used to estimate the discharge coefficient of a side weir located in a trapezoidal channel. The laboratory model consisted of a main channel with a trapezoidal cross section 10 m long. The slope of the main channel side walls was adjustable, and according to the required slope of trapezoidal channel side walls, the side weir opening width could easily be adjusted using plywood with variable length. The main channel bottom width was 0.67 meters. To tranquilize the inflow into the trapezoidal channel, two screens full of sand aggregate were installed at the main channel entrance. In Cheong’s laboratory model, a side weir was installed on a side wall, two-thirds from the main channel entrance. In the main channel, the upstream flow was measured by a weir and the downstream flow was regulated using a volumetric tank. The schematic plan of Cheong’s (1991) experimental model is illustrated in Fig. 1. The maximum, minimum, average, variance and standard deviation values of Cheong’s (1991) experimental model parameters are listed in Table 1. In this table, Fr, \(y_{1}\), \(b\), \(L\) and \(m\) are the Froude number, water depth upstream of the side weir, bottom width of the trapezoidal channel, side weir length and trapezoidal channel wall slope, respectively.

Fig. 1
figure 1

Schematic plan of Cheong’s (1991) experimental model

Table 1 Maximum, minimum, average, variance and standard deviation values of Cheong’s (1991) experimental model parameters

Discharge coefficient of a side weir

Emiroglu et al. (2011) showed that the rectangular side weir discharge coefficient is considered as a function of the Froude number upstream of the side weir (Fr), ratio of side weir length to main channel width (L/b), ratio of side weir length to flow depth upstream of the side weir (L/y1), ratio of weir crest height to flow depth upstream of the side weir (P/y1), the deviation angle of flow \(\left( \psi \right)\) and the main channel slope \(\left( {S_{0} } \right)\):

$$C_{\text{d}} = f\left( {Fr = \frac{u}{{\sqrt {gD} }},\frac{L}{b},\frac{L}{{y_{1} }},\frac{P}{{y_{1} }},\psi ,S_{0} } \right)$$
(2)

El-Khashab (1975) noted that the effect of \(\psi\) on the side weir discharge coefficient is hidden in parameter L/b, and the effect of \(\psi\) on the side weir discharge coefficient has not been studied previously. On the other hand, Borghei et al. (1999) stated that in subcritical flow condition, the effect of main channel slope is negligible. Thus, it can be written as follows:

$$C_{\text{d}} = f\left( {Fr = \frac{u}{{\sqrt {gD} }},\frac{L}{b},\frac{L}{{y_{1} }},\frac{P}{{y_{1} }}} \right)$$
(3)

Due to the wall slope of a trapezoidal channel, the effect of dimensionless parameter \(m\) on the side weir discharge coefficient is studied. As a result, Eq. (3) is written as follows:

$$C_{d} = f\left( {Fr = \frac{u}{{\sqrt {gD} }},\frac{L}{b},\frac{L}{{y_{1} }},\frac{P}{{y_{1} }},m} \right)$$
(4)

For training and testing of support vector machines (SVMs) to model the side weir discharge coefficient in trapezoidal channels, Cheong’s (1991) experimental data are used. In Cheong’s (1991) model, the crest height of the side weir was zero (p = 0.0). However, in order to evaluate all parameters affecting the discharge coefficient of a side weir located in a trapezoidal channel, the dimensionless parameter of ratio of flow depth upstream to the trapezoidal channel bottom width (y1/b) is considered in different combination models. Therefore, the discharge coefficient of side weirs in trapezoidal channels can be written as follows:

$$C_{d} = f\left( {Fr = \frac{u}{{\sqrt {gD} }},\frac{L}{b},\frac{L}{{y_{1} }},m,\frac{{y_{1} }}{b}} \right)$$
(5)

Thus, by combining the parameters affecting the discharge coefficient derived from Eq. (5), six different models (SVM 1–SVM 6) to simulate the discharge coefficient \(\left( {C_{d} } \right)\) are introduced:

$${\text{SVM}}\,\left( 1 \right) :\,C_{d} = \left( {Fr,\frac{L}{b},\frac{L}{{y_{1} }},m,\frac{{y_{1} }}{b}} \right)$$
(6)
$${\text{SVM}}\left( 2 \right) :\,C_{d} = \left( {Fr,\frac{L}{b},\frac{L}{{y_{1} }},m} \right)$$
(7)
$${\text{SVM}}( 3) :\,C_{d} = \left( {Fr,\frac{L}{b},\frac{L}{{y_{1} }},\frac{{y_{1} }}{b}} \right)$$
(8)
$${\text{SVM}}\left( 4 \right) :\;C_{d} = \left( {Fr,\frac{L}{b},m,\frac{{y_{1} }}{b}} \right)$$
(9)
$${\text{SVM}}\left( 5 \right) :C_{d} = \left( {Fr,\frac{L}{{y_{1} }},m,\frac{{y_{1} }}{b}} \right)$$
(10)
$${\text{SVM}}\left( 6 \right) :\;C_{d} = \left( {\frac{L}{b},\frac{L}{{y_{1} }},m,\frac{{y_{1} }}{b}} \right)$$
(11)

Supervised machine learning

The support vector machine (SVM) is a type of supervised machine learning that is part of the linear classification group. This method is formulated as structural risk minimization (SRM), which is quite different from empirical risk minimization (ERM) that has extensive application in statistical learning procedures. SRM decreases the upper bound error, while ERM reduces the training data errors. This difference leads to using SVM owing to further potential generalization. In addition, classical methods such as neural networks may present a local optimum solution as a global optimum solution (GOS), while using SVM guarantees obtaining a GOS. SVM can be applied for a variety of topics, such as regression and classification (Cortes and Vapnik 1995).

Feature space and kernel selection

The fundamental principle of SVM entails providing a nonlinear data mapping of some dot product spaces (called feature spaces). In dot product evaluation, a feature space with a high-dimensional nature is recognized, thus requiring high computational resources and a long time. However, in some cases, the performance is evaluated with less complex kernel formulations. Real-world problems have certain complexity, therefore requiring better assumptions than linear functions because current linear learning machines not only have computational advantages but also have some limitations. In other words, the target data cannot demonstrate simple linear combinations of given features. One of the remarkable properties of linear learning machines is dual representation. This means that the hypothesis can present linear combinations’ training points for low decisions to evaluate the inner training and testing products. To directly calculate the inner product in a feature space as a function of the main entry points, a nonlinear learning machine is built as a kernel function and can be expressed as K. A kernel function may be interpreted as a k function, and consequently, for all \(x,z \in X\), we have:

$$K\left( {x,z} \right) = \left\langle {\varphi \left( x \right) \cdot \varphi \left( z \right)} \right\rangle$$
(12)

A kernel function has two different conditions. First, the function has to be symmetric (Eq. 12), and second, the kernel function must encounter the Cauchy–Schwartz incongruence (Eq. 13) (Suykens and Vandewalle 1999):

$$K(x,z) - \left\langle {\varphi \left( x \right) \cdot \varphi \left( z \right)} \right\rangle = \left\langle {\varphi \left( x \right) \cdot \varphi \left( z \right)} \right\rangle - K(x,z)$$
(13)
$$K(x,z)^{2} - \left\langle {\varphi \left( x \right) \cdot \varphi \left( z \right)} \right\rangle^{2} < \left| {||\varphi \left( x \right)||} \right|^{2} |\varphi \left( z \right)||^{2}$$
(14)

In the above equations, albeit necessary, guaranteeing a feature space defined by the kernel function is not enough. However, kernel representations exhibit optional solutions via projection data in a feature space with high dimensionality to increase the capacity of linear learning machine computing. Among the different kernel functions available to develop a specific model, the nonlinear core functions perform better in analyzing relationships among real-world problems. Therefore, the radial basis function (RBF) is used as a kernel function in this study.

RBF kernel function

The flexible nature of SVMs is attributed to the kernel function that implicitly converts data into a higher-dimensional feature space. A linear solution in a higher-dimensional feature space that is originally associated with a nonlinear solution reduces the dimensional input space. This is because the SVM method is considered a good choice in hydrology and hydraulics, which are generally nonlinear. Several methods employ the nonlinear kernel function in strategy solving for regression problems. One of the radial basis function (RBF) methods used is known as the least square SVM (LS-SVM). The main advantage of LS-SVM is that it is more efficient than SVM in terms of computation, whereby LS-SVM training only solves a set of linear equations instead of the time-consuming and difficult calculation of second-order equations (Behzad et al. 2009a, b).

Compared to other kernel functions, RBF supports kernels and is more compressed, which limits the training process and increases the LS-SVM calculation efficiency—a valuable design feature. Lin et al. (2006) carried out rainfall-runoff modeling using different SVRs with various kernel functions for design and demonstrated that using the RBF kernel function leads to better results than other kernel functions. In addition, many studies on hydrology and hydraulic modeling using SVR have demonstrated good RBF kernel function performance (Arun and Mahesh 2009; Zahrahtul and Ani 2012; Nourani and Andalib 2015).

Support vector regression (SVR)

ε-SVR is presented as an arbitrary ε-insensitive loss function Masjedi et al. (2010). The purpose of SVR is a search function with a maximum ε deviation from the real target vector for all received training. This function can be flat. Suppose that \(\left\{ {x_{i} ,y_{i} } \right\}_{i - 1}^{N}\) is a training set in SVR, where \(x_{i} \in R^{p}\) represents an input vector with p-dimensions and \(y_{i} \in R\) is a scalar measured output. The purpose of modeling is to develop a function y = f(x), indicating that the output is related to yi on input xi, which can be expressed as the following function:

$$y = w^{\text{T}} \phi (x) + b$$
(15)

where w is the weight vector and b is the bias. The regression model can be provided using a nonlinear mapping function ϕ (.). Using input data mapping on the high-dimensional space changes the nonlinear separable problem into a linear separable problem. The \(\phi \left( . \right) = R^{p} \to R^{h}\) function is largely nonlinear and maps the data on a higher-dimensionless feature space. Optimization problems and limitations are interpreted using the following equations:

$$\hbox{min} J\left( {w,e} \right) = \frac{1}{2}w^{\text{T}} w + \gamma \frac{1}{2}\sum\limits_{i - 1}^{N} {e_{i}^{2} }$$
(16)

subject to

$$y_{i} = w^{\text{T}} \phi \left( {x_{i} } \right) + b + e_{i} \quad i = 1,2,3, \ldots N$$
(17)

where ei is the random error and \(y \in R^{ + }\) is a setting parameter in the optimization trade-off between training error minimization and the degree of model complexity. The purpose of the present study is to find the optimal parameters that minimize regression model error. The optimum model is selected using the minimization target function where the ei error is minimized. The formulation is related to feature space regression, and because the feature space dimensions are high, it is not a simple solution. To solve this problem, the Lagrange function can be expressed as follows:

$$L\left( {w,b,e,\alpha } \right) = J\left( {w,e} \right) - \sum\limits_{i - 1}^{N} {\alpha_{i} \left\{ {w^{\text{T}} \phi \left( {x_{i} } \right) + b + e_{i} - y_{i} } \right\}}$$
(18)

The above equation can be solved using partial differentiation with respect to w, b, e and α:

$$\frac{\partial L}{\partial w} = 0 \to w = \sum\limits_{i - 1}^{N} {\alpha_{i} \phi \left( {x_{i} } \right)}$$
(19)
$$\frac{\partial L}{\partial b} = 0 \to b = \sum\limits_{i - 1}^{N} {\alpha_{i} = 0}$$
(20)
$$\frac{\partial L}{{\partial e_{i} }} = 0 \to \alpha_{i} = \gamma e_{i} \quad i = 1,2,3, \ldots ,N$$
(21)
$$\frac{\partial L}{{\partial x_{i} }} = 0 \to w^{\mathrm{T}} \phi \left( {x_{i} } \right) + b + e_{i} - y_{i} = 0\quad i = 1,2,3,\ldots,N$$
(22)

The values of b and αi can be estimated by solving a linear system. Consequently, the LS-SVM is stated as follows:

$$y = f\left( x \right) = \sum\limits_{i - 1}^{N} {\hat{\alpha }_{i} K\left( {x,x_{i} } \right) + \hat{b}}$$
(23)

where K (x, xi) is a kernel function and a nonlinear RBF kernel function defined as follows is used in this study:

$$K\left( {x,x_{i} } \right) = \exp \left( { - \frac{1}{{\sigma^{2} }}\left\| {x - x_{i} } \right\|^{2} } \right)$$
(24)

where σ is the RBF kernel function parameter. Setting parameter γ is also important in LS-SVM. The value of this parameter with the trade-off between the minimum fitting error and estimated function is determined. LS-SVM parameters are determined by the user via trial and error. The optimum values are C = 4, ε = 0.0005 and γ = 0.01.

Results and discussion

In order to accurately evaluate SVM(1)–SVM(6) models, statistical indices including root mean square error (RMSE), mean absolute relative error (MARE) and correlation coefficient (R2) are used:

$${\text{RMSE}} = \sqrt {\frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left( {C_{{d\left( {\text{Predicted}} \right)_{i} }} - C_{{d\left( {\text{Observed}} \right)_{i} }} } \right)^{2} } }$$
(25)
$${\text{MARE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {\frac{{\left| {C_{{{\text{d}}\left( {\text{Predicted}} \right)_{i} }} - C_{{{\text{d}}\left( {\text{Observed}} \right)_{i} }} } \right|}}{{C_{{{\text{d}}\left( {\text{Observed}} \right)_{i} }} }}} \right)}$$
(26)
$$R^{2} = \frac{{\left( {n\sum\nolimits_{i = 1}^{n} {C_{{{\text{d}}\left( {\text{Predicted}} \right)i}} C_{{{\text{d}}\left( {\text{Observed}} \right)i}} - \sum\nolimits_{i = 1}^{n} {C_{{{\text{d}}\left( {\text{Predicted}} \right)i}} \sum\nolimits_{i = 1}^{n} {C_{{{\text{d}}\left( {\text{Observed}} \right)i}} } } } } \right)\,^{2} }}{{\left( {n\sum\nolimits_{i = 1}^{n} {\left( {C_{{{\text{d}}\left( {\text{Predicted}} \right)i}} } \right)^{2} - \sum\nolimits_{i = 1}^{n} {\left( {C_{{{\text{d}}\left( {\text{Predicted}} \right)i}} } \right)^{2} } } } \right)\,\left( {n\sum\nolimits_{i = 1}^{n} {\left( {C_{{{\text{d}}\left( {\text{Observed}} \right)i}} } \right)^{2} - \sum\nolimits_{i = 1}^{n} {\left( {C_{{{\text{d}}\left( {\text{Observed}} \right)i}} } \right)^{2} } } } \right)}}$$
(27)

Here, \(C_{{{\text{d}}\left( {\text{Observed}} \right)i}}\), \(C_{{{\text{d}}\left( {\text{Predicted}} \right)i}}\) and \(n\) are the experimental discharge coefficient, predicted discharge coefficient and experiment number, respectively. In this study, half of the laboratory measurements were used for training and the remaining data were applied for model testing. Figure 2 presents the statistical indices for training models SVM(1)–SVM(6). According to the figure, SVM(1) and SVM(2) have the lowest errors and greatest R2 values.

Fig. 2
figure 2

RMSE, MARE and R2 values for SVM(1)–SVM(6) in training

SVM(1) is a function of the Froude number upstream of the side weir \(\left( {\text{Fr}} \right)\), ratio of side weir length to trapezoidal channel bottom width (L/b), ratio of side weir length to upstream depth flow of the side weir (L/y1), side wall slope in a trapezoidal channel \(\left( m \right)\) and ratio of flow depth upstream of the side weir to trapezoidal channel bottom width (y1/b). Hence, all effective dimensionless parameters on the discharge coefficient are considered in SVM(1). The RMSE, MARE and correlation coefficient values calculated for SVM(1) are 0.0156, 0.0327 and 0.884, respectively. On the other hand, SVM(2) is a function of the Froude number upstream of the side weir, ratio of side weir length to main channel bottom width, ratio of side weir length to flow depth upstream of the side weir and trapezoidal channel wall slope. The RMSE value obtained for SVM(2) is 0.0157, the mean absolute relative error is 0.0329, and R2 is predicted as 0.882. After SVM(1) and SVM(2), models 3 and 6 are the most accurate in discharge coefficient simulation. According to the simulation results, the SVM(3) and SVM(6) have similar accuracy. The SVM(3) is as a function of the Fr, L/b, L/y1 and y1/b parameters, and the RMSE, MARE and R2 are 0.0173, 0.0406 and 0.861, respectively. In addition, the RMSE, MARE and R2 for SVM(6) are 0.0183, 0.0389 and 0.840, respectively. The SVM(6) models the discharge coefficient as a function of the ratio of side weir length to main channel bottom width, ratio of side weir length to flow depth upstream of the side weir, trapezoidal channel wall slope and ratio of flow depth upstream of the side weir to trapezoidal channel bottom width. The SVM(4) and SVM(5) are the least accurate in discharge coefficient modeling. Also, the SVM(4) is a function of Fr, L/b, \(m\) and y1/b,where the root mean square error and mean absolute relative error predicted are 0.020 and 0.0469, respectively. Moreover, for the SVM(4) model, the R2 is 0.810. The SVM(5) produces the highest error value (RMSE = 0.209) and the lowest correlation coefficient (R2 = 0.787). This model predicts the discharge coefficient using \({\text{Fr}}\), L/b, \(m\) and y1/b.

The RMSE, MARE and R2 results for SVM(1)–SVM(6) in testing are shown in Fig. 3. Similar to training, models SVM(1) and SVM(2) have the lowest errors and highest correlation coefficient values. Regarding the modeling results, the SVM(1) and SVM(2) are almost the same in testing mode. In other words, the RMSE for the SVM(1) and SVM(2) models is 0.0168 and 0.0167, respectively, while the R2 calculated for these models is 0.808 and 0.810, respectively. In test mode, after the SVM(1) and SVM(2), SVM(6) model has the maximum correlation coefficient (R2 = 0.772) and lowest RMSE and MARE (RMSE = 0.0187 and MARE = 0.0399), respectively.

Fig. 3
figure 3

RMSE, MARE and R2 values for SVM(1)–SVM(6) in testing mode

Based on Fig. 3, SVM(3), SVM(4) and SVM(5) models perform similarly in the test mode. The root mean square error predicted for SVM(3), SVM(4) and SVM(5) models in the test mode is 0.0206, 0.0205 and 0.0207, respectively. The R2 values for these models are 0.715, 0.713 and 0.704, respectively.

The SVM(1)–SVM(6) results for predicting the side weir discharge coefficient in training mode are presented in Fig. 4. According to this figure, the correlation coefficient for SVM(1) is higher than other models (R2 = 0.884). As mentioned in the previous section, the SVM(1) model simulates the discharge coefficient as a function of \(Fr\), L/b, L/y1, \(m\) and y1/b parameters. Moreover, a comparison between the experimental and predicted discharge coefficients by SVM(5) in training mode indicates that this model has a low correlation (R2 = 0.787). Also, the SVM(5) is a function of \(Fr\), L/y1, \(m\) and y1/b, and in comparison with the SVM(1), only the L/b dimensionless parameter has been removed. Therefore, the ratio of side weir length to trapezoidal channel bottom width (L/b) for predicting the discharge coefficient of a side weir in a trapezoidal channel has a considerable effect. In other words, removing this parameter causes substantial calculation errors.

Fig. 4
figure 4

Comparison between experimental and predicted discharge coefficient in training a SVM(1), b SVM(2), c SVM(3), d SVM(4), e SVM(5), f SVM(6)

The changes in discrepancy ratio (DR) against the side weir discharge coefficient for the SVM(1)–SVM(6) models in training mode are shown in Fig. 5. The DR is the ratio of the discharge coefficient value predicted with the SVM method to the experimental discharge coefficient \(\left( {{\text{DR}} = {{C_{{{\text{d}}\left( {\text{Predicted}} \right)i}} } \mathord{\left/ {\vphantom {{C_{{{\text{d}}\left( {\text{Predicted}} \right)i}} } {C_{{{\text{d}}\left( {\text{Observed}} \right)i}} }}} \right. \kern-0pt} {C_{{{\text{d}}\left( {\text{Observed}} \right)i}} }}} \right)\). As DR is closer to one, the performance of the numerical model is better. The maximum (DRmax), minimum (DRmin) and average (DRave) discrepancy ratios obtained for SVM(1) are 1.159, 0.919 and 1.001, respectively. After SVM(1), SVM(2) model has the lowest average discrepancy ratio (DRave = 1.002). The maximum and minimum DR for SVM(2) is 1.160 and 0.922, respectively. For SVM(3), the DRave value is 1.002, and the DRmax and DRmin for this model are 1.133 and 0.879, respectively. When it comes to discrepancy ratio, the SVM(4), SVM(5) and SVM(6) models have similar performance. In other words, the average discrepancy ratio calculated is 1.005 for SVM(4), 1.004 for SVM(5) and 1.007 for SVM(6).

Fig. 5
figure 5

Discrepancy ratio values of the discharge coefficient in training mode for models a SVM(1), b SVM(2), c SVM(3), d SVM(4), e SVM(5), f SVM(6)

Regarding an analysis of the SVM method results for predicting the discharge coefficient of side weirs located on trapezoidal channels, SVM(1) is introduced as the superior model. As noted above, this model is a function of \(Fr\), L/b, L/y1, \(m\) and y1/b. Therefore, the following equation is proposed to calculate the discharge coefficient of this hydraulic structure:

$$C_{\text{d}} = \sum\limits_{i = 1}^{{N_{\sup } }} {w_{i} p_{i} } + b$$
(28)

and

$$p_{i} = \exp \left( {\frac{{x_{i} \times x_{\sup }^{\text{T}} }}{{\sigma^{2} }}} \right)$$
(29)

where xi is the matrix of input parameters of the ith sample \(\left( {Fr,L/b,L/y_{1} ,m,y_{1} /b} \right)\), σ is equal to − 50 (σ = − 50) and Nsup is the number of all samples.

$$w^{\mathrm{T}} = \left[ \begin{array}{lllllllllll} &+ 1.04&\, - 0.05&\, + 0.67&\, - .032&\, - 1.83&\, + 0.70&\, - 0.86&\, + 1.65&\, + 1.69&\, + 1.06 \hfill \\ \ldots&\,\, - 0.06&\, + 3.13&\, - 0.13&\, + 0.57&\, - 1.27&\, - 0.10&\, + 1.05&\, + 0.38&\, + 1.79&\, + 0.82 \hfill \\ \ldots&\,\, + 0.25&\, + 0.96&\, + 0.18&\, + 0.31&\, + 0.44&\, - 0.53&\, + 0.7&\, - 0.52&\, + 1.95&\, - 0.58 \hfill \\ \ldots&\,\, - 0.13&\, - 3.46&\, + 1.08&\, + 0.58&\, - 0.47&\, + 0.03&\, - 0.36&\, - 0.22&\, - 4.00&\, + 1.55 \hfill \\ \ldots&\,\, - 4.00&\, - 0.47&\, + 2.92&\, - 2.06&\, - 0.51&\, - 0.14&\, - 0.52&\, + 0.86&\, - 0.72&\, + 0.27 \hfill \\ \ldots&\,\, - 0.23&\, + 1.08&\, + 0.96&\, - 0.35&\, - 0.31&\, - 1.29&\, - 2.17&\, - 0.42&\, - 0.55&\, + 0.01 \hfill \\ \end{array} \right]$$
(30)
$$x_{\sup }^{\mathrm{T}} = \left[ \begin{array}{lllll} A_{1} &\quad A_{2} &\quad A_{3}&\quad \ldots&\quad A_{60} \\ B_{1} &\quad B_{2} &\quad B_{3} &\quad \ldots& \quad B_{60} \\ C_{1} &\quad C_{2} &\quad C_{3} &\quad \ldots& \quad C_{60} \\ D_{1} &\quad D_{2} &\quad D_{3} &\quad \ldots& \quad D_{60} \\ E_{1} &\quad E_{2} &\quad E_{3} &\quad \ldots&\quad E_{60} \\ \end{array} \right]_{5 \times 60}$$
(31)
$$\left[\begin{array}{lllll} A_{1} &\quad A_{2}& \quad A_{3}&\quad \ldots& A_{60}\end{array}\right]= \left[\begin{array}{lllllllllll} &\quad 0.29&\quad 0.32&\quad 0.35&\quad 0.37&\quad 0.46&\quad 0.53&\quad 0.47&\quad 0.49&\quad 0.70&\quad 0.72 \hfill \\ \ldots&\quad 0.70&\quad 0.73&\quad 0.24&\quad 0.31&\quad 0.26&\quad 0.30&\quad 0.50&\quad 0.51&\quad 0.68&\quad 0.69 \hfill \\ \ldots&\quad 0.69&\quad 0.68&\quad 0.38&\quad 0.40&\quad 0.41&\quad 0.41&\quad 0.51&\quad 0.52&\quad 0.55&\quad 0.54 \hfill \\ \ldots&\quad 0.54&\quad 0.50&\quad 0.60&\quad 0.57&\quad 0.42&\quad 0.55&\quad 0.60&\quad 0.64&\quad 0.99&\quad 0.54 \hfill \\ \ldots&\quad 0.48&\quad 0.74&\quad 0.86&\quad 0.86&\quad 0.87&\quad 0.84&\quad 0.79&\quad 0.74&\quad 0.66&\quad 0.61 \hfill \\ \ldots&\quad 0.45&\quad 0.58&\quad 0.51&\quad 0.66&\quad 0.77&\quad 0.89&\quad 0.92&\quad 0.93&\quad 0.90&\quad 0.85 \hfill \\ \end{array} \right]_{1 \times 60}$$
(31-1)
$$\left[\begin{array}{*{20}l} B_{1} &\quad B_{2} &\quad B_{3} &\quad \ldots&\quad B_{60} \end{array}\right] = \left[ \begin{array}{*{20}l} &\quad 0.50&\quad 0.50&\quad 0.50& 0.50&\quad 1.02&\quad 1.02&\quad 1.02&\quad 1.02&\quad 1.45&\quad 1.45 \hfill \\ \ldots&\quad 1.45&\quad 1.45&\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50&\quad 1.02&\quad 1.02&\quad 1.45&\quad 1.45 \hfill \\ \ldots&\quad 1.45&\quad 1.45&\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50&\quad 1.02&\quad 1.02&\quad 1.02&\quad 1.02 \hfill \\ \ldots&\quad 1.45&\quad 1.45&\quad 1.45&\quad 1.45&\quad 0.81&\quad 0.81&\quad 0.81&\quad 0.81&\quad 0.81&\quad 1.01 \hfill \\ \ldots&\quad 1.01&\quad 1.01&\quad 1.01&\quad 1.64&\quad 1.64&\quad 1.64&\quad 1.64&\quad 1.64&\quad 0.80&\quad 0.80 \hfill \\ \ldots&\quad 0.80&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.60&\quad 1.60&\quad 1.60&\quad 1.60\hfill \\ \end{array} \right]_{1 \times 60}$$
(31-2)
$$\left[\begin{array}{*{20}l} C_{1} &\quad C_{2} &\quad C_{3} &\quad\ldots&\quad C_{60} \end{array} \right] \left[ \begin{array}{*{20}l}&\quad 1.51&\quad 1.82&\quad &\quad 1.74&\quad 1.68&\quad 5.03&\quad 4.88&\quad 4.14&\quad &\quad 4.65&\quad 8.36&\quad 8.15 \hfill \\ \ldots&\quad 7.49&\quad 8.50&\quad &\quad 1.54&\quad 1.76&\quad 1.57&\quad 1.80&\quad 5.78&\quad &\quad 5.25&\quad 8.22&\quad 7.82 \hfill \\ \ldots&\quad 7.46&\quad 7.94&\quad &\quad 2.64&\quad 2.48&\quad 2.36&\quad 2.28&\quad 5.19&\quad &\quad 4.89&\quad 5.71&\quad 5.27 \hfill \\ \ldots&\quad 7.64&\quad 7.81&\quad &\quad 7.92&\quad 7.87&\quad 4.26&\quad 3.55&\quad 3.38&\quad &\quad 9.89&\quad 7.29&\quad 5.48 \hfill \\ \ldots&\quad 5.95&\quad 8.02&\quad &\quad 7.67&\quad 8.55&\quad 8.83&\quad 8.97&\quad 9.11&\quad &\quad 9.93&\quad 3.41&\quad 3.59 \hfill \\ \ldots&\quad 4.31&\quad 5.56&\quad &\quad 6.03&\quad 5.00&\quad 8.14&\quad 7.78&\quad 8.62&\quad &\quad 8.89&\quad 9.03&\quad 9.18 \hfill \\ \end{array} \right]_{1 \times 60}$$
(31-3)
$$\left[\begin{array}{*{20}l} D_{1} &\quad D_{2} &\quad D_{3} &\quad \ldots &\quad D_{60} \end{array} \right] = \left[ \begin{array}{*{20}l} &\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50&\quad 0.50 \hfill \\ \ldots&\quad 0.50&\quad 0.50&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00 \hfill \\ \ldots&\quad 1.00&\quad 1.00&\quad 1.00&\quad 2.00&\quad 2.00&\quad 2.00&\quad 2.00&\quad 2.00&\quad 2.00&\quad 2.00 \hfill \\ \ldots&\quad 2.00&\quad 2.00&\quad 2.00&\quad 2.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00 \hfill \\ \ldots&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 1.00&\quad 0.00&\quad 0.00 \hfill \\ \ldots&\quad 0.00&\quad 0.00&\quad 0.00&\quad 0.00&\quad 0.00&\quad 0.00&\quad 0.00&\quad 0.00&\quad 0.00&\quad 0.00 \hfill \\ \end{array} \right]_{1 \times 60}$$
(31-4)
$$\left[ E_{1} \begin{array}{*{20}l}&\quad E_{2} &\quad E_{3} &\quad \ldots& E_{60} \end{array}\right] = \left[ \begin{array}{*{20}l} &\quad 0.33&\quad 0.27&\quad 0.29&\quad 0.30&\quad 0.20&\quad 0.21&\quad 0.25&\quad 0.22&\quad 0.17&\quad 0.18 \hfill \\ \ldots&\quad 0.19&\quad 0.17&\quad 0.33&\quad 0.28&\quad 0.32&\quad 0.28&\quad 0.18&\quad 0.19&\quad 0.18&\quad 0.19 \hfill \\ \ldots&\quad 0.19&\quad 0.18&\quad 0.19&\quad 0.20&\quad 0.21&\quad 0.22&\quad 0.20&\quad 0.21&\quad 0.18&\quad 0.19 \hfill \\ \ldots&\quad 0.19&\quad 0.19&\quad 0.18&\quad 0.18&\quad 0.19&\quad 0.23&\quad 0.24&\quad 0.08&\quad 0.11&\quad 0.19 \hfill \\ \ldots&\quad 0.17&\quad 0.13&\quad 0.13&\quad 0.19&\quad 0.19&\quad 0.18&\quad 0.18&\quad 0.16&\quad 0.23&\quad 0.22 \hfill \\ \ldots&\quad 0.19&\quad 0.18&\quad 0.17&\quad 0.20&\quad 0.12&\quad 0.13&\quad 0.19&\quad 0.18&\quad 0.18&\quad 0.17 \hfill \\ \end{array} \right]_{1 \times 60}$$
(31-5)

Comparison of SVM with previous studies

According to the studies, some studies were carried out on discharge coefficient of side weirs on trapezoidal channels, Cheong (1991), Azimi et al. (2017a, b). Therefore, error distribution of the superior model (SVM) and the studies is determined. In Fig. 6, the error distribution for these models is illustrated. For instance, almost 75% of the SVM results have an error less than 5%. Also, this figure for Azimi et al. (2017a) and Cheong (1991) is 87.5% and 50%, respectively. Additionally, roughly 12% of Azimi et al. (2017b) model has an error between 5% and 10%. Furthermore, the MARE for Azimi et al. (2017a, b) and Cheong (1991) is 0.029, 0.033 and 0.059, respectively. Based on the error distribution and MARE index, SVM model is more accurate than Azimi et al. (2017b) and Cheong (1991) models.

Fig. 6
figure 6

Error distribution of SVM and previous studies

On the one hand, when it comes to discharge coefficient modeling, Azimi et al. (2017a) model has better performance than other studies. They used ELM model so as to simulate the discharge coefficient. Also, a practical and simple matrix was presented which estimated the discharge coefficient with reasonable accuracy. Therefore, the matrix can be easily utilized by individuals without previous knowledge of artificial intelligence techniques. Furthermore, an explicit equation was provided using GEP model by Azimi et al. (2017b) which was as a function of all input parameters affecting the discharge coefficient. Moreover, it was shown that the GEP model had a good performance compared to other studies.

On the other hand, although the SVM model simulates the discharge coefficient with acceptable accuracy, this technique has some drawbacks. This means that the SVM is better just in MARE index and other indices for the model are much worse than Azimi et al. (2017a, b) studies. By contrast, it should be noted that the SVM model has more accuracy than empirical equation (Cheong 1991). However, it is obvious that the matrix presented using the SVM model is not pretty easy to use.

Conclusion

In practice, side weirs located on trapezoidal channels regulate the flow surface in irrigation and drainage systems and are used in hydraulic engineering. In this study, the discharge coefficient of side weirs located on trapezoidal channels was modeled using support vector machines (SVMs). Using dimensionless parameters affecting the discharge coefficient, six different models were defined to estimate the discharge coefficient of this type of diversion structure. Based on the modeling results, the superior model is a function of the Froude number (Fr), ratio of side weir length to trapezoidal channel bottom width (L/b), ratio of side weir length to flow depth upstream of the side weir (L/y1), side wall slope in a trapezoidal channel (m) and ratio of flow depth upstream of the side weir to trapezoidal channel bottom width (y1/b). The RMSE and MARE in training mode were 0.0156 and 0.0327, respectively, while the RMSE and R2 values calculated for the superior model in test mode were 0.0168 and 0.808, respectively. Furthermore, the maximum (DRmax), minimum (DRmin) and average (DRave) discrepancy ratios for the best model were 1.159, 0.919 and 1.001, respectively. According to the sensitivity analysis, the ratio of side weir length to trapezoidal channel bottom width (L/b) was introduced as the most influenced input variable to simulate discharge coefficient. Since the SVM coefficients were determined using trial and error and inadequate determination leads to poor model performance, combining SVM with evolutionary algorithms is suggested to determine the coefficients.