# Shear force estimation in rough boundaries using SVR method

- 395 Downloads
- 1 Citations

## Abstract

The accuracy of support vector regression (SVR) procedure in modeling the percentage of shear force carried by walls in a rectangular channel with rough boundaries was investigated. The SVR model is extended, and the more appropriate kernel function and input combination are studied. Finally, the SVR model with an exponential kernel function and three influence parameters was selected as the best SVR model with the lowest error. The output of this more appropriate SVR model is presented as a program. Then, this most appropriate SVR model is compared with three equations presented by other researchers for rough and smooth channels. The SVR model with the highest accuracy and lowest statistical values (RMSE of 0.565) performed the best compared with the other equations.

## Keywords

Rectangular channel Rough boundaries Shear force Support vector machine## Introduction

Flow structure is affected by the shear stress distribution in open channels. The flow resistance and characteristics of sediment and deposition are influenced by the bed and wall shear stresses. These shear stresses play important role in the determination of the values of average shear force carried by walls (%SF_{w}). Although experimental studies are widely adopted to measure the shear stress, the exhaustive and laborious procedures of experimental work prompted different approaches such as numerical, analytical and semianalytical methods used to calculate the shear stress distribution (Bonakdari et al. 2015; Hosseini et al. 2019; Sheikh Khozani and Bonakdari 2016; Sterling and Knight 2002; Yang et al. 2012). These new methods are cost efficient and eliminate the difficulties, in particular reducing the errors of experimental works. Using a new method based on the entropy approach and presenting equations for calculating shear stress distribution in open channels with different cross sections of circular and rectangular were investigated (Bonakdari et al. 2015; Sheikh Khozani and Bonakdari 2018; Sheikh and Bonakdari 2016). Numerous researches used the soft computing methods to predict different phenomena in the hydrology and hydraulic fields (Azamathulla and Zahiri 2012; Bonakdari et al. 2018; Sheikh Khozani et al. 2017, 2018, 2019). Sheikh Khozani et al. (2016a, b) estimated %SF_{w} in the smooth and rough boundaries using genetic algorithm artificial (GAA) neural network and genetic programming (GP). Not only the genetic-based modeling technique, the support vector machine (SVM) has also been receiving attention in modeling the reference evapotranspiration (Sheikh Khozani and Bonakdari 2018), stage–discharge relation including the hysteresis effect (Sheikh Khozani et al. 2018), suspended sediment (Yilmaz et al. 2018) and shear stress estimation in circular channels (Sheikh Khozani et al. 2017). Utilizing the advantages of SVR that are not depending on the dimensionality input space for complex computations and excellent generalization capability with high prediction accuracy, this research attempts to investigate the performance of SVR in modeling the percentage of shear force in channels with rough boundaries.

The aim of this paper is to minimize errors in estimating the shear stress distribution. As such, the designing process of more stable channels is permissible by having a more accurate prediction of %SF_{w}. To achieve this goal, the SVR model is extended and the effective parameters of %SF_{w} are determined. To find the most effective parameters, varying input combinations were considered and investigated. Since the kernel functions are very important in the performance of SVR model, different kernel functions were studied and the best kernel function in estimating %SF_{w} is selected. Finally, the performance of the most appropriate SVR model prediction was assessed based on three different regression equations which proposed by Knight (1981), Knight et al. (1984) and Knight et al. (1994).

## Materials and method

### Data used

_{w}using the SVR model. The experiments were conducted in a flume with 15 m long, 460 mm wide, on a constant bed slope of 9.58 × 10

^{−4}. The wall and bed shear force were measured using the Preston tube technique, whereby the %SF

_{w}was measured at different flow depths. Based on the data, Knight then extracted nonlinear regression formula for calculating the %SF

_{w}as an exponential function as:

*α*and

*β*are

*k*

_{sb}and

*k*

_{sw}are the bed and wall roughness, respectively. Knight et al. (1984) analyzed their own data including the data in Knight (1981) and plotted onto a log–log scale. By assuming a simple relationship between %SF

_{w}and

*B*/

*h*, they derived the following equation

_{w}) such as the geometry of the channel (

*b*), flow depth (

*h*), bed and wall roughness (

*k*

_{sb},

*k*

_{sw}), energy slope (

*S*

_{f}), flow velocity (

*V*), fluid density (\( \rho \)), gravitational acceleration (

*g*) and hydraulic radius (

*R*). Thus, %SF

_{w}can be considered as a function of

_{w}can be written as

### Support vector machines

*O*is closer to

*T*, the SVR model has higher performance. The inputs of this study are \( \frac{B}{h},{\text{Fr}},\frac{{k_{sb} }}{{k_{sw} }},{\text{Re}} \), and the output is the %SF

_{w}. The linear regression that predicts the output using the inputs is defined as

*w*represents the weight vectors and

*b*is the bias. In the SVR method, the model is penalized when the errors exceed a defined constant, denoted as epsilon (

*ε*). Penalization is done by the loss function defined as

*ζ*

_{i}is a nonnegative slag variable,

*t*

_{i}is the observed output and

*y*

_{i}is the output of the regression process. In this equation, the loss function is equal to zero if the difference between the regression and observed outputs is less than ε. Otherwise, the loss function takes amount. By using Eq. (7), it can be concluded that \( - \varepsilon - \zeta_{i}^{ - } \le t_{i} - y_{i} \le + \varepsilon + \zeta_{i}^{ + } \) must be correct. With this equation, the new shape of the loss function could be written as

*R*

_{emp}) is minimized, then the most accurate regression is obtained. The

*R*

_{emp}is defined as

*y*

_{i}is the output of the regression model and

*n*is the number of considered samples. However, in the process of finding the minimum value of

*R*

_{emp}, there is a probability that the size of the model would increase undesirably. Therefore, the complexity term should be added to Eq. (9) to minimize the issue. The regularized risk function (

*R*

_{reg}) [Eq. (10)] is defined as the summation of

*R*

_{emp}and the norm of

*w*, ||

*w*||. Minimizing

*R*

_{reg}is a two-objective minimization which finds the most accurate regression with the smallest model size.

*R*

_{reg}(Çimen 2008; Smola 1996) is obtained by using Eqs. (8) and (10) as

*C*is a positive constant that serves as a trade-off parameter to determine the degree of

*R*

_{emp}.

*S*is the set of support vectors and \( \alpha_{i}^{ + } \) and \( \alpha_{i}^{ - } \) are the Lagrange multipliers.

*K*(

*x*

_{i},

*x*

_{j}) represents the investigated kernel function. Selecting the appropriate kernel function directly affects SVR model performance. Nonetheless, there is no definitive rule to define kernel functions. Therefore, eight different kernel functions were employed in this study and their individual performance was compared. Details of the considered kernel functions are presented in Table 1.

Used kernel function equations

Kernel name | Kernel equation | Kernel constant |
---|---|---|

Linear | \( k(x,x^{\prime}) = x^{T} x^{\prime} \) | – |

Polynomial | \( k(x,x^{\prime}) = (x^{T} x^{\prime} + 1)^{d} \) | d |

Gaussian | \( k(x,x^{\prime}) = \exp \left( { - \frac{{\left\| {x - x^{\prime}} \right\|^{2} }}{{2\sigma^{2} }}} \right) \) | σ |

Exponential | \( k(x,x^{\prime}) = \exp \left( { - \frac{{\left\| {x - x^{\prime}} \right\|}}{{2\sigma^{2} }}} \right) \) | σ |

Laplacian | \( k(x,x^{\prime}) = \exp \left( { - \frac{{\left\| {x - x^{\prime}} \right\|}}{\sigma }} \right) \) | σ |

Sigmoid | \( k(x,x^{\prime}) = \tanh (x^{T} x^{\prime} + d) \) | d |

Rational quadratic | \( k(x,x^{\prime}) = 1 - \frac{{\left\| {x - x^{\prime}} \right\|^{2} }}{{\left\| {x - x^{\prime}} \right\|^{2} + d}} \) | d |

Multiquadratic | \( k(x,x^{\prime}) = \sqrt {\left\| {x - x^{\prime}} \right\|^{2} + d^{2} } \) | d |

The optimum selection of the *C* and *ε* constants has a significant effect on SVR model performance. According to Table 1, despite the linear kernel function, other kernel functions have another constant that must be determined. Therefore, the results of each SVR model with the second to eight kernel functions are needed to determine the three parameters of *C*, *ε* and the kernel constant. In this study, these parameters are determined via trial-and-error method. In the trial-and-error approach, some loops are added to the main SVR program. Each constant was changed in 20 stages; therefore, for each kernel function, nearly 8000 (that is 20 × 20 × 20) runs were performed.

## Results

### Goodness fit of model

*δ*). The RMSE measures the goodness of relevant fitted to %SF

_{w}values, and MAE yields a more balanced perspective of the goodness of fit at moderate prediction values. The %

*δ*is a parameter that compares the error between experimental data and the SVR model results. These statistical parameters were utilized to identify the accuracy of different soft computing methods (Sheikh Khozani et al. 2016, 2017). The selected statistical parameters are defined as:

*x*

_{im}and

*x*

_{ip}are the measured and predicted values of %SF

_{w}, respectively.

### Selection of the most appropriate kernel function

*B*/

*h*, Fr, Re and

*k*

_{b}/

*k*

_{w}were used as inputs. As shown in Table 2, the exponential kernel function (RMSE = 0.0916, MAE = 0.0772 and %

*δ*= 14.9238) is the most appropriate kernel, and the Laplacian functions with RMSE of 0.0925 showed good results in predicting %SF

_{w}after the exponential function. The multiquadratic kernel function predicts the worst %SF

_{w}with the highest error values obtained, i.e., RMSE = 0.3189, MAE = 0.2613 and %

*δ*= 65.0621.

Selection of the more appropriate kernel function

Kernel function | Training | Testing | ||||
---|---|---|---|---|---|---|

RMSE | MAE | %δ | RMSE | MAE | %δ | |

Linear | 0.0662 | 0.0542 | 13.7508 | 0.1091 | 0.0854 | 16.4614 |

Polynomial | 0.0406 | 0.0371 | 9.7428 | 0.0994 | 0.0824 | 15.5218 |

Radial basis function (RBF) | 0.0445 | 0.0418 | 10.7680 | 0.1698 | 0.1334 | 27.5025 |

Exponential | 0.0427 | 0.0393 | 9.8857 | 0.0916 | 0.0772 | 14.9238 |

Laplacian | 0.0428 | 0.0394 | 9.9035 | 0.0925 | 0.0783 | 15.5444 |

Sigmoid | 0.1091 | 0.0795 | 20.7184 | 0.1751 | 0.1286 | 27.4261 |

Rational quadratic | 0.0421 | 0.0386 | 9.8336 | 0.1213 | 0.1029 | 20.2756 |

Multiquadratic | 0.2627 | 0.2139 | 53.2506 | 0.3189 | 0.2613 | 65.0621 |

*R*

^{2}) value shows how well the data fit the experimental results. The exponential kernel function with

*R*

^{2}of 0.9547 shows the highest adaption of predicted %SF

_{w}on its experimental value. For the test dataset, nearly all kernel functions predicted underestimated values for %SF

_{w}. The sigmoid and multiquadratic kernel functions predict the least convincing results with

*R*

^{2}of 0.7705 and 9 × 10

^{−8}, respectively. Obviously, when the multiquadratic kernel function was used, the model predicted a constant value of 0.4016 for %SF

_{w}for all different aspect ratios. The straight line obtained for the trend line indicates that this model could not predict reasonable values for %SF

_{w}. Therefore, the exponential kernel function was selected as the more accurate Kernel function in the following step.

### Selection of the most appropriate input combination

_{w}. Some of input combinations contain two variables since with increasing number of input variables; few soft computing methods can handle the modeling procedure. These input combinations are shown in Table 3.

The variables of each input combination

Number | Input combination |
---|---|

(i) | B/h, Fr, Re, ksb/ksw |

(ii) | B/h, Fr, Re |

(iii) | B/h, Fr, ksb/ksw |

(iv) | B/h, Re, ksb/ksw |

(v) | Fr, Re, ksb/ksw |

(vi) | B/h, ksb/ksw |

(vii) | B/h, Fr |

(viii) | Fr, ksb/ksw |

*δ*= 12.6250. By ignoring the Re number, the model does not present accurate results and the error values evidently increased when this parameter was omitted. Input combination (viii), by ignoring the aspect ratio and Re, produced the worst predicted values with RMSE = 0.3181, MAE = 0.256 and %

*δ*= 64.5425. Therefore, the Re and aspect ratio are the most influential parameters on estimating accurate shear force percentage values.

Statistical parameters for selection of the more appropriate input combination

Input combination | Training | Testing | ||||
---|---|---|---|---|---|---|

RMSE | MAE | %δ | RMSE | MAE | %δ | |

(i) | 0.0427 | 0.0393 | 9.8857 | 0.0916 | 0.0772 | 14.9238 |

(ii) | 0.0392 | 0.0345 | 8.7039 | 0.0870 | 0.0674 | 12.6250 |

(iii) | 0.0449 | 0.0428 | 10.9996 | 0.1746 | 0.1475 | 32.2359 |

(iv) | 0.0437 | 0.0407 | 10.3011 | 0.1348 | 0.1174 | 23.6284 |

(v) | 0.0411 | 0.0373 | 9.4435 | 0.0860 | 0.0718 | 13.7927 |

(vi) | 0.0464 | 0.0446 | 11.5087 | 0.2029 | 0.1771 | 41.0301 |

(vii) | 0.0439 | 0.0411 | 10.4538 | 0.1746 | 0.1369 | 28.7079 |

(viii) | 0.0459 | 0.0439 | 11.1116 | 0.3181 | 0.2560 | 64.5425 |

_{w}, and then, the input combination (ii) was selected as the most appropriate input combination. For input combination (viii), with omitting the influence of the aspect ratio and

*Re*, the model could not estimate accurate results especially for testing data and this is supported by

*R*

^{2}of 0.0926. Also, by ignoring the

*Re*values as an input parameter in modeling procedure, the model did not show a good performance in estimating the percentage of shear force carried by walls either.

_{w}. This program was prepared in MATLAB software, and the exponential kernel function and input combination (ii) were used in the structure of this program. This program is simple, which is rarely seen in modeling with the SVR method. In this program, the three inputs in input combination (ii) consist of

*B*/

*h*, Fr and Re and were taken from the user. After using the exponential kernel function in calculations, the %SF

_{w}was obtained.

## Comparison between the proposed model and regression equations

_{w}, then it can be deduced that the roughness parameter has high influence on predicting %SF

_{w}. On the other hand, although the roughness ratio was omitted in input combination (ii), this model still could predict an accurate result for %SF

_{w}.

_{w}values were close to the measured values; the results were closer to the line of agreement in the scatter plot than the other equations. The trend line for the SVR model and each equation is shown in scatter plots, where the gray straight line is for the SVR model and the black dash line represents other models. Interestingly, only the equation of Knight (1981) was able to estimate a more close prediction values, and other equations were not able to accurately predict %SF

_{w}values. The performance of Eq. (2) is good for the data between the ranges of 1–20, but for the other data, the model overestimated predicted values for %SF

_{w}. As seen in Fig. 4, the values of %SF

_{w}which calculated using Eq. (3) provide similar findings with Eq. (2) and present low performance for estimating %SF

_{w}. The hydrographs show the residual of experimental data and data predicted by the models. Evidently, the SVR model graph has lower deviation from the straight line, but the other equations which presented for smooth channels [especially Eqs. (2) and (3)] demonstrated higher deviation from the straight line.

## Conclusion

The SVR model was used to predict %SF_{w} in a rectangular channel with non-homogeneous boundary roughness. The SVR model was extended in two phases. The best kernel function was selected after investigating eight different kernel functions, whereby the exponential kernel function was found to be the most appropriate. To study the amount of influence of different parameters on the %SF_{w} values, four parameters were assumed and eight input combinations were selected for this purpose. The results showed that the influence of aspect ratio and relative roughness is higher in predicting the %SF_{w}. Input combination (ii) containing *B*/*h*, Fr and Re was selected as the best input combination. Although the relative roughness was not included in the input combination (ii), the proposed model was able to present accurate results in predicting %SF_{w}. A simple MATLAB code was presented as the output of the more appropriate SVR model. Finally, the best SVR model was compared with three equations presented by other researchers. It can be deducted that the equation for rough rectangular channels presented by Knight (1981) demonstrated good results of prediction of %SF_{w}, but the SVR model with RMSE of 0.0565 performed better than Knight’s equation with RMSE of 0.0641. The equations for smooth channels derived by Knight et al. (1984) and Knight et al. (1994) showed the worst results for predicting %SF_{w} with RMSE of 0.2413 and 0.4048, respectively. The smooth channel equations overestimated the values for %SF_{w} in channels with rough boundaries. As such, these equations are not applicable for predicting %SF_{w} in these channels. The SVR presents a high-performance model in predicting the hydraulic parameter of %SF_{w} in rectangular channels with rough boundaries.

## Notes

### Acknowledgements

The first author acknowledges Universiti Kebangsaan Malaysia (Grant No. MI-2018-011) for financial support.

### Compliance with ethical standards

### Conflict of interest

The authors declare that they have no conflict of interest.

## References

- Azamathulla HM, Zahiri A (2012) Flow discharge prediction in compound channels using linear genetic programming. J Hydrol 454:203–207CrossRefGoogle Scholar
- Bonakdari H, Sheikh Z, Tooshmalani M (2015a) Comparison between Shannon and Tsallis entropies for prediction of shear stress distribution in open channels. Stoch Environ Res Risk Assess 29:1–11. https://doi.org/10.1007/s00477-014-0959-3 CrossRefGoogle Scholar
- Bonakdari H, Tooshmalani M, Sheikh Z (2015b) Predicting shear stress distribution in rectangular channels using entropy concept. Int J Eng Trans A Basics 28:360–367. https://doi.org/10.5829/idosi.ije.2015.28.03c.04 CrossRefGoogle Scholar
- Bonakdari H, Sheikh Khozani Z, Zaji AH, Asadpour N (2018) Evaluating the apparent shear stress in prismatic compound channels using the genetic algorithm based on multi-layer perceptron: a comparative study. Appl Math Comput 338:400–411. https://doi.org/10.1016/j.amc.2018.06.016 CrossRefGoogle Scholar
- Çimen M (2008) Estimation of daily suspended sediments using support vector machines. Hydrol Sci J 53:656–666. https://doi.org/10.1623/hysj.53.3.656 CrossRefGoogle Scholar
- Hosseini D, Torabi M, Moghadam MA (2019) Preference assessment of energy and momentum equations over 2D-SKM method in compound channels. J Water Resour Eng Manag 6:24–34Google Scholar
- Knight DW (1981) Boundary shear in smooth and rough channels. J Hydraul Div 107:839–851Google Scholar
- Knight DW, Demetriou JD, Hamed ME (1984) Boundary shear stress in smooth rectangular channel. J Hydraul Eng 10(4):405–422CrossRefGoogle Scholar
- Knight DW, Yuen KWH, Alhamid AAI (1994) Boundary shear stress distributions in open channel flow. In: Beven K, Chatwin PC, Millbark JJ (eds) Physical mechanisms of mixing and transport in the environment. Wiley, LondonGoogle Scholar
- Sheikh Z, Bonakdari H (2016) Prediction of boundary shear stress in circular and trapezoidal channels with entropy concept. Urban Water J 13:629–636. https://doi.org/10.1080/1573062x.2015.1011672 CrossRefGoogle Scholar
- Sheikh Khozani Z, Bonakdari H (2016) A comparison of five different models in predicting the shear stress distribution in straight compound channels. Sci Iran Trans A Civ Eng 23:2536–2545Google Scholar
- Sheikh Khozani Z, Bonakdari H (2018) Formulating the shear stress distribution in circular open channels based on the Renyi entropy. Phys A Stat Mech Appl 490:114–126. https://doi.org/10.1016/j.physa.2017.08.023 CrossRefGoogle Scholar
- Sheikh Khozani Z, Bonakdari H, Zaji AH (2016a) Application of a genetic algorithm in predicting the percentage of shear force carried by walls in smooth rectangular channels. Measurement 87:87–98CrossRefGoogle Scholar
- Sheikh Khozani Z, Bonakdari H, Zaji AH (2016b) Application of a soft computing technique in predicting the percentage of shear force carried by walls in a rectangular channel with non-homogeneous roughness. Water Sci Technol 73:124–129. https://doi.org/10.2166/wst.2015.470 CrossRefGoogle Scholar
- Sheikh Khozani Z, Bonakdari H, Zaji AH (2017a) Estimating the shear stress distribution in circular channels based on the randomized neural network technique. Appl Soft Comput 58:441–448. https://doi.org/10.1016/j.asoc.2017.05.024 CrossRefGoogle Scholar
- Sheikh Khozani Z, Bonakdari H, Zaji AH (2017b) Estimating shear stress in a rectangular channel with rough boundaries using an optimized SVM method. Neural Comput Appl. https://doi.org/10.1007/s00521-016-2792-8 CrossRefGoogle Scholar
- Sheikh Khozani Z, Bonakdari H, Ebtehaj I (2018) An expert system for predicting shear stress distribution in circular open channels using gene expression programming. Water Sci Eng 11:167–176. https://doi.org/10.1016/j.wse.2018.07.001 CrossRefGoogle Scholar
- Sheikh Khozani Z, Khosravi K, Pham BT, Kløve B, Wan Mohtar WHM, Yaseen ZM (2019) Determination of compound channel apparent shear stress: application of novel data mining models. J Hydroinformatics. https://doi.org/10.2166/hydro.2019.037 CrossRefGoogle Scholar
- Smola AJ (1996) Regression estimation with support vector learning machines. Diplomarbeit, Technische Universität MünchenGoogle Scholar
- Sterling M, Knight D (2002) An attempt at using the entropy approach to predict the transverse distribution of boundary shear stress in open channel flow. Stoch Environ Res Risk Assess 16:127–142CrossRefGoogle Scholar
- Vapnik V (2000) The nature of statistical learning theory, 2nd edn. Springer, New York. ISBN-13: 978-0387987804CrossRefGoogle Scholar
- Yang K, Nie R, Liu X, Cao S (2012) Modeling depth-averaged velocity and boundary shear stress in rectangular compound channels with secondary flows. J Hydraul Eng 139:76–83. https://doi.org/10.1061/(asce)hy.1943-7900.0000638 CrossRefGoogle Scholar
- Yilmaz B, Aras E, Nacar S, Kankal M (2018) Estimating suspended sediment load with multivariate adaptive regression spline, teaching-learning based optimization, and artificial bee colony models. Sci Total Environ 639:826–840. https://doi.org/10.1016/j.scitotenv.2018.05.153 CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.