Gene Expression Programming (GEP) to predict coefficient of discharge for oblique side weir

In irrigation and drainage structures, side weir is widely used for flow diversion from main to branch channels. Side weir is also used as a measuring device for discharge measurements, so discharge coefficient was mainly studied in many previous studies. Skew side weir was not taking a good highlight in previous studies and literature, so the present work discharge coefficient calculation for the skew side weir was adopted and studied. Multiple Linear Regression (MLR) and Gene Expression Programming (GEP) tools were used in the present study and compared with observed values of Cd. The mean absolute error for Cd observed and calculated in MLR and GEP was not exceeded 5%. The Cd values for skew side weir ranged from (0.65) to (0.85), while its values for straight vertical side from previous literature weir ranged from (0.45) to (0.65); this mean skew side weir can be used for increase in discharge diversion to the branch channel at the same water levels by 27%. The Akaike information criteria (AIC) with (AICs), root-mean-square error (RMSE), mean absolute relative error (MARE) and scatter index (SI) are used in this study for measuring the GEP model performance. From results, the GEP model has AIC = − 216.51, AICs = − 918.51, RMSE = 0.004653, MARE = 0.005234, R2 = 0.994 and SI = 0.006231 performed the best. According to previous results, the new equation presented through GEP can be adopted for discharge coefficient calculation in skew side weir.


y
Head of water (L) v Flow velocity (L/T) g Gravity acceleration (L/T 2 ) P Weir height (L) q Discharge per unit length (L 3 /T L) S Longitudinal slope C d Coefficient of discharge y 1 Flow depth (L) L Weir length (L) b Channel width (

Introduction
Side weirs are an overflow weir installed on the side of the main channel, which allows flow when water rises above the crest. This type of flow is considered as a spatially varied flow. Side weirs are usually used as a control structure and as head regulators in irrigation structure. Many studies deal with side weir hydraulics; some of these studies deal with sharp crested side weirs such as El-Khashab and Smith (1976), Uyumaz and Smith (1991) Swamee et al. (1994), Hager (1987), Masoud (2003), Singh et al. (1994), Rao and Pillai (2008), Delkash and Babak (2014) and other investigated deals with inclined and oblique side weir such as Mwafaq and Ahmed (2011). Honar and Javan (2007) and Amir et al. (2016). The numerical analysis on inclined side weir was investigated by Ahmed (2011), Ahmed et al. (2013) and Ahmed (2015). A powerful tool is recently used to solve complex nonlinear and multi-linear regression equations in hydraulic engineering such as artificial neural network (ANN), genetic programming (GP) and statistic's analysis using Monte Carlo method, Kisi et al. (2012), Ahmed (2018), Hayawi et al. (2019) and Ahmed and Anna (2020).
In the recent years GEP and learning machine were used to model of nonlinear problems of predicting discharge coefficient in side weir such as Isa et al. (2015) and Reza et al. (2020). The aim of this study is to estimate MLR equation and compare equation modeled from GEP for coefficient of discharge calculation from skew side weir and then compare these values with values of C d estimated from the rectangular side weir.

Experimental methodology
According to Al-Talib (2012), the experimental works were achieved in rectangular laboratory channel 10 m long, 0.3 m wide and 0.45 m depth, while the side channel dimensions were 0.15 m wide, 0.3 m depth and 2 m long. The discharge was measured using standard sharp crested weir (0.15 * 0.3 * 0.01) m dimensions at main channel; the side weirs were fixed at the entrance of the side channel by different angles starting from (90°) (perpendicular to the side channel) decreasing to (30°). Five different angles were taken (90°, 75°, 60°, 45° and 30°) inclined to the left of flow direction. Figure 1.
Five discharges were taken ranging from (7.3 to 16.5 L/s); the actual discharge for the main channel was calculated from the equation.
where Q act . = actual discharge and H = head over standard weir.
Equation (1) can be calculated by trial and error from volumetric calculations. The actual discharge was measured by closed side channel and measured depth of water over the standard weir at the end of main channel, then from Eq. 1 found Q 1 , then open side channel and measured water depth over the standard weir at the end of the main channel once again and from Eq. 1 found the discharge again, but discharge measured in this case (when side channel open) was Q 2 . Actual side channel discharge Q 3 used Eq. 2.
where Q 1 = actual discharge in main channel when side channel is closed, Q 2 = actual discharge in main channel when side channel is opened, and Q 3 = actual side channel discharge after subtracting Q 1 and Q 2 .

Theoretical methodology
The general flow through the side weir derived depends on head of water over the side weir as well as the velocity of flow through it, according to specific energy assumption (De Marchi 1934). where E = specific energy, y = head of water, v = flow velocity and g = gravity acceleration.
Depending on Q, Eq. 3 can be written as discharge, where ( b × y) = cross-sectional area, P = weir height. Depending on De Marchi, Eq. 4 can be written as, where q = discharge per unit length, S = longitudinal slope and C d = coefficient of discharge. Equation 5 satisfies rectangular channel and side weir perpendicular to channel bed, so in skew side weir it is not perpendicular to channel bed; the angle for inclined side weir must have taken, and then, Eq. 5 must change depending on these angles Fig. 1.

Dimensional analysis
Dimensional analysis is important to study the effects of the angle skew side weir to calculate coefficient of discharge (C d ) from standard side weir in rectangular channel. The parameters involved is calculated C d in skew side weir as: where C d = coefficient of discharge, v = flow velocity, y 1 = flow depth, P = weir height, L = weir length, b = channel width and g = acceleration due to gravity, = side weir angle.
By using Buckingham Pi theorem, parameters on Eq. 6 can be used to develop a non-dimensional equation below:

Modeling of skew side weir using MLR
There are many applications for involved regression analysis. These applications deal with linear and nonlinear analysis, depending on variables that involve in the problem. In order to obtain a general equation for skew side weir, several trials with several equation models examined using (Statistical Package Social Sciences SPSS user guide). According to Ahmed (2015), from Eq. 7 using several models of SPSS, Eq. 8 can be developed as MLR with a coefficient of determination R 2 (0.958) where C 1 -C 5 = constants, and in radian.

GEP modeling for side weir
Gene Expression Programming was an artificial procedure to solve genotype system. This way was invented by Ferreira (2001Ferreira ( , 2006, and GEP was similar to (GA) genetic algorithms and (GP) genetic programming; GA deals with individuals as a linear string of length fixed (chromosomes), while GP deals with individuals as nonlinear entities for different parse tree structure. In GEP, the individuals deal with encoded linear strings (chromosomes) which are expressed as nonlinear entities. In GEP, there are two important players: the tree structure (ETS) and chromosomes. The decoding of the process information is called translation that implies obviously a type of code and rules. The genetic code of GEP was simple; a relation between the symbol of the chromosomes and the node is represented in the tree. The rules of GEP determine nodes in the trees and then the type of the interaction in sub-ETS. GEP programming depends on two principal languages: the genetic language and expression trees language. This bilingual notation in GEP is named as Karva. Figure 2 shows the expression tree (ET) for an example of mathematical expressions ( xb + √ c + d ) Mohd et al.(2015) and Khalid and Negm (2008), This ET is encoded in Karva language, and then, the expression is called K-expression. Each gene starts at the first left position, then scans all symbols in all directions every time when a symbol is finally added to the K-expression, and then, the K-expression mentioned above can be written as ( +x √ ab + cd). Figure 3 shown the steps of GEP, include some steps at the begin with the randomly generate of the chromosome from initial population. Then, these chromosomes were expressed and excluded the tree expression to evaluate fitness. The individual is then selected with respect to their fitness to reproduce with the modification; these individuals are subject to the same development. This process was repeated several times until a good solution is found. (Ferreira 2004) The basis of GEP is established on the structure of GEP gene. The simple structure of genes allows the encoding of thinkable program and allows their dynamic evolution due to these multilateral structural arrangements; a powerful set of genetics worker can be implemented to search efficiently solution Ferreira 2002.
(8) C d = C 1 + C 2 Fr + C 3 P∕y + C 4 L∕b + C 5 145 Page 4 of 9 The equation obtained from GEP is given as: The corresponding expression tree for the above equation is given in Fig. 4. (9)

Results and discussion
The genetic operation parameters setting is presented in Table 1, while Table 2 represents the statistics obtained from GEP after testing more than 1000 equation models and running more than 11 h. The comparison between the results of the GEP and MLR presented in this study as well as MLR for previous studies illustrated in Table 3 is presented in terms of coefficient of determination (R 2 ), root-mean-square error (RMSE), Akaike information criteria (AIC) (AICs-which AIC with a correction for small sample sizes), mean absolute relative error (MARE) and scatter index (SI). These values are presented in Table 4, and the equations are defined below: (10)     Jalili and Borghei (1996) 19969 Jalili and Borghei (1996) −  Table 4 represent statistics comparison for the present work with previous studies, and it may be seen that the GEP model refers to highest value of R 2 (0.994) and the lowest value of MARE and RMSE (0.00523 and 0.00465), respectively, as well as the AIC refers to the best value (− 216.51) compared with all others equations, and all that indicate that the execution of GEP is the best with respect to other previous equations; overall, all values refer to a good agreement of equation for the present work compared with MLR according to Ahmed (2015) and all other previous equations. Figure 5 shows a comparison between the discharge coefficients estimated using MLR models and GEP models, respectively, while Figs. 6 and 7 show the discharge coefficient estimated using an MLR model and GEP model, respectively. Compared with observed coefficient of discharge, Figs. 5, 6 and 7 show agreement between compared coefficients of discharge computed from different models and observed values having a relative error below 5%. Figure 8 presents the discharge coefficient estimated from equations shown in Table 3 as well as that value estimated from the present model with the observed value. According to these results, all values estimated from previous equation range from 0.45 to 0.75 for vertical side weir, while in the present work, values range between 0.65 and 0.85 for the skew side weir.
These values increased when the side angle increased; this means the discharge coefficient for skew side weir is greater than its values for vertical side weirs and these values for skew side weir increased when the side angle increased.

Conclusion
In the present study, a Gene Expression Programming (GEP) was used to predict an equation to calculate coefficient of discharge in skew side weir in a rectangular channel; this equation was compared with equation predicted from Multiple Linear Regression (MLR) which estimated from statistical tools. The two methods give a good result compared with the observed one with absolute error not exceeding 5% for both methods with correlation coefficient 0.9 and 0.996 for MLR and GEP, respectively, as well as the root-meansquare errors (RMSE) 0.0123 and 0.0046 for MLR and GEP, respectively. The results presented in this method compared with others equation calculated show that the accuracy of modeling and fitting of GEP is better than other methods.  Cd Jalili & Borghei 1996Cd Borghei et. al.1999 present work (GEP) Cd Borghei et. al.2003 Cd Ali S. et. al.2018