Skip to main content
Log in

Unit hydrograph identification based on fuzzy regression analysis

Evolving Systems Aims and scope Submit manuscript

Cite this article

Abstract

A methodology has been developed to treat the uncertainties of the unit hydrograph rainfall-runoff transformation based on the fuzzy linear regression model. The components of the unit hydrograph are proposed to be symmetric triangular fuzzy numbers. The main idea is to develop a fuzzified version of the widely—used unit hydrograph. The input of the model is the effective rainfall (which is a set of crisp numbers) and the output is the runoff (which is a set of fuzzy numbers). The problem of determining the fuzzy components of the unit hydrograph is transformed into a linear optimization approach based on the fuzzy linear regression of Tanaka. The condition applied is that the data must be included in the fuzzy band of discharge produced by the fuzzy unit hydrograph. However, some modifications of the widely-used fuzzy regression are proposed in order to be compatible with the unit hydrograph implementation. The model was tested based on twenty real rainfall-runoff events, the conclusion being that it works well in the case of an individual storm event. However, the consideration of many storms seems to be required in order to increase the applicability of the produced fuzzy hydrograph to other storms in such a way that the fuzziness is increased. Several measures of suitability are proposed in order to check the efficiency of the proposed method. Several other useful comments according to the comparison with other unit hydrograph considerations are made.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mike Spiliotis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix I

The Tanaka et al. (1982) formulation of the objective function is firstly adopted in our method which is equivalent to the newer formulation of the fuzzy linear regression which proposes the minimization of the total spread of the fuzzy output for this study (e.g. as Tanaka et al. (1989) formulation):

$$J_{2} = \sum\limits_{j = 1}^{m} {\sum\limits_{{\text{i} = \text{1}}}^{\text{n}} {\text{c}_{\text{i}} \left| {r_{ij}^{\prime } } \right|} }$$
(A1)

[newer formulation by Tanaka (1987)].

By following the definition of Eq. (1), if we have the data of a complete rainfall—runoff episode, then the following holds:

$$J_{2} = \sum\limits_{j = 1}^{m} {\sum\limits_{{\text{i} = \text{1}}}^{\text{n}} {\text{c}_{\text{i}} \left| {r_{ij}^{\prime } } \right|} } = h_{effective} \sum\limits_{{\text{i} = \text{1}}}^{\text{n}} {\text{c}_{\text{i}} }$$
(A2)

where for each column the sum \(\left( {r_{i1}^{\prime } + ... + r_{im}^{\prime } } \right)\) is equal to the total amount of the effective rainfall, \(h_{effective}\), which is known and therefore the two objective functions are equivalent considering the optimization process.

The effective rainfall r' over discrete time intervals can be calculated as follows, based on the conventional crisp equation:

$$Q_{j} = \sum\limits_{m = 1}^{j \le R} {r_{m}^{\prime } h_{j - m + 1} }$$
(A3)

where j is the discrete time step, which includes the j pulses of input. In the usual unit hydrograph theory, the number of the unit hydrograph components is equal to M (number of data). Where R is equal with the number of rainfall periods plus one.

For example, without loss of generality, in cases where we have six time-steps of the direct runoff, and three continuously direct rainfall events (e.g., 1 h duration for each event), then we should consider four components of the corresponding unit hydrograph, so that the above equation can be written as follows (Bras 1989):

$$\begin{gathered} r_{1} h_{1} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, = Q_{1} \hfill \\ r_{2} h_{1} + r_{1} h_{2} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, = Q_{2} \hfill \\ r_{3} h_{1} + r_{2} h_{2} + r_{1} h_{3} \,\,\,\,\,\,\,\,\,\,\,\,\,\, = Q_{3} \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,r_{3} h_{2} + r_{2} h_{3} + r_{1} h_{4} \,\, = Q_{4} \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,r_{3} h_{3} + r_{2} h_{4} \,\, = Q_{5} \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,r_{3} h_{4} \,\, = Q_{6} \hfill \\ \end{gathered}$$

In the matrix form of the crisp Eq. (A3), the above system can be written as follows:

$${\mathbf{Q}} = {\mathbf{r}}_{{\mathbf{m}}}^{\prime }$$

To be more specific, in the previous example, the matrices are the following:

$${\mathbf{Q}} = \left[ \begin{gathered} Q_{1} \hfill \\ Q_{2} \hfill \\ Q_{3} \hfill \\ Q_{4} \hfill \\ Q_{5} \hfill \\ Q_{6} \hfill \\ \end{gathered} \right]\text{,}\,\,{\mathbf{h}} = \left[ \begin{gathered} h_{1} \hfill \\ h_{2} \hfill \\ h_{3} \hfill \\ h_{4} \hfill \\ \end{gathered} \right]\quad {\text{and}}\quad {\mathbf{r}}_{{\mathbf{m}}}^{\prime } = \left[ {\begin{array}{*{20}c} {r_{1} } & 0 & 0 & 0 \\ {r_{2} } & {r_{1} } & 0 & 0 \\ {\,r_{3} } & {r_{2} } & {r_{1} } & 0 \\ 0 & {\,r_{3} } & {r_{2} } & {r_{1} } \\ 0 & 0 & {\,r_{3} } & {r_{2} } \\ 0 & 0 & 0 & {\,r_{3} } \\ \end{array} } \right]$$

In our case, instead of the crisp components of the crisp unit hydrograph, the fuzzy components should be defined, \({\tilde{\mathbf{h}}}\). As mentioned before, since the symmetric triangular fuzzy numbers are selected, the fuzzy components can be described by the centers, \({\mathbf{h}}\) and the widths \({\mathbf{c}}\) of the fuzzy components:

$$\,{\mathbf{h}} = \left[ \begin{gathered} h_{1} \hfill \\ h_{2} \hfill \\ h_{3} \hfill \\ h_{4} \hfill \\ \end{gathered} \right],\,\,{\mathbf{c}} = \left[ \begin{gathered} \text{c}_{\text{1}} \hfill \\ \text{c}_{\text{2}} \hfill \\ \text{c}_{\text{3}} \hfill \\ \text{c}_{\text{4}} \hfill \\ \end{gathered} \right]$$

If we select the newer objective function (Tanaka et al. 1989), that is, the sum of the spreads of the fuzzy output for all the data (here, m = 6), it holds:

$$J_{2} = \sum\limits_{j = 1}^{6} {\sum\limits_{{\text{i} = \text{1}}}^{4} {\text{c}_{\text{i}} \left| {r_{ij}^{\prime } } \right|} } = \left( {r_{1} + r_{2} + r_{3} } \right)\sum\limits_{{\text{i} = \text{1}}}^{4} {\text{c}_{\text{i}} } = h_{effective} \cdot \sum\limits_{{\text{i} = \text{1}}}^{4} {\text{c}_{\text{i}} } ,\,\,\,\,\,\,\,\,r_{1} + r_{2} + r_{3} = h_{effective}$$

Appendix II

The method proposed by Bárdossy et al. (2006) primary is based on the consideration of the Gamma probability distribution function as unit hydrograph with fuzzy parameters. The implementation of (crisp) probability density functions as unit hydrograph is proposed from some researchers (Ghorbani et al. 2017), whilst the use of Gamma probability distribution function can be interpreted with the Nash cascade model (Nash 1960) based on the instantaneous unit hydrograph. Hence, based on the Bárdossy et al. (2006) approach, the following fuzzified Gamma distribution probability can be used:

$$h\left( {t,K,n} \right) = \frac{1}{{\tilde{K} \cdot \Gamma \left( {\tilde{n}} \right)}}\left( {\frac{t}{{\tilde{K}}}} \right)^{{\tilde{n} - 1}} e^{{\left( { - \frac{t}{{\tilde{K}}}} \right)}}$$
(A4)

where \(\tilde{K},\,\tilde{n}\) are selected to be fuzzy triangular symmetrical numbers.

In order to select the fuzzy parameters of the distribution, a heuristic optimization is implemented which enables us to use the simulation procedure. In this article, the PSO was used as heuristic optimization procedure where a swarm of possible solutions is randomly established. For each possible solution (member of the swarm), a simulation procedure can be implemented separately.

Hence, for each hybrid fuzzy probability density function, the final outflow is calculated for each time step. Since the unit hydrograph has fuzzy parameters, the output will be a fuzzy band. Separately, the boundaries of the α-cuts can be determined based on the extension principle of the fuzzy sets for each time step j:

$$\left( {q_{j} } \right)_{a}^{ - } = \min \left\{ {\sum\limits_{m = 1}^{j \le R} {r_{m}^{\prime } h_{j - m + 1} } \,\,\,\left| {\sum\limits_{m = 1}^{M} {h_{m} } } \right. = \,\,\,1,\,\,h_{m} = \frac{1}{{x_{1} \cdot \Gamma \left( {x_{2} } \right)}}\left( {\frac{m}{{x_{1} }}} \right)^{{x_{2} - 1}} e^{{\left( -{\frac{m}{{x_{1} }}} \right)}} \,,\,\,x_{1} \in \left[ K \right]_{a} ,\,\,x_{2} \in \left[ n \right]_{a} \,\,} \right\}$$
(A5)
$$\left( {q_{j} } \right)_{a}^{ + } = \max \left\{ {\sum\limits_{m = 1}^{j \le R} {r_{m}^{\prime } h_{j - m + 1} } \,\,\,\left| {\sum\limits_{m = 1}^{M} {h_{m} } } \right. = \,\,\,1,\,\,h_{m} = \frac{1}{{x_{1} \cdot \Gamma \left( {x_{2} } \right)}}\left( {\frac{m}{{x_{1} }}} \right)^{{x_{2} - 1}} e^{{\left( -{\frac{m}{{x_{1} }}} \right)}} \,,\,\,x_{1} \in \left[ K \right]_{a} ,\,\,x_{2} \in \left[ n \right]_{a} \,\,} \right\}$$
(A6)

where \(\left( {q_{j} } \right)_{a}^{ - } ,\,\,\left( {q_{j} } \right)_{a}^{ + }\) the left and the right hand band boundaries of the fuzzy outflow. In other words, for each time step, the used values of the unit hydrograph must be equal to one and in general the values of the parameter n, K must be included within the corresponding α-cuts, \(\left[ K \right]_{a} ,\,\,\,\left[ n \right]_{a} \,,\) respectively. It should be clarified that a small allowable error is permitted regarding the equality constraint that the sum of the used ordinates of the unit hydrograph equals one (Bhattacharjya 2004).

Hence, this approach overcomes this difficulty that the sum of the used unit hydrograph ordinates must be equal to one, since it applies the extension principle step by step in order to determine the discharge at each time. This means that at each time step, both one maximization and one minimization problem are implemented. The described procedure can be seen as an internal algorithm (loop) which is located within a heuristic optimization problem.

After the end of the simulation procedure for all the time steps, a fitness function must be established. The fitness function will contain the objective function as well as the constraints which are not included in the simulation, as a penalty. The objective function aims at the minimum fuzzy band whilst the constraints force the solution to include all the observed data for a selected α-cut, an idea which comes from the fuzzy regression analysis of Tanaka.

For simplicity, in this article, the zero cut is used to check the inclusion constraints and the zero cut is used to modulate the objective function too, that is, the magnitude of the total fuzzy band is based on the zero cut, instead of all the area of the membership function of the produced fuzzy discharge. The check of the inclusion constraints are based on the zero cut and hence, the penalty constraints are modulated. The final fitness function will be the following:

$$M \cdot \mathop {\left\{ {\sum\limits_{j = 1}^{m} {a_{{R_{j} }} \left( {\,q_{j}^{observed} - q_{j}^{ + } } \right)^{2} } + \sum\limits_{j = 1}^{m} {a_{{L_{j} }} \left( {q_{j}^{ - } - q_{j}^{observed} } \right)^{2} } } \right\}}\limits_{penalty} + \mathop {\sum\limits_{j = 1}^{m} {\left( {\,q_{j}^{ + } - q_{j}^{ - } } \right)^{2} } }\limits_{objective\,function} \to \min$$
(A7)

where \(q_{j}^{ + } ,\,\,\,q_{j}^{ - }\) are the right and the left boundaries (regarding the zero cut) of the fuzzy discharge based on the extension principle and M is a large positive number, whilst all the other symbols remain the same, as they were explained in the main text.

Regarding the used heuristic optimization method (to calibrate the fuzzy parameters of the Gamma distribution), the PSO is a stochastic global optimization method based on the simulation of the swarm. PSO can deal with nonlinear optimization problems in non-convex domains (Ostadrahimi 2012). Each candidate solution is named as particle, and the set of possible solutions in each iteration creates the ‘swarm’ (Spiliotis et al. et al. 2016). The main idea is that each particle moves towards its best previous position and towards the best particle in the whole swarm (Eberhart et al. 1996; Parsopoulos and Vrahatis 2002). Clerc and Kennedy’s analysis based on the initial PSO method proposed the following equation for the new positions:

$$\left\{ \begin{gathered} \upsilon_{i} \left( {t + 1} \right) = \chi \left[ {\upsilon_{i} \left( t \right) + c_{1} \rho_{1} \left( {} \right) \cdot \left( {p_{i} - x_{i} \left( t \right)} \right) + c_{2} \rho_{2} \left( {} \right) \cdot \left( {p_{g} - x_{i} \left( t \right)} \right)} \right] \hfill \\ x_{i} \left( {t + 1} \right) = x_{i} \left( t \right) + \upsilon_{i} \left( {t + 1} \right) \hfill \\ \end{gathered} \right.$$
(A8)

where χ is a parameter called constriction coefficient or constriction factor:

$$\chi = \frac{2}{{\left| {2 - \varphi - \sqrt {\varphi^{2} - 4\varphi } } \right|}},\,\,\varphi = c_{1} + c_{2} ,\,\,\varphi > 4$$
(A9)

whilst, based on the literature, the following parameters are proposed (Clerc and Kennedy 1992; Eberhart and Kennedy 1995):

$$\chi = 0.729,\,\,c_{1} = c_{2} = 2.05$$
(A10)

In addition, where ρ() is a vector of random numbers uniformly distributed in the open interval 0, 1 that is generated at each iteration and for each particle, pi is the best previously visited position of the tth particle (partial optimum) and pg is the global best previously visited position of all particles (global optimum). Furthermore, the term \(c_{1} \rho_{1} \left( {} \right) \cdot \left( {p_{i} - x_{i} \left( t \right)} \right)\) that associates the particle’s own experience with its current position, is weighted by the constant \(c_{1}\), and is called individuality (cognitive acceleration). The term \(c_{2} \rho_{2} \left( {} \right) \cdot \left( {p_{g} - x_{i} \left( t \right)} \right)\) is associated with the social interaction between the particles of the swarm and weighted by the constant \(c_{2} ,\) and is called sociality (social acceleration).

In this study, the proposals of Clerc and Kennedy (2002) are adopted in the implementation of the PSO.

Appendix III

SVR algorithm was first developed to solve classification problems; however it was expanded in regression problems. Instead of a crisp value, the outcome of ε-SVR regression is to provide a tube with constant semi-width ε (ε > 0) which contains all the observed data. Hence, the constraints are modulated as follows [Vapnik (1995); Samadianfard et al. (2019)]:

$$\begin{gathered} \hfill \\ \left| {y_{j} - \left\langle {w,x_{j} } \right\rangle - b} \right| \le \varepsilon \Leftrightarrow - \varepsilon \le y_{j} - \left\langle {w,x_{j} } \right\rangle - b \le \varepsilon \Leftrightarrow \hfill \\ \Leftrightarrow \left. \begin{gathered} y_{j} \le \left\langle {w,x_{j} } \right\rangle + \left( {b + \varepsilon } \right) \hfill \\ and \hfill \\ y_{j} \ge \left\langle {w,x_{j} } \right\rangle + \left( {b - \varepsilon } \right) \hfill \\ \end{gathered} \right\} \hfill \\ \end{gathered}$$
(A11)

Without loss of generality let us consider one independent variable and moreover, a fuzzy symmetrical number with central value equals to b and the semi-width equals to ε. If the following fuzzy linear regression model is adopted

$$\tilde{y}_{j} = \tilde{A}_{0} + A_{1} x_{j}$$
(A12)

that is, the fuzzy linear regression without uncertainty in the independent variable coefficient, then for h = 0, the inclusion constraint becomes:

$$\begin{gathered} \hfill \\ \left. \begin{gathered} y_{j} \le a_{j} x_{j} + \left( {b + \varepsilon } \right) \hfill \\ and \hfill \\ y_{j} \ge a_{j} x_{j} - \left( {b - \varepsilon } \right) \hfill \\ \end{gathered} \right\} \Leftrightarrow - \varepsilon \le y_{j} - a_{j} x_{j} - b \le \varepsilon \hfill \\ \end{gathered}$$
(A13)

Therefore, by using fuzzy sets only in the constant term, the parameter ε can be seen as the semi –width of the constant term. Hence, in this case the inclusion constraints are equivalent.

In case of ε-tube regression the parameter ε was selected a priori whilst in case of fuzzy regression the semi-width is produced by using an optimization model, in other words in case of fuzzy regression the parameter ε is a decision variable as well as the central value of the constant term.

The ε—SVR regression aims to a curve as flat as possible whilst this goal does not exist in fuzzy regression methodology. In addition other versions of the ε- SVR regression enable the existence of some non –included points in the produced ε-tube. Hence, the fuzzy regression uses the total sum of the produced fuzzy band as objective function whilst the aforementioned ε-tube regression model aims to be as flat as possible. In addition, to have non-linear SVR, the use of Kernel function to map input data into multidimensional domain is proposed to produce a linear hyperplane (Kazemi et al. 2021; Li et al., 2014).

In general, the fuzzy regression model of Tanaka (1987) enables the uncertainty not only in the constant term but also in the coefficient of the independent variable. Therefore a fuzzy band with not constant width (as in the case of the ε—SVR regression) is produced (Fig. 

Fig. 13
figure 13

The produced band in case of a the ε—SVR regression b fuzzy linear regression model of Tanaka (general case)

13). However, ideas from the ε-SVR could be used in the fuzzy regression model. For instance Hao and Chiang (2008) proposed the modification of the objective function so as to use ideas from the ε-SVR. Chukhrova and Johannssen (2019) suggested a negative spread is in conflict with the extension principle, and moreover, adding auxiliary variables (two for each point of data) lacks theoretical foundation and increases computational effort.

In the examined examples, the tube regression model (ε-SVR) is improper for our model since the proposed fuzzy regression model has not constant term. This is due to the physical meaning, since a constant term in the component of the unit hydrograph will produce direct runoff without effective rainfall which has no physical meaning.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Spiliotis, M., Garrote, L. Unit hydrograph identification based on fuzzy regression analysis. Evolving Systems 12, 701–722 (2021). https://doi.org/10.1007/s12530-021-09380-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-021-09380-7

Keywords

Navigation