Advertisement

River flow simulation using a multilayer perceptron-firefly algorithm model

  • Sabereh DarbandiEmail author
  • Fatemeh Akhoni Pourhosseini
Open Access
Original Article

Abstract

River flow estimation using records of past time series is importance in water resources engineering and management and is required in hydrologic studies. In the past two decades, the approaches based on the artificial neural networks (ANN) were developed. River flow modeling is a non-linear process and highly affected by the inputs to the modeling. In this study, the best input combination of the models was identified using the Gamma test then MLP–ANN and hybrid multilayer perceptron (MLP–FFA) is used to forecast monthly river flow for a set of time intervals using observed data. The measurements from three gauge at Ajichay watershed, East Azerbaijani, were used to train and test the models approach for the period from January 2004 to July 2016. Calibration and validation were performed within the same period for MLP–ANN and MLP–FFA models after the preparation of the required data. Statistics, the root mean square error and determination coefficient, are used to verify outputs from MLP–ANN to MLP–FFA models. The results show that MLP–FFA model is satisfactory for monthly river flow simulation in study area.

Keywords

Ajichay watershed Estimation Firefly algorithm Multilayer perceptron River flow 

Introduction

River flow simulation is significant for planning and management of catchment area, evaluation of risk and control of droughts, floods, development of water resources, production of hydroelectric energy, navigation planning and allocation of water for agriculture (Khatibi et al. 2012).

Simulation of river flow is great importance for protection and simulation of changes in marine ecosystems. Different methods are used for river flow simulation including time series analysis, fuzzy logic, neurofuzzy, genetic programming, artificial neural networks and recently, chaos theory. Since the 1990s, time series methods utilizing the genetic programming, artificial neural network and fuzzy logic methods have become viable, giving rise to the publication of many scientific studies.

ANNs applied by Anmala et al. (2000) for river flow estimation in three watersheds in Kansas. Simulation show that ANNs model does not provide a significant improvement without time delayed input over other regression models. The river flow at the Kafue Hook Bridge in Vietnam, simulated by Chibanga et al. (2003), separately using ANNs. A system comparison of two types of ANNs applied by Chiang et al. (2004), static and dynamic in their research. Wu et al. (2005) developed the using of ANNs for watershed runoff and river flow simulations. Back propagation (This technique is also sometimes called backward propagation of errors) ANN, runoff models applied by Sarkar et al. (2006) to estimate and prediction daily runoff for a part of the Satluj river basin of India. Comparison of different ANN models applied by Kisi (2007) for short term daily river flow estimation. Kalteh (2008) applied ANNs model for the estimation of streamflow and used Garson’s algorithm for determining the relative significant of inputs, neural interpretation diagram, and randomization approach. In 2009, Modarres simulated the Plasjan watershed rainfall–runoff by ANNs model in the western area of the Zayandehrud watershed, Iran. Dorum et al. (2010) studied to set up rainfall–runoff relationship using ANN and ANFIS models at hydrometric stations on seven sites in Susurluk watershed. Ghorbani et al. (2016) investigated the applicability of MLP, RBF and SVM models for the estimation of river flow. The results show that the RBF and MLP models are better for estimation monthly river flow. Li et al. (2017) were evaluated implementation of hybrid evolutionary model based on SVR–FFA for water quality indicator simulation. The SVR–FFA model was presented to be a acceptable and robust model for the estimation of WQI. Alweshah et al. (2014) used firefly algorithm with artificial neural network for time series problems and concluded the experimental results showed that the proposed ANN FFA model can effectively solve time series classification problems.

In this paper, a novel simulation approach based on evolutionary facts called MLP–FFA is adopted in this study for the simulation processes. Inputs of models was selected using Gamma test. The result of the proposed algorithm was verified by comparing with MLP–ANN model.

Methodology

Study area, data and performance criteria

Our study is Ajichay watershed. Ajichay (East Azerbaijani) is one of the major rivers in the province of East Azerbaijan. In terms of geographical location is located between the north and longitude of 37°42′–30°38′East 45°40′–47°53′. The watershed starts of the height of 3400 m south and south west slopes of Sabalan mount and about 33 km northeast of the city of Sarab and through the northern city of tabriz in the west Azarshahr at an altitude of 1270 m, the lake ends. The total area of the catchment area is about 12,790 Sq. km. In Fig. 1 presented river watershed. The river is also the largest water supply to the Urmia Lake. Table 1 presents some of the important statistics for the time series used and Fig. 2 shows the variations of monthly data for study stations.
Fig. 1

Location of study area (Ajichay)

Table 1

Statistics of monthly river flow data from Ajichay river

Station

Data set

Number of data

Maximum

Minimum

Mean

Standard deviation (m3/s)

Coefficient of variation

Vanyar

Training

252

107.703

0.002

10.94

18.05

1.65

Testing

108

65.3

0.013

6.23

11.17

1.79

Total

360

107.703

0.002

9.52

16.62

1.72

Markid

Training

110

43.11

0

5.16

9.54

1.84

Testing

46

26.93

0

3.11

5.28

1.69

Total

156

43.11

0

4.55

8.54

1.87

Arzang

Training

110

32.42

0

2.73

5.61

2.05

Testing

46

12.06

0

1.27

2.66

2.09

Total

156

32.42

0

2.31

4.96

2.15

Fig. 2

Monthly river flow time series at the Ajichay river

Mutual information

Finding optimum lag time

Mutual information (MI) technique has been used widely in the linear and nonlinear correlation estimation and also input lag time variables selection (Wang et al. 2010). For the given time series sequence \( \left\{ {\left. {x_{0} ,x_{1} ,x_{2} , \ldots ,x_{t} , \ldots ,x_{n} } \right\}} \right. \) the mutual information illustrate the amount of information about the state \( x_{t + \tau } \) if the state of \( x_{t} \) is known. The average mutual information is specified by:
$$ I(\tau ) = \sum\limits_{t = 1}^{N - \tau } {P(x_{t} ,x_{t + \tau } )} .\log \left( {\frac{{p(x_{t} ,x_{t + \tau } )}}{{p(x_{t} ).p(x_{t + \tau } )}}} \right), $$
(1)
where \( P(x_{t} ) \) is the probability density of \( x_{t} \) and the \( P(x_{t} ,x_{t + \tau } ) \) is the common probability density of \( x_{t} \) and \( x_{t + \tau } \). The first local minimum of \( I(\tau ) \) estimates the optimal selection for the lag time necessary for input selection (De Domenico et al. 2013).
Input variables selection is one of the important problems when developing simulation models. This is due to the influence of correlated lag times to the model accuracies. Thus, mutual information function was performed to specify the numbers of lags (Khatibi et al. 2011).The AMI shows well-defined first minima at time lag 3 months (Fig. 3). Thus, a set of four input sets were designed with lagged t of up to 3 months for 1-month lead time simulation of river flow using MLP–ANN, MLP–FFA methods.
Fig. 3

Average mutual information (AMI) function of the study stations time series

Development of rainfall–runoff simulation models

Two soft computing models including MLP_ANN and MLP_FFA are used for river flow modeling. To evaluate the efficiency of the models for simulating monthly river flow data are divided into two groups, each used separately in the training and testing periods. The models are developed with 70% of data for training and the 20% for testing and then, the data for MLP_ANN model should be normalized and the rang of input data has been used within 0–1.

Multilayer perceptron artificial neural networks (MLP–ANN)

ANNs are parallel data processing system. A neural network consists of a set of neurons arranged in layers and in the case that weighted inputs are used, these nodes provide appropriate inputs by conversion functions. Any layer consists of pre-designated neurons and each neural network includes one or more of these interconnected system. Figure 4 represents a three layered structure that consists of one input layer, one hidden layer and one output layer. The operation process of these networks is so that the input layer accepts the data and intermediate layer processes them and finally the output layer displays the resultant outputs of model usage. For the time of data modeling stage, coefficients related to present errors in nodes are corrected through comparing the model outputs with recorded input data. Further information on ANNs can be found in e.g. Haykin (1999).
Fig. 4

Simple configuration of multilayer perceptron neural network (Nkuna and Odiyo 2011)

Firefly algorithm

The FFA developed by Yang (2010), is a swarm intelligence optimization technique based on the motion of fireflies. An optimization subject solution can be suppose as factor i.e. firefly which glows in proportion to its type. Each brighter firefly attracts its partners, irrespective of their sex, which makes exploration of the search place very efficient (Lukasik and Zak 2009). The whole swarm moves towards the brightest firefly. So the attractiveness of the fireflies is directly corresponding to their brightness. In addition, the brightness depends on the intensity of the factor (Kayarvizhy et al. 2014).The main issues in FFA development are the formulation of the objective function and the variation of the light intensity. The light intensity I(r), the attractiveness \( (\beta ) \) and the Cartesian distance between any two fireflies i and j can be written as:
$$ I(r) = I_{o} \exp ( - \gamma r^{2} ) $$
(2)
$$ \beta (r) = \beta_{o} \exp ( - \gamma r^{2} ) $$
(3)
$$ r_{ij} = \left\| {x_{i} + x_{j} } \right\| = \sqrt {\sum\limits_{k = 1}^{d} {(x_{i,k} - x_{j,k} )} } , $$
(4)
where \( \gamma \) is the light absorption factor; I(r) and I o are the light intensity at distance r and primary light intensity from a firefly, \( \beta (r) \) and \( \beta_{o} \) are the attractiveness \( \beta \) at a spacing r and r = 0. The next movement of firefly i can be illustrate as:
$$ x_{i}^{i + 1} = x_{i} + \Delta x_{i} $$
(5)
$$ \Delta x_{i} = \beta_{o} e^{{ - \gamma r^{2} }} (x_{j} - x_{i} ) + \alpha \varepsilon_{i} . $$
(6)

In the Eq. (5) attraction is the first term, the second term show the randomization, with \( \alpha \) as a randomization factor whose value range is 0–1 and \( \varepsilon_{i}^{{}} \) is the random number vector obtain from a Gaussian distribution (Sudheer et al. 2014).

In this research optimal values of γ, ε and C for the model and in addition optimal values for the weights of the MPL model were computed

Performance evaluation criteria

To evaluate the river flow simulation performances of the developed models, two statistical indices are used. The indices include determination coefficient (R2) and root mean squared error (RMSE).
$$ {\text{RMSE}} = \sqrt {\frac{{\sum\limits_{i = 1}^{N} {(x_{i} - y_{i} )^{2} } }}{N}} $$
(7)
$$ R^{2} = \left[ {\frac{{\sum\limits_{i = 1}^{N} {(x_{i} - \bar{x})(y_{i} - \bar{y})} }}{{\sqrt {\sum\limits_{i = 1}^{N} {(x_{i} - \bar{x})^{2} } } \sum\limits_{i = 1}^{N} {(y_{i} - \bar{y})^{2} } }}} \right]^{2} , $$
(8)
where \( x_{i} \) and \( y_{i} \) = the observed flow and simulated flow by the developed model, respectively; \( \bar{x} \) = the mean of the observed values; \( \bar{y} \) = the mean of the simulated values; and N = the number of observed data. The R2, is used for comparisons of models. A high R2 implies a good model performance. The RMSE is used to measure estimating accuracy, which produces a positive value by squaring the errors. High value for R2 (up to one) and low value for RMSE indicate high efficiency of the model (Najafzadeh et al. 2014).

Analysis, results and discussion

Comparison of the models performance for river flow simulation

The three-layer is used for MLP–ANN model with one hidden layer and the common trial-and-error procedure was selected the number of hidden nodes. The network was trained in 100 epochs, learning rate of 0.0012 and momentum coefficient of 0.84. The optimal number of neuron in the hidden layer was identified using a trial and error procedure.

In this research, MLP–FFA model was obtained by combining multilayer perceptron models and firefly algorithm. The results of MLP–ANN and MLP_FFA models for river flow simulation based on there different input settings are presented in this section. The performance of models structure has been evaluated using root mean square error and coefficient of determination.

Table 2 present the simulation performance of MLP–ANN and MLP–FFA models using three different input settings in training (calibration) and testing (validation) periods.
Table 2

Statistical analysis of simulated values with ANN–MLP and MLP–FFA models

Station

Model

Model structure

Training

Testing

Input combination

Output

Model structure

RMSE (m3/s)

R 2

RMSE (m3/s)

R 2

Vanyar

MLP 1

Q t−1

Q t

(1, 15, 1)

10.77

0.706

11.065

0.668

MLP 2

Qt−1, Qt−2

Q t

(2, 13, 1)

7.596

0.827

8.07

0.805

MLP 3

Qt−1, Qt−2, Qt−3

Q t

(3, 8, 1)

6.333

0.878

6.462

0.811

Markid

MLP 1

Q t−1

Q t

(1, 7, 1)

2.879

0.852

2.389

0.799

MLP 2

Qt−1, Qt−2

Q t

(2, 16, 1)

2.872

0.867

2.159

0.815

MLP 3

Qt−1, Qt−2, Qt−3

Q t

(3, 10, 1)

2.777

0.919

2.849

0.829

Arzang

MLP 1

Q t−1

Q t

(1, 13, 1)

2.285

0.848

2.36

0.61

MLP 2

Qt−1, Qt−2

Q t

(2, 17, 1)

1.9

0.904

2.075

0.797

MLP 3

Qt−1, Qt−2, Qt−3

Q t

(3, 10, 1)

1.585

0.949

1.596

0.866

Vanyar

MLP–FFA 1

Q t−1

Q t

(1, 15, 1)

6.402

0.885

7.53

0.749

MLP–FFA 2

Qt−1, Qt−2

Q t

(2, 13, 1)

5.695

0.901

6.057

0.859

MLP–FFA 3

Qt−1, Qt−2, Qt−3

Q t

(3, 8, 1)

4.441

0.94

4.562

0.89

Markid

MLP–FFA 1

Q t−1

Q t

(1, 7, 1)

4.28

0.825

4.29

0.44

MLP–FFA 2

Qt−1, Qt−2

Q t

(2, 16, 1)

4.565

0.781

4.77

0.6

MLP–FFA 3

Qt−1, Qt−2, Qt−3

Q t

(3, 10, 1)

2.083

0.956

2.137

0.899

Arzang

MLP–FFA 1

Q t−1

Q t

(1, 13, 1)

1.714

0.911

1.771

0.725

MLP–FFA 2

Qt−1, Qt−2

Q t

(2, 17, 1)

1.425

0.944

1.696

0.727

MLP–FFA 3

Qt−1, Qt−2, Qt−3

Q t

(3, 10, 1)

1.189

0.972

1.197

0.928

According to Table 2, MLP3 model is the best structure for simulation of Vanyar, Markid and Arzang stations in Ajichay river and it was selected as the optimum model for training and testing data set.

Results of statistical analysis for the Vanyar, Markid and Arzang stations in Ajichay river indicate that the MLP–FFA3 model outperforms the MLP3 model for river flow modeling during the training period. In the testing period, two developed models indicate that river flow values with the same accuracy in terms of statistical indices. In general, RMSE is found to be smaller (lowest for MLP–FFA3) and the MLP–FFA approximate are closer to the observed data. Coefficient of determination (R2) is highest for MLP–FFA3 in all the cases of training and testing periods. Moreover, scatter plots for simulated and observed monthly river flow values is indicated in Fig. 5 during testing period. It can be seen that the linear trend line of MLP_FFA model is the closest to the 45°. Similarly, it can be seen in Fig. 5 that the MLP–FFA3 model has the best accuracy for the estimation of the monthly river flow at the Ajichay river basin during the testing periods. The estimated time series of river flow using the MLP–FFA3 model are compared with the observed time series during the testing periods. A good fit is observed between the observed and simulated streamflow by MLP–FFA3 model. It can be found that the developed MLP–FFA3 model out performs the ANN model developed in this research for simulation monthly river flow and is sufficient for modeling river flow. The results of this research show that the MLP–FFA3 model is able to provide a good simulation river flow in study river.
Fig. 5

Comparison between time series plots of simulated and observed values; and scatter plot of observed and simulated values

The Taylor diagram

The Taylor diagrams applied to prepare a visual comprehension into performance measures which plots a series of points on a polar plot for the two sets of modeling results: (1) 3 data points for MLP; (2) 3 data points for MLP–FFA and (3) the observed value. The Taylor diagram representation normalized standard deviation (SD) between simulated and observed values along the radial intervals with normalized origins and R2-values are represented as the direction angles. The expectation is that the observed values have a individual show on the Taylor diagram and the closer the simulated performance measures to the representation of the observed values, the better model performance. Figure 6 shows the Taylor diagram and shows that MLP–FFA enable an important improvement in the model performance and the performances of modeling strategies likely classified as: MLP, MLP–FFA for both the training and testing status.
Fig. 6

Taylor diagram: performance measures for training and testing phases: a 3 data points of MLP, b 3 data points of MLP–FFA

Summary and conclusion

In this study, MLP–ANN and MLP–FFA models were employed for modeling river flow using monthly data. Monthly river flow for three stations were used in Ajichay river and evaluated flow. Three different combinations are considered for input data. The inputs to the models include runoff with 3 month lag times (Qt−3, Qt−2 and Qt−1). To evaluate the models performances and the effects of input data for river flow, Ajichay watershed was selected as case study. The models, performances are evaluated based on two statistical indices to measure the modeling error and Taylor diagram. The results indicate that third models are the best ones for river flow modeling. Inter-relationships among the variables cannot be distinguished clearly in the ANNs and MLP–FFA models. To overcome this weakness, MI methods was used for pre-processing inputs before using them. The results also reveal that the MLP–FFA3 model in three stations are better than MLP3 model. The results represent that MLP–FFA3 model is capable of river flow modeling with efficiency.

Notes

Acknowledgements

The authors are grateful to the Local Water Organization of Tabriz (Iran) for making available the river flow observations.

References

  1. Alweshah M, Bin Ghazi PA, Balqa AL (2014) Firefly algorithm with artificial neural network for time series problems. Res J Appl Sci Eng Technol 7(19):3978–3982Google Scholar
  2. Anmala J, Zhang B, Govindaraju RS (2000) Comparison of ANNs and empirical approaches for predicting watershed runoff. J Water Resour Plan Manag 126(3):156–166 (American Society of Civil Engineers) CrossRefGoogle Scholar
  3. Chiang Y, Chang L, Chang F (2004) Comparison of static-feed forward and dynamic-feedback neural networks for rainfall–runoff. J Hydrol 290:297–311CrossRefGoogle Scholar
  4. Chibanga R, Berlamont J, Vandewalle J (2003) Modeling and forecasting of hydrological variables using artificial neural networks: the Kafue river sub-basin. Hydrol Sci J 48:363–379.  https://doi.org/10.1623/hysj.48.3.363.45282 CrossRefGoogle Scholar
  5. De Domenico M, Ghorbani MA, Makarynskyy O, Makarynska D, Asadi H (2013) Chaos and reproduction in sea level. Appl Math Model 37:3687–3697.  https://doi.org/10.1016/j.apm.2012.08.018 CrossRefGoogle Scholar
  6. Dorum A, Yarar A, Faik Sevimli M, Onüçyildiz M (2010) Modeling the rainfall–runoff data of Susurluk basin. Exp Syst Appl 37(9):6587–6593.  https://doi.org/10.1016/j.eswa.2010.02.127 CrossRefGoogle Scholar
  7. Ghorbani MA, Zadeh HA, Isazadeh M, Terzi O (2016) A comparative study of artificial neural network (MLP, RBF) and support vector machine models for river flow simulation. Environ Earth Sci 6:476.  https://doi.org/10.1007/s12665-015-5096-x CrossRefGoogle Scholar
  8. Haykin S (1999) Neural networks. A comprehensive foundation, 2nd edn. Prentice Hall, Upper Saddle RiverGoogle Scholar
  9. Kalteh AM (2008) Rainfall–runoff using artificial neural networks (ANNs): modeling and understanding. Caspian J Environ Sci 6(1):53–58Google Scholar
  10. Kayarvizhy N, Kanmani S, Uthariaraj RV (2014) ANN models optimized using swarm intelligence algorithms. WSEAS Transact Comput 13:501–519Google Scholar
  11. Khatibi R, Ghorbani MA, Aalami MT, Kocak K, Makarynskyy O, Makarynska D, Aalinezhad M (2011) Dynamics of hourly sea level at Hillarys Boat Harbour, Western Australia: a chaos theory perspective. Ocean Dyn 61:1797–1807.  https://doi.org/10.1007/s10236-011-0466-8 CrossRefGoogle Scholar
  12. Khatibi R, Sivakumar B, Ghorbani MA, Kisi O, Kocak K, FarsadiZadeh D (2012) Investigating chaos in river stage and discharge time series. J Hydrol 414:108–117.  https://doi.org/10.1016/j.jhydrol.2011.10.026 CrossRefGoogle Scholar
  13. Kisi O (2007) River flow forecasting using different artificial neural network algorithms. J Hydrol Eng 12:532–539CrossRefGoogle Scholar
  14. Li J, Abdulmohsin HA, Sami Hasan S, Kaiming L, Al-Khateeb B, Ismaeel Ghareb M, Mohammed MN (2017) Hybrid soft computing approach for determining water quality indicator: Euphrates River. Neural Comput Appl 1–11.  https://doi.org/10.1007/s00521-017-3112-7
  15. Lukasik S, Zak S (2009) Firefly algorithm for continuous constrained optimization tasks. In: International Conference on Computational Collective Intelligence 97–106Google Scholar
  16. Najafzadeh M, Barani GA, Azamathulla HM (2014) Simulation of pipeline scour depth in clear-water and live-bed conditions using group method of data handling. Neural Comput Appl 24:629–635.  https://doi.org/10.1007/s00521-012-1258-x CrossRefGoogle Scholar
  17. Nkuna TA, Odiyo JO (2011) Filling of missing rainfall data in Luvuvhu river catchment using artificial neural networks. Phys Chem Earth 36:830–835.  https://doi.org/10.1016/j.pce.2011.07.041 CrossRefGoogle Scholar
  18. Sarkar A, Agarwal A, Singh RD (2006) Artificial neural network models for rainfall–runoff forecasting in a hilly catchment. J Indian Water Resour Soc 26:1–4.  https://doi.org/10.4236/jwarp.2012.410105 Google Scholar
  19. Sudheer Ch, Sohani SK, Kumar D, Malik A, Chahar BR, Nema AK, Panigrahi BK, Dhiman RC (2014) A support vector machine-firefly algorithm based forecasting model to determine malaria transmission. Neuro Comput 129:279–288.  https://doi.org/10.1016/j.neucom.2013.09.030 Google Scholar
  20. Wang X, Park T, Carriere K (2010) Variable selection via combined penalization for high-dimensional data analysis. Comput Stat Data Anal 54:2230–2243.  https://doi.org/10.1016/j.csda.2010.03.026 CrossRefGoogle Scholar
  21. Wu S, Han PE, Annambhotla J, Bryant S (2005) Artificial neural networks for forecasting watershed runoff and stream flows. J Hydrol Eng 10:216–222CrossRefGoogle Scholar
  22. Yang XS (2010) Firefly algorithm, stochastic test functions and design optimisation. Int J Bio-Insp Comput 2:78–84.  https://doi.org/10.1504/ijbic.2010.032124 CrossRefGoogle Scholar

Copyright information

© The Author(s) 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  • Sabereh Darbandi
    • 1
    Email author
  • Fatemeh Akhoni Pourhosseini
    • 2
  1. 1.Water Engineering DepartmentUniversity of TabrizTabrizIran
  2. 2.University of TehranTehranIran

Personalised recommendations