Introduction

Streamflow is one of the most important processes in the hydrological cycle and its prediction, is vital for water resources management and planning (Kahya and Dracup 1993). Computer simulation models of watershed hydrology and artificial intelligent techniques are widely used for runoff simulation and forecasting. The use of watershed models is increasing in need to growing demands for improved runoff quantity.

Over the last decades, artificial intelligent techniques have been introduced and widely applied in hydrological studies as powerful alternative modelling tools, such as artificial neural networks (ANN) (Dawson and Wilby 2001; Bray and Han 2004; Nayak et al. 2007; Goel 2009; Banaei et al. 2012; Javan et al. 2015), fuzzy inference system (FIS) (Zadeh 1965; See and Openshaw 2000; Xiong et al. 2001; Nayak et al. 2007). In addition, Shamseldin (1997), Tokar and Johnson (1999), Kumar et al. (2005), Filho and dos Santos (2006), Chua et al. (2008) and Mutlu et al. (2008) compared ANNs with different input variables for runoff simulation. The comparisons show that the ANN models with rainfall and discharge as input variables give better results than the models with rainfall inputs. When the model utilizes of rainfall as the input variables, the simulated hydrographs do not match the measured hydrographs so good (see Halff et al. 1993; Kumar et al. 2005; Filho and dos Santos 2006; Mutlu et al. 2008). Although good fits between the simulated and measured hydrographs have been reported in other studies where additional variables such as temperature (Zhang and Govindaraju 2000), evaporation (Kingston et al. 2005), soil moisture (Abrahart and See 2007) have been included as inputs to the ANN model.

Hydrological simulation program-fortran (HSPF) is a semi-distributed, conceptual model that combines spatially distributed physical attributes into hydrologic response units. In this model surface runoff is simulated primarily as an infiltration-excess process. HSPF has been used for simulation of various hydrologic conditions (Srinivasan et al. 1998; Zarriello and Ries 2000), non-point source pollutants, including contaminated sediment (Fontaine and Jacomino 1997), and land use management and flood control scenarios (Donigianet al. 1997). Abdullah et al. (2009) have used of HSPF model for runoff simulation of a watershed in Jordan. The results showed that monthly calibration and verifications produced good fit with correlation coefficient equal to 0.928 and 0.923, respectively. Also, daily simulation results showed lower correlation coefficient 0.785. Al-Abed and Whiteley (2002) applied HSPF model for runoff simulation in the Grand River watershed; located in Southern Ontario, Canada, with drainage area about 6965 km2. Their study revealed very satisfactory results in the calibration step and the percentage error between the simulated and the observed yearly discharge ranged between 4 and 16 %.

Not only ANNs and HSPF have been widely used for runoff simulation, but also many researchers have been performed based on the comparison between mathematical models and ANNs (Tokar and Markus 2000; Morid et al. 2002; Srivastava et al. 2006). In this study ANN and HSPF are evaluated for the simulation of Gharehsoo River stream flow (Iran).

The study area

This study was directed for the watershed of Ardebil province in North-western Iran, which lies between latitude 37° to 38°N and longitude 47° to 48°E (Fig. 1). The geographical information and the mean observed climate data for the seven main synoptic stations of the province for the baseline years between 1951 and 2007 are depicted in Table 1. The mean annual precipitation in this watershed (Table 1) is very little in comparison with world average of 800 mm. In recent years, the water shortage in Ardebil city (the capital of the province) that used in excess of water resource for agricultural province and industry consumptions, become a serious problem for this province. Land use is mainly open grass (former pasture and hay fields), mountain, and agriculture, rural and urban residential. The mean annual precipitation in this watershed (stations are presented in Table 1) is very little in comparison with world average of 800 mm. The slope watershed is variable between 0 and 60 % that the lowest slope in central watershed is between 0 and 2 %. The pictures satellites are of LAND SAT5 satellite that these have taken in 2006 year (Fig. 2).

Fig. 1
figure 1

Gharehsoo River watershed and its location in Iran with its topography, drainage network and climate stations

Table 1 The positions and the averages of the precipitation and temperature of seven climate stations
Fig. 2
figure 2

Land use of Gharehsoo River watershed (LAND SAT5 satellite _ 2006)

Methods and matrials

HSPF hydrological model

In this study, Hydrological Simulation Program FORTRAN (HSPF) was used for simulation of Gharehsoo River runoff. HSPF is a set of computer codes, developed by the U.S Environmental Protection Agency. It is based on the Stanford Watershed Model IV (Crawford and Linsley 1996). HSPF has been generated by the combination of Stanford Watershed Model IV with Agricultural Runoff Management Model (ARM) (Donigian and Davis 1987), Non-point Source Runoff Model (NPS) (Donigian and Crawford 1976), and Hydrological Simulation Program (HSP) (Hydrocomp Inc. 1977; Donigian and Huber 1991; Donigian et al. 1995). This model can simulate the hydrologic processes on permeable and impermeable land surfaces and streams (Bicknell et al. 2005). It has been widely used in Asian and other parts of the world in the climate change studies (Albek et al. 2004; Abdulla et al. 2009).

HSPF is a semi distributed deterministic, continuous and physically based model. The PERLND, IMPLND, and RCHRES modules are three main modules of HSPF which help to simulate permeable land segments, impermeable land segments, and free-flow reaches, respectively. Detailed information about these modules can be found in the literatures (Bicknel et al. 1993; Donigian and Crawford 1976; Donigian et al. 1984; Bicknell et al. 2005). Figure 3 exhibits the hydrological cycle processes in HSPF model. HSPF model uses a Storage Routing technique to route water in each reach. Infiltration in permeable land is calculated based on Richard’s equation (Bicknell et al. 2005). Actual evapotranspiration (ET) is calculated by Penman or Jensen formulas. Table 2 shows key HSPF parameters. These parameters should be calibrated during the calibration process. LZSN is the lower zone nominal capacity. That is the most important parameter in infiltration capacity which is called in HSPF with the INFILT parameter. AGWRC is defined as the rate of flow today divided by the rate of flow yesterday that is depended on topography, climate, soil properties and land use. UZSN is influenced of LZSN (Albek et al. 2004). Other parameters that they have not presented in Table 2 are estimated using the BASINS software based on topographic, soil properties and land use data. Then the estimated parameters are introduced to HSPF. The BASINS (Better Assessment Science Integrating Point and Nonpoint Sources) is developed to promote better assessment and integration of point and nonpoint sources in watershed and water quality management. It integrates several key environmental data sets with improved analysis techniques. Several types of environmental programs can benefit from the use and application of such an integrated system in various stages of environmental management planning and decision making (Bicknell et al. 2005). The data from 1998 to 2004 were utilized for HSPF model calibration and the data from 2005 to 2007 were used as validation dataset.

Fig. 3
figure 3

HSPF conceptual hydrologic model (Mark et al. 2003)

Table 2 The parameters of HSPF model in simulation process (EPA 2001)

Artificial neural network (ANN)

The ANN technique has attracted a great deal of attention due to its pattern recognition capabilities. ANNs with one hidden layer are commonly used in hydrologic modeling (Dawson and Wilby 2001; de Vos and Rientjes 2005) since these networks are considered to provide enough complexity to accurately simulate the nonlinear-properties of the hydrologic process. A FFNN consists of at least three layers, input, output and hidden layers. The first step when applies ANN, is based on the selection of network architecture. After selecting the networkarchitecture, the next step is to determine the training algorithm. The most common training for multi-layer feedforward neural networks is the back-propagation algorithm (Hagan and Menhaj 1994; Hagan et al. 1996). The transfer function used in this study is the Sigmoid Function. The sigmoid transfer function allows non-linearity to be introduced in the neural network processing and is broadly used in ANN modeling (Shamseldin 1997).

The input signals presented to the system in input layer are processed in forward through to the hidden layer. The input signal can be a single signal or an array of signals, whereas the output signal is typically single. The summation of weighted input signals is transferred by a nonlinear activation function. The response of network is compared with the actual observation results and the network error is calculated. The error of network is propagated backwards through the system and the weight coefficients updated (Fig. 4).

Fig. 4
figure 4

A three-layered FFNN with a back-propagation calibrating algorithm (Chang et al. 2007)

The most common ANN network is the feed-forward network, which uses the back-propagation algorithm for calibration (Bougadis et al. 2005). The number of neurons contained in the input and output layers, which are determined by the number of input and output variables of a given system. The size or number of neurons of a hidden layer is an important consideration when solving problems using multilayer feed-forward networks. If there are fewer neurons within a hidden layer, there may not be enough opportunity for the neural network to capture the intricate relationships between indicator parameters and the computed output parameters. Here, we used three-layer FFNN with one hidden layer and the common trial and error method to select the number of hidden nodes. Too many hidden layer neurons not only require a large computational time for accurate calibrating, but may also result in overtraining. A neural network is said to be “overtrained” when the network focuses on the characteristics of individual data points rather than just capturing the general patterns present in the entire calibrating set.

Understanding the temporal relationship between climatic variables and runoff is fundamental to the model development. To predict one lead day runoff Q (t + 1), different variants of input variables are considered. The best MLP architecture is chosen according to MSE criterion. Therefore a total number of six variables were identified as inputs (Eq. 1).

$$Q_{t} = f(P_{t - 1} ,P_{t - 2} ,P_{t - 3} ,P_{t - 4} ,T_{t - 1} ,T_{t - 2} )$$
(1)

After the appropriate input vector was identified, the network was trained to predict future data based on past and present data. In the present study, the input and output variables are first normalized linearly in the range of 0 and 1, the normalization is done using the following equation:

$$\bar{X} = \frac{{X - X_{\hbox{min} } }}{{X_{\hbox{max} } - X_{\hbox{min} } }}$$
(2)

where \(\bar{X}\) is the standardized value of the input, X is the original data set, \(X_{\hbox{min} }\) and \(X_{\hbox{max} }\) are respectively, the minimum and maximum of the actual values, in all observations. The main reason for standardizing the data matrix is that the variables are usually measured in different units. By standardizing the variables and recasting them in dimensionless units, the arbitrary impact of similarity between objects is removed.

Data

The results show discharge output of models in calibration and validation periods based on observed data. The training input dataset includes total 2557 data records between 1998 and 2004. The testing input dataset consists of a total 729 data record, observed in the last 2 years (2005–2007).

Evaluation criteria

Table 3 shows evaluation criteria that use in this study. The root-mean square error (RMSE) evaluates how closely predictions match observations. Values may range from 0 (perfect fit) to \(+ \infty\) (no fit) based on the relative range of the data. The coefficient of determination, R, known as the square of the sample correlation coefficient, ranges from 0 to 1 and describes the amount of observed variance explained by the model. A value of 0 implies no correlation, while a value of 1 suggests that the model can explain all of the observed variance. The Nash–Sutcliffe coefficient of Efficiency, ENS, measures the model’s ability to predict variables different from the mean and gives the proportion of the initial variance accounted for by the model (Nash and Sutcliffe 1970). PWRMSE is implicitly a measure of the comparison of the magnitudes of the peaks, volumes, and times of peak of the simulated and measured hydrographs.

Table 3 List of elevation criteria

Results and discussion

Daily streamflow simulation Sy HSPF model

The daily discharge data from 1998 to 2004 and from 2005 to 2007 were utilized for ‘calibration and training’ and ‘validation and testing’ the model approach, respectively. Table 4 shows the values of calibrated parameters in this study. For example, LZSN in Table 4 is an average value 38.1 mm/h that has been estimated according to the Linsley equation (Linsley et al. 1988). Linsley equation for the LZSN estimation is LZSN = 100 + 0.25 × (Yearly mean precipitation). For estimation of the other parameters, BASINS Technical Note 6 (EPA 2001) has been utilized. Figures 5 and 6 show the observed and simulated hydrographs for calibration and validation periods, respectively. These figures present good agreement between observed and simulated daily runoff in the calibration and validation periods. The correlation coefficients for calibration and validation periods are 0.814 and 0.806, respectively. It implies that HSPF simulation is acceptable. Moreover, Nash–Sutcliff coefficient (model efficiency) is 0.87 in calibration period and 0.76 in validation period. Nash–Sutcliffe efficiency coefficient value less than 0.5 are considered as unacceptable, while values greater than 0.6 are considered as good and greater than 0.8 are considered excellent results. Therefore, HSPF has been presented good daily runoff simulation. Results show that HSPF simulation of watershed discharge is acceptable in calibration period and can be used in this research.

Table 4 Values of parameters, used in simulation in HSPF model
Fig. 5
figure 5

Simulated and observed hydrographs for calibration period for HSPF model

Fig. 6
figure 6

Simulated and observed hydrographs for validation period for HSPF model

Daily streamflow simulation by ANN model

The most common ANN network is the feed-forward network, which uses the back-propagation algorithm for training (Bougadis et al. 2005). The numbers of neurons contained in the input and output layers are determined by the number of input and output variables of a given system. The size or number of neurons of a hidden layer is an important consideration when solving problems using multilayer feed-forward networks. If there are fewer neurons within a hidden layer, there may not be enough opportunity for the neural network to capture the intricate relationships between indicator parameters and the computed output parameters. Here, we use the three-layer FFNN with one hidden layer and the common trial and error method to select the number of hidden nodes. Too many hidden layer neurons not only require a large computational time for accurate training, but may also result in overtraining. A neural network is said to be “overtrained” when the network focuses on the characteristics of individual data points rather than just capturing the general patterns present in the entire training set.

Figures 7 and 8 show discharge output of ANN model in calibration and validation periods based on observed data, respectively. A very good match is obtained between the observed runoff values and those computed by the ANN model for the training data in all the input. The performance of ANN model during high flows, both during summer, autumn and winter are perfect and better than that of the HSPF model, but in low flows, a little bit above estimation is observed. This indicates the robustness of the ANN model and confirms its capability for runoff simulation within acceptable accuracy. The testing input dataset consists of a total 729 data record, observed in the last 2 years (2005–2007). The R 2 for the ANN model 0.950 for the calibration period and 0.962 for the validation period.

Fig. 7
figure 7

Analysis plot for daily flow in calibration period with ANN model (m3/s)

Fig. 8
figure 8

Analysis plot for daily flow in validation period with ANN model (m3/s)

Comparison of HSPF and ANN models

During comparison of results, the words such as ‘calibration’ and ‘validation’ of HSPF model were used as similar to training and testing of ANN model, respectively. To estimate the relative performance of the models in runoff simulation, values of evaluation criteria obtained from both ANN and HSPF models were compared. The evaluation criteria of ANN model obtained during training were compared with the corresponding evaluation criteria obtained during HSPF calibration. The values of Ens, R 2, ME, RMSE and PWRMSE are Statistical evaluation criteria that have been showed in Table 5. Similarly, the testing results of the ANN model were compared with the validation results of HSPF model.

Table 5 Performances of HSPF and FFNN model

The ENs for the HSPF model ranged from 0.565 to 0.963 for the calibration period and from 0.526 to 0.933 for the validation period. Similarly, The R 2 for the HSPF model ranged from 0.821 to 0.98 for the calibration period and from 0.793 to 0.97 for the validation period. The ME ranged from −0.738 to 0.721 for the calibration period and from −1.153 to −0.927 for the validation period. Also, The RMSE ranged from 1.399 to 2.84 for the calibration period and from 2.097 to 2.681 for the validation period. The PWRMSE for HSPF model ranged from 1.519 to 3.196 for the calibration period and from 2.71 to 3.279 for the validation period.

The ENs for the ANN model ranged from 0.836 to 0.982 for the calibration period and from 0.694 to 0.921 for the validation period. Similarly, The R 2 for the ANN model ranged from 0.950 to 0.992 for the calibration period and from 0.961 to 0.963 for the validation period. The ME ranged from −0.457 to 0.551 for the calibration period and from 0.211 to 0.262 for the validation period. Also, The RMSE ranged from 0.953 to 1.817 for the calibration period and from 1.724 to 2.289 for the validation period. The PWRMSE for ANN model ranged from 1.536 to 2.224 for the calibration period and from 3.392 to 2.263 for the validation period.

The results indicated that both models were generally able to simulate stream flow well during both the calibration/validation periods. However, the simulated stream flows by ANN were better than those predicted by HSPF during the calibration and validation periods. The runoff simulation of the ANN model was found to be better than the HSPF model during calibration and validation as revealed from the values of the evaluation criteria. There was a considerable difference between the values of ENs obtained from the ANN and HSPF models for the year 2004 (Table 5). Similar results were obtained during model validation period as well. In this study of the HSPF model, the values of Nash–Sutcliffe coefficients were found to be lower than that of the ANN model. This confirms that ANN model is well capable of describing the non-linear relationship between the input and output.

Figures 9 and 10 show the scatter plots of observed and computed runoff values for the calibration and validation periods for the HSPF model. The scatter plot is well spread over the ideal line for this the watershed (Fig. 9). In the validation period, the plot is shifted towards one side. The shift from the ideal line shows the possibility of systematic errors. Similar scatter plots for the ANN model, shown in Figs. 11 and 12, exhibit a closer scatter to the ideal line, thus indicating good runoff simulation for the Gharehsoo River watershed. The scatter for the ANN model is obviously better than that of the HSPF model. These scatter plots are considered to have been accounted for by the application of the ANN model as is revealed by relatively more symmetrical scatter in figures. The ANN model was found to be more successful than the HSPF in relation to better forecast of peak flow. The results of this study, in general, showed that ANNs can be powerful tools in runoff simulation.

Fig. 9
figure 9

Scatter plot for daily flow in calibration period with HSPF model (m3/s)

Fig. 10
figure 10

Scatter plot for daily flow in validation period with HSPF model (m3/s)

Fig. 11
figure 11

Scatter plot for daily flow in calibration period with ANN model (m3/s)

Fig. 12
figure 12

Scatter plot for daily flow in validation period with ANN model (m3/s)

Conclusion

This paper reports the results of a comparison of two different models for runoff simulation in the Gharehsoo River watershed, Iran, in the period 1998–2007. The performance of models in ‘calibration and training’ and ‘validation and testing’ stages are compared with the observed runoff values to identify the best fit forecasting model based upon a number of selected performance criteria. The comparison results show that the ANN model have better performances in forecasting of runoff from HSPF. By considering a good training process and suitable algorithms and nodes, the prediction is more accurate. Once the architecture of the network is defined, weights are calculated so as to represent the desired output through a learning process where the ANN is trained to obtain the expected results. The neural network could predict runoff accurately, with good agreement between the observed and predicted values compared to the HSPF model. The ANNs are capable for daily simulation of runoff. However, in low flows, a little bit above estimation is observed. As in hydrologic models, ANN does not require watershed information and other physical parameters in the modeling process, which reduces the complexities of modeling the system. Required time for the calibration of the ANNs is much less as compared to the HSPF. Also, for calibration ANN model is need less expertise and experiences. In comparison to HSPF model, less data is required for simulation using the ANNs. If a number of scenarios are to be made to investigate the response of the catchment, the HSPF may prove advantageous in comparison to the ANNs. One of advantages of the HSPF model is to make reliable runoff simulation when there are available climate and soil data at ungauged site. The results of this study were in a good agreement with earlier studies conducted for, in general, the HSPF and ANN comparison in daily simulations, specifically runoff prediction performance.