Introduction

Runoff and sediment yield are the key components of the watershed modeling (Arnold and Fohrer 2005; Arnold et al. 1998; Borah and Bera 2003). Many empirical and physically based conceptual/distributed parameter models have been developed to simulate runoff and sediment yield process (Merritt et al. 2003; Picouet et al. 2001; Tokar and Johnson 1999). However, such models normally fail to represent the non-linear dynamics, which are inherent in runoff and sediment yield processes. Though physically based deterministic hydrological models have proved to be very useful for simulation of various processes related to the management of water, such as hydrodynamic, morphological, ecological, water quality, sediment yield, groundwater flow etc., implementation and calibration of such models pose different difficulties, requiring sophisticated mathematical tools, significant amounts of calibration data and some degree of expertise and experience with models (Cigizoglu and Alp 2006; Kisi 2005; Yapo et al. 1996, 1998; Yew Gan et al. 1997). Therefore, there is a need to look for alternative methods for the prediction of runoff and sediment yield. Soft computing technique is one the alternative approaches to deal with such problems.

Artificial neural network (ANN) is one of the soft computing techniques which is composed of densely interconnected processing nodes and has the ability to extract and store the information from the few patterns (data) in training through learning. Hydrologic applications of ANN include the modeling of rainfall–runoff forecasting, sediment yield process, snow–rainfall process, assessment of stream’s ecological and hydrological responses to climate change, and ground water quality prediction and ground water remediation (Alp and Cigizoglu 2007; Cigizoglu 2004; Cigizoglu and Alp 2006; Drago and Boxall 2002; Hsu et al. 1995; Kalteh 2013; Kerem Cigizoglu and Kisi 2006; Kisi 2005; Kişi 2009, 2010; Kisi and Cimen 2011, 2012; Partal and Kişi 2007; Partal and Küçük 2006).

A new tool from the Artificial Intelligence field called a support vector machine (SVM) has applied to time series. It has been applied successfully to financial time series (Cao and Tay 2001; Kim 2003; Tay and Cao 2001) as well as in the field of hydrology (Asefa et al. 2006; He et al. 2014; Raghavendra and Deka 2014; Yu et al. 2004). The present paper has an objective to develop an SVM model and ANN models for predicting both runoff and sediment yield from the Kankaimai watershed in eastern Nepal.

Artificial neural network (ANN)

The key characteristic of a neural network is its ability to learn. If a convenient mathematical model that describes a data set is already known, a neural network is unlikely to be needed, but when the rules that underlie the data are known only partially, or not at all, a neural network may discover interesting relationships as it rambles through the database. Neural networks are able to learn complex behavior and are highly adaptable; even a brief study shows them to be capable of undertaking a remarkable variety of tasks.

Artificial neural network is an information processing system that tries to replicate the behavior of a past through their learning systems with input–output data and it is appreciably applicable for discharge & sediment modeling (Agarwal et al. 2006; Alp and Cigizoglu 2007; Cigizoglu 2004; Cigizoglu and Alp 2006; Hsu et al. 1995; Kalteh 2013; Kişi 2010; Singh et al. 2012). Artificial neural networks are able to imprecise nonlinearities in the data (Peter 2003; Tayfur and Singh 2006). ANN is data driven self-adaptive method based on multivariate nonlinear nonparametric statistical analysis for real world complex problem (Basheera and Hajmeer 2000; Mehdi Khashei 2010). Time series modeling is a key area of prediction in which past observations of the same variables are collected and analyzed to develop a model describing the good correlation. Past learning technique is quite useful to get rid of unavailability of data (Zhang Guoqiang and Hu 1998). The models were used to generalize the time series for future prediction.

ANN model does a nonlinear functional mapping form the past observations to predict the future values. ANN uses logistic hidden layer transfer function and two model parameter as connection weights (Peter 2003). Basic Eqs. 1 and 2 were used for time series analysis.

$${\text{Q}}_{\text{t}} = {\text{f}}\left( {{\text{Q}}_{{{\text{t}} - 1}} ,{\text{Q}}_{{{\text{t}} - 2}} ,{\text{Q}}_{{{\text{t}} - 3}} , \ldots ,{\text{Q}}_{{{\text{t}} - {\text{p}}}} ,{\text{w}}} \right) + {{\upvarepsilon }}_{\text{t}}$$
(1)
$$S_{t} = f\left( {S_{t - 1} ,S_{t - 2} ,S_{t - 3} , \ldots ,S_{t - p} ,w} \right) + \varepsilon_{t}$$
(2)

Q, discharge, in m3/s; S, sediment rate, in tones/day; p time delay; w, weights; ɛt, fluctuations at time t.

Support vector machines (SVM)

Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. Viewing input data as two sets of vectors in a n-dimensional space, an SVM will construct a separating hyper plane in that space, one which maximizes the margin between the two data sets (Fig. 1). To calculate the margin, two parallel hyperplanes are constructed, one on each side of the separating hyper plane, which are “pushed up against” the two data sets. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the neighboring data points of both classes, since in general the larger the margin the better the generalization error of the classifier.

Fig. 1
figure 1

Architecture of SV regression machine

Study area

Nepal has differential rainfall with high range of elevations. Previously research has been done in the area of hydrology and water resources for Nepal (Atreya et al. 2006; Chalise et al. 2003; Hannah et al. 2005; Kannel et al. 2007; Sharma and Shakya 2006). Kankaimai watershed of situated in Ilam district of eastern Nepal was studied for ANN (Sharma et al. 2009) but comparison form SVM will give another dimension to the study. The main river in the watershed is Kankaimai and its major tributaries are Mai, Lodhiya Khola, Deumai Khola, Puwa Khola and Jogmai Khola. The total length of the river up to Mainachuli is about 90 km. The area of the watershed is 1180 km2 and lies between 87°35′ to 88°10′ latitude and 26°37′ to 27°05′ longitude. The topography of the watershed is undulating with the average slope of 4 %. The elevation of the watershed ranges from 125 to 3636 m above mean sea level. The watershed receives an average annual rainfall of 2300 mm, of which the monsoon season (June–September) contributes more than 79 %. The monthly mean temperature ranges from a maximum of 22.5 °C to a minimum of 12.6 °C. The monthly mean relative humidity varies from a minimum of 67 % in the month of March to a maximum of 92 % in the month of July. The climate of the watershed area varies from subtropical climate in the lower region to a temperate climate in the upper region. In terms of land resources, the Kankaimai watershed is covered with forest, cultivated land, tea gardens, settlements, water bodies, grazing land, sand bars, barren land and swampy areas. These are broadly categorized into five groups namely: forest land, cultivated land, grazing land, shrub land and others. The water resources of the Kankaimai basin are currently being used mainly for irrigation, power generation, drinking water supply, water mills and religious purposes. The following data (Tables 1, 2) has been used to calibrate and validate runoff–sediment model.

Table 1 Data available at various locations with duration for rainfall–runoff modeling
Table 2 Data available at various locations with duration for sediment yield modeling

Methodology

Auto correlation and cross correlation analysis was done considering different lag times. Analysis indicates as shown in Fig. 2 that for this particular watershed the rainfall data and previous time steps runoff having longer lag time have poor correlation with runoff. So rainfall data having time steps of t, t − 1, and t − 2; and runoff data of time step t − 1 and t − 2 are considered for developing various runoff prediction models.

Fig. 2
figure 2

Auto correlation of runoff and cross correlation between runoff and rainfall

The daily data of rainfall and runoff from 1995 to 1999 is selected for the training and testing of models. The first four and half years data is selected for training and remaining 6 months data is used for validation.

The records of suspended sediment yield which were measured in Kankaimai River at Mainchuli have been adopted for this study. The sediment yield data of wet seasons are available from the year 2001 to 2003. In total 387 data were collected from the record. Of the total data, ten percent is used for model testing and remaining 349 is used for training.

The performance of a model can be evaluated in terms of accuracy, consistency and versatility. A versatile model is defined as the model which is accurate and consistent when used for different application. Numerical indicators are the root mean square error (RMSE), R2 efficiency (Nash and Sutcliffe 1970) and coefficient of correlation (CC).

  1. (a)

    Root mean square error It yields a residual error in terms of mean square error, expressed as:

    $$RMSE = \left[ {\sum\limits_{j = 1}^{n} {\left( {Y{}_{j} - \hat{Y}\left. {_{j} } \right)^{2} /n} \right.} } \right]^{1/2}$$
    (3)

    Y and \(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Y}\) are the estimated and observed values and n is the number of observations.

  2. (b)

    Correlation coefficient (CC) It is expressed as:

    $$CC = \sum\limits_{j = 1}^{n} {\left\{ {{ (}\hat{Y}_{j} - \bar{\hat{Y}} ) (\bar{\hat{Y}} - \bar{Y} )} \right\}} \, \left\{ {\left( {\sum\limits_{j = 1}^{n} {\hat{Y}_{j} - \bar{\hat{Y}}} } \right)^{2} \sum\limits_{j = 1}^{n} {(Y_{j} - \bar{Y})} } \right\}^{1/2}$$
    (4)

    where \(\bar{Y}\) and \(\bar{\hat{Y}}\) are mean of estimated and observed values.

  3. (c)

    Coefficient of Efficiency (CE) Based on the standardization of residual variance with initial variance it is expressed as:

    $$CE = \left\{ {1 - \sum\limits_{j = 1}^{n} {\left( {\widehat{{Y_{j} }} - Y_{j} } \right)^{2} /\sum\limits_{j = 1}^{n} {\left( {\widehat{{Y_{j} }} - \overline{{\widehat{Y}}} } \right)^{2} } } } \right\} \times 100$$
    (5)

Selection of algorithm for training SVMs is sequential minimal optimization due to fast convergence capacity (Cao et al. 2006; Catanzaro et al. 2008; Platt 1998), or RBF kernel was used in the field of rainfall runoff prediction (Bürger et al. 2007; Han et al. 2007; Lin et al. 2006; Sivapragasam et al. 2001) and it is used for this study also. The accuracy of SVM model is depends upon the selection of the model parameter methods for finding optimal parameter values RBF kernels (C), epsilon and gamma for regression (Friedrichs and Igel 2005; Staelin 2003; Wu et al. 2009). A grid search method can be adopted to find good parameter value for epsilon (Lerman 1980; Lin et al. 2014) and has been adopted for this study. After finding good parameter value of epsilon, pairs of (C, gamma) are tried using coarse grid search with the best mean absolute error is picked. It is found that trying exponentially growing sequence of C and gamma is a practical method to identify good parameter (for example C = 2−5, 2−3, …, 215, gamma = 2−15, 2−13, …, 23). After identifying a “better” region on the grid, a finer grid search on that region has been conducted.

To avoid over fitting 10 fold cross-validation is used because extensive tests on numerous datasets, with different learning techniques, have shown that 10 is about the right number of folds to get the best estimate of error, and there is also some theoretical evidence that backs this up. In 10 folds cross validation the data is divided randomly into ten parts in which the class is represented in approximately the same proportions as in the full dataset. Each part is held out in turn and the learning scheme trained on the remaining nine–tenths; then its error rate is calculated on the holdout set. Thus the learning procedure is executed a total of 10 times on different training sets (each of which have a lot in common). Finally, the ten error estimates are averaged to yield an overall error estimate.

Results and discussions

Four and half year input data has been used for optimizing the model parameter C, γ and epsilon. For optimizing the epsilon parameter a pattern search with a coarse grid such is applied using 10 fold cross validation. The value of C, γ and epsilon has been tried in exponentially growing sequence with C = 20, 22, …, 222, γ = 2−15, 2−13, …, 23, and epsilon value ranges 2−1, 2−0.5, 20, 20.5, 21 and 22. After considering the number of support vector and evaluation parameter mean absolute error, epsilon 20.5 is selected. A “better region” of the grid of epsilon 20.5 is found that C = 214 and γ = 21. A finer grid search on the neighborhood of (C = 214, γ = 21) is conducted and finally obtained a better results C = 214, γ = 20.775, epsilon = 21.3 (Figs. 3, 4).

Fig. 3
figure 3

MAE of discharge for gamma < 0

Fig. 4
figure 4

MAE of discharge for gamma > 1

Input variables of 6 months i.e. 183 instances has been used for testing and temporal variation of runoff has been computed using optimized model parameter C = 214, γ = 20.775, epsilon = 21.3 obtained through training process. Comparison of computed runoff with observed one is shown in Fig. 5 and evaluation parameters is given in the Table 3.

Fig. 5
figure 5

Runoff prediction using three different models

Table 3 Runoff prediction model

Comparison of SVM, ANN and regression model

Prediction by proposed model is also compared with the result of ANN and regression model proposed by Sharma and Shakya (2006) and shown in Fig. 5 and statistical parameters is shown in Table 4.

Table 4 Comparison of SVM, ANN and regression models

The numerical performance indicators (RMSE, R2, CC) are self explanatory. SVM is not performed better than ANN, calibration parameter of the SVM model requires high configuration computer. Thus there is always the possibility of improvement of the SMOreg. model by selecting kernel mapping function and epsilon.

Sediment yield prediction model

The records of suspended sediment yield which were measured in Kankaimai River at Mainchuli have been adopted for the present study. The sediment data of wet seasons are available from the year 2001 to 2003. In total 387 data were collected from the record. Of the total data 10 % were used for model testing and the rest of the data were utilized for training the model.

Trainining the data for optimizing the model parameters

349 datas have been used for optimizing the model parameters such as C, epsilon and gamma. Applying the same procedure as discussed above achieved better result of model parameters in the form of C = 225.5, γ = 2−0.8 and epsilon = 211.85. Input variables of 38 data’s has been used for testing and temporal variation of sediment yield at Mainchuli station has been computed using optimized model parameters of C = 225.5, γ = 2−0.8 and epsilon = 211.85 and The comparison of the observed and estimated suspended sediments is also presented in the form of validation plot (Fig. 6) evaluation parameter is given in the Table 5.

Fig. 6
figure 6

Sediment prediction using three different models

Table 5 sediment yield prediction model

It is seen from the validation plot Fig. 5 that the model closely estimates and follows the observed value. It is to be noted that sediment yield in some of the predictions are negative which is practically not correct, the same is due to the used for the machine tool. For practical purpose negative value is taken as zero.

Comparison of SVM, ANN and regression model

Figure 6 depicts the plot of observed sediment yield and the computed sediment yield using SVM with ANN and regression model as function of time in days, marked serially from 1 to 38. The numerical performance indicators such as root mean square error, coefficient of efficiency and correlation coefficient presented in Table 6 explains that SVM models are superior to ANN model and far superior to than regression model. This is also supported by mass curve.

Table 6 Comparison of SVM, ANN and regression models

Conclusions

The Kankaimai watershed is fairly good with moderately high peak flow of shorter duration. The watershed is characterized by homogenous lithology with less influence of geological structure. The drainage density of watershed is high, which promotes quick response of sediment yield and runoff. Number of streams and length of streams have exponential relation with stream order. The basin is predominantly covered by sparse and medium vegetation and have moderately high rate of soil erosion. The land area covered by dense vegetation is comparatively less, which results in the formation of more numbers of streams. The error analysis conducted for the comparison of the three approaches i.e. SVM, ANN and regression for sediment yield produced correlation coefficient 0.979, 0.97, and 0.97 respectively. Similarly Nash coefficients values are figured out to be 0.948, 0.93, 0.85 and respectively. From the above values it appears that SVM analysis is somewhat exhibiting some superiority and accuracy. The study has also been carried out for runoff prediction and compared with ANN, regression analysis. The Correlation coefficient came out to be 0.85, 0.91 and 0.78 for SVM, ANN and regression analysis respectively. Similarly, Nash is 0.68, 0.82, and 0.57. It is evident from the above statistical outcomes that SVM is giving better results than regression analysis as far as accuracy and efficiency is concerned. However results of SVM is though satisfactory against ANN methodology, but because of lack of due incorporation of peak prediction in SVM model, result is not quite encouraging. There is ample possibility to effect significant improvement by attempting high configuration computer, other various Kernel function and specifically chosen algorithms.