Abstract
While there are different methods and models that can be applied to estimate the qualitative and quantitative parameters of water resources, unfortunately, no comprehensive qualitative and quantitative data exist about water resources in Iran. The present study is to compare the performance of the artificial neural network (ANN) and the multivariate regression methods in simulating spring discharge in the Caspian Southern Watersheds. Multivariate regression method was used by using SPSS software. Springs average discharge was considered as the dependent variable and other affecting factors as independent variables. Two linear models were presented for estimating the alluvial and karst springs discharge. Then, the models’ performance was evaluated and confirmed. Also, the artificial neural network was applied to simulate the alluvial and karst springs discharge. ANN performance was evaluated through two parameters: median root of square of the error and Pearson’s R-squared statistics. The results showed that the most important factors of karst springs discharge were the porosity of aquifer formation and the site elevation; in case of the alluvial springs, the transmissivity of aquifer formation and the aquifer depth were the most important factors. Moreover, ANN efficiency in estimating springs discharge was higher than that of the multivariate regression method.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Throughout the world, water is considered the most vital element of life. In addition, due to factors such as population increase, development in agricultural, industrial and other human activities, this vital role has become more prominent (Khaleghi and Varvani 2018a, b). The scarcity of appropriate surface water resources and the increasing water demands have caused greater use of groundwater resources (Jang 2015). Furthermore, due to industrial developments, increase in population, and lack of observing the ecological standards, the possibility of water resources pollution has increased. Springs are one of the most important water resources that provide a part of human required water. In the field of water resources management, conducting qualitative and quantitative studies such as the study of springs discharge is so important. On the other hand, the computerized models have provided the tool for water resources management (Gualbert and Essink 2001). Recently, it has become widespread to use the computerized mathematical models in groundwater management. Furthermore, different methods and techniques have been presented for hydrologic and hydrogeological parameters simulations. Lallahem et al. (2005) came to this conclusion that the artificial neural network (ANN) is efficient in groundwater modeling. To evaluate the groundwater level, Ioannis et al. (2005) applied the ANN and found that, when there is limited data at hand, this model can present acceptable groundwater estimation. Using ANN, Krishna et al. (2008) presented a model for groundwater in Kakinada, a coastal city in India. She stated that this model along with the BP (back propagation) network, and the LM algorithm can present the best simulation. For presenting models for groundwater salinity, Hall et al. (2001) conducted a study in Thailand and presented some models for salinity management and estimation. Coppola et al. (2005) found that the ANN technology can serve as a powerful and accurate simulation and management tool that minimizes the degradation of groundwater quality to the extent possible by identifying appropriate pumping policies under variable and/or changing climate conditions. Nichols and Verry (2001) developed regression equations relating yearly groundwater recharge and stream flows to the seasonal precipitation amounts in north central Minnesota. In addition, the previous studies have shown that the impact of climate changes on groundwater is site-specific (Brouyere et al. 2004). Alvis et al. (2005) investigated alluvial aquifer and its changes in the South Dakota. Their results showed that the water table depth and discharge values varied in different places. Auckenthaler et al. (2005) presented a linear model to simulate karst springs discharge in Switzerland, and their results showed the effect of aquifer formation on karst springs discharge. Worthington (1991) studied karst areas hydrology and his results showed that porosity volume of aquifer formation is an important factor on karst areas hydrologic conditions. Prohaska and Stevanovic (1993), Zhang et al. (1996), Dimitrov et al. (1997) and Bonacci (2001) presented some models to simulate karst springs discharge by using different methods. Their results showed the effects of formation type, precipitation, and water resources such as rivers on springs discharge. Generally, effective factors in alluvial springs discharge are the depth of groundwater, precipitation value and evaporation, topography and distance from water resources, and specifically, the porosity of aquifer formation and aquifer depth in karst springs (Brunner and Kinzelbach 2005; Zhang 2001). The purpose of this study is to use the ANN and multivariate regression to simulate alluvial and karst springs discharge and also to compare the performance of these methods in estimating the springs discharge in the Caspian Southern Watersheds.
Study area
Mazandaran province (study area) is located in the north of Iran. This province is located north of the Alborz Mountains, west of Gilan province, and east of Golestan province. This study was done on the southern Caspian watersheds, in the surface of central Alborz highlands (karst areas) and Caspian Southern coasts (Quaternary alluvial sediments). The studied area was located in the eastern longitude 50°30′–53°50′ and northern latitude 35°45′–36°45′. The location of the study area and karst and alluvial springs is presented in Fig. 1. The Caspian Southern coasts include the plain that are made up of Quaternary sediments; however, in the central Alborz, different geologic formations, elevations, and slopes changes are observed. Elevation maximum is 5670 m in Damavand crest. Karst springs are located in the central Alborz highlands (karst areas) and alluvial springs in the Caspian Southern coasts (alluvial sediments).
Materials and methods
Artificial neural network
Overview of the artificial neural networks
The artificial neural network is comprised of neurons network. It takes the cue from the neurons’ biological counterparts, so the neurons that are capable of learning can be trained to find the appropriate solutions, recognize patterns, classify data, and even forecast the future events. As a result, they have found a wide range of applications in simulating very complex relationships and modeling many hydrologic problems including rainfall–runoff modeling and streamflow forecasting (Hsu et al. 2002; Riad et al. 2004; Mazvimavi et al. 2005). Such a network usually consists of many layers arranged in serial orders while each layer contains a group of neurons with the same pattern of connections to the neurons in the other layer(s). The first and last layers are used as input and output variables, while the transitional layers are usually connoted as the hidden layer that, depending on the complexity of the problem, can be either one or more. The weights to a neuron are automatically adjusted by training the network according to learning rules until it properly simulates the past data or performs the desired task. Mathematical functions, known as neuron transfer functions, are used to transform the input to output, for each neuron.
The log-sigmoid transfer function is commonly used for hidden layer neurons, especially with the back-propagation algorithm. Two separate stages are involved in the hidden (transitional) and output layers in the transformation of their inputs to respective outputs. First, a neuron’s input is multiplied by its corresponding weight and the total sum of those input products a constant term yields the neuron net input. The second stage entails the transformation of the net input into output. The optimum network architecture (plant model) can then be used to predict future behavior of the plant. An optimization algorithm is used to select the control input that optimizes future performance. Figure 2 shows a typical example of a one hidden layer feedforward neural network architecture.
Feedforward back-propagation network (FFBP)
One of the basic types of a neural network is multi-layered perceptron (MLP), which consists of three layers of neurons, namely input, hidden, and output as well as the flows of information in the forward direction (Fig. 3). Using a simple back-propagation (BP) algorithm, such a feedforward network can be trained conveniently. The resulting network is called feedforward back propagation (FFBP) that involves the minimization of the error value between realized and target outputs, according to the steepest or gradient descent approach. In such an approach, the correction to a weight (W) is made proportional to the rate of global error change (E) with respect to that weight, i.e. (Eq. 1):
For FFBP, the number of hidden layers and the number of the nodes in the input and hidden layers were determined after trying various network structures. There are several methods to avoid overfitting in ANNs. These methods are summarized in Giustolisi and Laucelli (2005).
Analysis of the data
For this study, first, a 35-year time period data were provided from Iranian Water Research Institution (TAMAB). These data and statistics involved surface and groundwater quantitative experiments, type of aquifer formation, the mean depth of groundwater, and statistics related to precipitation and evaporation values from climatology stations in Mazandaran Province. The data of 57 rain gauge stations were also gathered. Eighty-two alluvial springs in the Caspian Southern coasts and 80 karst springs in the central Alborz highlands were studied. The used spring’s discharges data were used for the development of a model that considered the different combinations of inputs variables, e.g. precipitation, groundwater levels, and elevation sites. For both of the two models (karst and alluvial springs), available data were divided so that 80% of the data were used for training and the remaining 20% for validation processes. In this research, since the number of data points was more than the number of used parameters by the network (weights and biases), the additional analysis for the detection of network overtraining seemed unnecessary. In other words, cross-validation analysis was not required to stop overtraining (Salehi et al. 2000). Data usage by an ANN model typically requires data scaling. This may be due to the particular data behavior or limitations of the transfer functions. For example, as the outputs of the logistic transfer function are between 0 and 1, the data are generally scaled in the range 0.1–0.9 or 0.2–0.8, to avoid problems caused by the limits of the transfer function (Maier and Dandy 2000). In this study, the data were scaled in the range of − 1 to + 1, based on the following Eq. (2):
where p0 is the observed data, pn is the scaled data, and pmax and pmin are maximum and minimum observed data points, respectively. In order to provide consistency of the analysis, the above equation was used to scale the average air temperature, streamflow, and precipitation data; then the unit of the scaled pn would correspond to individual data series.
-
1.
Root-mean-squared error (RMSE)
In this equation, Obs refers to observed values, calc to values calculated by the network and the model, and n to the number of data in each step. The nearer is RMSE to zero, the nearer are the observed and calculated values to each other, and the more accurate is the simulation in each step.
$${\text{RMSE}} = \sqrt {\sum\limits_{i = 1}^{n} {\frac{{\left( {{\text{obs}} - {\text{calc}}} \right)}}{n}} }$$(3) -
2.
The Pearson’s R-squared statistics (RSqr)
$${\text{Rsqr}} = \left[ {\frac{{\sum\limits_{i = 1}^{n} {\left( {Qi - \overline{Qi} } \right) \cdot \left( {\hat{O}i - \tilde{O}i} \right)} }}{{\sqrt {\sum\limits_{i = 1}^{n} {\left( {Qi - \overline{Qi} } \right)^{2} \cdot \sum\limits_{i = 1}^{n} {\left( {\hat{O}i - \tilde{O}i} \right)^{2} } } } }}} \right]^{2}$$(4)where Qi is the observed value, Ôi is the modeled value, Ōi is the mean of the observed data, Õi is the mean of the modeled data, and n is the number of data in each stage of test and terrain (Eq. 4).
The purpose of network training is to have a network that can improve the relationship between the input and output of the model. Due to the lack of a single special value for planning the ANN, various structures were evaluated. Eighty percent of the data were used in the training stage and twenty percent in the test or validating stage. For training and testing a neural network in a model, the number and type of input parameters are very important. Eight input patterns planned for alluvial springs discharge simulation are as follows:
In the above structures:
-
Qspring is the mean spring discharge (m3/s).
-
D is the mean depth of grand water (m).
-
L is the distance from water resources (m).
-
H is the site elevation (m).
-
P is the mean annual precipitation (mm).
-
W is the water table (m).
-
T is the mean transmissivity formation of aquifer formation (m2/day).
Also, eight input patterns planned for karst springs discharge simulation are as follows:
In the above formulas:
-
Qspring is the spring average discharge (m3/s).
-
P is the porosity percentage of aquifer formation.
-
L is the distance from water resources (m).
-
H is the site elevation (m).
-
R is the annual mean precipitation (mm).
-
S is the site slope (degree).
Multivariate regression method
Statistical analysis was performed through SPSS software using a stepwise method. Spring average discharge was considered as independent variables. For karst springs, the aquifer formation porosity (%), aquifer depth, site elevation, annual precipitation, site slope, and distance from the water resources (lakes and rivers) were studied as independent variables. For alluvial springs, the transmissivity of aquifer formation (T), aquifer depth, water table depth, annual precipitation, site elevation, and distance from the water resources were considered as independent variables. Eighty-two alluvial springs in the Caspian Southern coasts and 80 karst springs in the central Alborz highlands were studied. Eighty percent of the sample springs were applied to propose the models, and 20 percent were used to validate the models’ efficiencies. Through this method, two linear models were presented to simulate alluvial and karst springs discharge (Formulas 5, 6). In the next step, efficiencies of the presented linear models for simulation of springs discharge were evaluated. In this stage (validation), the presented models were applied for simulating the spring’s discharges that were not used in the modeling stage.
Results
As it has been mentioned, in this study, multivariate regression has been applied to present two models for estimating springs discharges. The Correlations between each of the springs discharge factors and springs average discharge are presented in Table 1. The presented linear models for karst and alluvial springs were significant (Tables 2, 3), and R values for karst and alluvial springs models were 0.73 and 0.7 (R2 = 0.53 and 0.5), respectively. The obtained results for karst springs discharge are given in the following linear model (Eq. 5):
That Qspring is the mean spring discharge (l/s), P is the aquifer formation porosity (%), and H is the site elevation (m).
The obtained results for alluvial springs discharge are presented in the following linear model (Eq. 6):
That Qspring is the mean spring discharge (l/s). T is transmissivity of the aquifer formation (m2/day). HQ is the water table depth (m).
The presented models were used for validating the performance of those sites (springs) that were not used for modeling. The performances of these models were proved by comparing the values estimated by the model with the recorded values obtained from the previous qualitative water experiments (Fig. 4). Comparison of the models estimated values and recorded values provided by TAMAB (Water Resources Researches Organization of Iran) confirmed the linear model’s efficiencies for simulating karst and alluvial the mean springs discharge in the central Alborz highlands and the Caspian Southern coasts.
Also, the artificial neural network was used through three learning techniques: GDX, CG, and LM, and the best learning technique was selected through determining (RMSE) and Rsqr criteria. Tables 4 and 5 show the best results of various models with changing neuron numbers and types in learning technique for all eight network structure. Moreover, to evaluate the effect of the change in the neurons number for estimating the spring discharge in eight models, the numbers of neurons were evaluated consecutively with a changed domain of 2–10. The results showed that there was no exact and definite rule for determining neuron numbers in the hidden layer while estimating the spring’s discharges. On the other hand, Tables 4 and 5 show the RMSE changes toward various learning techniques for all 8 models in the testing stage. Figures 5 and 6 show the evaluation of neural network performance in spring discharge simulation. The presented model along with the ANN can be used for future planning of sustainable groundwater development and management of the groundwater resources (Yasmin 2008; Samani et al. 2007).
Discussion
In this study, using equal data springs discharge was simulated in the central Alborz highlands and the Caspian coasts through two methods of multivariate regression and artificial neural network. The results of these two methods indicated that the aquifer formation porosity (%) had the highest correlation with karst springs discharge (Zhang et al. 1996; Dimitrov et al. 1997). Also, elevation had a significant relationship with karst springs discharge. But in the case of the alluvial springs, the elevation factor was not important and did not reveal any the significant relationship with springs discharge. For alluvial springs discharge, both transmissivity (T) of aquifer formation and aquifer depth had significant relations with spring discharge and they were the most important factors in alluvial springs discharge (William 2003; Gholami et al. 2008). The evaluation of linear models performances showed that the maximum correlations between the simulated and observed values were 0.63 and 0.73 (karst and alluvial springs), respectively. Nonetheless, using the same data for evaluation of the Artificial ANN’s efficiency, the correlation between estimated and observed values was 81 and 79% (karst and alluvial springs, respectively) correspondingly.
Therefore, the efficiency and accuracy of the ANN in springs discharge simulation were greater than the multivariate regression. This finding was consistent with the results of the past studies that showed the high capability of the neural network with LM learning technique in estimating and simulating groundwater parameters (Prohaska and Stevanovic 1993). Generally, the neural network has the capacity to extract the dominant rules of the data, even for scrambled data. This characteristic of the neural network can be considered as the most outstanding characteristic of these techniques, in comparison with the other methods. Considering the appropriate type of learning technique, the proper number of neurons and hidden layer, the suitable type and the number of input factors, and also through applicable calibration, it can be said that this technique is an efficient and appropriate tool for estimating springs discharge in any investigation (Gholami et al. 2018).
Conclusions
The performance of the artificial neural network with a multi-layered perceptron format and LM learning technique was proved in the current study. Considering the results of the network efficiency analysis for different models, and comparing the obtained results with the real data, it can be concluded that the second and seventh models, out of the whole sixteen suggested models, were the best for simulating karst and alluvial springs discharge. In this study, training the network through three learning techniques was investigated and the results indicated that in comparison with the CG, GDX learning techniques, the LM learning technique showed a higher learning speed as well as a greater error decrease. Therefore, through the application of the artificial neural network, it is possible to estimate springs discharge and other qualitative and quantitative water parameters even in data-less situations. Moreover, this approach is applicable in the field of water resources management. Also, for further studies, it is suggested that spring discharges modeling by using the other artificial intelligence.
References
Alvis A, Hargrave R, Francisco E, Fischer D (2005) Aquifer vulnerability of the Inyan Kava Group, Blackhawk QuaDRANGLE, SOUTH Dakota. In: Western South Dakota hydrology conference
Auckenthaler A, Reichert P, Huggenberger P (2005) Modeling discharge and microorganism transport in a karst aquifer. Geophys Res Abstr 7:01603
Bonacci O (2001) Analysis of the maximum discharge of karst springs. Hydrogeol J 9(4):328–338. https://doi.org/10.1007/s100400100142
Brouyere S, Carabin G, Dassargues A (2004) Climate change impacts on groundwater reserves: modeled deficits in a chalky aquifer. Geer Basin, Belgium. Hydrogeol J 12(2):123–134. https://doi.org/10.1007/s10040-003-0293-1
Brunner P, Kinzelbach W (2005) Groundwater modeling in remote Chinese Basin—how can models be improved in areas where data are scarce? European Geosciences Union
Coppola EA, Mclane CF, Poulton MM, Szidarovszky F, Magelky RD (2005) Predicting conductance due to upcoming using neural networks. J Groundw 43(6):827–836. https://doi.org/10.1111/j.1745-6584.2005.00092
Dimitrov D, Machkova M, Damyanov G (1997) On the karst spring discharge forecasting by means of stochastic modeling. In: Günay G, Johnson AI (eds) Karst waters and environmental impacts (Proceedings of international symposium Antalya, Turkey, 1995). Balkema, Rotterdam, pp 353–359
Gholami V, Azodi M, Salimi Taghvaye (2008) Modeling of karst and alluvial springs discharge in the central Alborz highlands and on the Caspian Southern coasts. Casp J Environ Sci 6(1):41–45
Gholami V, Booij MJ, Tehrani EN, Hadian MA (2018) Spatial soil erosion estimation using an artificial neural network (ANN) and field plot data. CATENA 163:210–218
Giustolisi O, Laucelli D (2005) Improving generalization of artificial neural networks in rainfall–runoff modeling. Hydrol Sci J 50(3):439–457. https://doi.org/10.1623/hysj.50.3.439.65025
Gualbert H, Essink O (2001) Fresh groundwater supply-problems and solutions center of hydrology (ICHU). Institute of Earth Science, Ocean & Coastal Management
Hall N, Greiner R, Yangvanit S (2001) Modeling salinity management at from and catchment level in NSW and Thailand and Modsim 2001. Australian National University, Canberra
Hsu K, Gupta HV, Sorooshian GS, Imam B (2002) Self-organizing linear output map (SOLO): an artificial neural network suitable for hydrologic modeling and analysis. J Hydrol 38(12):1–17. https://doi.org/10.1029/2001WR000795
Ioannis N, Daliakopoulosa P, Coulibalya I, Tsanis K (2005) Groundwater level forecasting using artificial neural networks. J Hydrol 309:229–240. https://doi.org/10.1016/j.jhydrol.2004.12.001
Jang CH (2015) Probability-based classifications for spatially characterizing the water temperatures and discharge rates of hot springs in the Tatun Volcanic Region, Taiwan. Environ Monit Assess 187(5):297. https://doi.org/10.1007/s10661-015-4520-8
Khaleghi MR, Varvani J (2018a) Simulation of the relationship between river discharge and sediment yield in the semi-arid river watersheds. Acta Geophys 66:109–119. https://doi.org/10.1007/s11600-018-0110-9
Khaleghi MR, Varvani J (2018b) Sediment rating curve parameters relationship with Watershed, characteristics in the Semiarid River Watersheds. J Arab J Sci Eng. https://doi.org/10.1007/s13369-018-3092-7
Krishna B, Satyajit Rao YR, Vijaya T (2008) Modeling groundwater levels in an urban coastal aquifer using artificial neural networks. Hydrol Process 22:1180–1188. https://doi.org/10.1002/hyp.6686
Lallahem S, Mania J, Hani A, Najjar Y (2005) On the use of neural networks to evaluate groundwater levels in fractured media. J Hydrol 307:92–111. https://doi.org/10.1016/j.jhydrol.2004.10.005
Maier HR, Dandy GC (2000) Neural networks for prediction and forecasting of water resources variables: a review of modeling issues and applications. Environ Model Softw 15:101–123
Mazvimavi D, Maijerink AM, Savenije HH, Stein A (2005) Prediction of flow characteristics using multiple regression and neural networks: a case study in Zimbabwe. Phys Chem Earth 30:639–647. https://doi.org/10.1016/j.pce.2005.08.003
Nichols DS, Verry ES (2001) Streamflow and groundwater recharge from the small forested watershed in north-central Minnesota. J Hydrol 89–103. PII: soo22-1694(01)00337-7
Prohaska S, Stevanovic Z (1993) The development of the autocross-regression model for karst spring flow simulation. Theor Appl Karstol 6:151–155
Riad S, Mania J, Bouchaou L, Najjar Y (2004) Predicting catchment flow in a semi-arid region via an artificial neural network technique. Hydrol Proc J 18(13):2387–2393. https://doi.org/10.1002/hyp.1469
Salehi F, Prasher SO, Amin S, Madani A, Jebelli SJ, Ramaswamy HS, Drury CT (2000) Prediction of annual nitrate-N losses in drain outflows with artificial neural networks. ASAE 43(5):1137–1143
Samani N, Gohari-Moghadam M, Safavi A (2007) A simple neural network model for the determination of aquifer parameters. J Hydrol 340:1–11. https://doi.org/10.1016/j.jhydrol.2007.03.017
William BW (2003) Conceptual models for karstic aquifers. Re-published by permission from. In: Palmer AN, Palmer MV, Sasowsky ID (eds) Karst modeling: special publication 5. The Karst Waters Institute, Charles Town, pp 11–16
Worthington SR (1991) Karst hydrogeology of the Canadian Rocky Mountains. PhD thesis, McMaster University, Hamilton, p 227
Yasmin R (2008) Groundwater modeling of the northeastern part of barind tract its sustainable development and management, Bangladesh. Asian J Inf Technol 7(5):218–225
Zhang M (2001) Information-statistics evaluation of the effects of groundwater buried depth to upper soil and groundwater salinity. China Postdoctoral Preceding Science Press, Beijing, pp 221–224
Zhang YK, Bai EW, Libra R, Rowden R, Huaibai L (1996) Simulation of spring discharge from a limestone aquifer in Iowa, USA. USA Hydrogeol J 4(1):41–54. https://doi.org/10.1007/s100400050087
Acknowledgements
We thank TAMAB (Water Resources Research Organization of Iran) for providing the data for aquifers and groundwater and also for helping us with the data-preprocessing.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Gholami, V., Khaleghi, M.R. A comparative study of the performance of artificial neural network and multivariate regression in simulating springs discharge in the Caspian Southern Watersheds, Iran. Appl Water Sci 9, 9 (2019). https://doi.org/10.1007/s13201-018-0886-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13201-018-0886-4