Abstract
River flow variations directly affect the hydro-climatological, environmental, and ecological characteristics of a region. Therefore, an accurate prediction of river flow can critically be important for water managers and planners. The present study aims to compare different data-driven models in predicting monthly flow. Two river catchments located in the Guilan province in Iran, where rivers play an essential role in agricultural productions (mainly rice), are studied. The monthly river flow dataset was provided by Guilan Regional Water Authority during 1986–2015. The models are derived from two different numerical types of stochastic and machine learning (ML) models. The stochastic model is seasonal autoregressive integrated moving average (SARIMA), and the MLs are least square support vector machine (LSSVM), adaptive neuro-fuzzy inference system (ANFIS), and group method of data handling (GMDH). The inputs were selected by autocorrelation and partial autocorrelation functions (ACF and PACF) from the flow rates of the previous months. The data was divided into 75% of training and 25% of testing phases, and then the mentioned models were implemented. Predictions were evaluated by the criteria of root mean square error (RMSE), normalized RMSE (NRMSE), and Nash Sutcliff (NS) coefficient. According to the calculated values of different criteria during the test phase, RMSE = 1.138 cms, NRMSE = 0.109, and NS = 0.826, it can be concluded that the SARIMA model was superior to its ML competitors. Among the ML models, GMDH had the best performance (by RMSE = 1.290 cms, NRMSE = 0.124, and NS = 0.777) because it has more optimization parameters and sample space for network make-up. The models were also evaluated in hydrological drought conditions of both rivers. It was resulted that the rivers’ flow can be well predicted in drought conditions by using these models, especially the SARIMA stochastic model. According to the NRMSE values (ranged between 0.1 and 0.2), the accuracy of predictions is evaluated in the appropriate range, and the present study shows promising results of the current approaches. Consequently, a comparison between the performance of linear stochastic models and complex black-box MLs, reveals that linear stochastic models are more suitable for the current region’s monthly river flow prediction.
Similar content being viewed by others
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Abdel-Fattah MK, Mokhtar A, Abdo AI (2021) Application of neural network and time series modeling to study the suitability of drain water quality for irrigation: a case study from Egypt. Environ Sci Pollut Res 28(1):898–914. https://doi.org/10.1007/s11356-020-10543-3
Abdulshahed AM, Longstaff AP, Fletcher S (2015) The application of ANFIS prediction models for thermal error compensation on CNC machine tools. Appl Soft Comput 27:158–168. https://doi.org/10.1016/j.asoc.2014.11.012
Adnan RM, Liang Z, Parmar KS, Soni K, Kisi O (2021) Modeling monthly streamflow in mountainous basin by MARS, GMDH-NN and DENFIS using hydroclimatic data. Neural Comput Appl 33(7):2853–2871. https://doi.org/10.1007/s00521-020-05164-3
Adnan RM, Yuan X, Kisi O, Curtef V (2017) Application of time series models for streamflow forecasting. Civil and Environmental Research 9(3):56–63
Aghelpour P, Bahrami-Pichaghchi H, Varshavian V (2021a) Hydrological drought forecasting using multi-scalar streamflow drought index, stochastic models and machine learning approaches, in northern Iran. Stochastic Environmental Research and Risk Assessment, 1-21. https://doi.org/10.1007/s00477-020-01949-z
Aghelpour P, Guan Y, Bahrami-Pichaghchi H, Mohammadi B, Kisi O, Zhang D (2020) Using the MODIS sensor for snow cover modeling and the assessment of drought effects on snow cover in a mountainous area. Remote Sensing 12(20):3437. https://doi.org/10.3390/rs12203437
Aghelpour P, Kisi O, Varshavian V (2021b) Multivariate drought forecasting in short-and long-term horizons using MSPI and data-driven approaches. J Hydrol Eng 26(4):04021006. https://doi.org/10.1061/(ASCE)HE.1943-5584.0002059
Aghelpour P, Singh VP, Varshavian V (2021c) Time series prediction of seasonal precipitation in Iran, using data-driven models: a comparison under different climatic conditions. Arab J Geosci 14(7):1–14. https://doi.org/10.1007/s12517-021-06910-0
Aghelpour P, Varshavian V (2020) Evaluation of stochastic and artificial intelligence models in modeling and predicting of river daily flow time series. Stoch Env Res Risk Assess 34(1):33–50. https://doi.org/10.1007/s00477-019-01761-4
Aghelpour P, Varshavian V (2021) Forecasting different types of droughts simultaneously using multivariate standardized precipitation index (MSPI), MLP neural network, and imperialistic competitive algorithm (ICA). Complexity, 2021. https://doi.org/10.1155/2021/6610228
Anusree K, Varghese KO (2016) Streamflow prediction of Karuvannur River Basin using ANFIS, ANN and MNLR models. Procedia Technol 24:101–108. https://doi.org/10.1016/j.protcy.2016.05.015
Ashrafzadeh A, Kişi O, Aghelpour P, Biazar SM, Masouleh MA (2020) Comparative study of time series models, support vector machines, and GMDH in forecasting long-term evapotranspiration rates in northern Iran. J Irrig Drain Eng 146(6):04020010. https://doi.org/10.1061/(ASCE)IR.1943-4774.0001471
Bezdek JC, Ehrlich R, Full W (1984) FCM: The fuzzy c-means clustering algorithm. Comput Geosci 10(2–3):191–203. https://doi.org/10.1016/0098-3004(84)90020-7
Bisht Dinesh CS, Jangid Ashok (2011) Discharge modelling using adaptive neuro - fuzzy inference system, International Journal of Advanced Science and Technology Vol. 31, June, 2011
Bonakdari H, Binns AD, Gharabaghi B (2020) A Comparative study of linear stochastic with nonlinear daily river discharge forecast models. Water Resour Manage 34(11):3689–3708. https://doi.org/10.1007/s11269-020-02644-y
Cortes C, Vapnik V (1995) Support-vector networks. Machine learning 20(3):273–297. https://doi.org/10.1007/BF00994018
Deo RC, Samui P (2017) Forecasting evaporative loss by least-square support-vector regression and evaluation with genetic programming, Gaussian process, and minimax probability machine regression: case study of Brisbane City. J Hydrol Eng 22(6):05017003. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001506
Dodangeh E, Panahi M, Rezaie F, Lee S, Bui DT, Lee CW, Pradhan B (2020) Novel hybrid intelligence models for flood-susceptibility prediction: Meta optimization of the GMDH and SVR models with the genetic algorithm and harmony search. J Hydrol 590:125423. https://doi.org/10.1016/j.jhydrol.2020.125423
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. https://doi.org/10.1080/01969727308546046
El-Shafie A, Taha MR, Noureldin A (2007) A neuro-fuzzy model for inflow forecasting of the Nile river at Aswan high dam. Water Resour Manage 21(3):533–556. https://doi.org/10.1007/s11269-006-9027-1
Fallah-Mehdipour E, Haddad OB, Marino MA (2014) Genetic programming in groundwater modeling. J Hydrol Eng 19(12):04014031. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000987
Ghorbani MA, Zadeh HA, Isazadeh M, Terzi O (2016) A comparative study of artificial neural network (MLP, RBF) and support vector machine models for river flow prediction. Environ Earth Sci 75(6):476. https://doi.org/10.1007/s12665-015-5096-x
Graf R, Aghelpour P (2021) Daily River Water Temperature Prediction: A Comparison between Neural Network and Stochastic Techniques. Atmosphere 12(9):1154. https://doi.org/10.3390/atmos12091154
He Z, Wen X, Liu H, Du J (2014) A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region. J Hydrol 509:379–386. https://doi.org/10.1016/j.jhydrol.2013.11.054
Ivakhnenko AG (1970) Heuristic self-organization in problems of engineering cybernetics. Automatica 6(2):207–219. https://doi.org/10.1016/0005-1098(70)90092-0
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666. https://doi.org/10.1016/j.patrec.2009.09.011
Jang JSR, Sun CT, Mizutani E (1997) Neuro-fuzzy and soft computing-a computational approach to learning and machine intelligence [Book Review]. IEEE Trans Autom Control 42(10):1482–1484. https://doi.org/10.1109/TAC.1997.633847
Khairuddin N, Aris AZ, Elshafie A, Sheikhy Narany T, Ishak MY, Isa NM (2019) Efficient forecasting model technique for river stream flow in tropical environment. Urban Water Journal 16(3):183–192. https://doi.org/10.1080/1573062X.2019.1637906
Khan MS, Coulibaly P (2006) Application of support vector machine in lake water level prediction. J Hydrol Eng 11(3):199–205. https://doi.org/10.1061/(ASCE)1084-0699(2006)11:3(199)
Kifanyi GE, Ndambuki JM, Odai SN, Gyamfi C (2019) Stochastic Modelling of Great Letaba River Flow Process. J Geosci Environ Protect 7(6):42–54. https://doi.org/10.4236/gep.2019.76004
Kisi O, Demir V, Kim S (2017) Estimation of long-term monthly temperatures by three different adaptive neuro-fuzzy approaches using geographical inputs. J Irrig Drain Eng 143(12):04017052. https://doi.org/10.1061/(ASCE)IR.1943-4774.0001242
Kisi O, Parmar KS (2016) Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution. J Hydrol 534:104–112. https://doi.org/10.1016/j.jhydrol.2015.12.014
Kisi O, Shiri J, Karimi S, Adnan RM (2018) Three different adaptive neuro fuzzy computing techniques for forecasting long-period daily streamflows. In Big data in engineering applications (pp. 303–321). Springer, Singapore. https://doi.org/10.1007/978-981-10-8476-8_15
Mamdani EH, Assilian S (1975) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man Mach Stud 7(1):1–13. https://doi.org/10.1016/S0020-7373(75)80002-2
McKee TB, Doesken NJ, Kleist J (1993) The relationship of drought frequency and duration to time scales. In Proceedings of the 8th Conference on Applied Climatology (Vol. 17, No. 22, pp. 179–183)
Mohammadi B, Guan Y, Aghelpour P, Emamgholizadeh S, Pillco Zolá R, Zhang D (2020) Simulation of Titicaca lake water level fluctuations using hybrid machine learning technique integrated with grey wolf optimizer algorithm. Water 12(11):3015. https://doi.org/10.3390/w12113015
Mohammadi B, Moazenzadeh R, Christian K, Duan Z (2021) Improving streamflow simulation by combining hydrological process-driven and artificial intelligence-based models. Environmental Science and Pollution Research, 1-17. https://doi.org/10.1007/s11356-021-15563-1
Najafzadeh M, Barani GA (2011) Comparison of group method of data handling based genetic programming and back propagation systems to predict scour depth around bridge piers. Scientia Iranica 18(6):1207–1213. https://doi.org/10.1016/j.scient.2011.11.017
Nalbantis I, Tsakiris G (2009) Assessment of hydrological drought revisited. Water Resour Manage 23(5):881–897. https://doi.org/10.1007/s11269-008-9305-1
Nayak PC, Sudheer KP, Rangan DM, Ramasastri KS (2004) A neuro-fuzzy computing technique for modeling hydrological time series. J Hydrol 291(1–2):52–66. https://doi.org/10.1016/j.jhydrol.2003.12.010
Nourani V, Alami MT, Vousoughi FD (2016) Hybrid of SOM-clustering method and wavelet-ANFIS approach to model and infill missing groundwater level data. J Hydrol Eng 21(9):05016018. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001398
Panahi F, Ehteram M, Emami M (2021) Suspended sediment load prediction based on soft computing models and Black Widow Optimization Algorithm using an enhanced gamma test. Environ Sci Pollut Res, 1-21. https://doi.org/10.1007/s11356-021-14065-4
Parsaie A, Haghiabi AH, Latif SD, Tripathi RP (2021) Predictive modelling of piezometric head and seepage discharge in earth dam using soft computational models. Environ Sci Pollut Res, 1-15. https://doi.org/10.1007/s11356-021-15029-4
Pereira IM, Bueno EI (2011) Variable identification in group method of data handling methodology.
Pham QB, Mohammadpour R, Linh NTT, Mohajane M, Pourjasem A, Sammen SS, ... Nam VT (2021) Application of soft computing to predict water quality in wetland. Environ Sci Pollut Res, 28(1), 185-200. https://doi.org/10.1007/s11356-020-10344-8
Salas JD (1980) Applied modeling of hydrologic time series. Water Resources Publication
Sun Y, Niu J, Sivakumar B (2019) A comparative study of models for short-term streamflow forecasting with emphasis on wavelet-based approach. Stoch Env Res Risk Assess 33(10):1875–1891. https://doi.org/10.1007/s00477-019-01734-7
Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300. https://doi.org/10.1023/A:1018628609742
Suykens JA, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1–4):85–105. https://doi.org/10.1016/S0925-2312(01)00644-0
Takagi T, Sugeno M (1985) Fuzzy identification of systems and its applications to modeling and control. IEEE Trans Syst Man Cybern 1:116–132. https://doi.org/10.1109/TSMC.1985.6313399
Valipour M, Banihabib ME, Behbahani SMR (2013) Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J Hydrol 476:433–441. https://doi.org/10.1016/j.jhydrol.2012.11.017
Velmurugan T (2014) Performance based analysis between k-Means and Fuzzy C-Means clustering algorithms for connection oriented telecommunication data. Appl Soft Comput 19:134–146. https://doi.org/10.1016/j.asoc.2014.02.011
Wagena MB, Goering D, Collick AS, Bock E, Fuka DR, Buda A, Easton ZM (2020) Comparison of short-term streamflow forecasting using stochastic time series, neural networks, process-based, and Bayesian models. Environ Model Softw 126:104669. https://doi.org/10.1016/j.envsoft.2020.104669
Wang WC, Chau KW, Xu DM, Chen XY (2015) Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour Manage 29(8):2655–2675. https://doi.org/10.1007/s11269-015-0962-6
Wang X, Tian W, Liao Z (2021) Statistical comparison between SARIMA and ANN’s performance for surface water quality time series prediction. Environmental Science and Pollution Research, 1-14. https://doi.org/10.1007/s11356-021-13086-3
Xu D, Zhang Q, Ding Y, Zhang D (2021) Application of a hybrid ARIMA-LSTM model based on the SPEI for drought forecasting. Environmental Science and Pollution Research, 1-17. https://doi.org/10.1007/s11356-021-15325-z
Zaini N, Malek MA, Yusoff M, Mardi NH, Norhisham S (2018) Daily river flow forecasting with hybrid support vector machine–particle swarm optimization. In IOP Conference Series: Earth and Environmental Science (Vol. 140, No. 1, p. 012035). IOP Publishing. https://doi.org/10.1088/1755-1315/140/1/012035
Acknowledgements
The authors acknowledge the Guilan Regional Water Authority for the availability of rivers’ flow data, and also thank the editor and reviewers for their valuable time to review the article.
Author information
Authors and Affiliations
Contributions
Conceptualization: Pouya Aghelpour.
Methodology: Zahra Hamedi, Pouya Aghelpour, and Hedieh Khodakhah.
Software: Hedieh Khodakhah and Pouya Aghelpour.
Formal analysis and investigation: Hedieh Khodakhah.
Writing—original draft preparation: Hedieh Khodakhah and Zahra Hamedi.
Writing—review and editing: Zahra Hamedi and Pouya Aghelpour.
Resources: Hedieh Khodakhah.
Supervision: Pouya Aghelpour.
Visualization: Pouya Aghelpour and Hedieh Khodakhah.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Responsible Editor: Marcus Schulz
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Khodakhah, H., Aghelpour, P. & Hamedi, Z. Comparing linear and non-linear data-driven approaches in monthly river flow prediction, based on the models SARIMA, LSSVM, ANFIS, and GMDH. Environ Sci Pollut Res 29, 21935–21954 (2022). https://doi.org/10.1007/s11356-021-17443-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-021-17443-0