Abstract
Hydrologic forecasting serves as an important tool in water resource management to mitigate disasters and managing water infrastructures. In field of hydrology, models for time series forecasting have been developed throughout the years, including the application of data-driven models. In this research, application of Artificial Intelligence (AI) technique in Support Vector Machine (SVM) method is used to forecast monthly discharge in Pemali River Basin, Indonesia. SVM method, optimized with fast messy genetic algorithm (fmGA), is employed for time series forecasting, whereas input data rely solely on previous values. Model performance is assessed with three different performance metrics and against Seasonal Autoregressive Moving Average (SARIMA) method for comparison. Scenarios are constructed with different input pattern for SVM to identify appropriate input data for giving good prediction accuracy. Input data are developed with and without selecting mechanism. Selecting mechanism is done based on assessment in Autocorrelation Function (ACF) coefficient of the time series data. While input data without the selecting mechanism consist of monthly discharge up to lag time 12 months prior (Qt-1, Qt-2,…,Qt-12). The result shows that input data with good correlation can give good prediction accuracy. Involvement of poorly correlated input data series may decrease model performance. However, with proper combination of input data, SVM can have good forecasting accuracy regardless of having the poorly correlated input data. Coherently, appropriate input data combination can reduce the number of support vectors in SVM, thus scaling down the risk of over fitting data.
Similar content being viewed by others
Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code availability
Software applications used during the current study are legitimate and under acknowledgement of corresponding authors.
References
Adhikari R, Agrawal RK (2013) An introductory study on time series modelling and forecasting. LAP Lambert Academic Publishing, Germany
Araghinejad S (2014) Data-driven modeling: using MATLAB® in water resources and environmental engineering. Springer, Dordrecht Heidelberg
Box GEP, Jenkins GM (1970) Time series analysis forecasting and control. Holden-Day, San Franscisco
Carlson RF, MacCormick AJA, Watts DG (1970) Application of linear random models to four annual streamflow series. Water Resour Res 6:1070–1078. https://doi.org/10.1029/WR006i004p01070
Cheng MY, Roy AFV (2011) Evolutionary fuzzy decision model for cash flow prediction using time-dependent support vector machines. Int J Project Manage 29:56–65. https://doi.org/10.1016/j.ijproman.2010.01.004
DeLurgio SA (1998) Forecasting principles and applications, 1st edn. Irwin McGraw-Hill Publishers, New York
Di CL, Yang XH, Wang XC (2014) A four-stage hybrid model for hydrological time series forecasting. PLoS ONE 9:e104663. https://doi.org/10.1371/journal.pone.0104663
Han S, Qubo C, Meng H (2012) Parameter selection in SVM with RBF kernel function. World Automation Congress, Puerto Vallarta, Mexico, 2012, pp. 1-4
Hipel KW, McLeod AI (1994) Time series modelling of water resources and environmental systems. Elsevier, Amsterdam
Hsu CW, Chang CC, Lin CJ (2003) A practical guide to support vector classification. National Taiwan University, Taipei, 106, Taiwan. https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
Jiang X, Adeli H (2004) Wavelet packet-autocorrelation function method for traffic flow pattern analysis. Comput Aided Civ Infrastruct Eng 19:324–337. https://doi.org/10.1111/j.1467-8667.2004.00360.x
Kecman V (2005) Support Vector Machines: an introduction. In: Wang L (ed) Support Vector Machines: theory and applications. Springer, Berlin, pp 1–48
Khan MYA, Hasan F, Panwar S, Chakrapani GJ (2016) Neural network model for discharge and water-level prediction for Ramganga River catchment of Ganga Basin, India. Hydrol Sci J 61(11):2084–2095. https://doi.org/10.1080/02626667.2015.1083650
Khan MYA, Hasan F, Tian F (2019) Estimation of suspended sediment load using three neural network algorithms in Ramganga River catchment of Ganga Basin, India. Sustain Water Resour Manag 5:1115–1131. https://doi.org/10.1007/s40899-018-0288-7
Krzystofowicz R (2001) The case for probabilistic forecasting in hydrology. J Hydrol 249:2–9. https://doi.org/10.1016/S0022-1694(01)00420-6
Li Y, Bai B, Zhang Y (2010) Improved particle swarm optimization algorithm for fuzzy multi-class SVM. J Syst Eng Electron 21(3):509–513. https://doi.org/10.3969/j.issn.1004-4132.2010.03.024
Lin CF, Wang SD (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471
Maceira MEP, Damazio JM, Ghirardi AO, Dantas, HM (1999) Periodic ARMA models applied to weekly streamflow forecasts. PowerTech Budapest 99. Abstract Records. (Cat. No. 99EX376). Doi: https://doi.org/10.1109/ptc/1999.826517
McCleary R, Hay RA, Meidinger EE, McDowall D (1980) Applied time series analysis for the social sciences. Sage Publications Inc, USA
Moazenzadeh R, Mohammadi B, Shamshirband S (2018) Coupling a firefly algorithm with support vector regression to predict evaporation in Northern Iran. Eng Appl Comput Fluid Mech 12(1):584–597. https://doi.org/10.1080/19942060.2018.1482476
Modarres R (2006) Streamflow drought time series forecasting. Stoch Environ Res Risk Assess 21(3):223–233. https://doi.org/10.1007/s00477-006-0058-1
Mohammadpour R, Shaharuddin S, Chang CK et al (2015) Prediction of water quality index in constructed wetlands using support vector machine. Environ Sci Pollut Res 22:6208–6219. https://doi.org/10.1007/s11356-014-3806-7
Montgomery DC, Jennings CL, Kulahci M (2008) Introduction to time series analysis and forecasting. Wiley, New Jersey
Mustafa A, Rienow A, Saadi I, Cools M, Teller J (2018) Comparing support vector machines with logistic regression for calibrating cellular automata land use change models. Eur J Remote Sens 51:391–401. https://doi.org/10.1080/22797254.2018.1442179
Naghibi SA, Ahmadi K, Daneshi A (2017) Application of Support Vector machine, random forest, and genetic algorithm optimized random forest models in groundwater potential mapping. Water Resour Manage 31:2761–2775. https://doi.org/10.1007/s11269-017-1660-3
Noori R, Karbassi AR, Moghaddamnia A, Han D, Zokaei-Ashtiani MH, Farkohnia A, Gousheh GM (2011) Assessment of input variables determination on the SVM model performance using PCA, gamma test, and forward selection techniques for monthly stream flow prediction. J Hydrol 401:177–189. https://doi.org/10.1016/j.jhydrol.2011.02.021
Parmar KS, Bhardwaj R (2015) Statistical, time series, and fractal analysis of full stretch of River Yamuna (India) for water quality management. Environ Sci Pollut Res 22:397–414. https://doi.org/10.1007/s11356-014-3346-1
Raghavendra NS, Deka PC (2014) Support Vector Machine applications in the field of hydrology: a review. Appl Soft Comput 19:372–386. https://doi.org/10.1016/j.asoc.2014.02.002
Shahabboding S, Sajjad H, Hana S, Saeed S, Esmaeil A, Sadra S, Katayoun K, Amir M, Narjes N, Kwok-Wing C (2020) Predicting standardized streamflow index for hydrological drought using machine learning models. Eng Appl Comput Fluid Mech 14(1):339–350. https://doi.org/10.1080/19942060.2020.1715844
Sivapalan M, Takeuchi K, Franks SW, Gupta VK, Karambiri H, Lakshmi V, Liang X, McDonell JJ, Mendiondo EM, O’Connell PE, Oki T, Pomeroy JW, Schertzer D, Uhlenbrook S, Zehe E (2003) IAHS decade on predictions in Ungauged Basins (PUB), 2003–2012: shaping an exciting future for the hydrological sciences. Hydrol Sci J 48(6):857–880. https://doi.org/10.1623/hysj.48.6.857.51421
Sudheer KP, Gosain AK, Ramasastri KS (2002) A data-driven algorithm for constructing artificial neural network rainfall-runoff models. Hydrol Process 16:1325–1330. https://doi.org/10.1002/hyp.554
Sudheer C, Maheswaran R, Panigrahi BK, Mathur S (2014) A hybrid SVM-PSO model for forecasting monthly streamflow. Neural Comput Appl 24:1381–1389. https://doi.org/10.1007/s00521-013-1341-y
Sujay RN, Paresh CD (2014) Support vector machine applications in the field of hydrology: a review. Appl Soft Comput 19:372–386. https://doi.org/10.1016/j.asoc.2014.02.002
USGS (1982) Guidelines for determining flood flow frequency. U.S. Department of the Interior Geological Survey, Office of Water Data Coordination, Reston, Virginia.
Valipour M (2015) Long-term runoff study using SARIMA and ARIMA Models in the United States. Meteorol Appl 22:592–598. https://doi.org/10.1002/met.1491
Vapnik V (1998) Statistical learning theory. Wiley, New York
Wagener T, Gupta HV (2005) Model identification for hydrological forecasting under uncertainty. Stoch Environ Res Risk Assess 19(6):378–387. https://doi.org/10.1007/s00477-005-0006-5
Wang W, Chau K, Cheng C, Qiu L (2009) A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J Hydrol 374:294–306. https://doi.org/10.1016/j.jhydrol.2009.06.019
Wang W, Chau K, Xu D, Chen X (2015) Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour Manag 29:2655–2675. https://doi.org/10.1007/s11269-015-0962-6
Wong H, Ip W, Zhang R, Xia J (2007) Non-parametric time series models for hydrological forecasting. J Hydrol 332:337–347. https://doi.org/10.1016/j.jhydrol.2006.07.013
Yu PS, Chen ST, Chang IF (2006) Support vector regression for real-time flood stage forecasting. J Hydrol 328:704–716. https://doi.org/10.1016/j.jhydrol.2006.01.021
Zahrahtul AZ, Ani S (2012) Streamflow forecasting at ungauged sites using support vector machines. Appl Math Sci 6(60):3003–3014
Zhang X, Guo Y (2009) Optimization of SVM parameters based on PSO algorithm. 2009 Fifth international conference on natural computation, Tianjin, pp. 536–539
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation and analysis were performed by KC. Data collection and software providence were performed by AFVR and DY. The first draft of the manuscript was written by KC and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Christian, K., Roy, A.F.V., Yudianto, D. et al. Application of optimized Support Vector Machine in monthly streamflow forecasting: using Autocorrelation Function for input variables estimation. Sustain. Water Resour. Manag. 7, 29 (2021). https://doi.org/10.1007/s40899-021-00506-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40899-021-00506-y