Abstract
Many previous studies have developed decomposition and ensemble models to improve runoff forecasting performance. However, these decomposition-based models usually introduce large decomposition errors into the modeling process. Since the variation in runoff time series is greatly driven by climate change, many previous studies considering climate change focused on only rainfall-runoff modeling, with few meteorological factors as input. Therefore, a climate-driven streamflow forecasting (CDSF) framework was proposed to improve the runoff forecasting accuracy. This framework is realized by using principal component analysis (PCA), long short-term memory (LSTM) and Bayesian optimization (BO), referred to as PCA-LSTM-BO. To validate the effectiveness and superiority of the PCA-LSTM-BO method along with one autoregressive LSTM model and two other CDSF models based on PCA, BO, and either support vector regression (SVR) or gradient boosting regression trees (GBRT), namely, PCA-SVR-BO and PCA-GBRT-BO, respectively, were compared. A generalization performance index based on the Nash-Sutcliffe efficiency (NSE), called the GI(NSE) value, is proposed to evaluate the generalizability of the model. The results show that (1) the proposed model is significantly better than the other benchmark models in terms of the mean square error (MSE<=185.782), NSE>=0.819, and GI(NSE) <=0.223 for all the forecasting scenarios; (2) the PCA in the CDSF framework can improve the forecasting capacity and generalizability; (3) the CDSF framework is superior to the autoregressive LSTM models for all the forecasting scenarios; and (4) the GI(NSE) value is demonstrated to be effective in selecting the optimal model with better generalizability.
Similar content being viewed by others
Data and Code Availability
Data used in the research work have been acknowledged, and data and code are available on request.
References
Abdi H, Williams LJ (2010) Principal component analysis. WIREs Comp Stat 2:433–459. https://doi.org/10.1002/wics.101
Abro MI, Zhu D, Khaskheli MA, Elahi E, Aleem-ul-Hassan M, Ramay M (2020) Statistical and qualitative evaluation of multi-sources for hydrological suitability inflood-prone areas of Pakistan. J Hydrol 588:125117. https://doi.org/10.1016/j.jhydrol.2020.125117
Asante-Okyere S, Shen C, Ziggah YY, Rulegeya MM, Zhu X (2020) Principal component analysis (PCA) based hybrid models for the accurate estimation of reservoir water saturation. Comput Geosci 145:104555. https://doi.org/10.1016/j.cageo.2020.104555
Awotwi A, Annor T, Anornu GK, Quaye-Ballard JA, Agyekum J, Ampadu B, Nti IK, Gyampo MA, Boakye E (2021) Climate change impact on streamflow in a tropical basin of Ghana, West Africa. J Hydrol Reg Stud 34:100805. https://doi.org/10.1016/j.ejrh.2021.100805
Bai Y, Bezak N, Sapač K, Klun M, Zhang J (2019) Short-term streamflow forecasting using the feature-enhanced regression model. Water Resour Manag 33:4783–4797. https://doi.org/10.1007/s11269-019-02399-1
Bartoletti N, Casagli F, Marsili-Libelli S, Nardi A, Palandri L (2018) Data-driven rainfall/runoff modelling based on a neuro-fuzzy inference system. Environ Model Softw 106:35–47. https://doi.org/10.1016/j.envsoft.2017.11.026
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5:157–166. https://doi.org/10.1109/72.279181
Bisoyi N, Gupta H, Padhy NP, Chakrapani GJ (2019) Prediction of daily sediment discharge using a back propagation neural network training algorithm: A case study of the Narmada River, India. Int J Sediment Res 34:125–135. https://doi.org/10.1016/j.ijsrc.2018.10.010
Boubchir M, Aourag H (2020) Materials genome project: The application of principal component analysis to the formability of perovskites and inverse perovskites. Comput Condens Matter 24:e00495. https://doi.org/10.1016/j.cocom.2020.e00495
Chua LH (2012) Considerations for data-driven and physically-based hydrological models in flow forecasting. IFAC Proc Vol 45:1025–1030. https://doi.org/10.3182/20120711-3-BE-2027.00036
Davis JC, Sampson RJ (1986) Statistics and data analysis in geology. John Wiley & Sons, New York
Dewancker I, McCourt M, Clark S (2015) Bayesian optimization primer. https://app.sigopt.com/static/pdf/SigOpt_Bayesian_Optimization_Primer.pdf
Dewancker I, McCourt M, Clark S (2016) Bayesian optimization for machine learning: A practical guidebook. arXiv preprint arXiv:1612.04858
Fang H-T, Jhong B-C, Tan Y-C, Ke K-Y, Chuang M-H (2019) A two-stage approach integrating SOM and MOGA-SVM-based algorithms to forecast spatial-temporal groundwater level with meteorological factors. Water Resour Manag 33:797–818. https://doi.org/10.1007/s11269-018-2143-x
Farfán JF, Palacios K, Ulloa J, Avilés A (2020) A hybrid neural network-based technique to improve the flow forecasting of physical and data-driven models: Methodology and case studies in Andean watersheds. J Hydrol Reg Stud 27:100652. https://doi.org/10.1016/j.ejrh.2019.100652
George A, Vidyapeetham AV (2012) Anomaly detection based on machine learning: dimensionality reduction using PCA and classification using SVM. Int J Comput Appl 47:5–8. https://doi.org/10.5120/7470-0475
He X, Luo J, Zuo G, Xie J (2019) Daily runoff forecasting using a hybrid model based on variational mode decomposition and deep neural networks. Water Resour Manag 33:1571–1590. https://doi.org/10.1007/s11269-019-2183-x
Huang C-C, Chang M-J, Lin G-F, Wu M-C, Wang P-H (2021) Real-time forecasting of suspended sediment concentrations reservoirs by the optimal integration of multiple machine learning techniques. J Hydrol Reg Stud 34:100804. https://doi.org/10.1016/j.ejrh.2021.100804
Kourtit K, Pele MMM, Nijkamp P, Pele DT (2021) Safe cities in the new urban world: A comparative cluster dynamics analysis through machine learning. Sustain Cities Soc 66:102665. https://doi.org/10.1016/j.scs.2020.102665
Kratzert F, Klotz D, Brenner C, Schulz K, Herrnegger M (2018) Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks. Hydrol Earth Syst Sci 22:6005–6022. https://doi.org/10.5194/hess-22-6005-2018
Li Y, Cao H (2018) Prediction for tourism flow based on LSTM neural network. Proc Comput Sci 129:277–283. https://doi.org/10.1016/j.procs.2018.03.076
Li B, Shi X, Lian L, Chen Y, Chen Z, Sun X (2020) Quantifying the effects of climate variability, direct and indirect land use change, and human activities on runoff. J Hydrol 584:124684. https://doi.org/10.1016/j.jhydrol.2020.124684
Liao S, Liu Z, Liu B, Cheng C, Jin X, Zhao Z (2020) Multistep-ahead daily inflow forecasting using the ERA-Interim reanalysis data set based on gradient-boosting regression trees. Hydrol Earth Syst Sci 24:2343–2363. https://doi.org/10.5194/hess-24-2343-2020
Manheim DC, Detwiler RL (2019) Accurate and reliable estimation of kinetic parameters for environmental engineering applications: A global, multi objective, Bayesian optimization approach. MethodsX 6:1398–1414. https://doi.org/10.1016/j.mex.2019.05.035
Mao G, Wang M, Liu J, Wang Z, Wang K, Meng Y, Zhong R, Wang H, Li Y (2021) Comprehensive comparison of artificial neural networks and long short-term memory networks for rainfall-runoff simulation. Phys Chem Earth Parts A/B/C 123:103026. https://doi.org/10.1016/j.pce.2021.103026
Marmolin H (1986) Subjective MSE measures. IEEE Trans Syst Man Cybern 16:486–489. https://doi.org/10.1109/TSMC.1986.4308985
Minka TP (2001) Automatic choice of dimensionality for PCA: advances in NIPS. Adv Neural Inf Process Syst 598–604
Moazenzadeh R, Mohammadi B, Shamshirband S, Chau K (2018) Coupling a firefly algorithm with support vector regression to predict evaporation in northern Iran. Eng Appl Comput Fluid Mech 12:584–597. https://doi.org/10.1080/19942060.2018.1482476
Myronidis D, Ivanova E (2020) Generating regional models for estimating the peak flows and environmental flows magnitude for the Bulgarian-Greek Rhodope mountain range torrential watersheds. Water 12:784. https://doi.org/10.3390/w12030784
Myronidis D, Ioannou K, Fotakis D, Dörflinger G (2018) Streamflow and hydrological drought trend analysis and forecasting in Cyprus. Water Resour Manag 32:1759–1776. https://doi.org/10.1007/s11269-018-1902-z
Narayan RK, Ghosh SK (2021) Analysis of variations in morphological characteristics of orbito-meningeal foramen: An anatomical study with clinical implications. Transl Res Anat 24:100108. https://doi.org/10.1016/j.tria.2020.100108
Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models part I: A discussion of principles. J Hydrol 10:282–290. https://doi.org/10.1016/0022-1694(70)90255-6
Rasmussen CE (ed) (2004) Gaussian processes in machine learning. Lect Notes Comput Sci 3176. Springer, Berlin, Heidel
Shirmohammadi B, Vafakhah M, Moosavi V, Moghaddamnia A (2012) Application of several data-driven techniques for predicting groundwater level. Water Resour Manag 27:419–432. https://doi.org/10.1007/s11269-012-0194-y
Su J, Wang X, Liang Y, Chen B (2014) GA-based support vector machine model for the prediction of monthly reservoir storage. J Hydrol Eng 19:1430–1437. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000915
Su X, Li X, Niu Z, Wang N, Liang X (2021) A new complexity-based three-stage method to comprehensively quantify positive/negative contribution rates of climate change and human activities to changes in runoff in the upper Yellow River. J Clean Prod 287:125017. https://doi.org/10.1016/j.jclepro.2020.125017
Sun Y, Niu J, Sivakumar B (2019) A comparative study of models for short-term streamflow forecasting with emphasis on wavelet-based approach. Stoch Environ Res Risk Assess 33:1875–1891. https://doi.org/10.1007/s00477-019-01734-7
Vapnik V, Golowich S, Smola A (1996) Support vector method for function approximation, regression estimation and signal processing. Adv Neural Inf Process Syst 9:281–287
Wu Z, Huang NE (2009) Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv Adapt Data Anal 1–41
Yu X, Zhang X, Qin H (2018) A data-driven model based on Fourier transform and support vector regression for monthly reservoir inflow forecasting. J Hydro-environ Res 18:12–24. https://doi.org/10.1016/j.jher.2017.10.005
Zhang Y, Haghani A (2015) A gradient boosting method to improve travel time prediction. Transp Res Part C Emerg Technol 58:308–324. https://doi.org/10.1016/j.trc.2015.02.019
Zhao J, Cai R, Sun W (2021) Regional sea level changes prediction integrated with singular spectrum analysis and long-short-term memory network. Adv Space Res. https://doi.org/10.1016/j.asr.2021.08.017
Zuo G, Luo J, Wang N, Lian Y, He X (2020a) Decomposition ensemble model based on variational mode decomposition and long short-term memory for streamflow forecasting. J Hydrol 585:124776. https://doi.org/10.1016/j.jhydrol.2020.124776
Zuo G, Luo J, Wang N, Lian Y, He X (2020b) Two-stage variational mode decomposition and support vector regression for streamflow forecasting. Hydrol Earth Syst Sci. 24:5491–5518. https://doi.org/10.5194/hess-24-5491-2020
Acknowledgments
I sincerely appreciate the data provided by the China Meteorological Data Service Center.
Funding
This work was supported by the Natural Science Basic Research Program of Shaanxi (Grant No. 2019JLZ-15 and 2017JQ5076), the National Natural Science Foundation of China (Grant Nos. 51979221 and 51679186), the Research Fund of the State Key Laboratory of Eco-hydraulics in Northwest Arid Region, Xi’an University of Technology (Grant No. 2019KJCXTD-5) and the Special Scientific Research Program of Shaanxi Provincial Education Department (Grant No. 17JK0558).
Author information
Authors and Affiliations
Contributions
Conceptualization: Lian YN; Methodology: Lian YN and Zuo GG; Writing-original draft preparation: Lian YN and Wang JM; Writing-review and editing: Luo JG, Wei N and Zuo GG; Funding acquisition: Luo JG.
Corresponding author
Ethics declarations
Ethical Approval and Consent to Participate
This article does not contain any studies with human participants or animals performed by any of the authors.
Consent to Publish
All authors have consented to publish this manuscript.
Competing Interests
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Lian, Y., Luo, J., Wang, J. et al. Climate-driven Model Based on Long Short-Term Memory and Bayesian Optimization for Multi-day-ahead Daily Streamflow Forecasting. Water Resour Manage 36, 21–37 (2022). https://doi.org/10.1007/s11269-021-03002-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-021-03002-2