Abstract
For effective water resource management, water budgeting, and optimal release discharge from a reservoir, the accurate prediction of daily inflow is critical. An attempt has been made using artificial intelligence (AI) techniques to enhance water management efficiency in the Haditha-dam reservoir. This case study occasionally suffers from severe drought events and thus causes significant water shortages as well as stopping hydroelectric power stations for several months. Four different approaches were employed for inflow forecasting, namely multiple linear regression (MLR), random forest (RF), extreme learning machine (ELM), and regularized extreme learning machine (RELM). Autocorrelation function (ACF) and partial autocorrelation function (PACF) were used to select the best-lagged variables. The obtained results revealed the superiority of the RELM model compared to other forecast models. The proposed model (RELM) yielded higher prediction accuracy, and its prediction records were similar to the actual values. Moreover, the adopted model achieved a higher correlation of coefficient value (R = 0.955). The regularization approach effectively enhanced the prediction capacity and the generalization ability of the proposed model. On the other hand, the RF model's performance capacity was poor compared to other comparable models due to the overfitting issue. Moreover, the results showed that the PACF (partial autocorrelation function) gave more accurate and realistic predictors than ACF (autocorrelation function) because of its ability to cope with a sudden temporal variation of inflow time series. Overall, the RELM approach provided higher adequacy and tighter confidence in forecasting daily inflow even in noisy data and severe climatic conditions.
Similar content being viewed by others
Data availability
Data are available upon request from the corresponding author.
References
Adamowski J, Sun K (2010) Development of a coupled wavelet transform and neural network method for flow forecasting of non-perennial rivers in semi-arid watersheds. J Hydrol 390(1):85–91. https://doi.org/10.1016/j.jhydrol.2010.06.033
Alizadeh Z, Shourian M, Yaseen ZM (2020) Simulating monthly streamflow using a hybrid feature selection approach integrated with an intelligence model. Hydrol Sci J 65(8):1374–1384. https://doi.org/10.1080/02626667.2020.1755436
AlOmar MK, Hameed MM, Al-Ansari N, AlSaadi MA, Jiang Y-Z (2020a) Data-driven model for the prediction of total dissolved gas: robust artificial intelligence approach. Adv Civ Eng 2020:1–20. https://doi.org/10.1155/2020/6618842
AlOmar MK, Hameed MM, AlSaadi MA (2020b) Multi hours ahead prediction of surface ozone gas concentration: robust artificial intelligence approach. Atmos Pollut Res 11(9):1572–1587. https://doi.org/10.1016/j.apr.2020.06.024
Atiquzzaman M, Kandasamy J (2015) Prediction of hydrological time-series using extreme learning machine. J Hydroinf 18(2):345–353. https://doi.org/10.2166/hydro.2015.020
Belvederesi C, Dominic JA, Hassan QK, Gupta A, Achari G (2020) Predicting river flow using an AI-based sequential adaptive neuro-fuzzy inference system. Water 12(6):1622
Bilhan O, Emiroglu ME, Miller CJ, Ulas M (2018) The evaluation of the effect of nappe breakers on the discharge capacity of trapezoidal labyrinth weirs by ELM and SVR approaches. Flow Meas Instrum 64:71–82. https://doi.org/10.1016/j.flowmeasinst.2018.10.009
Budu K (2014) Comparison of wavelet-based ANN and regression models for reservoir inflow forecasting. J Hydrol Eng 19(7):1385–1400. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000892
Deo RC, Şahin M (2015) Application of the extreme learning machine algorithm for the prediction of monthly Effective Drought Index in eastern Australia. Atmos Res 153:512–525. https://doi.org/10.1016/j.atmosres.2014.10.016
Deo RC, Şahin M (2016) An extreme learning machine model for the simulation of monthly mean streamflow water level in eastern Queensland. Environ Monit Assess 188(2):90. https://doi.org/10.1007/s10661-016-5094-9
Deo RC, Downs N, Parisi AV, Adamowski JF, Quilty JM (2017) Very short-term reactive forecasting of the solar ultraviolet index using an extreme learning machine integrated with the solar zenith angle. Environ Res 155:141–166. https://doi.org/10.1016/j.envres.2017.01.035
Despotovic M, Nedic V, Despotovic D, Cvetanovic S (2015) Review and statistical analysis of different global solar radiation sunshine models. Renew Sustain Energy Rev 52:1869–1880. https://doi.org/10.1016/j.rser.2015.08.035
Diop L, Bodian A, Djaman K, Yaseen ZM, Deo RC, El-shafie A, Brown LC (2018) The influence of climatic inputs on streamflow pattern forecasting: case study of Upper Senegal River. Environ Earth Sci 77(5):182. https://doi.org/10.1007/s12665-018-7376-8
Ebtehaj I, Bonakdari H, Moradi F, Gharabaghi B, Khozani ZS (2018) An integrated framework of extreme learning machines for predicting scour at pile groups in clear water condition. Coast Eng 135:1–15. https://doi.org/10.1016/j.coastaleng.2017.12.012
El-Shafie A, Taha MR, Noureldin A (2007) A neuro-fuzzy model for inflow forecasting of the Nile river at Aswan high dam. Water Resour Manag 21(3):533–556. https://doi.org/10.1007/s11269-006-9027-1
Hadi SJ, Abba SI, Sammen SS, Salih SQ, Al-Ansari N, Yaseen ZM (2019) Non-linear input variable selection approach integrated with non-tuned data intelligence model for streamflow pattern simulation. IEEE Access 7:141533–141548. https://doi.org/10.1109/ACCESS.2019.2943515
Hameed MM, AlOmar MK (2020) Predictionof compressive strength of high-performance concrete: hybrid artificial intelligence technique BT. In: Khalaf MI, Al-Jumeily D, Lisitsa A (eds) Appliedcomputing to support industry: innovation and technology. SpringerInternational Publishing, Cham, pp 323–335
Hameed MM, AlOmar MK, Baniya WJ, AlSaadi MA (2021a) Incorporation of artificial neural network with principal component analysis and cross-validation technique to predict high-performance concrete compressive strength. Asian J Civ Eng 22(6):1019–1031. https://doi.org/10.1007/s42107-021-00362-3
Hameed MM, AlOmar MK, Baniya WJ, AlSaadi MA (2021b) Prediction of high-strength concrete: high-order response surface methodology modeling approach. Eng Comput. https://doi.org/10.1007/s00366-021-01284-z
Hameed MM, AlOmar MK, Khaleel F, Al-Ansari N (2021c) An extra tree regression model for discharge coefficient prediction: novel, practical applications in the hydraulic sector and future research directions. Math Probl Eng 2021:7001710. https://doi.org/10.1155/2021/7001710
Hameed MM, AlOmar MK, Mohd Razali SF, Kareem Khalaf MA, Baniya WJ, Sharafati A, AlSaadi MA (2021d) Application of artificial intelligence models for evapotranspiration prediction along the southern coast of Turkey. Complexity 2021:8850243. https://doi.org/10.1155/2021/8850243
Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16):3056–3062. https://doi.org/10.1016/j.neucom.2007.02.009
Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501. https://doi.org/10.1016/j.neucom.2005.12.126
Jiang Z, Li R, Li A, Ji C (2018) Runoff forecast uncertainty considered load adjustment model of cascade hydropower stations and its application. Energy 158:693–708. https://doi.org/10.1016/j.energy.2018.06.083
Jiang Z, Wu W, Qin H, Hu D, Zhang H (2019) Optimization of fuzzy membership function of runoff forecasting error based on the optimal closeness. J Hydrol 570:51–61. https://doi.org/10.1016/j.jhydrol.2019.01.009
Kim J, Kim J, Jang G-J, Lee M (2017) Fast learning method for convolutional neural networks using extreme learning machine and its application to lane detection. Neural Netw 87:109–121. https://doi.org/10.1016/j.neunet.2016.12.002
Kişi Ö (2007) Streamflow forecasting using different artificial neural network algorithms. J Hydrol Eng 12(5):532–539. https://doi.org/10.1061/(ASCE)1084-0699(2007)12:5(532)
Kushwaha NL, Rajput J, Elbeltagi A, Elnaggar AY, Sena DR, Vishwakarma DK, Mani I, Hussein EE (2021) Data intelligence model and meta-heuristic algorithms-based pan evaporation modelling in two different agro-climatic zones: a case study from Northern India. Atmosphere 12(12):1654
Li M-F, Tang X-P, Wu W, Liu H-B (2013) General models for estimating daily global solar radiation for different solar radiation zones in mainland China. Energy Convers Manag 70:139–148. https://doi.org/10.1016/j.enconman.2013.03.004
Lima LMM, Popova E, Damien P (2014) Modeling and forecasting of Brazilian reservoir inflows via dynamic linear models. Int J Forecast 30(3):464–476. https://doi.org/10.1016/j.ijforecast.2013.12.009
Moriasi DN, Arnold JG, Van Liew MW, Bingner RL, Harmel RD, Veith TL (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50(3):885–900. https://doi.org/10.13031/2013.23153
Nacar S, Hınıs MA, Kankal M (2018) Forecasting daily streamflow discharges using various neural network models and training algorithms. KSCE J Civ Eng 22(9):3676–3685. https://doi.org/10.1007/s12205-017-1933-7
Ochoa-Tocachi BF, Buytaert W, De Bièvre B (2016) Regionalization of land-use impacts on streamflow using a network of paired catchments. Water Resour Res 52(9):6710–6729. https://doi.org/10.1002/2016WR018596
Othman NY (2013) Developing expert system for operating Haditha Dam. Al-Qadisiyah J Eng Sci 6(1):1–25
Parisouj P, Mohebzadeh H, Lee T (2020) Employing machine learning algorithms for streamflow prediction: a case study of four river basins with different climatic zones in the United States. Water Resour Manag 34(13):4113–4131. https://doi.org/10.1007/s11269-020-02659-5
Prasad R, Deo RC, Li Y, Maraseni T (2017) Input selection and performance optimization of ANN-based streamflow forecasts in the drought-prone Murray Darling Basin region using IIS and MODWT algorithm. Atmos Res 197:42–63. https://doi.org/10.1016/j.atmosres.2017.06.014
Sahay RR, Srivastava A (2014) Predicting monsoon floods in rivers embedding wavelet transform, genetic algorithm and neural network. Water Resour Manag 28(2):301–317. https://doi.org/10.1007/s11269-013-0446-5
Seo Y, Kim S, Kisi O, Singh VP (2015) Daily water level forecasting using wavelet decomposition and artificial intelligence techniques. J Hydrol 520:224–243. https://doi.org/10.1016/j.jhydrol.2014.11.050
Shiri J, Shamshirband S, Kisi O, Karimi S, Bateni SM, Hosseini Nezhad SH, Hashemi A (2016) Prediction of water-level in the Urmia lake using the extreme learning machine approach. Water Resour Manag 30(14):5217–5229. https://doi.org/10.1007/s11269-016-1480-x
Teutschbein C, Grabs T, Laudon H, Karlsen RH, Bishop K (2018) Simulating streamflow in ungauged basins under a changing climate: the importance of landscape characteristics. J Hydrol 561:160–178. https://doi.org/10.1016/j.jhydrol.2018.03.060
Tian Z, Li S, Wang Y (2020) A prediction approach using ensemble empirical mode decomposition-permutation entropy and regularized extreme learning machine for short-term wind speed. Wind Energy 23(2):177–206. https://doi.org/10.1002/we.2422
Xu W, Zhang C, Peng Y, Fu G, Zhou H (2014) A two stage Bayesian stochastic optimization model for cascaded hydropower systems considering varying uncertainty of flow forecasts. Water Resour Res 50(12):9267–9286. https://doi.org/10.1002/2013WR015181
Yaseen ZM, Jaafar O, Deo RC, Kisi O, Adamowski J, Quilty J, El-Shafie A (2016) Streamflow forecasting using extreme learning machines: a case study in a semi-arid region in Iraq. J Hydrol 542:603–614. https://doi.org/10.1016/j.jhydrol.2016.09.035
Yaseen ZM, Naganna SR, Sa’adi Z, Samui P, Ghorbani MA, Salih SQ, Shahid S (2020) Hourly river flow forecasting: application of emotional neural network versus multiple machine learning paradigms. Water Resour Manag 34(3):1075–1091. https://doi.org/10.1007/s11269-020-02484-w
Zhang K, Luo M (2015) Outlier-robust extreme learning machine for regression problems. Neurocomputing 151:1519–1527. https://doi.org/10.1016/j.neucom.2014.09.022
Zhang X, Wang H, Peng A, Wang W, Li B, Huang X (2020) Quantifying the uncertainties in data-driven models for reservoir inflow prediction. Water Resour Manag 34(4):1479–1493. https://doi.org/10.1007/s11269-020-02514-7
Acknowledgements
The authors would like to thank Al-Maarif University College for supporting this research. The authors also thank the director of the Haditha Dam Project for providing the necessary information to fulfill this research.
Funding
Al-Maarif University College funded this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Consent to participate
The authors consent to participate in this research study.
Consent to publish
The authors consent to publish the current research in Stochastic Environmental Research and Risk Assessment journal.
Ethical approval
We acknowledge that the present research has been conducted ethically, and the final form of this research has been agreed by all authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Multiple linear regression
At a given time stage, the reservoir inflow is associated with the reservoir inflow that precedes and follows it. Inner interactions between inputs and outputs may operate on a natural system. Multiple linear regression model is a linear equation matching technique for modeling the relationship between the predicted variable and two or more predictors. The MLR can then be used to identify the relationship between dependent and independent variables as a multivariate statistical method which is described in Eq. (23).
where Y represent the response variables, \({X}_{1},{X}_{2 },{X}_{n}\) is the independent variables; and \({\alpha }_{o}\),\({\alpha }_{1},{\alpha }_{2}\dots {\alpha }_{j}\) are the coefficients of regression, which can be acquired by Eq. (24):
whereas the error of estimated and real values of inflow rate is represented by e and \({y}_{i}\), respectively.
1.2 Random forest
Random Forests (RF) is a high-dimensional regression algorithm. This approach is based on trees, where all trees have random selection variables, and the forest is made up of several regression trees and is bundled together (Hameed et al. 2021c, d). The tree is defined as a random subset of variables to determine the prediction outcome. Although the random learning process specifies two essential parameters, the first is the number of trees (ntrees) and the second is the number of variables in each division (mtry). The final decision will be made based on average results following the integration of the individual tree into the ensemble (bagging process). The distance between the bagged trees equals one tree, while the difference is minimized as the relationship between the trees decreases.
The procedure combines with tree growth to estimate random vectors \((\ominus)\) for the regression-based formation of RF to provide numerical values for the tree predictor \(h(X,\ominus)\). The mean squared generalization error for any given numerical estimator may be expressed in the Eq. (25).
The RF predictor is achieved by combining an approximation of over j of a single tree. Next, in this context, line up the following theorems:
Theorem 1 By expanding the quantity of trees in the forest, the error will be conveyed by Eq. (26).
The equation in the right hand indicates the error of generalization of the forest. In the same way, the average tree generalization error can be indicated from Eq. (27).
Theorem 2 If we assume \({E}_{Y}={E}_{X}h(X,\ominus)\) for every \(\ominus\), then Eq. (28)
where, \(\overline{p }\) represent the weight correlation.
1.3 Extreme learning machine
Extreme learning machine (ELM) is a modern learning algorithm with a convenient layout, typically made up of three layers: the input, hidden, and output. The hidden layer is one of the most important layers in the ELM scheme, with several nonlinear hidden nodes. The ELM can be mainly defined because the model's internal parameters, such as hidden neurons, do not require tuning. ELM is also referred to as an improved version of conventional ANN, although it can resolve regression issues over a shorter period of time (Deo et al. 2017; Deo and Şahin 2015; Kim et al. 2017). It is very fast because the weights are associated with the hidden layer of the input layer, and the bias values are randomly selected while the output weights are optimally calculated using the Moore–Penrose equation (Huang et al. 2006). Thus, it would result in better performance relative to other forecasting models that can be measured using the ANN methodology (Atiquzzaman and Kandasamy 2015; Ebtehaj et al. 2018; Shiri et al. 2016).
ELM is often recognized as an essential and alternative solution to conventional modeling technologies such as ANN, which are often hindered by various issues such as overfitting, poor convergence, local minimum issues, weaker generalization, longer run, and iterative tuning. Focus on the basic structure of the ELM; the randomly allocated hidden neurons are tuned in such a way that the ELM is powerfully resilient to achieve a global minimum solution, resulting in universal approximation capabilities (Huang and Chen 2007). Mathematically, an ELM model is shown in Eq. (29).
The number of hidden nodes is represented by L, the hidden layer output function is represented by \({g}_{k}\left({\alpha }_{k}.{x}_{k}+{\beta }_{k}\right)\), (\({\alpha }_{k}\) and \({\beta }_{k})\) stands for the parameters of hidden nods which are randomly initialized, the weight values linking the kth hidden node(s) with the output node is represented by \({B}_{k}\), and the ELM target is the \({z}_{t}\).
Trial and error in the range of 1–25 can very well establish the number of hidden nodes. The current research has used the hybrid tangent sigmoid transfer function to trigger hidden nodes when predicting ELM model values that depend on the linear activation function obtained from the output layer (Deo and Şahin 2016). The selection of hidden node parameters can be determined arbitrarily if the process does not need precise information, mostly on formation data, nor does the neuron of the hidden layer need to be adjusted according to the sum square error. Accordingly, for any randomly assigned sequence \(\{({\alpha }_{k},{\beta }_{k}{{)}^{L}}_{k=1}\}r\) and any continuous target function \(f(x)\), Eq. (30) is utilized to estimate and quantify the range of N training samples as follows.
The significant benefits of a non-tuned ELM model are the arbitrary achievement of hidden weight values. Thus, it ends in a zero error and allows the network target weight values (B) for the training data set to be evaluated analytically. It is very worth bearing in mind that the value of the internal transfer function factors (\({\alpha }_{k} and {\beta }_{k}\)) is specified according to the probability distribution. Ultimately, \(Y=GB\) is considered an equivalent to Eq. (30), which can be linearly expressed as explained by Eqs. (31) and (32) (Huang et al. 2006).
and
whereas the output matrix of the hidden layer is represented by G, and the transpose matrix is represented by T. Subsequently, Eq. 31) can be summarized as shown in Eq. (33).
The lowest norm square of Eq. (32) can be calculated as shown in Eq. (33):
The \({\check{H}}\) represents the generalized Moore–Penrose inverse of the Hessian matrix used to measure the output weights of the ELM model. Singular Value Decomposition (SVD) technique is primarily used as an effective solution to the ELM learning process.
Rights and permissions
About this article
Cite this article
Hameed, M.M., AlOmar, M.K., Al-Saadi, A.A.A. et al. Inflow forecasting using regularized extreme learning machine: Haditha reservoir chosen as case study. Stoch Environ Res Risk Assess 36, 4201–4221 (2022). https://doi.org/10.1007/s00477-022-02254-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-022-02254-7