Abstract
Accurate rainfall forecasting is one of the most important and challenging hydrological modeling tasks with significant benefits for many sectors of the economy. This study presents novel insight into how to improve the accuracy of a new generation of stochastic monthly rainfall forecast models by examining four different preprocessing techniques: (1) time series modeling without preprocessing which is the common method in stochastic modeling as the base case, (2) preprocess using differencing, spectral analysis seasonal and non-seasonal standardization techniques, (3) two-step preprocessing including stationarization and normalization of data using 8 different transformations, and (4) two-step preprocessing, unlike scenario 3, so that the main time series was normalized and transformed to be stationary. Using the autocorrelation function and partial autocorrelation function diagrams, the parameters of the stochastic model are determined. The results indicate that the proposed data preprocessing normalization and transformation techniques can lead to major improvements in the prediction accuracy of the new monthly rainfall forecast model.
Similar content being viewed by others
References
Abadan S, Shabri A (2014) Hybrid empirical mode decomposition-ARIMA for forecasting price of rice. Appl Math Sci 8(63):3133–3143. https://doi.org/10.12988/Ams.2014.43189
Akpanta AC, Okorie IE, Okoye NN (2015) SARIMA modelling of the frequency of monthly rainfall in Umuahia, Abia state of Nigeria. Am J Math Stat 5(2):82–87. https://doi.org/10.5923/j.ajms.20150502.05
Alias NMA (2011) Rainfall forecasting using an artificial neural network model to prevent flash floods. In: High Capacity Optical Networks and Enabling Technologies (HONET), 2011, IEEE, pp 323–328. https://doi.org/10.1109/honet.2011.6149841
Anderson TW, Darling DA (1952) Asymptotic theory of certain” goodness of fit” criteria based on stochastic processes. Ann Math Stat. https://doi.org/10.1214/aoms/1177729437
Asadi S, Tavakoli A, Hejazi SR (2012) A new hybrid for improvement of auto-regressive integrated moving average models applying particle swarm optimization. Expert Syst Appl 39(5):5332–5337. https://doi.org/10.1016/j.eswa.2011.11.002
Asnaashari A, Gharabaghi B, McBean ED, Mahboubi AA (2015) Reservoir management under predictable climate variability and change. J Water Clim Change 6(3):472–485. https://doi.org/10.2166/wcc.2015.053
Bonakdari H, Moeeni H, Ebtehaj I, Zeynodin M, Mohammadian M, Gharabaghi B (2018) New insights into soil temperature time series modeling: linear or nonlinear? Theore Appl Clim. https://doi.org/10.1007/s00704-018-2436-2
Box GE, Cox DR (1964) An analysis of transformations. J R Stat Soc S B 26:211–252
Camara A, Feixing W, Xiuqin L (2016) Energy consumption forecasting using seasonal ARIMA with artificial neural networks models. Int J Bus Manag 11(5):231. https://doi.org/10.1016/0022-1694(93)90172-6
Conover WJ (1999) Practical nonparametric statistics, 3rd edn. Wiley, New York, pp 250–257
Cryer J, Chan K (2008) Time series analysis. Springer, New York
Dagum EB, Lothian JR, Morry M (1975) A test of independence of the residuals based on the cumulative periodogram. Seasonal Adjustment Methods Unit, Ottawa
Ebtehaj I, Bonakdari H, Sharifi A (2014) Design criteria for sediment transport in sewers based on self-cleansing concept. J Zhejiang Univ Sci-A 15(11):914–924. https://doi.org/10.1631/jzus.A1300135
Ebtehaj I, Bonakdari H, Gharabaghi B (2019) A reliable linear method for modeling lake level fluctuations. J Hydrol 570:236–250. https://doi.org/10.1016/j.jhydrol.2019.01.010
Freeman BS, Taylor G, Gharabaghi B, Thé J (2018) Forecasting air quality time series using deep learning. J Air Waste Manag. https://doi.org/10.1080/10962247.2018.1459956
Guo Y, Zhao R, Zeng Y, Shi Z, Zhou Q (2018) Identifying scale-specific controls of soil organic matter distribution in mountain areas using anisotropy analysis and discrete wavelet transform. CATENA 160:1–9. https://doi.org/10.1016/j.catena.2017.08.016
Hernández N, Camargo J, Moreno F, Plazas-Nossa L, Torres A (2017) Arima as a forecasting tool for water quality time series measured with UV-Vis spectrometers in a constructed wetland. Tecnología y Ciencias del Agua 8(5):127–139. https://doi.org/10.24850/j-tyca-2017-05-09
Hirsch RM, Slack JR (1984) A nonparametric trend test for seasonal data with serial dependence. Water Resour Res 20(6):727–732. https://doi.org/10.1029/wr020i006p00727
Huajun W, Lei S, Hongying L (2010) Adjustments based on wavelet transform ARIMA model for network traffic prediction. In: 2010 2nd international conference on computer engineering and technology (ICCET), vol 4, pp V4–520. IEEE. https://doi.org/10.1109/iccet.2010.5485432
Hurst HE, Black RP, Simaika YM (1969) Long-term storage. An experimental study. Constable, London
Isa IS, Omar S, Saad Z, Noor NM, Osman MK (2010) Weather forecasting using photovoltaic system and neural network. In 2010 2nd international conference on computational intelligence, communication systems and networks (CICSyN), IEEE, pp 96–100. https://doi.org/10.1109/CICSyN.2010.63
Jalalkamali A, Moradi M, Moradi N (2015) Application of several artificial intelligence models and ARIMAX model for forecasting drought using the standardized precipitation index. Int J Environ Sci Technol 12(4):1201–1210. https://doi.org/10.1007/s13762-014-0717-6
Jarque CM, Bera AK (1980) Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Econ Lett 6(3):255–259. https://doi.org/10.1016/0165-1765(80)90024-5
John J, Draper N (1980) An alternative family of transformations. J R Stat Soc S C 29:190–197. https://doi.org/10.2307/2986305
Johnson N (1949) Systems of frequency curves generated by methods of translation. Biometrika 36:149–176. https://doi.org/10.2307/2332539
Kashyap RL, Rao AR (1976) Dynamic stochastic models from empirical data. Mathematics in science and engineering. Harcourt Brace Jovanovich (Academic Press): New York, p 334
Khandelwal I, Adhikari R, Verma G (2015) Time series forecasting using hybrid ARIMA and ANN models based on DWT decomposition. Procedia Comput Sci 48:173–179. https://doi.org/10.1016/j.procs.2015.04.167
Kullback S (1959) Information theory and statistics. Wiley, New York
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86. https://doi.org/10.1214/aoms/1177729694
Kwiatkowski D, Phillips PC, Schmidt P, Shin Y (1992) Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? J Econo 54(1–3):159–178. https://doi.org/10.1016/0304-4076(92)90104-Y
Lee R, Liu J (2004) iJADE WeatherMAN: a weather forecasting system using intelligent multiagent-based fuzzy neuro network. IEEE T Syst Man Cyb 34(3):369–377. https://doi.org/10.1109/TSMCC.2004.829302
Lihua N, Xiaorong C, Qian H (2010) ARIMA model for traffic flow prediction based on wavelet analysis. In: 2nd international conference on information science and engineering (ICISE), pp 1028–1031. https://doi.org/10.1109/ICISE.2010.5690910
Lilliefors H (1967) On the Kolmogorov–Smirnov test for normality with mean and variance unknown. J Am Stati Assoc 62:399–402. https://doi.org/10.1080/01621459.1967.10482916
Ljung GM, Box GE (1978) On a measure of lack of fit in time series models. Biometrika 65(2):297–303. https://doi.org/10.1093/biomet/65.2.297
Manly BF (1976) Exponential data transformations. Statistician 25:37–42. https://doi.org/10.2307/2988129
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60
Marco JB, Harboe R, Salas JD (2012) Stochastic hydrology and its use in water resources systems simulation and optimization, vol 237. Springer, Berlin
McLeod AI, Hipel KW, Lennox WC (1977) Advances in Box-Jenkins modeling: 2. Applications. Water Resour Res 13(3):577–586. https://doi.org/10.1029/wr013i003p00577
Meher J, Jha R (2013) Time-series analysis of monthly rainfall data for the Mahanadi River Basin, India. Sci Cold Arid Reg (SCAR) 5(1):73–84
Mills TC (2014) Time series modelling of temperatures: an example from Kefalonia. Meteor Appl 21(3):578–584. https://doi.org/10.1002/met.1379
Mishra PK, Karmakar S (2018) Performance of optimum neural network in rainfall–runoff modeling over a river basin. Int J Environ Sci Technol. https://doi.org/10.1007/s13762-018-1726-7
Moeeni H, Bonakdari H (2017) Forecasting monthly inflow with extreme seasonal variation using the hybrid SARIMA-ANN model. Stoch Envl Res Risk A 31(8):1997–2010. https://doi.org/10.1007/s00477-016-1273-z
Moeeni H, Bonakdari H, Ebtehaj I (2017a) Monthly reservoir inflow forecasting using a new hybrid SARIMA genetic programming approach. J Earth Syst Sci 126(2):18. https://doi.org/10.1007/s12040-017-0798-y
Moeeni H, Bonakdari H, Fatemi SE (2017b) Stochastic model stationarization by eliminating the periodic term and its effect on time series prediction. J Hydrol 547:348–364. https://doi.org/10.1016/j.jhydrol.2017.02.012
Moeeni H, Bonakdari H, Fatemi SE, Zaji AH (2017c) Assessment of stochastic models and a hybrid artificial neural network-genetic algorithm method in forecasting monthly reservoir inflow. INAE Lett 2(1):13–23. https://doi.org/10.1007/s41403-017-0017-9
Nazaripour H, Daneshvar MM (2014) Spatial contribution of one-day precipitations variability to rainy days and rainfall amounts in Iran. Int J Environ Sci Technol 11(6):1751–1758. https://doi.org/10.1007/s13762-014-0616-x
Pektaş AO, Cigizoglu HK (2013) ANN hybrid model versus ARIMA and ARIMAX models of runoff coefficient. J Hydrol 500:21–36. https://doi.org/10.1016/j.jhydrol.2013.07.020
Ranjbar M, Khaledian M (2014) Using Arima time series model in forecasting the trend of changes in qualitative parameters of Sefidrud River. Int Res J Appl Basic Sci 8(3):346–351
Rudra RP, Dickinson WT, Ahmed SI, Patel P, Zhou J, Gharabaghi B, Khan AA (2015) Changes in rainfall extremes in Ontario. Int J Environ Res 9(4):1117–1126
Said SE, Dickey DA (1984) Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika 71(3):599–607. https://doi.org/10.1093/biomet/71.3.599
Salas JD, Delleur JR, Yevjevich V, Lane WL (1980) Applied modeling of hydrologic time series. Water Resources Publications, Littleton
Shaghaghi S, Bonakdari H, Gholami A, Ebtehaj I, Zeinolabedini M (2017) Comparative analysis of GMDH neural network based on genetic algorithm and particle swarm optimization in stable channel design. Appl Math Comput 313:271–286. https://doi.org/10.1016/j.amc.2017.06.012
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3–4):591–611. https://doi.org/10.2307/2333709
Srivastava PK, Islam T, Singh SK, Petropoulos GP, Gupta M, Dai Q (2016) Forecasting Arabian Sea level rise using exponential smoothing state space models and ARIMA from TOPEX and Jason satellite radar altimeter data. Meteor Appl 23(4):633–639. https://doi.org/10.1002/met.1585
Stedinger JR, Lettenmaier DP, Vogel RM (1985) Multisite ARMA (1, 1) and disaggregation models for annual streamflow generation. Water Resour Res 21(4):497–509. https://doi.org/10.1029/wr021i004p00497
Su Z, Wang J, Lu H, Zhao G (2014) A new hybrid model optimized by an intelligent optimization algorithm for wind speed forecasting. Energ Convers Manag 85:443–452. https://doi.org/10.1016/j.enconman.2014.05.058
Tsay RS (2010) Analysis of financial time series, 3rd edn. Wiley, Hoboken
Valipour M (2015) Long-term runoff study using SARIMA and ARIMA models in the United States. Meteor Appl 22(3):592–598. https://doi.org/10.1002/met.1491
Valipour M, Banihabib ME, Behbahani SMR (2012) Parameters estimate of autoregressive moving average and autoregressive integrated moving average models and compare their ability for inflow forecasting. J Math Stat 8(3):330–338
Valipour M, Banihabib ME, Behbahani SMR (2013) Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J Hydrol 476:433–441. https://doi.org/10.1016/j.jhydrol.2012.11.017
Vasiljevic B, McBean E, Gharabaghi B (2012) Trends in rainfall intensity for stormwater designs in Ontario. J Water Clim Change 3(1):1–10. https://doi.org/10.2166/wcc.2012.125
Yaseen ZM, Ghareb MI, Ebtehaj I, Bonakdari H, Siddique R, Heddam S, Yusif A, Deo R (2018) Rainfall pattern forecasting using novel hybrid intelligent model based ANFIS-FFA. Water Resour Manag 32(1):105–122. https://doi.org/10.1007/s11269-017-1797-0
Yeo IK, Johnson RA (2000) A new family of power transformations to improve normality or symmetry. Biometrika 87(4):954–959. https://doi.org/10.1093/biomet/87.4.954
Zaji AH, Bonakdari H, Gharabaghi B (2018) Reservoir water level forecasting using group method of data handling. Acta Geophys 66(4):717–730. https://doi.org/10.1007/s11600-018-0168-4
Zaji AH, Bonakdari H, Gharabaghi B (2019) Remote sensing satellite data preparation for simulating and forecasting river discharge. IEEE T Geosci Remote 56(6):3432–3441. https://doi.org/10.1109/tgrs.2018.2799901
Zeynoddin M, Bonakdari H, Azari A, Ebtehaj I, Gharabaghi B, Madavar HR (2018) Novel hybrid linear stochastic with non-linear extreme learning machine methods for forecasting monthly rainfall a tropical climate. J Environ Manag 222:190–206. https://doi.org/10.1016/j.jenvman.2018.05.072
Acknowledgments
Authors would like the acknowledge their gratitude and appreciation for the Department of Irrigation and Drainage (DID), Malaysia, for providing the rainfall dataset of the studied case study and their admirable cooperation
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interests regarding publishing this paper.
Additional information
Editorial responsibility: Zhenyao Shen.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Ebtehaj, I., Bonakdari, H., Zeynoddin, M. et al. Evaluation of preprocessing techniques for improving the accuracy of stochastic rainfall forecast models. Int. J. Environ. Sci. Technol. 17, 505–524 (2020). https://doi.org/10.1007/s13762-019-02361-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13762-019-02361-z