Advertisement

Evaluation of preprocessing techniques for improving the accuracy of stochastic rainfall forecast models

  • I. Ebtehaj
  • H. BonakdariEmail author
  • M. Zeynoddin
  • B. Gharabaghi
  • A. Azari
Original Paper
  • 12 Downloads

Abstract

Accurate rainfall forecasting is one of the most important and challenging hydrological modeling tasks with significant benefits for many sectors of the economy. This study presents novel insight into how to improve the accuracy of a new generation of stochastic monthly rainfall forecast models by examining four different preprocessing techniques: (1) time series modeling without preprocessing which is the common method in stochastic modeling as the base case, (2) preprocess using differencing, spectral analysis seasonal and non-seasonal standardization techniques, (3) two-step preprocessing including stationarization and normalization of data using 8 different transformations, and (4) two-step preprocessing, unlike scenario 3, so that the main time series was normalized and transformed to be stationary. Using the autocorrelation function and partial autocorrelation function diagrams, the parameters of the stochastic model are determined. The results indicate that the proposed data preprocessing normalization and transformation techniques can lead to major improvements in the prediction accuracy of the new monthly rainfall forecast model.

Keywords

Linear modeling Normality transforms Seasonal auto-regressive integrated moving average Spectral analysis Standardization 

Notes

Acknowledgments

Authors would like the acknowledge their gratitude and appreciation for the Department of Irrigation and Drainage (DID), Malaysia, for providing the rainfall dataset of the studied case study and their admirable cooperation

Compliance with ethical standards

Conflict of interest

The authors declare that there is no conflict of interests regarding publishing this paper.

Supplementary material

13762_2019_2361_MOESM1_ESM.doc (1.2 mb)
Supplementary material 1 (DOC 1205 kb)

References

  1. Abadan S, Shabri A (2014) Hybrid empirical mode decomposition-ARIMA for forecasting price of rice. Appl Math Sci 8(63):3133–3143.  https://doi.org/10.12988/Ams.2014.43189 Google Scholar
  2. Akpanta AC, Okorie IE, Okoye NN (2015) SARIMA modelling of the frequency of monthly rainfall in Umuahia, Abia state of Nigeria. Am J Math Stat 5(2):82–87.  https://doi.org/10.5923/j.ajms.20150502.05 Google Scholar
  3. Alias NMA (2011) Rainfall forecasting using an artificial neural network model to prevent flash floods. In: High Capacity Optical Networks and Enabling Technologies (HONET), 2011, IEEE, pp 323–328.  https://doi.org/10.1109/honet.2011.6149841
  4. Anderson TW, Darling DA (1952) Asymptotic theory of certain” goodness of fit” criteria based on stochastic processes. Ann Math Stat.  https://doi.org/10.1214/aoms/1177729437 Google Scholar
  5. Asadi S, Tavakoli A, Hejazi SR (2012) A new hybrid for improvement of auto-regressive integrated moving average models applying particle swarm optimization. Expert Syst Appl 39(5):5332–5337.  https://doi.org/10.1016/j.eswa.2011.11.002 Google Scholar
  6. Asnaashari A, Gharabaghi B, McBean ED, Mahboubi AA (2015) Reservoir management under predictable climate variability and change. J Water Clim Change 6(3):472–485.  https://doi.org/10.2166/wcc.2015.053 Google Scholar
  7. Bonakdari H, Moeeni H, Ebtehaj I, Zeynodin M, Mohammadian M, Gharabaghi B (2018) New insights into soil temperature time series modeling: linear or nonlinear? Theore Appl Clim.  https://doi.org/10.1007/s00704-018-2436-2 Google Scholar
  8. Box GE, Cox DR (1964) An analysis of transformations. J R Stat Soc S B 26:211–252Google Scholar
  9. Camara A, Feixing W, Xiuqin L (2016) Energy consumption forecasting using seasonal ARIMA with artificial neural networks models. Int J Bus Manag 11(5):231.  https://doi.org/10.1016/0022-1694(93)90172-6 Google Scholar
  10. Conover WJ (1999) Practical nonparametric statistics, 3rd edn. Wiley, New York, pp 250–257Google Scholar
  11. Cryer J, Chan K (2008) Time series analysis. Springer, New YorkGoogle Scholar
  12. Dagum EB, Lothian JR, Morry M (1975) A test of independence of the residuals based on the cumulative periodogram. Seasonal Adjustment Methods Unit, OttawaGoogle Scholar
  13. Ebtehaj I, Bonakdari H, Sharifi A (2014) Design criteria for sediment transport in sewers based on self-cleansing concept. J Zhejiang Univ Sci-A 15(11):914–924.  https://doi.org/10.1631/jzus.A1300135 Google Scholar
  14. Ebtehaj I, Bonakdari H, Gharabaghi B (2019) A reliable linear method for modeling lake level fluctuations. J Hydrol 570:236–250.  https://doi.org/10.1016/j.jhydrol.2019.01.010 Google Scholar
  15. Freeman BS, Taylor G, Gharabaghi B, Thé J (2018) Forecasting air quality time series using deep learning. J Air Waste Manag.  https://doi.org/10.1080/10962247.2018.1459956 Google Scholar
  16. Guo Y, Zhao R, Zeng Y, Shi Z, Zhou Q (2018) Identifying scale-specific controls of soil organic matter distribution in mountain areas using anisotropy analysis and discrete wavelet transform. CATENA 160:1–9.  https://doi.org/10.1016/j.catena.2017.08.016 Google Scholar
  17. Hernández N, Camargo J, Moreno F, Plazas-Nossa L, Torres A (2017) Arima as a forecasting tool for water quality time series measured with UV-Vis spectrometers in a constructed wetland. Tecnología y Ciencias del Agua 8(5):127–139.  https://doi.org/10.24850/j-tyca-2017-05-09 Google Scholar
  18. Hirsch RM, Slack JR (1984) A nonparametric trend test for seasonal data with serial dependence. Water Resour Res 20(6):727–732.  https://doi.org/10.1029/wr020i006p00727 Google Scholar
  19. Huajun W, Lei S, Hongying L (2010) Adjustments based on wavelet transform ARIMA model for network traffic prediction. In: 2010 2nd international conference on computer engineering and technology (ICCET), vol 4, pp V4–520. IEEE.  https://doi.org/10.1109/iccet.2010.5485432
  20. Hurst HE, Black RP, Simaika YM (1969) Long-term storage. An experimental study. Constable, LondonGoogle Scholar
  21. Isa IS, Omar S, Saad Z, Noor NM, Osman MK (2010) Weather forecasting using photovoltaic system and neural network. In 2010 2nd international conference on computational intelligence, communication systems and networks (CICSyN), IEEE, pp 96–100.  https://doi.org/10.1109/CICSyN.2010.63
  22. Jalalkamali A, Moradi M, Moradi N (2015) Application of several artificial intelligence models and ARIMAX model for forecasting drought using the standardized precipitation index. Int J Environ Sci Technol 12(4):1201–1210.  https://doi.org/10.1007/s13762-014-0717-6 Google Scholar
  23. Jarque CM, Bera AK (1980) Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Econ Lett 6(3):255–259.  https://doi.org/10.1016/0165-1765(80)90024-5 Google Scholar
  24. John J, Draper N (1980) An alternative family of transformations. J R Stat Soc S C 29:190–197.  https://doi.org/10.2307/2986305 Google Scholar
  25. Johnson N (1949) Systems of frequency curves generated by methods of translation. Biometrika 36:149–176.  https://doi.org/10.2307/2332539 Google Scholar
  26. Kashyap RL, Rao AR (1976) Dynamic stochastic models from empirical data. Mathematics in science and engineering. Harcourt Brace Jovanovich (Academic Press): New York, p 334Google Scholar
  27. Khandelwal I, Adhikari R, Verma G (2015) Time series forecasting using hybrid ARIMA and ANN models based on DWT decomposition. Procedia Comput Sci 48:173–179.  https://doi.org/10.1016/j.procs.2015.04.167 Google Scholar
  28. Kullback S (1959) Information theory and statistics. Wiley, New YorkGoogle Scholar
  29. Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86.  https://doi.org/10.1214/aoms/1177729694 Google Scholar
  30. Kwiatkowski D, Phillips PC, Schmidt P, Shin Y (1992) Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? J Econo 54(1–3):159–178.  https://doi.org/10.1016/0304-4076(92)90104-Y Google Scholar
  31. Lee R, Liu J (2004) iJADE WeatherMAN: a weather forecasting system using intelligent multiagent-based fuzzy neuro network. IEEE T Syst Man Cyb 34(3):369–377.  https://doi.org/10.1109/TSMCC.2004.829302 Google Scholar
  32. Lihua N, Xiaorong C, Qian H (2010) ARIMA model for traffic flow prediction based on wavelet analysis. In: 2nd international conference on information science and engineering (ICISE), pp 1028–1031.  https://doi.org/10.1109/ICISE.2010.5690910
  33. Lilliefors H (1967) On the Kolmogorov–Smirnov test for normality with mean and variance unknown. J Am Stati Assoc 62:399–402.  https://doi.org/10.1080/01621459.1967.10482916 Google Scholar
  34. Ljung GM, Box GE (1978) On a measure of lack of fit in time series models. Biometrika 65(2):297–303.  https://doi.org/10.1093/biomet/65.2.297 Google Scholar
  35. Manly BF (1976) Exponential data transformations. Statistician 25:37–42.  https://doi.org/10.2307/2988129 Google Scholar
  36. Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60Google Scholar
  37. Marco JB, Harboe R, Salas JD (2012) Stochastic hydrology and its use in water resources systems simulation and optimization, vol 237. Springer, BerlinGoogle Scholar
  38. McLeod AI, Hipel KW, Lennox WC (1977) Advances in Box-Jenkins modeling: 2. Applications. Water Resour Res 13(3):577–586.  https://doi.org/10.1029/wr013i003p00577 Google Scholar
  39. Meher J, Jha R (2013) Time-series analysis of monthly rainfall data for the Mahanadi River Basin, India. Sci Cold Arid Reg (SCAR) 5(1):73–84Google Scholar
  40. Mills TC (2014) Time series modelling of temperatures: an example from Kefalonia. Meteor Appl 21(3):578–584.  https://doi.org/10.1002/met.1379 Google Scholar
  41. Mishra PK, Karmakar S (2018) Performance of optimum neural network in rainfall–runoff modeling over a river basin. Int J Environ Sci Technol.  https://doi.org/10.1007/s13762-018-1726-7 Google Scholar
  42. Moeeni H, Bonakdari H (2017) Forecasting monthly inflow with extreme seasonal variation using the hybrid SARIMA-ANN model. Stoch Envl Res Risk A 31(8):1997–2010.  https://doi.org/10.1007/s00477-016-1273-z Google Scholar
  43. Moeeni H, Bonakdari H, Ebtehaj I (2017a) Monthly reservoir inflow forecasting using a new hybrid SARIMA genetic programming approach. J Earth Syst Sci 126(2):18.  https://doi.org/10.1007/s12040-017-0798-y Google Scholar
  44. Moeeni H, Bonakdari H, Fatemi SE (2017b) Stochastic model stationarization by eliminating the periodic term and its effect on time series prediction. J Hydrol 547:348–364.  https://doi.org/10.1016/j.jhydrol.2017.02.012 Google Scholar
  45. Moeeni H, Bonakdari H, Fatemi SE, Zaji AH (2017c) Assessment of stochastic models and a hybrid artificial neural network-genetic algorithm method in forecasting monthly reservoir inflow. INAE Lett 2(1):13–23.  https://doi.org/10.1007/s41403-017-0017-9 Google Scholar
  46. Nazaripour H, Daneshvar MM (2014) Spatial contribution of one-day precipitations variability to rainy days and rainfall amounts in Iran. Int J Environ Sci Technol 11(6):1751–1758.  https://doi.org/10.1007/s13762-014-0616-x Google Scholar
  47. Pektaş AO, Cigizoglu HK (2013) ANN hybrid model versus ARIMA and ARIMAX models of runoff coefficient. J Hydrol 500:21–36.  https://doi.org/10.1016/j.jhydrol.2013.07.020 Google Scholar
  48. Ranjbar M, Khaledian M (2014) Using Arima time series model in forecasting the trend of changes in qualitative parameters of Sefidrud River. Int Res J Appl Basic Sci 8(3):346–351Google Scholar
  49. Rudra RP, Dickinson WT, Ahmed SI, Patel P, Zhou J, Gharabaghi B, Khan AA (2015) Changes in rainfall extremes in Ontario. Int J Environ Res 9(4):1117–1126Google Scholar
  50. Said SE, Dickey DA (1984) Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika 71(3):599–607.  https://doi.org/10.1093/biomet/71.3.599 Google Scholar
  51. Salas JD, Delleur JR, Yevjevich V, Lane WL (1980) Applied modeling of hydrologic time series. Water Resources Publications, LittletonGoogle Scholar
  52. Shaghaghi S, Bonakdari H, Gholami A, Ebtehaj I, Zeinolabedini M (2017) Comparative analysis of GMDH neural network based on genetic algorithm and particle swarm optimization in stable channel design. Appl Math Comput 313:271–286.  https://doi.org/10.1016/j.amc.2017.06.012 Google Scholar
  53. Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3–4):591–611.  https://doi.org/10.2307/2333709 Google Scholar
  54. Srivastava PK, Islam T, Singh SK, Petropoulos GP, Gupta M, Dai Q (2016) Forecasting Arabian Sea level rise using exponential smoothing state space models and ARIMA from TOPEX and Jason satellite radar altimeter data. Meteor Appl 23(4):633–639.  https://doi.org/10.1002/met.1585 Google Scholar
  55. Stedinger JR, Lettenmaier DP, Vogel RM (1985) Multisite ARMA (1, 1) and disaggregation models for annual streamflow generation. Water Resour Res 21(4):497–509.  https://doi.org/10.1029/wr021i004p00497 Google Scholar
  56. Su Z, Wang J, Lu H, Zhao G (2014) A new hybrid model optimized by an intelligent optimization algorithm for wind speed forecasting. Energ Convers Manag 85:443–452.  https://doi.org/10.1016/j.enconman.2014.05.058 Google Scholar
  57. Tsay RS (2010) Analysis of financial time series, 3rd edn. Wiley, HobokenGoogle Scholar
  58. Valipour M (2015) Long-term runoff study using SARIMA and ARIMA models in the United States. Meteor Appl 22(3):592–598.  https://doi.org/10.1002/met.1491 Google Scholar
  59. Valipour M, Banihabib ME, Behbahani SMR (2012) Parameters estimate of autoregressive moving average and autoregressive integrated moving average models and compare their ability for inflow forecasting. J Math Stat 8(3):330–338Google Scholar
  60. Valipour M, Banihabib ME, Behbahani SMR (2013) Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J Hydrol 476:433–441.  https://doi.org/10.1016/j.jhydrol.2012.11.017 Google Scholar
  61. Vasiljevic B, McBean E, Gharabaghi B (2012) Trends in rainfall intensity for stormwater designs in Ontario. J Water Clim Change 3(1):1–10.  https://doi.org/10.2166/wcc.2012.125 Google Scholar
  62. Yaseen ZM, Ghareb MI, Ebtehaj I, Bonakdari H, Siddique R, Heddam S, Yusif A, Deo R (2018) Rainfall pattern forecasting using novel hybrid intelligent model based ANFIS-FFA. Water Resour Manag 32(1):105–122.  https://doi.org/10.1007/s11269-017-1797-0 Google Scholar
  63. Yeo IK, Johnson RA (2000) A new family of power transformations to improve normality or symmetry. Biometrika 87(4):954–959.  https://doi.org/10.1093/biomet/87.4.954 Google Scholar
  64. Zaji AH, Bonakdari H, Gharabaghi B (2018) Reservoir water level forecasting using group method of data handling. Acta Geophys 66(4):717–730.  https://doi.org/10.1007/s11600-018-0168-4 Google Scholar
  65. Zaji AH, Bonakdari H, Gharabaghi B (2019) Remote sensing satellite data preparation for simulating and forecasting river discharge. IEEE T Geosci Remote 56(6):3432–3441.  https://doi.org/10.1109/tgrs.2018.2799901 Google Scholar
  66. Zeynoddin M, Bonakdari H, Azari A, Ebtehaj I, Gharabaghi B, Madavar HR (2018) Novel hybrid linear stochastic with non-linear extreme learning machine methods for forecasting monthly rainfall a tropical climate. J Environ Manag 222:190–206.  https://doi.org/10.1016/j.jenvman.2018.05.072 Google Scholar

Copyright information

© Islamic Azad University (IAU) 2019

Authors and Affiliations

  1. 1.Department of Civil EngineeringRazi UniversityKermanshahIran
  2. 2.School of EngineeringUniversity of GuelphGuelphCanada
  3. 3.Department of Water EngineeringRazi UniversityKermanshahIran

Personalised recommendations