Advertisement

Water Resources Management

, Volume 32, Issue 15, pp 5207–5239 | Cite as

Univariate Time Series Forecasting of Temperature and Precipitation with a Focus on Machine Learning Algorithms: a Multiple-Case Study from Greece

  • Georgia PapacharalampousEmail author
  • Hristos Tyralis
  • Demetris Koutsoyiannis
Article

Abstract

We provide contingent empirical evidence on the solutions to three problems associated with univariate time series forecasting using machine learning (ML) algorithms by conducting an extensive multiple-case study. These problems are: (a) lagged variable selection, (b) hyperparameter handling, and (c) comparison between ML and classical algorithms. The multiple-case study is composed by 50 single-case studies, which use time series of mean monthly temperature and total monthly precipitation observed in Greece. We focus on two ML algorithms, i.e. neural networks and support vector machines, while we also include four classical algorithms and a naïve benchmark in the comparisons. We apply a fixed methodology to each individual case and, subsequently, we perform a cross-case synthesis to facilitate the detection of systematic patterns. We fit the models to the deseasonalized time series. We compare the one- and multi-step ahead forecasting performance of the algorithms. Regarding the one-step ahead forecasting performance, the assessment is based on the absolute error of the forecast of the last monthly observation. For the quantification of the multi-step ahead forecasting performance we compute five metrics on the test set (last year’s monthly observations), i.e. the root mean square error, the Nash-Sutcliffe efficiency, the ratio of standard deviations, the coefficient of correlation and the index of agreement. The evidence derived by the experiments can be summarized as follows: (a) the results mostly favour using less recent lagged variables, (b) hyperparameter optimization does not necessarily lead to better forecasts, (c) the ML and classical algorithms seem to be equally competitive.

Keywords

Neural networks Support vector machines Hyperparameter optimization Lagged variable selection Multi-step ahead forecasting One-step ahead forecasting 

Notes

Acknowledgements

A previous shorter version of the paper has been presented in the 10th World Congress of EWRA “Panta Rei” Athens, Greece, 5-9 July, 2017 under the title “Forecasting of geophysical processes using stochastic and machine learning algorithms” (Papacharalampous et al. 2017b). We thank the Scientific and Organizing Committees for selecting this research. We also thank the Guest Editor and two anonymous reviewers of Water Resources Management for the time they have devoted to our work.

References

  1. Achen CH, Snidal D (1989) Rational deterrence theory and comparative case studies. World Polit 41(2):143–169.  https://doi.org/10.2307/2010405 CrossRefGoogle Scholar
  2. Atiya AF, El-Shoura SM, Shaheen SI, El-Sherif MS (1999) A comparison between neural-network forecasting techniques-case study: river flow forecasting. IEEE Trans Neural Netw 10(2):402–409.  https://doi.org/10.1109/72.750569 CrossRefGoogle Scholar
  3. Ballini R, Soares S, Andrade MG (2001) Multi-step-ahead monthly streamflow forecasting by a neurofuzzy network model. IFSA World Congress and 20th NAFIPS International Conference, p 992–997.  https://doi.org/10.1109/NAFIPS.2001.944740
  4. Baxter P, Jack S (2008) Qualitative case study methodology: study design and implementation for novice researchers. Qual Rep 13(4):544–559Google Scholar
  5. Belayneh A, Adamowski J, Khalil B, Ozga-Zielinski B (2014) Long-term SPI drought forecasting in the Awash River basin in Ethiopia using wavelet neural network and wavelet support vector regression models. J Hydrol 508:418–429.  https://doi.org/10.1016/j.jhydrol.2013.10.052 CrossRefGoogle Scholar
  6. Breiman L (2001) Random forests. Mach Learn 45(1):5–32.  https://doi.org/10.1023/A:1010933404324 CrossRefGoogle Scholar
  7. Brownrigg R, Minka TP, Deckmyn A (2017) maps: draw geographical maps. R package version 3.2.0. https://CRAN.R-project.org/package=maps
  8. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297.  https://doi.org/10.1007/BF00994018 CrossRefGoogle Scholar
  9. Cortez P (2010) Data mining with neural networks and support vector machines using the R/rminer tool. In: Perner P (ed) Advances in data mining. Applications and theoretical aspects. Springer, Berlin Heidelberg, pp 572–583.  https://doi.org/10.1007/978-3-642-14400-4_44 CrossRefGoogle Scholar
  10. Cortez P (2016) rminer: Data Mining Classification and Regression Methods. R package version 1.4.2. https://CRAN.R-project.org/package=rminer
  11. Dooley LM (2002) Case study research and theory building. Adv Dev Hum Resour 4(3):335–354.  https://doi.org/10.1177/1523422302043007 CrossRefGoogle Scholar
  12. El-Shafie A, Taha MR, Noureldin A (2007) A neuro-fuzzy model for inflow forecasting of the Nile river at Aswan high dam. Water Resour Manag 21(3):533–556.  https://doi.org/10.1007/s11269-006-9027-1 CrossRefGoogle Scholar
  13. Fraley C, Leisch F, Maechler M, Reisen V, Lemonte A (2012) fracdiff: Fractionally differenced ARIMA aka ARFIMA(p,d,q) models. R package version 1.4–2. https://CRAN.R-project.org/package=fracdiff
  14. Guo J, Zhou J, Qin H, Zou Q, Li Q (2011) Monthly streamflow forecasting based on improved support vector machine model. Expert Syst Appl 38(10):13073–13081.  https://doi.org/10.1016/j.eswa.2011.04.114 CrossRefGoogle Scholar
  15. Hong WC (2008) Rainfall forecasting by technological machine learning models. Appl Math Comput 200(1):41–57.  https://doi.org/10.1016/j.amc.2007.10.046 CrossRefGoogle Scholar
  16. Hung NQ, Babel MS, Weesakul S, Tripathi NK (2009) An artificial neural network model for rainfall forecasting in Bangkok, Thailand. Hydrol Earth Syst Sci 13:1413–1425.  https://doi.org/10.5194/hess-13-1413-2009 CrossRefGoogle Scholar
  17. Hyndman RJ, Khandakar Y (2008) Automatic time series forecasting: the forecast package for R. J Stat Softw 27(3):1–22.  https://doi.org/10.18637/jss.v027.i03
  18. Hyndman RJ, O'Hara-Wild M, Bergmeir C, Razbash S, Wang E (2017) forecast: Forecasting Functions for Time Series and Linear Models. R package version 8.2. https://CRAN.R-project.org/package=forecast
  19. Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab - an S4 package for kernel methods in R. J Stat Softw 11(9):1–20.  https://doi.org/10.18637/jss.v011.i09
  20. Koutsoyiannis D, Yao H, Georgakakos A (2008) Medium-range flow prediction for the Nile: a comparison of stochastic and deterministic methods. Hydrol Sci J 53(1):142–164.  https://doi.org/10.1623/hysj.53.1.142 CrossRefGoogle Scholar
  21. Krause P, Boyle DP, Bäse F (2005) Comparison of different efficiency criteria for hydrological model assessment. Adv Geosci 5:89–97CrossRefGoogle Scholar
  22. Kumar DN, Raju KS, Sathish T (2004) River flow forecasting using recurrent neural networks. Water Resour Manag 18(2):143–161.  https://doi.org/10.1023/B:WARM.0000024727.94701.12 CrossRefGoogle Scholar
  23. Larsson R (1993) Case survey methodology: quantitative analysis of patterns across case studies. Acad Manag J 36(6):1515–1546.  https://doi.org/10.2307/256820 CrossRefGoogle Scholar
  24. Lawrimore JH, Menne MJ, Gleason BE, Williams CN, Wuertz DB, Vose RS, Rennie J (2011) An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3. J Geophys Res-Atmos 116(D19121).  https://doi.org/10.1029/2011JD016187
  25. Maier HR, Dandy GC (2000) Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ Model Softw 15(1):101–124.  https://doi.org/10.1016/S1364-8152(99)00007-9 CrossRefGoogle Scholar
  26. Moustris KP, Larissi IK, Nastos PT, Paliatsos AG (2011) Precipitation forecast using artificial neural networks in specific regions of Greece. Water Resour Manag 25(8):1979–1993.  https://doi.org/10.1007/s11269-011-9790-5 CrossRefGoogle Scholar
  27. Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models part I—A discussion of principles. J Hydrol 10(3):282–290.  https://doi.org/10.1016/0022-1694(70)90255-6 CrossRefGoogle Scholar
  28. Nayak PC, Sudheer KP, Ranganc DM, Ramasastrid KS (2004) A neuro-fuzzy computing technique for modeling hydrological time series. J Hydrol 291(1–2):52–66.  https://doi.org/10.1016/j.jhydrol.2003.12.010 CrossRefGoogle Scholar
  29. Ouyang Q, Lu W (2017) Monthly rainfall forecasting using echo state networks coupled with data preprocessing methods. Water Resour Manag 32(2):659–674.  https://doi.org/10.1007/s11269-017-1832-1
  30. Papacharalampous GA, Tyralis H, Koutsoyiannis D (2017a) Comparison of stochastic and machine learning methods for multi-step ahead forecasting of hydrological processes. Preprints 2017100133.  https://doi.org/10.20944/preprints201710.0133.v1
  31. Papacharalampous GA, Tyralis H, Koutsoyiannis D (2017b) Forecasting of geophysical processes using stochastic and machine learning algorithms. Eur Water 59:161−168Google Scholar
  32. Peterson TC, Vose RS (1997) An overview of the global historical climatology network temperature database. Bull Am Meteorol Soc 78(12):2837–2849.  https://doi.org/10.1175/1520-0477(1997)078<2837:AOOTGH>2.0.CO;2 CrossRefGoogle Scholar
  33. R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/
  34. Raghavendra NS, Deka PC (2014) Support vector machine applications in the field of hydrology: a review. Appl Soft Comput 19:372–386.  https://doi.org/10.1016/j.asoc.2014.02.002 CrossRefGoogle Scholar
  35. Sivapragasam C, Liong SY, Pasha MFK (2001) Rainfall and runoff forecasting with SSA-SVM approach. J Hydroinf 3(3):141–152CrossRefGoogle Scholar
  36. Taieb SB, Bontempi G, Atiya AF, Sorjamaa A (2012) A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert Syst Appl 39(8):7067–7083.  https://doi.org/10.1016/j.eswa.2012.01.039 CrossRefGoogle Scholar
  37. Tongal H, Berndtsson R (2017) Impact of complexity on daily and multi-step forecasting of streamflow with chaotic, stochastic, and black-box models. Stoch Env Res Risk A 31(3):661–682.  https://doi.org/10.1007/s00477-016-1236-4 CrossRefGoogle Scholar
  38. Tyralis H (2016) HKprocess: Hurst-Kolmogorov Process. R package version 0.0–2. https://CRAN.R-project.org/package=HKprocess
  39. Tyralis H, Koutsoyiannis D (2011) Simultaneous estimation of the parameters of the Hurst–Kolmogorov stochastic process. Stoch Env Res Risk A 25(1):21–33.  https://doi.org/10.1007/s00477-010-0408-x CrossRefGoogle Scholar
  40. Tyralis H, Papacharalampous GA (2017) Variable selection in time series forecasting using random forests. Algorithms 10(4):114.  https://doi.org/10.3390/a10040114 CrossRefGoogle Scholar
  41. Valipour M, Banihabib ME, Behbahani SMR (2013) Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J Hydrol 476(7):433–441.  https://doi.org/10.1016/j.jhydrol.2012.11.017 CrossRefGoogle Scholar
  42. Vapnik VN (1995) The nature of statistical learning theory, 5th edn. Springer-Verlag, New York.  https://doi.org/10.1007/978-1-4757-3264-1 CrossRefGoogle Scholar
  43. Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999.  https://doi.org/10.1109/72.788640 CrossRefGoogle Scholar
  44. Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer-Verlag, New York.  https://doi.org/10.1007/978-0-387-21706-2 CrossRefGoogle Scholar
  45. Wang W, Van Gelder PH, Vrijling JK, Ma J (2006) Forecasting daily streamflow using hybrid ANN models. J Hydrol 324(1–4):383–399.  https://doi.org/10.1016/j.jhydrol.2005.09.032 CrossRefGoogle Scholar
  46. Warnes GR, Bolker B, Gorjanc G, Grothendieck G, Korosec A, Lumley T, MacQueen D, Magnusson A, Rogers J, et al (2017) gdata: Various R Programming Tools for Data Manipulation. R package version 2.18.0. https://CRAN.R-project.org/package=gdata
  47. Wickham H (2016) ggplot2. Springer International Publishing.  https://doi.org/10.1007/978-3-319-24277-4
  48. Wickham H, Chang W (2017) devtools: Tools to Make Developing R Packages Easier. R package version 1.13.4. https://CRAN.R-project.org/package=devtools
  49. Wickham H, Henry L (2017) tidyr: Easily Tidy Data with 'spread()' and 'gather()' Functions. R package version 0.7.2. https://CRAN.R-project.org/package=tidyr
  50. Wickham H, Hester J, Francois R, Jylänki J, Jørgensen M (2017) readr: Read Rectangular Text Data. R package version 1.1.1. https://CRAN.R-project.org/package=readr
  51. Witten IH, Frank E, Hall MA, Pal CJ (2017) Data mining: practical machine learning tools and techniques, 4th edn. Elsevier Inc., Amsterdam ISBN:978-0-12-804291-5Google Scholar
  52. Xie Y (2014) knitr: a comprehensive tool for reproducible research in R. In: Stodden V, Leisch F, Peng RD (eds) Implementing reproducible computational research. Chapman and Hall/CRC, LondonGoogle Scholar
  53. Xie Y (2015) Dynamic documents with R and knitr, 2nd edn. Chapman and Hall/CRC, LondonGoogle Scholar
  54. Xie Y (2017) knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.17. https://CRAN.R-project.org/package=knitr
  55. Yaseen ZM, Allawi MF, Yousif AA, Jaafar O, Hamzah FM, El-Shafie A (2016) Non-tuned machine learning approach for hydrological time series forecasting. Neural C Ap 30(5):1479–1491.  https://doi.org/10.1007/s00521-016-2763-0
  56. Yin RK (2003) Case study research: design and methods, 3rd edn. Sage Publications, Inc., Thousand OaksGoogle Scholar
  57. Yu X, Liong SY, Babovic V (2004) EC-SVM approach for real-time hydrologic forecasting. J Hydroinf 6(3):209–223CrossRefGoogle Scholar
  58. Zambrano-Bigiarini M (2017a) hydroGOF: Goodness-of-Fit Functions for Comparison of Simulated and Observed Hydrological Time Series. R package version 0.3–10. https://CRAN.R-project.org/package=hydroGOF
  59. Zambrano-Bigiarini M (2017b) hydroTSM: Time Series Management, Analysis and Interpolation for Hydrological Modelling. R package version 0.5–1. https://github.com/hzambran/hydroTSM
  60. Zeileis A, Grothendieck G (2005) zoo: S3 infrastructure for regular and irregular time series. J Stat Softw 14(6):1–27.  https://doi.org/10.18637/jss.v014.i06

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Department of Water Resources and Environmental Engineering, School of Civil EngineeringNational Technical University of AthensZografouGreece

Personalised recommendations