Frequency based imputation of precipitation

  • Fatih Dikbas
Original Paper


Changing climate and precipitation patterns make the estimation of precipitation, which exhibits two-dimensional and sometimes chaotic behavior, more challenging. In recent decades, numerous data-driven methods have been developed and applied to estimate precipitation; however, these methods suffer from the use of one-dimensional approaches, lack generality, require the use of neighboring stations and have low sensitivity. This paper aims to implement the first generally applicable, highly sensitive two-dimensional data-driven model of precipitation. This model, named frequency based imputation (FBI), relies on non-continuous monthly precipitation time series data. It requires no determination of input parameters and no data preprocessing, and it provides multiple estimations (from the most to the least probable) of each missing data unit utilizing the series itself. A total of 34,330 monthly total precipitation observations from 70 stations in 21 basins within Turkey were used to assess the success of the method by removing and estimating observation series in annual increments. Comparisons with the expectation maximization and multiple linear regression models illustrate that the FBI method is superior in its estimation of monthly precipitation. This paper also provides a link to the software code for the FBI method.


Frequency based imputation Data-driven modelling Precipitation Estimation of missing data 



I would like to thank The General Directorate of the State Hydraulic Works of Turkey for providing the data used in this study and the editors and reviewers for their valuable contributions and comments, which greatly improved the manuscript.


  1. Ajaaj AA, Mishra AK, Khan AA (2016) Comparison of BIAS correction techniques for GPCC rainfall data in semi-arid climate. Stoch Environ Res Risk A 30:1659–1675. doi: 10.1007/s00477-015-1155-9 CrossRefGoogle Scholar
  2. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38Google Scholar
  3. Dikbas F (2016a) Frequency based prediction of Buyuk Menderes flows. Tek Dergi 27:7325–7343Google Scholar
  4. Dikbas F (2016b) Three-dimensional imputation of missing monthly river flow data. Sci Iran 23:45–53Google Scholar
  5. Do CB, Batzoglou S (2008) What is the expectation maximization algorithm?. Nat Biotech 26:897–899.
  6. Elshorbagy A, Corzo G, Srinivasulu S, Solomatine DP (2010a) Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology-Part 1: concepts and methodology. Hydrol Earth Syst Sci 14:1931–1941. doi: 10.5194/hess-14-1931-2010 CrossRefGoogle Scholar
  7. Elshorbagy A, Corzo G, Srinivasulu S, Solomatine DP (2010b) Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology-Part 2: application. Hydrol Earth Syst Sci 14:1943–1961. doi: 10.5194/hess-14-1943-2010 CrossRefGoogle Scholar
  8. Hou AY et al (2014) The global precipitation measurement mission. Bull Am Meteorol Soc 95:701–722. doi: 10.1175/BAMS-D-13-00164.1 CrossRefGoogle Scholar
  9. Jayawardena AW, Lai F (1994) Analysis and prediction of chaos in rainfall and stream flow time series. J Hydrol 153:23–52. doi: 10.1016/0022-1694(94)90185-6 CrossRefGoogle Scholar
  10. Leconte J, Forget F, Charnay B, Wordsworth R, Pottier A (2013) Increased insolation threshold for runaway greenhouse processes on earth-like planets. Nature 504:268–271. doi: 10.1038/nature12827 CrossRefGoogle Scholar
  11. Maier HR, Dandy GC (2000) Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ Model Softw 15:101–124. doi: 10.1016/S1364-8152(99)00007-9 CrossRefGoogle Scholar
  12. Maier HR, Jain A, Dandy GC, Sudheer KP (2010) Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions. Environ Model Softw 25:891–909. doi: 10.1016/j.envsoft.2010.02.003 CrossRefGoogle Scholar
  13. Popp M, Schmidt H, Marotzke J (2016) Transition to a Moist Greenhouse with CO2 and solar forcing. Nat Commun. doi: 10.1038/ncomms10627 Google Scholar
  14. Reager JT, Famiglietti JS (2009) Global terrestrial water storage capacity and flood potential using GRACE. Geophys Res Lett. doi: 10.1029/2009GL040826 Google Scholar
  15. Remesan R, Mathew J (2015) Hydrological data driven modelling: a case study approach. Springer, Switzerland. doi: 10.1007/978-3-319-09235-5 CrossRefGoogle Scholar
  16. Sikorska AE, Montanari A, Koutsoyiannis D (2015) Estimating the uncertainty of hydrological predictions through data-driven resampling techniques. J Hydrol Eng. doi: 10.1061/(ASCE)HE.1943-5584.0000926 Google Scholar
  17. Sivakumar B (2000) Chaos theory in hydrology: important issues and interpretations. J Hydrol 227:1–20. doi: 10.1016/S0022-1694(99)00186-9 CrossRefGoogle Scholar
  18. Sivakumar B, Liong SY, Liaw CY, Phoon KK (1999) Singapore rainfall behavior: chaotic? J Hydrol Eng 4:38–48. doi: 10.1061/(ASCE)1084-0699(1999)4:1(38) CrossRefGoogle Scholar
  19. Solomatine DP (2006) Data-driven modeling and computational intelligence methods in hydrology. Encyclopedia of hydrological sciences. Wiley, Hoboken. doi: 10.1002/0470848944.hsa021 Google Scholar
  20. Solomatine D, See LM, Abrahart RJ (2008) Data-driven modelling: concepts, approaches and experiences. In: Abrahart R, See L, Solomatine D (eds) Practical hydroinformatics, vol 68, Water science and technology library. Springer, Berlin, p 17. doi: 10.1007/978-3-540-79881-1_2 CrossRefGoogle Scholar
  21. Wang XL, Lin A (2015) An algorithm for integrating satellite precipitation estimates with in situ precipitation data on a pentad time scale. J Geophys Res Atmos 120:3728–3744. doi: 10.1002/2014JD022788 CrossRefGoogle Scholar
  22. Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82. doi: 10.3354/cr030079 CrossRefGoogle Scholar
  23. Yozgatligil C, Aslan S, Iyigun C, Batmaz I (2013) Comparison of missing value imputation methods in time series: the case of Turkish meteorological data. Theor Appl Climatol 112:143–167. doi: 10.1007/s00704-012-0723-x CrossRefGoogle Scholar
  24. Zhang Q, Xu C-Y, Tao H, Jiang T, Chen YD (2010) Climate changes and their impacts on water resources in the arid regions: a case study of the Tarim River basin, China. Stoch Environ Res Risk A 24:349–358. doi: 10.1007/s00477-009-0324-0 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Civil Engineering DepartmentPamukkale UniversityDenizliTurkey

Personalised recommendations