Skip to main content
Log in

Quantifying a Threshold of Missing Values for Gap Filling Processes in Daily Precipitation Series

  • Published:
Water Resources Management Aims and scope Submit manuscript

Abstract

Multiple missing levels are explored to quantify a threshold of missing values during gap filling processes in daily precipitation series. An autoregressive model was used to generate rainfall estimates and subsets of data are selected with four sampling windows (whole data, front, middle, and rear section) at different missing levels, including 5, 10, 15, 16, 17, and 18 %. The proposed threshold was found and evaluated based on statistical criteria, including coefficient of determination (R2) and its associated index termed “the R2 difference index (RDI).” The result indicates that about 15 % missing level of data is plausible to construct daily precipitation series for further hydrological analysis when the Gamma distribution function (GDF) is used as an estimation method. The threshold determined from this study will contribute to gap filling guidelines, especially for water managers and hydrologists to take advantage of skillful estimates for missing daily precipitation data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Ahrens B (2006) Distance in spatial interpolation of daily rain gauge data. Hydrol Earth Syst Sci 10:197–208

    Article  Google Scholar 

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds), 2nd International Symposium on Information Theory, 267–281

  • Aksoy H (2000) Use of gamma distribution in hydrological analysis. Turk J Eng Environ Sci 24:419–428

    Google Scholar 

  • Calder IR (1993) In handbook of hydrology. Maidment D (ed.), McGraw-Hill, New York, 18.12–18.13

  • Chattopadhyay S, Jhajharia D, Chattopadhyay G (2011) Univariate modelling of monthly maximum temperature time series over northeast India: neural network versus Yule-Walker equation based approach. Meteorol Appl 18(1):70–82

    Article  Google Scholar 

  • Dirks KN, Hay JE, Harris D (1998) High-resolution studies of rainfall on Norfolk island part II: interpolation of rainfall data. J Hydrol 208(3–4):187–193

    Article  Google Scholar 

  • Garcia M, Peters-Lidard CD, Goodrich DC (2008) Spatial interpolation of precipitation in a dense gauge network for monsoon storm events in the southwestern United States. Water Resour Res 44(W05S13):1–14

    Google Scholar 

  • Goldberg AS (2000) A course in econometrics. Harvard University Press, Cambridge

    Google Scholar 

  • Goswami M, O’Connor KM (2007) Real-time flow forecasting in the absence of quantitative precipitation forecasts: a multi-model approach. J Hydrol 334(1–2):125–140

    Article  Google Scholar 

  • Hasan MM, Croke BFW (2013) Filling gaps in daily rainfall data: a statistical approach. 20th International Congress on Modeling and Simulation, Adelaide

    Google Scholar 

  • Hipel KW, McLeod AI (1994) Time series modelling of water resources and environmental systems. Elsevier, Amsterdam

    Google Scholar 

  • Hubbard KG, Goddard S, Sorensen WD, Wells N, Osugu TT (2005) Performance of quality assurance procedures for an applied climate information system. J Atmos Ocean Technol 22(1):105–112

    Article  Google Scholar 

  • Ing C-K, Sin C-Y, Yu S-H (2010) Efficient selection of the order of an AR (infinity): a unified approach without knowing the order of integratedness. Econometric Society, 10th World Congress, Shanghai

    Google Scholar 

  • Linsley RK, Kohler MA, Paulhus JLH (1982) Hydrology for engineers. McGraw-Hill, New York

    Google Scholar 

  • Ljung GM, Box GEP (1978) On a measure of lack of fit in time series models. Biometrika 65:297–303

    Article  Google Scholar 

  • Mair A, Fares A (2011) Comparison of rainfall interpolation methods in a mountainous region of a tropical island. J Hydrol Eng 16(4):371–383

    Article  Google Scholar 

  • Markovic RD (1965) Probability functions of best fit to distributors of annual precipitation and runoff. Hydrology papers Colorado State University, Fort Collins, P 119

    Google Scholar 

  • Ryu JH, Palmer RN, Wiley MW, Jeong S (2009) Mid-range streamflow forecasts based on climate modeling-statistical correction and evaulation. J Am Water Resour Assoc 45(2):355–368

    Article  Google Scholar 

  • Salas JD, Delleur JW, Yevjevich V, Lane WL (1980) Applied modeling of hydrologic time series. Water Resources Publications, LLC, P.O. Box 2841, Littleton, CO 80161, USA, pp468

  • Sevruk B (1996) Adjustment of tipping-bucket precipitation gauge measurements. Atmos Res 42(1–4):237–246

    Article  Google Scholar 

  • Simolo C, Brunetti M, Maugeri M, Nanni T (2010) Improving estimation of missing values in daily precipitation series by a probability density function-preserving approach. Int J Climatol 30(10):1564–1576

    Google Scholar 

  • Sprinthall RC (2011) Basic statistical analysis. 9th edn. Pearson Education Group

  • Teegavarapu RSV (2014a) Statistical corrections of spatially interpolated missing precipitation data estimates. Hydrol Process 28:3789–3808

    Article  Google Scholar 

  • Teegavarapu RSV (2014b) Missing precipitation data estimating using optimal proximity metric-based imputation, nearest-neighbor classification and cluster-based interpolation methods. Hydrol Sci J 59(11):2009–2026

    Article  Google Scholar 

  • Teegavarapu RSV, Tufail MI, Ormsbee L (2009) Optimal functional forms for estimation of missing precipitation records. J Hydrol 374:106–115

    Article  Google Scholar 

  • Teegavarapu RSV, Meskele T, Pathak C (2011) Geo-spatial grid-based transformation of precipitation estimate using spatial interpolation methods. Comput Geosci. doi:10.1016/j.cageo.2011.07.004

    Google Scholar 

  • Westerberg I, Walther A, Guerrero JL, Coello Z, Halldin S, Xu CY, Chen D, Lundin LC (2010) Precipitation data in a mountainous catchment in Honduras: quality assessment and spatiotemporal characteristics. Theor Appl Climatol 110:381–396

    Article  Google Scholar 

  • Wilks DS (1995) Statistical methods in the atmospheric sciences. Academic Press

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jae H. Ryu.

Appendix

Appendix

The Gamma Distribution Function (GDF):

$$ F\left(x\left|a,\beta \right.\right)={\displaystyle \underset{0}{\overset{x}{\int }}f\left(x\left|a,\beta \right.\right)} $$
(A1)
$$ f\left(x\left|a,\beta \right.\right)=\frac{1}{\beta^2\varGamma (a)}{x}^{a-1}{e}^{-x};x\ge 0 $$
(A2)
$$ \varGamma (a)={\displaystyle \underset{0}{\overset{\infty }{\int }}{x}^{a-1}{e}^{-x}dx} $$
(A3)
$$ a={\left(\frac{\overline{x}}{s}\right)}^2,\kern0.96em \beta =\frac{S^2}{\overline{x}} $$
(A4)
$$ {F}^{-1}\left(x\left|a,\beta \left|a,\beta \right.\right.\right)=x $$
(A5)

Where, F(x) is CDF of gamma distribution, f(x) is the probability density function of the gamma distribution, x is precipitation at the time step, \( \overline{x} \) is the mean precipitation of specific month, s is the standard deviation of the specific month, α is the shape parameter, β is the scale parameter, Γ is the gamma function.

Coefficient of determination (R2): R2 is given by

$$ {R}^2={\left(\frac{\frac{1}{N}\times {\displaystyle {\sum}_{i=1}^N\left({P}_{Qi}-{\overline{P}}_{Qi}\right)\times \left({P}_{Si}-{\overline{P}}_{Si}\right)}}{\sqrt{\frac{N\times {\displaystyle {\sum}_{i=1}^N{P}_{Qi}^2-{\left({\displaystyle {\sum}_{i=1}^N{P}_{Q1}}\right)}^2}}{N\times \left(N-1\right)}}\times \sqrt{\frac{N\times {\displaystyle {\sum}_{i=1}^N{P}_{Si}^2-{\left({\displaystyle {\sum}_{i=1}^N{P}_{S1}}\right)}^2}}{N\times \left(N-1\right)}}}\right)}^2 $$
(A6)

where, PQi is the observed precipitation data at time step i, PSi is the estimated precipitation data. \( {\overline{P}}_{Qi}\; and\;{\overline{P}}_{Si} \) are the mean of the observed and estimated precipitation data, respectively. N is sample size.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, J., Ryu, J.H. Quantifying a Threshold of Missing Values for Gap Filling Processes in Daily Precipitation Series. Water Resour Manage 29, 4173–4184 (2015). https://doi.org/10.1007/s11269-015-1052-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11269-015-1052-5

Keywords

Navigation