Stochastic evaluation of simple pairing approaches to reconstruct incomplete rainfall time series

  • Daniele PedrettiEmail author
  • Roger D. Beckie
Original Paper


Two-station pairing approaches are routinely used to infill missing information in incomplete rainfall databases. We evaluated the performance of three simple methodologies to reconstruct incomplete time series in presence of variable nonlinear correlation between data pairs. Nonlinearity stems from the statistics describing the marginal peak-over-threshold (POT) values of rainfall events. A Monte Carlo analysis was developed to quantitatively assess expected errors from the use of chronological pairing (CP) with linear and nonlinear regression and frequency pairing (FP). CP is based on a priori selection of regression functions, while FP is based on matching the probability of non-exceedance of an event from one time series with the probability of non-exceedance of a similar event from another time series. We adopted a generalized Pareto (GP) model to describe POT events, and a t-copula algorithm to generate reference nonlinearly correlated pairs of random temporal distributions distributed according with the GP model. The results suggest that the optimal methodology strongly depends on GP statistics. In general, CP seems to provide the lowest errors when GP statistics were similar and correlation became linear; we found that a power-2 function performs well for the selected statistics when the number of missing points is limited. FP outperforms the other methods when POT statistics are different and variables are markedly nonlinearly correlated. Ensemble-based results seem to be supported by the analysis of observed precipitation at two real-world gauge stations.


Missing data Rainfall Chronological pairing Frequency pairing Nonlinearity Copulas 



We acknowledge the useful suggestions provided by the Associate Editor and three anonymous reviewers, who significantly helped to improve the manuscript. All the data and additional information used and cited in this paper can be provided by the corresponding author at specific requests.

Supplementary material

477_2015_1195_MOESM1_ESM.pdf (119 kb)
(Supplementary material 1 (pdf 119 kb)
477_2015_1195_MOESM2_ESM.pdf (41 kb)
((Supplementary material 2 (pdf 42 kb)


  1. Arnell NW (1988) Unbiased estimation of flood risk with the GEV distribution. Stoch Hydrol Hydraul 2(3):201–212. doi: 10.1007/BF01550842 CrossRefGoogle Scholar
  2. Arnold BC (2004) Pareto distribution. In: Kotz S, Johnson NL, Read CB (eds) Encyclopedia of statistical sciences. Wiley, New YorkGoogle Scholar
  3. Bargaoui Kebaili Z, Chebbi A (2009) Comparison of two kriging interpolation methods applied to spatiotemporal rainfall. J Hydrol 365(12):56–73. doi: 10.1016/j.jhydrol.2008.11.025 CrossRefGoogle Scholar
  4. Beauchamp J, Downing D, Railsback S (1989) Comparison of regression and time-series methods for synthesizing missing streamflow records. J Am Water Resour Assoc 25(5):961–975. doi: 10.1111/j.1752-1688.1989.tb05410.x CrossRefGoogle Scholar
  5. Brdossy A, Pegram G (2014) Infilling missing precipitation records—a comparison of a new copula-based method with other techniques. J Hydrol 519(Part A):1162–1170. doi: 10.1016/j.jhydrol.2014.08.025 CrossRefGoogle Scholar
  6. Cantet P, Arnaud P (2014) Extreme rainfall analysis by a stochastic model: impact of the copula choice on the sub-daily rainfall generation. Stoch Environ Res Risk Assess 28(6):1479–1492. doi: 10.1007/s00477-014-0852-0 CrossRefGoogle Scholar
  7. Castillo E, Hadi AS (1997) Fitting the generalized Pareto distribution to data. J Am Stat Assoc 92(440):1609–1620. doi: 10.1080/01621459.1997.10473683 CrossRefGoogle Scholar
  8. Clauset A, Shalizi CR, Newman ME (2007) Power-law distribution in empirical data. SIAM Rev 51:661–703. doi: 10.1137/070710111 CrossRefGoogle Scholar
  9. Coulibaly P, Evora ND (2007) Comparison of neural network methods for infilling missing daily weather records. J Hydrol 341(12):27–41. doi: 10.1016/j.jhydrol.2007.04.020 CrossRefGoogle Scholar
  10. Demarta S, McNeil AJ (2005) The t copula and related copulas. Int Stat Rev 73(1):111–129. doi: 10.1111/j.1751-5823.2005.tb00254.x CrossRefGoogle Scholar
  11. Goovaerts P (2000) Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. J Hydrol 228(12):113–129. doi: 10.1016/S0022-1694(00)00144-X CrossRefGoogle Scholar
  12. Grygier JC, Stedinger JR, Yin HB (1989) A generalized maintenance of variance extension procedure for extending correlated series. Water Resour Res 25(3):345–349. doi: 10.1029/WR025i003p00345 CrossRefGoogle Scholar
  13. Hirsch RM (1979) An evaluation of some record reconstruction techniques. Water Resour Res 15(6):1781–1790. doi: 10.1029/WR015i006p01781 CrossRefGoogle Scholar
  14. Hirsch RM (1982) A comparison of four streamflow record extension techniques. Water Resour Res 18(4):1081–1088. doi: 10.1029/WR018i004p01081 CrossRefGoogle Scholar
  15. Kajornrit J, Wong KW, Fung CC (2012) Estimation of missing precipitation records using modular artificial neural networks. In: Huang T, Zeng Z, Li C, Leung CS (eds) Neural information processing. Lecture notes in computer science, vol 7666, Springer, Berlin/Heidelberg, pp 52–59CrossRefGoogle Scholar
  16. Kashani MH, Dinpashoh Y (2011) Evaluation of efficiency of different estimation methods for missing climatological data. Stoch Environ Res Risk Assess 26(1):59–71. doi: 10.1007/s00477-011-0536-y CrossRefGoogle Scholar
  17. Khalil B, Adamowski J (2012) Record extension for short-gauged water quality parameters using a newly proposed robust version of the line of organic correlation technique. Hydrol Earth Syst Sci 16:2253–2266. doi: 10.5194/hessd-9-4667-2012 CrossRefGoogle Scholar
  18. Khalil B, Ouarda TBMJ, St-Hilaire A (2012) Comparison of record-extension techniques for water quality variables. Water Resour Manag 26(14):4259–4280. doi: 10.5194/hessd-9-4667-2012 CrossRefGoogle Scholar
  19. Kim D, Olivera F, Cho H (2013) Effect of the inter-annual variability of rainfall statistics on stochastically generated rainfall time series: part 1. Impact on peak and extreme rainfall values. Stoch Environ Res Risk Assess 27(7):1601–1610. doi: 10.1007/s00477-013-0696-z CrossRefGoogle Scholar
  20. Kim JW, Pachepsky YA (2010) Reconstructing missing daily precipitation data using regression trees and artificial neural networks for SWAT streamflow simulation. J Hydrol 394(34):305–314. doi: 10.1016/j.jhydrol.2010.09.005 CrossRefGoogle Scholar
  21. Kim TW, Ahn H (2008) Spatial rainfall model using a pattern classifier for estimating missing daily rainfall data. Stoch Environ Res Risk Assess 23(3):367–376. doi: 10.1007/s00477-008-0223-9 CrossRefGoogle Scholar
  22. Li Z, Li C, Xu Z, Zhou X (2013) Frequency analysis of precipitation extremes in Heihe River basin based on generalized Pareto distribution. Stoch Environ Res Risk Assess 28(7):1709–1721. doi: 10.1007/s00477-013-0828-5 CrossRefGoogle Scholar
  23. Lye LM (1990) Bayes estimate of the probability of exceedance of annual floods. Stoch Hydrol Hydraul 4(1):55–64. doi: 10.1007/BF01547732 CrossRefGoogle Scholar
  24. Millar R (2013) A statistical approach for deriving project design rainfall. The Australasian Institute of Mining and Metallurgy, Melbourne, pp 273–276Google Scholar
  25. Musial JP, Verstraete MM, Gobron N (2011) Comparing the effectiveness of recent algorithms to fill and smooth incomplete and noisy time series. Atmos Chem Phys Discuss 11(5):14259–14308. doi: 10.5194/acpd-11-14259-2011 CrossRefGoogle Scholar
  26. Papalexiou SM, Koutsoyiannis D (2013) Battle of extreme value distributions: a global survey on extreme daily rainfall. Water Resour Res 49(1):187–201. doi: 10.1029/2012WR012557 CrossRefGoogle Scholar
  27. Paulhus JLH, Kohler MA (1952) Interpolation of missing precipitation records. Mon Weather Rev 80(8):129–133CrossRefGoogle Scholar
  28. Pickands J III (1975) Statistical inference using extreme order statistics. Ann Stat 3(1):119–131CrossRefGoogle Scholar
  29. Porporato A, Ridolfi L (1998) Influence of weak trends on exceedance probability. Stoch Hydrol Hydraul 12(1):1–14. doi: 10.1007/s004770050006 CrossRefGoogle Scholar
  30. Rubin DB (1976) Inference and missing data. Biometrika 63(3):581–592. doi: 10.2307/2335739 CrossRefGoogle Scholar
  31. Serinaldi F, Kilsby CG (2014) Rainfall extremes: toward reconciliation after the battle of distributions. Water Resour Res 50:336CrossRefGoogle Scholar
  32. Vogel RM, Stedinger JR (1985) Minimum variance streamflow record augmentation procedures. Water Resour Res 21(5):715–723. doi: 10.1029/WR021i005p00715 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Earth, Ocean and Atmospheric Sciences University of British Columbia (UBC)VancouverCanada

Personalised recommendations