Abstract
Two-station pairing approaches are routinely used to infill missing information in incomplete rainfall databases. We evaluated the performance of three simple methodologies to reconstruct incomplete time series in presence of variable nonlinear correlation between data pairs. Nonlinearity stems from the statistics describing the marginal peak-over-threshold (POT) values of rainfall events. A Monte Carlo analysis was developed to quantitatively assess expected errors from the use of chronological pairing (CP) with linear and nonlinear regression and frequency pairing (FP). CP is based on a priori selection of regression functions, while FP is based on matching the probability of non-exceedance of an event from one time series with the probability of non-exceedance of a similar event from another time series. We adopted a generalized Pareto (GP) model to describe POT events, and a t-copula algorithm to generate reference nonlinearly correlated pairs of random temporal distributions distributed according with the GP model. The results suggest that the optimal methodology strongly depends on GP statistics. In general, CP seems to provide the lowest errors when GP statistics were similar and correlation became linear; we found that a power-2 function performs well for the selected statistics when the number of missing points is limited. FP outperforms the other methods when POT statistics are different and variables are markedly nonlinearly correlated. Ensemble-based results seem to be supported by the analysis of observed precipitation at two real-world gauge stations.
Similar content being viewed by others
References
Arnell NW (1988) Unbiased estimation of flood risk with the GEV distribution. Stoch Hydrol Hydraul 2(3):201–212. doi:10.1007/BF01550842
Arnold BC (2004) Pareto distribution. In: Kotz S, Johnson NL, Read CB (eds) Encyclopedia of statistical sciences. Wiley, New York
Bargaoui Kebaili Z, Chebbi A (2009) Comparison of two kriging interpolation methods applied to spatiotemporal rainfall. J Hydrol 365(12):56–73. doi:10.1016/j.jhydrol.2008.11.025
Beauchamp J, Downing D, Railsback S (1989) Comparison of regression and time-series methods for synthesizing missing streamflow records. J Am Water Resour Assoc 25(5):961–975. doi:10.1111/j.1752-1688.1989.tb05410.x
Brdossy A, Pegram G (2014) Infilling missing precipitation records—a comparison of a new copula-based method with other techniques. J Hydrol 519(Part A):1162–1170. doi:10.1016/j.jhydrol.2014.08.025
Cantet P, Arnaud P (2014) Extreme rainfall analysis by a stochastic model: impact of the copula choice on the sub-daily rainfall generation. Stoch Environ Res Risk Assess 28(6):1479–1492. doi:10.1007/s00477-014-0852-0
Castillo E, Hadi AS (1997) Fitting the generalized Pareto distribution to data. J Am Stat Assoc 92(440):1609–1620. doi:10.1080/01621459.1997.10473683
Clauset A, Shalizi CR, Newman ME (2007) Power-law distribution in empirical data. SIAM Rev 51:661–703. doi:10.1137/070710111
Coulibaly P, Evora ND (2007) Comparison of neural network methods for infilling missing daily weather records. J Hydrol 341(12):27–41. doi:10.1016/j.jhydrol.2007.04.020
Demarta S, McNeil AJ (2005) The t copula and related copulas. Int Stat Rev 73(1):111–129. doi:10.1111/j.1751-5823.2005.tb00254.x
Goovaerts P (2000) Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. J Hydrol 228(12):113–129. doi:10.1016/S0022-1694(00)00144-X
Grygier JC, Stedinger JR, Yin HB (1989) A generalized maintenance of variance extension procedure for extending correlated series. Water Resour Res 25(3):345–349. doi:10.1029/WR025i003p00345
Hirsch RM (1979) An evaluation of some record reconstruction techniques. Water Resour Res 15(6):1781–1790. doi:10.1029/WR015i006p01781
Hirsch RM (1982) A comparison of four streamflow record extension techniques. Water Resour Res 18(4):1081–1088. doi:10.1029/WR018i004p01081
Kajornrit J, Wong KW, Fung CC (2012) Estimation of missing precipitation records using modular artificial neural networks. In: Huang T, Zeng Z, Li C, Leung CS (eds) Neural information processing. Lecture notes in computer science, vol 7666, Springer, Berlin/Heidelberg, pp 52–59
Kashani MH, Dinpashoh Y (2011) Evaluation of efficiency of different estimation methods for missing climatological data. Stoch Environ Res Risk Assess 26(1):59–71. doi:10.1007/s00477-011-0536-y
Khalil B, Adamowski J (2012) Record extension for short-gauged water quality parameters using a newly proposed robust version of the line of organic correlation technique. Hydrol Earth Syst Sci 16:2253–2266. doi:10.5194/hessd-9-4667-2012
Khalil B, Ouarda TBMJ, St-Hilaire A (2012) Comparison of record-extension techniques for water quality variables. Water Resour Manag 26(14):4259–4280. doi:10.5194/hessd-9-4667-2012
Kim D, Olivera F, Cho H (2013) Effect of the inter-annual variability of rainfall statistics on stochastically generated rainfall time series: part 1. Impact on peak and extreme rainfall values. Stoch Environ Res Risk Assess 27(7):1601–1610. doi:10.1007/s00477-013-0696-z
Kim JW, Pachepsky YA (2010) Reconstructing missing daily precipitation data using regression trees and artificial neural networks for SWAT streamflow simulation. J Hydrol 394(34):305–314. doi:10.1016/j.jhydrol.2010.09.005
Kim TW, Ahn H (2008) Spatial rainfall model using a pattern classifier for estimating missing daily rainfall data. Stoch Environ Res Risk Assess 23(3):367–376. doi:10.1007/s00477-008-0223-9
Li Z, Li C, Xu Z, Zhou X (2013) Frequency analysis of precipitation extremes in Heihe River basin based on generalized Pareto distribution. Stoch Environ Res Risk Assess 28(7):1709–1721. doi:10.1007/s00477-013-0828-5
Lye LM (1990) Bayes estimate of the probability of exceedance of annual floods. Stoch Hydrol Hydraul 4(1):55–64. doi:10.1007/BF01547732
Millar R (2013) A statistical approach for deriving project design rainfall. The Australasian Institute of Mining and Metallurgy, Melbourne, pp 273–276
Musial JP, Verstraete MM, Gobron N (2011) Comparing the effectiveness of recent algorithms to fill and smooth incomplete and noisy time series. Atmos Chem Phys Discuss 11(5):14259–14308. doi:10.5194/acpd-11-14259-2011
Papalexiou SM, Koutsoyiannis D (2013) Battle of extreme value distributions: a global survey on extreme daily rainfall. Water Resour Res 49(1):187–201. doi:10.1029/2012WR012557
Paulhus JLH, Kohler MA (1952) Interpolation of missing precipitation records. Mon Weather Rev 80(8):129–133
Pickands J III (1975) Statistical inference using extreme order statistics. Ann Stat 3(1):119–131
Porporato A, Ridolfi L (1998) Influence of weak trends on exceedance probability. Stoch Hydrol Hydraul 12(1):1–14. doi:10.1007/s004770050006
Rubin DB (1976) Inference and missing data. Biometrika 63(3):581–592. doi:10.2307/2335739
Serinaldi F, Kilsby CG (2014) Rainfall extremes: toward reconciliation after the battle of distributions. Water Resour Res 50:336
Vogel RM, Stedinger JR (1985) Minimum variance streamflow record augmentation procedures. Water Resour Res 21(5):715–723. doi:10.1029/WR021i005p00715
Acknowledgments
We acknowledge the useful suggestions provided by the Associate Editor and three anonymous reviewers, who significantly helped to improve the manuscript. All the data and additional information used and cited in this paper can be provided by the corresponding author at specific requests.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Pedretti, D., Beckie, R.D. Stochastic evaluation of simple pairing approaches to reconstruct incomplete rainfall time series. Stoch Environ Res Risk Assess 30, 1933–1946 (2016). https://doi.org/10.1007/s00477-015-1195-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-015-1195-1