Theoretical and Applied Climatology

, Volume 136, Issue 1–2, pp 417–427 | Cite as

Generalised Pareto distribution: impact of rounding on parameter estimation

  • Z. Pasarić
  • K. CindrićEmail author
Original Paper


Problems that occur when common methods (e.g. maximum likelihood and L-moments) for fitting a generalised Pareto (GP) distribution are applied to discrete (rounded) data sets are revealed by analysing the real, dry spell duration series. The analysis is subsequently performed on generalised Pareto time series obtained by systematic Monte Carlo (MC) simulations. The solution depends on the following: (1) the actual amount of rounding, as determined by the actual data range (measured by the scale parameter, σ) vs. the rounding increment (Δx), combined with; (2) applying a certain (sufficiently high) threshold and considering the series of excesses instead of the original series. For a moderate amount of rounding (e.g. σx ≥ 4), which is commonly met in practice (at least regarding the dry spell data), and where no threshold is applied, the classical methods work reasonably well. If cutting at the threshold is applied to rounded data—which is actually essential when dealing with a GP distribution—then classical methods applied in a standard way can lead to erroneous estimates, even if the rounding itself is moderate. In this case, it is necessary to adjust the theoretical location parameter for the series of excesses. The other solution is to add an appropriate uniform noise to the rounded data (“so-called” jittering). This, in a sense, reverses the process of rounding; and thereafter, it is straightforward to apply the common methods. Finally, if the rounding is too coarse (e.g. σx~1), then none of the above recipes would work; and thus, specific methods for rounded data should be applied.



The constructive comments from two anonymous reviewers are gratefully acknowledged.

Funding information

This work has been supported in part by the Croatian Science Foundation under the project 2831. K. Cindrić received funding from the European Union’s Horizon 2020 research and innovation program under the grant agreement no. 653824/EU-CIRCLE.


  1. Anagnostopolou C, Tolika K (2012) Extreme precipitation in Europe: statistical threshold selection based on climatological criteria. Theor Appl Climatol 107:479–489. CrossRefGoogle Scholar
  2. Bai Z, Zheng S, Zhang B, Hu G (2009) Statistical analysis for rounded data. J Stat Plan Inference 139:2526–2542CrossRefGoogle Scholar
  3. Begueria S (2005) Uncertainties in partial duration series modelling of extremes related to the choice of the threshold value. J Hydrol 303:215–230. CrossRefGoogle Scholar
  4. Cindrić K, Pasarić Z, Gajić-Čapka M (2010) Spatial and temporal analysis of dry spells in Croatia. Theor Appl Climatol 102:171–184. CrossRefGoogle Scholar
  5. Coles S (2001) An introduction to statistical Modelling of extreme values. Springer-Verlag, LondonCrossRefGoogle Scholar
  6. Coles S, Pericchi LR, Sisson S (2003) A fully probabilistic approach to extreme rainfall modelling. J Hydrol 273:35–50. CrossRefGoogle Scholar
  7. Deidda R, Puliga M (2006) Sensitivity of goodness-of-fit statistics to rainfall data rounding off. Phys Chem Earth 31:1240–1251. CrossRefGoogle Scholar
  8. Deidda R (2007) An efficient rounding-off rule estimator: application to daily rainfall time series. Water Resour Res 43:W12405. CrossRefGoogle Scholar
  9. Deidda R (2010) A multiple threshold method for fitting the generalized Pareto distribution to rainfall time series. Hydrol Earth Syst Sci 14:2559–2575. CrossRefGoogle Scholar
  10. de Zea Bermudez P, Kotz S (2010a) Parameter estimation of the generalized Pareto distribution—part I. J Stat Plan Inference 140:1353–1373. CrossRefGoogle Scholar
  11. de Zea Bermudez P, Kotz S (2010b) Parameter estimation of the generalized Pareto distribution—part II. J Stat Plan Inference 140:1374–1388. CrossRefGoogle Scholar
  12. Heitjan DF (1989) Inference from grouped continuous data: a review. Stat Sci 4(2):164–183CrossRefGoogle Scholar
  13. Hogg RV, McKean J, Craig AT (2012) Introduction to mathematical statistics. Pearson, BostonGoogle Scholar
  14. Hosking JRM (1990) L-moments: analysis and estimation of distributions using linear combinations of order statistics. J R Statist Soc B 52(1):105–124Google Scholar
  15. Hosking JRM, Wallis JR (1997) Regional frequency analysis. An approach based on L-moments. Cambridge University Press, LondonCrossRefGoogle Scholar
  16. Lana X, Martínez MD, Burgueño A, Serra C, Martín-Vide J, Gómez L (2006) Distribution of long dry spells in the Iberian peninsula, years 1951-1990. Int J Climatol 26:1999–2021. CrossRefGoogle Scholar
  17. Lang M, Ouarda TBMJ, Bobee B (1999) Towards operational guidelines for over-threshold modeling. J Hydrol 225:103–117. CrossRefGoogle Scholar
  18. Langousis A, Mamalakis A, Puliga M, Deidda R (2016) Threshold detection for the generalized Pareto distribution: review of representative methods and application to the NOAA NCDC daily rainfall database. Water Resour Res 52(4):2659–2681. CrossRefGoogle Scholar
  19. Madsen H, Pearson CP, Rosbjerg D (1997a) Comparison of annual maximum series and partial duration methods for modeling extreme hydrologic events. 1. At-site modeling. Water Resour Res 33:759–769. CrossRefGoogle Scholar
  20. Madsen H, Pearson CP, Rosbjerg D (1997b) Comparison of annual maximum series and partial duration methods for modeling extreme hydrologic events. 2. Regional modeling. Water Resour Res 33:771–790. CrossRefGoogle Scholar
  21. Mudelsee M (2014) Climate time series analysis: classical statistical and bootstrap methods. Springer International Publishing, SwitzerlandGoogle Scholar
  22. Mudelsee M, Bermejo MA (2017) Optimal heavy tail estimation—part 1: order selection. Nonlin Process Geophys 24:737–744. CrossRefGoogle Scholar
  23. Naveau P, Huser R, Ribereau P, Hannart A (2016) Modeling jointly low, moderate, and heavy rainfall intensities without a threshold selection. Water Resour Res 52:2753–2769. CrossRefGoogle Scholar
  24. Prieto F, Gómez-Déniz E, Sarabia JM (2014) Modelling road accident blackspots data with the discrete generalized Pareto distribution. Acid Anal Prev 71:38–49. CrossRefGoogle Scholar
  25. Reiss RD, Thomas M (2007) Statistical analysis of extreme values. Birkhäuser, BaselGoogle Scholar
  26. Serra C, Lana X, Burgueño A, Martínez MD (2016) Partial duration series distributions of the European dry spell lengths for the second half of the twentieth century. Theor Appl Climatol 123:63–81. CrossRefGoogle Scholar
  27. Smith RL (2003) Statistics of extremes, with applications in environment, insurance and finance. In: Finkenstadt B (ed) Extreme values in finance, telecommunications, and the environment. Chapman and Hall/CRC Press, LondonGoogle Scholar
  28. Vicente-Serrano SM, Begueria-Portugues S (2003) Estimating extreme dry spell-risk in the middle Ebro valley (NE Spain): a comparative analysis of partial duration series with a general Pareto distribution and annual maxima series with a Gumbel distribution. Int J Climatol 23:1103–1118. CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Austria, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Geophysics, Faculty of ScienceUniversity of ZagrebZagrebCroatia
  2. 2.Meteorological and Hydrological ServiceZagrebCroatia

Personalised recommendations