Abstract
Accurate and precise estimation of return levels is often a key goal of any extreme value analysis. For example, in the UK the British Standards Institution (BSI) incorporate estimates of ‘once-in-50-year wind gust speeds’—or 50-year return levels—into their design codes for new structures; similarly, the Dutch Delta Commission use estimates of the 10,000-year return level for sea-surge to aid the construction of flood defence systems. In this paper, we briefly highlight the shortcomings of standard methods for estimating return levels, including the commonly-adopted block maxima and peaks over thresholds approach, before presenting an estimation framework which we show can substantially increase the precision of return level estimates. Our work allows explicit quantification of seasonal effects, as well as exploiting recent developments in the estimation of the extremal index for handling extremal clustering. From frequentist ideas, we turn to the Bayesian paradigm as a natural approach for building complex hierarchical or spatial models for extremes. Through simulations we show that the return level posterior mean does not have an exceedance probability in line with the intended encounter risk; we also argue that the Bayesian posterior predictive value gives the most satisfactory representation of a return level for use in practice, accounting for uncertainty in parameter estimation and future observations. Thus, where feasible, we propose a Bayesian estimation strategy for optimal return level inference.
Similar content being viewed by others
Notes
Provided their “\(D(u_n)\) condition” holds; informally, this condition ensures that, for large enough lags, any dependence is sufficiently negligible so as to have no effect on the limit laws for extremes.
Details of MCMC techniques are now extensively published (Smith and Roberts 1993), for example and so are omitted here.
References
Ancona-Navarrete MA, Tawn JA (2000) A comparison of methods for estimating the extremal index. Extremes 3:5–38
Atyeo J, Walshaw D (2012) A region-based hierarchical model for extreme rainfall over the UK, incorporating spatial dependence and temporal trend. Environmetrics 23(6):509–521
Beirlant J, Goegebeur J, Teugels J, Segers J, De Waal D, Ferro C (2004) Statistics of extremes. Wiley, New York
Bottolo P, Consonni G, Dellaportos P, Lijoi A (2003) Bayesian analysis of extreme values by mixture modelling. Extremes 6:25–48
Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. J Comput Graph Stat 7(4):434–455
Cabras S (2013) Default priors based on pseudo-likelihoods for the Poisson-GPD model. In: Torelli N, Pesarin F, Bar-Hens A (eds) Advances in theoretical and applied statistics, studies in theoretical and applied statistics. Springer, Berlin, pp 3–12
Chavez-Demoulin V, Davison A (2005) General additive modelling of sample extremes. J R Stat Soc Ser C 54:207–222
Coles SG (2001) An introduction to statistical modeling of extreme values. Springer, London
Coles SG, Tawn JA (1991) Modelling extreme multivariate events. J R Stat Soc Ser B 53:377–392
Coles SG, Powell EA (1996) Bayesian methods in extreme value modelling: a review and new developments. Int Stat Rev 64(1):119–136
Coles SG, Tawn JA (1996) A Bayesian analysis of extreme rainfall data. J R Stat Soc Ser C 45:463–478
Coles SG, Heffernan JE, Tawn JA (1999) Dependence measures for extreme valueanalyses. Extremes 2:339–365
Coles SG, Tawn JA (2005) Bayesian modelling of extreme surges on the UK east coast. Philos Trans R Soc A 363:1387–1406
Davison AC, Smith RL (1990) Models for exceedances over high thresholds. J R Stat Soc Ser B 52:393–442 (with discussion)
Davison AC, Padoan SA, Ribatet M (2012) Statistical modeling of spatial extremes. Stat Sci 27:161–186
Diggle PJ, Ribeiro PJ (2007) Model-based geostatistics. Springer, New York
Eastoe EF (2009) A hierarchical model for non-stationary multivariate extremes: a case study of surface-level ozone and NOx data in the UK. Environmetrics 20:428–444
Eastoe EF, Tawn JA (2012) Modelling the distribution for the cluster maxima of exceedances of sub-asymptotic thresholds. Biometrika 99(1):43–55
Efron B (1987) Better bootstrap confidence intervals. J Am Stat Assoc 82:171–185
Eugenia Castellanos M, Cabras S (2007) A default Bayesian procedure for the generalized Pareto distribution. J Stat Plan Inference 137(2):473–483
Fawcett L (2005) Statistical methodology for the estimation of environmental extremes. Ph.D. thesis, Newcastle University, Newcastle-upon-Tyne
Fawcett L, Walshaw D (2006a) Markov chain models for extreme wind speeds. Environmetrics 17(8):795–809
Fawcett L, Walshaw D (2006b) A hierarchical model for extreme wind speeds. J R Stat Soc Ser C 55(5):631–646
Fawcett L, Walshaw D (2007) Improved estimation for temporally clustered extremes. Environmetrics 18(2):173–188
Fawcett L, Walshaw D (2008) Bayesian inference for clustered extremes. Extremes 11:217–233
Fawcett L, Walshaw D (2012) Estimating return levels from serially dependent extremes. Environmetrics 23(3):272–283
Ferro CAT, Segers J (2003) Inference for clusters of extreme values. J R Stat Soc Ser B 65:545–556
Galiatsatou P, Prinos P (2011) Modeling non-stationary extreme waves using a point process approach and wavelets. Stoch Environ Res Risk Assess 25(2):165–183
Gomes MI (1993) On the estimation of parameters of rare events in environmental time series. In: Barnett V, Turkman KF (eds) Statistics for the environment 2: water related issues. Wiley, Chichester, pp 225–241
Ho KW (2010) A matching prior for extreme quantile estimation of the generalized Pareto distribution. J Stat Plan Infernce 140(6):1513–1518
Hsing T (1993) Extremal index estimation for a weakly dependent stationary sequence. Ann Stat 21:2043–2071
Jonathan P, Ewans KC, Randell D (2014) Non-stationary conditional extremes of northern North Sea storm characteristics. Environmetrics 25(3):172–188
Jonathan P, Ewans KC (2011) Modelling the seasonality of extreme waves in the Gulf of Mexico. ASME J Offshore Mech Arct Eng 133:021104
Leadbetter MR, Lindgren G, Rootzén H (1983) Extremes and related properties of random sequences and series. Springer, New York
Leadbetter MR, Rootzén H (1988) Extremal theory for stochastic processes. Ann Probab 16:431–476
Northrop P (2012) Semiparametric estimation of the extremal index using block maxima. Technical report number 318, Department of Statistical Science, University College London
Northrop P and Attalides N (2014) Posterior propriety in objective Bayesian extreme value analyses. Departmental research report 323, University College London
Padoan SA, Bevilacqua M (2013) CompRandFld: composite-likelihood based analysis of random fields. R Package Version 1:3
Pickands J (1981). Multivariate extreme value distributions. In: Proceedings of the 43rd session of the international statistics institute, vol 2, Bull Inst Int Stat 49, pp 859–878
Roberts GO, Gelman A, Gilks WR (1997) Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Probab 7(1):110–120
Sang H, Gelfand AE (2009) Hierarchical modeling for extreme values observed over space and time. Environ Ecol Stat 16:407–426
Sang H, Gelfand AE (2010) Continuous spatial process models for extreme values. J Agric Biol Environ Stat 15:49–65
Scarrott C, MacDonald A (2012) A review of extreme value threshold estimation and uncertainty quantification. REVSTAT Stat J 10(1):33–60
Schlather M (2002) Models for stationary max-stable random fields. Extremes 5:33–44
Serinaldi F (2015) Dismissing return periods!. Stoch Environ Res Risk Assess 29(4):1179–1189
Shiau JT (2003) Return period of bivariate distributed extreme hydrological events. Stoch Environ Res Risk Assess 17(1–2):42–57
Smith AFM, Roberts GO (1993) Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods. J R Stat Soc Ser B 55:3–23
Smith EL, Walshaw D (2003) Modelling bivariate extremes in a region. Bayesian Stat 7:681–690
Smith RL (1992) The extremal index for a Markov chain. J Appl Probab 29:37–45
Smith RL (1999) Bayesian and frequentist approaches to parametric predictive inference (with discussion). In: Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds) Bayesian statistics, vol 6, pp 589–223
Smith RL, Weissman I (1994) Estimating the extremal index. J R Stat Soc Ser B 56:515–528
Smith RL, Tawn JA, Coles SG (1997) Markov chain models for threshold exceedances. Biometrika 84:249–268
Smith RL, Goodman DJ (2000) Bayesian risk analysis. In: Embrechts P (ed) Extremes and integrated risk management. Risk Books, London, pp 235–251
Stepheson A, Ribatet M (2014) evdbayes: Bayesian analysis in extreme value. R Package Version 1(8):1
Stephenson A, Tawn JA (2004) Inference for extremes: accounting for the three extremal types. Extremes 7(4):291–307
Süveges M (2007) Likelihood estimation of the extremal index. Extremes 10:41–55
Süveges M, Davison AC (2010) Model misspecification in peaks over threshold analysis. Ann Appl Stat 4:203–221
Van der Vyver H (2015) On the estimation of continuous 24-h precipitation maxima. Stoch Environ Res Risk Assess 29(3):653–663
Vanem E (2011) Long-term time-dependent stochastic modelling of extreme wave. Stoch Environ Res Risk Assess 25:185–209
Walshaw D (1991) Statistical analysis of extreme wind speeds. Ph.D. thesis, University of Sheffield, Sheffield
Walshaw D (1994) Getting the most from your extreme wind data: a step by step guide. J Res Natl Inst Stand Technol 99:399–411
Xu Y, Booij MJ, Tang Y (2010) Uncertainty analysis in statistical modeling of extreme hydrological events. Stoch Environ Res Risk Assess 24(5):567–578
Acknowledgments
We would like to thank three referees, and the Associate Editor, for their extremely helpful comments.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: extremal index estimators
1.1 Cluster size estimators
-
The Runs Estimator: \(\hat{\theta }=({\text {mean \,cluster \,size}})^{-1}\), using cluster termination interval \(\kappa \) to identify clusters (see Sect. 1.2).
-
The blocks estimator: As for the runs estimator, but where blocks of length \(\tau \) are considered clusters if there is at least one threshold exceedance within the block.
1.2 Maxima methods
-
Gomes’ estimator: Obtain \({(\hat{\mu }_{\theta}}, {\hat{\varsigma }_{\theta}}, {\hat{\xi }_{\theta}})\) for the GEV applied to block maxima \(\{M_{\tau }\}\) with block length \(\tau \). Find also \((\hat{\mu },\hat{\varsigma },\hat{\xi })\) from block maxima \(\{\bar{M}_{\tau }\}\), obtained from an independent series after randomisation of the original series. Then
$$\begin{aligned} \hat{\theta }\,=\,& {} (\hat{\varsigma }/\hat{\varsigma _{\theta }})^{-1/\tilde{\xi }}, \quad {\text { where}}\\ \tilde{\xi }\,=\,& {} (\hat{\varsigma }-\hat{\varsigma }_{\theta })/(\hat{\mu }-\hat{\mu }_{\theta }) \quad ({\rm Gomes}\,1993). \end{aligned}$$ -
Northrop’s estimator:
$$ \hat{\theta } = -1/\overline{{\text {log }V}}, $$with \(\overline{\text {log }V} = \sum _{i=1}^{n}{\text {log }}V_{i}/n\), \(V_{i}\) being a random sample from a \({\textit{Beta}}(\theta ,1)\) distribution (Northrop 2012).
1.3 Intervals estimators
-
Ferro and Segers’ estimator:
$$ \hat{\theta } = {\text {min}}\left\{ 1, \sum _{i=1}^{J-1}(T_{i}-a)^{2} \big /(J-1)\sum _{i=1}^{J-1}(T_{i}-b)(T_{i}-c)\right\} ,$$where \(T_{i}=S_{i+1}-S_{i}\), \(i=1, \ldots , J-1\) are the times between J threshold exceedances; \(a=b=c=0\) if \({\text {max}}(T_{i})\le 2\); otherwise, \(a=b=1,c=2\) (Ferro and Segers 2003).
-
Süveges’ MLE: Maximum likelihood estimator based on an extension of the work in Ferro and Segers (2003). The likelihood for \(U_{i}=T_{i}-1\), \(i=1, \ldots , J-1\), is maximised to obtain a closed-form expression for \(\hat{\theta }\) (Süveges 2007).
-
Süveges’ IWLS: Iterative weighted least squares estimator based on the normalised gaps between clusters (Süveges 2007).
-
K -gaps estimator: An extension of Süveges’ MLE, shown to have reduced bias and RMSE (given an optimal choice of tuning parameter K) (Süveges and Davison 2010).
Appendix 2: Bayesian sampling in the hierarchical model
For the hierarchical model outlined in Sect. 3.2, we have
for the GPD (log) scale and shape, and the logistic dependence parameters (respectively). All random effects for \(\eta _{m,s}\) and \(\xi _{m,s}\) are assumed to be normally distributed:
for seasonal effects, and
for site effects. We choose the mean of the normal distribution of the seasonal effects to be fixed at zero to avoid over-parameterisation and problems of identifiability; however, we could equally have fixed the mean for the distribution of site effects to achieve this. Since the logistic dependence parameter \(\alpha \) must lie between 0 and 1, we draw the site effect for \(\alpha \) from a uniform distribution, and so
The final layer of the model is to specify prior distributions for the random effect distribution parameters. Here, we have chosen largely non-informative priors, adopting conjugacy wherever possible to simplify computations. Thus,
with a suitable specification of hyper-parameters. The MCMC algorithm employed is Metropolis within Gibbs, i.e. we update each component singly using a Gibbs sampler where the conjugacy allows straightforward sampling from the full conditionals, and a Metropolis step elsewhere. The full conditionals for the Gibbs sampling are:
and
where \(n_{m} {=} {\text {number\, of\, months}} {=} 12\) and \(n_{s}\) = number of sites = 12, and here the notation \(\zeta _{\centerdot}\), for example, is used generically to denote either \(\zeta _{\eta }\) or \(\zeta _{\xi }\). The complexity of the likelihood derived from the GPD means that conjugacy is unattainable for the random effect parameters, and a Metropolis step is used to update each of these.
In the absence of expert prior knowledge, prior parameters were chosen to give a highly non-informative specification:
Rights and permissions
About this article
Cite this article
Fawcett, L., Walshaw, D. Sea-surge and wind speed extremes: optimal estimation strategies for planners and engineers. Stoch Environ Res Risk Assess 30, 463–480 (2016). https://doi.org/10.1007/s00477-015-1132-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-015-1132-3