Stochastic Environmental Research and Risk Assessment

, Volume 19, Issue 6, pp 417–427

Seasonal effects of extreme surges


  • Stuart Coles
    • Dipartimento di Scienze StatisticheUniversità di Padova
    • Department of Mathematics and StatisticsLancaster University
Original Paper

DOI: 10.1007/s00477-005-0008-3

Cite this article as:
Coles, S. & Tawn, J. Stoch Environ Res Ris Assess (2005) 19: 417. doi:10.1007/s00477-005-0008-3


Extreme value analysis of sea levels is an essential component of risk analysis and protection strategy for many coastal regions. Since the tidal component of the sea level is deterministic, it is the stochastic variation in extreme surges that is the most important to model. Historically, this modelling has been accomplished by fitting classical extreme value models to series of annual maxima data. Recent developments in extreme value modelling have led to alternative procedures that make better use of available data, and this has led to much refined estimates of extreme surge levels. However, one aspect that has been routinely ignored is seasonality. In an earlier study we identified strong seasonal effects at one of the number of locations along the eastern coastline of the United Kingdom. In this article, we discuss the construction and inference of extreme value models for processes that include components of seasonality in greater detail. We use a point process representation of extreme value behaviour, and set our inference in a Bayesian framework, using simulation-based techniques to resolve the computational issues. Though contemporary, these techniques are now widely used for extreme value modelling. However, the issue of seasonality requires delicate consideration of model specification and parameterization, especially for efficient implementation via Markov chain Monte Carlo algorithms, and this issue seems not to have been much discussed in the literature. In the present paper we make some suggestions for model construction and apply the resultant model to study the characteristics of the surge process, especially in terms of its seasonal variation, on the eastern UK coastline. Furthermore, we illustrate how an estimated model for seasonal surge can be combined with tide records to produce return level estimates for extreme sea levels that accounts for seasonal variation in both the surge and tidal processes.


Bayesian statisticsExtreme valuesPoint processSea levelSeasonalitySurge

1 Introduction

The various European coastlines that border the North Sea are especially prone to flooding. The causes of this are compound: some regions are very low lying; tidal ranges are large in some areas; and meteorological depressions causing large surge events can be severe. To some extent, the problem of low land levels and high tidal ranges can be mitigated against, as the effects are predictable and perfectly quantifiable. More problematic is the effect of extreme surges, since both the timing and magnitude of such events are difficult to predict.

In a recent paper—Coles and Tawn (2004)—we reviewed the statistical methodology that has been developed for the analysis of extreme sea levels. The principal developments include the use of asymptotic models for extreme events, the separate analysis of the various sea-level components, the enhancement of the basic extreme value model to incorporate extra information into inference and the spatial mapping of model outputs. As well as review, we provided a case study based on 11 sites on the eastern UK coastline, and focused on the spatial mapping of extremal characteristics of the surge distribution. The analysis, which we set in a Bayesian framework, stopped short of a fully spatial analysis as spatial dependence was not taken into account. Despite this, clear spatial patterns were identified. Focusing on one site in particular, Lowestoft, we also found an evidence for a strong seasonal effect in surge behaviour. The aim of this paper, which complements Coles and Tawn (2004), is to address the issue of seasonal variation in the surge process in greater detail. Our aims are twofold: first, to discuss methodological aspects of modelling seasonality in extreme value processes that seem not to have been widely discussed in the literature; second, to explore the seasonal aspects of the surge process based on the same set of data analysed in our earlier paper.

The fact that the sea level is driven by different physical mechanisms, each having their own temporal and spatial dynamic, means that it is almost always more efficient to analyse the constituent processes separately. This concept is at the heart of the joint probability method (Pugh and Vassie 1979, 1980) in which the two principal sea level components—surge and tide—are separately modelled, prior to combination for an inference on total sea level. A complication is that surge and tide processes are generally dependent and it is necessary to modify the joint probability approach to take account of the interaction in the processes (Tawn and Vassie 1989). On the east coast of the UK, for example, the surge distribution tends to be dampened on the high tide. Fortunately, this affords the region some natural protection against excessive sea levels.

In Coles and Tawn (2004), where we also worked with the constituent processes of the sea level, we took a rather more pragmatic approach, filtering the surge series to obtain those values that coincided with a high tide. There are approximately two such events per day, or more precisely, 705 events per year. In this way, the high tide and coincident surge events can reasonably be treated as independent. Moreover, time-dependence within each series is considerably reduced, to the point where independence is a reasonable working approximation. The price to be paid for this simplification is that the surge series generated does not necessarily include the most extreme surges. On the other hand, since they are the ones that occur on the highest tide, it is likely that they are the ones leading to the greatest damage. In any event, it is the series of high-tide surge events that we continue to work with here.

In Sect. 2 we set out a basic modelling strategy for extreme surges. This is based on the latest ideas from extreme value modelling, linking a point process model for extremes to a Bayesian method of inference and adapted to allow for seasonality. There are many advantages to adopting a Bayesian inference, but computation becomes more difficult. Recent years have seen the development of many simulation-based approaches, and in particular the technique of Markov chain Monte Carlo (MCMC). This method is outlined in Sect. 3. However, the efficiency of MCMC algorithms can depend dramatically on the parameterization of a model, and this has serious implications for extreme value models, especially once seasonality is incorporated. This is discussed in Sect. 4, where we make some novel suggestions about parameterization and apply the resultant model to surge series at sites on the eastern UK coastline. Informally, we are able to identify the spatial characteristics of the seasonal variation. Finally, in Sect. 5, we turn to the issue of sea level modelling. In principle, due to the independence of our tide and surge events, the sea level distribution can be obtained as a convolution of the tide and surge distributions. In practice this is complicated not just by seasonality in the surge process, but also by seasonality in the tide process. We illustrate by example how this issue can be handled.

2 Seasonal models for extremes

Though the range of techniques available for modelling extreme surges in particular is set out in our previous paper, it is convenient to detail one particular approach here. Of the many different representations for extremes of stochastic processes, the most general is in terms of point processes. Not only does this representation include others as special cases, but it also provides a simple modelling tool that optimizes the use of available information on extremes. This characterization of extremes was developed by Pickands (1971), but was first used explicitly for inference by Smith (1989). A detailed discussion of the procedure can be found in Coles (2001). In the present application we have a series X1,X2,...,Xn in which each Xi is the surge recorded at a single location at a time of high tide. If there are no observations per year, then the number of years of observation is given by ny=n/no. Initially we assume the surge series to be independent in time and homogeneous in distribution, though subsequently we consider the effect of relaxing each of these assumptions. Now let
$$P_{n}= \left\{\left(\frac{i}{n_{\rm o}},X_{i}\right):\;i=1,\ldots, n \right\}. $$
This defines a point process that comprises the surge measurement times, measured in years from the start of the recording period, and the surge levels. The points in Pn that occur in a region of the form
$${\mathcal A}_{u}=(0,n_{y})\times(u,\infty), $$
correspond to those events whose magnitude is greater than some threshold value u. For large u, therefore, the points of Pn falling in \({\mathcal A}_u\) correspond to the extreme events, where extreme now has a very precise meaning. The characterization obtained by Pickands (1971) is based on a limiting result for a scaled version of Pn restricted to \({\mathcal A}_u,\) for increasingly large values of u. Interpreting this limiting result as an approximation implies that, for large u, the process Pn on \({\mathcal A}_u\) is approximately Poisson with intensity function belonging to the family
$$\lambda(t,x)=\frac{1}{\sigma} \left[1+\xi \frac{(x-\mu)}{\sigma}\right]_+^ {-1/\xi-1}, \quad 0 < t < n_{y},\; x > u, $$
where σ>0, a+=max(a, 0) and t measures time in years.
Several things follow. First, by standard properties of the Poisson process, we obtain a likelihood function that provides a basis for parameter estimation:
$$L\left(\mu,\sigma,\xi;(t_1,x_1) \ldots,(t_{n_u},x_{n_u})\right)= \exp\left\{-\int_{{\mathcal A}_u}\lambda(t,x)\hbox{d}t\hbox{d}x\right\} {\prod\limits_{i = 1}^{n_{u} } \lambda (t_i,x_i)}, $$
where \(x_1,\ldots,x_{n_u}\) is an enumeration of the values of xi that exceed the threshold u, and \(t_1,\ldots,t_{n_u}\) are the associated times of these events. Defining the annual maximum of the surge process to be MX, with \(M_X=\max\{X_1,\ldots,X_{n_{\rm o}}\},\) it follows from the point process representation that the distribution of MX is generalized extreme value (GEV) with parameters μ,σ,ξ (Coles 2001, for example). That is,
$$\hbox{pr}(M_X\leq x) =\exp\left[-\left\{1+\xi \frac{(x-\mu)}{\sigma}\right\}_+^{-1/\xi}\right]. $$
Similar representations hold when processes are stationary but temporally dependent, subject to limitations on long-range dependence at extreme levels. In this case there is a similar limiting Poisson result, except that points in the limiting process, corresponding to the extreme events, can form clusters. The mean number of points per cluster is denoted ϕ−1, and ϕ∈(0,1] is termed the extremal index. It then follows that
$$\begin{aligned} \hbox{pr}(M_X\leq x) =&\exp\left[-\phi\left\{1+\xi \frac{(x-\mu)} {\sigma}\right\}_+^{-1/\xi}\right] \\ =&\exp\left[-\left\{1+\xi\frac{(x-\mu_{\phi})} {\sigma_{\phi}}\right\}_+^{-1/\xi}\right], \end{aligned} $$
where μϕ=μ+σ(ϕξ−1)/ξ and σϕ=σϕξ. Consequently, temporal dependence impacts only on model parameters, which in any case have to be inferred, and not on the class of extremal models itself. It should be noted, however, that there are many examples of dependent series for which there is no local clustering of extreme values and consequently ϕ=1. Ledford and Tawn (2003) give examples of series of this type as well as diagnostic methods for distinguishing between the cases ϕ<1 and ϕ=1 from data. When ϕ is identified to be 1, it is appropriate to model extremes of a series as if the series were independent.

It follows from this discussion that there is a mutual consistency between the classical annual maxima model and the point process model for extremes. The models even have the same parameters, enabling inference to be performed on one of the models, but interpretation given in terms of the other. This is convenient, since inference based on the point process model uses all data that have exceeded some threshold u, which is likely to be a larger set than the set of annual maxima. Consequently, such inference is likely to be more precise than an inference based only on the annual maximum data, but since the parameters of the two models are the same, we can present results in the more usual form of the estimated distribution of the annual maximum. Working with the point process model, however, means that the issue of stationarity becomes more critical, and in the presence of seasonality this has positive and negative aspects. On the negative side, if there is seasonality and it is ignored, inferences based on the point process model are likely to be biased. This contrasts with the annual maximum model, for which the asymptotic basis of the model is also undermined by seasonality, but for which the process of modelling just the annual maximum reduces the impact of model misspecification. On the positive side, the point process machinery provides an opportunity to model the seasonality and to incorporate its effects into the extremal analysis, enabling both aggregated and time-specific estimates of extreme behaviour.

Previous extreme value analyses that have considered seasonal effects include Carter and Challenor (1981) and Coles et al. (1994). Within the point process framework, a natural setup for seasonality proposed by Smith (1989) is to modify the intensity function 1 to
$$\lambda(t,x)=\frac {1}{\sigma(t)}\left[1+\xi(t) \frac{(x-\mu(t))}{\sigma(t)}\right]_+^ {-1/\xi(t)-1}, \quad 0<t<n_y,\;x>u(t), $$
where now μ(t), σ(t) and ξ(t) are parametric functions of time with period 1-year and u(t) is a time-dependent threshold. It follows that the tail of the distribution of a variable in the Xj series that occurs at time t, may be approximated as
$$\hbox{pr}(X(t)<x \;|\; \mu(t),\sigma(t),\xi(t))\approx \exp\left[-\frac{1}{n_{o}} \left\{1+\xi(t) \frac{(x-\mu(t))} {\sigma(t)}\right\}_+^{-1/\xi(t)}\right], $$
for large enough x (Smith 1989). Then assuming temporal independence, the distribution of the annual maximum is
$$\hbox{pr}(M_X\leq x)=\prod_{j=1}^{n_{o}} \hbox{pr}(X(j/n_o)<x \;|\; \mu(j/n_o),\;\sigma(j/n_o),\;\xi(j/n_o)). $$
Substituting Eq. 4 in Eq. 2 leads to a likelihood function in terms of the parameters of these models. One approach to inference is then simply to maximize this function with respect to the model parameters to obtain maximum likelihood estimates. Standard asymptotic likelihood theory can also be exploited to obtain standard errors and confidence intervals. Bayesian statistics provides an alternative form of inference. For a model with data x and parameter θ, Bayes’ theorem states simply
$$f(\theta \;|\; x)\propto f(\theta)f(x \;|\; \theta). $$

In this expression, f(θ) is a prior probability density function that expresses beliefs about θ prior to data observation, f(x | θ) is the likelihood function and f(θ | x) is the posterior density function for θ. Accordingly, Bayes’ theorem updates prior information with information contained in the data to give the posterior distribution, which is the complete inference. Depending on the application, it may be necessary to condense the information in the posterior distribution to provide a single point estimate of θ, or a range of plausible estimates. Usually, the posterior mean, median or mode is taken as a point estimator. Any interval [a, b] such that pr(a ≤ θ ≤ b | x)=1−α is termed a (1−α)% credibility interval for θ. The choice of specific point or credibility interval for any particular problem can naturally be set in a decision theoretic framework, leading to an optimal selection. In practice, however, choices are usually determined by pragmatic considerations, using whichever of the point estimators happens to be easiest to calculate, and a credibility interval that is in some sense central.

Arguments in favour of a Bayesian inference include the possibility of including external information through the prior distribution, the fact that inferences have a precise probabilistic interpretation and the extension of results to make predictive assessment. For further discussion in the context of modelling extreme surges see Coles and Tawn (2004). The two principal objections to a Bayesian analysis are first that it requires the specification of a prior distribution, and second that the proportionality factor in Eq. 7 makes computation difficult. The first of these points is a question of taste. In our view, the option to include additional non-data sources of information into the analysis in the form of a prior distribution is an advantage, while the fact that results are often stable across classes of prior distributions with large variance means that a Bayesian analysis, properly conducted, can still be regarded as objective. With regards to computation, many recent advances based on simulation strategies for inference have greatly increased the range of models that can be handled from a numerical point of view. In particular, the technique of MCMC provides a class of algorithms that are especially well suited to Bayesian computation; see Gilks et al. (1995) for a general discussion of MCMC methodology and Coles (2001) for a discussion in the specific context of extreme value modelling.

3 Simulation techniques for Bayesian inference

With a specified likelihood function f(x | θ) and prior f(θ) Bayesian inference requires the calculation of the posterior distribution, or perhaps a summary of it, via Bayes’ theorem Eq. 7. Except in very simple cases, direct computation is not feasible, as calculation of the proportionality factor requires integration of the right-hand side of Eq. 7 over the parameter space. If it is possible to simulate from f(θ | x) directly, we could simply use empirical summaries of the simulated sample to approximate the theoretical equivalents: the sample mean to approximate the posterior mean, marginal histograms to approximate marginal densities, and so on. Unfortunately, explicit simulation is not usually feasible. In such situations MCMC algorithms provide an indirect way of generating a sample from the posterior.

The basic trick of MCMC is to substitute simulation from f(θ | x) with simulation from something simpler, modifying the simulation rule in an appropriate way to ensure that the simulated values are related to f(θ | x), rather than the model from which they were simulated. A widely-used MCMC routine is the Metropolis–Hastings algorithm which takes the form:
  1. 1.

    Specify θ0; set i=0.

  2. 2.

    Set i=i+1; simulate θ*q(. | θi-1)

  3. 3.
    $$\alpha=\min\left\{1,\frac{f(\theta^*)f(x\;|\;\theta^*)q(\theta_{i-1}\;|\;\theta^*)} {f(\theta_{i-1})f(x\;|\;\theta_{i-1})q(\theta^*\;|\;\theta_{i-1})}\right\} $$
  4. 4.
    $$\theta _{i} = \left\{ \begin{aligned} & \theta ^{*} \quad {\text{with}}\;{\text{probability}}\;\alpha \\ & \theta _{i - 1} \quad {\text{with}}\;{\text{probability}}\;1 - \alpha \\ \end{aligned} \right. $$
  5. 5.

    Return to 2.


The simulation of each θi is dependent on the value of θi-1, but given this value, it is independent of the previous history θ0,...,θi-2. This is the defining characteristic of a Markov chain. Under suitable regularity conditions, such chains converge to their stationary distribution. In other words, for large enough N, the sequence θN+1N+2,... can be regarded as a dependent series from a common distribution that is the stationary distribution for the chain.

The remarkable thing about the MCMC algorithm above is that, again subject to some regularity, for any choice of transition density q, the stationary distribution for the chain is the target posterior distribution f(θ | x). Consequently, for large N, the series θN+1N+2,... can be regarded as a sample, albeit of dependent values, from f(θ | x), and summarized to obtain, for example, posterior means and variances. The fact that this is true for essentially any choice of q means that q itself can be chosen for ease of simulation—a common choice is a simple random walk model. On the other hand, the stochastic acceptance or rejection of the proposal at step 4 means that the generated series has a limit behaviour that inherits the properties of f(θ | x) rather than q.

Of course, nothing comes for free. The flexibility of the MCMC algorithm is paid for in various ways. Primarily, the distribution of the generated θi only converges to the target posterior distribution. Such convergence may be slow (and therefore costly) and hard to identify (and therefore lead to doubts about model estimates). Furthermore, even after convergence, the series is dependent, meaning that many simulations may be needed to obtain reliable results. In practice this means that although q is arbitrary, some fine-tuning may be needed to obtain a chain that both converges reasonably rapidly and does not have excessively high dependence.

When θ is a vector of parameters, the algorithm is still valid, though it can be difficult to make good choices for q in this case. Because of this, it is usual to apply the MCMC algorithm to each component of θ in turn, choosing for each component a one-dimensional (and potentially different) transition model q. The point is that, whilst it may be difficult to propose moves in high dimensional space that explore regions of high density, explorations component-by-component are likely to be more efficient. There is a crucial caveat here though: if the posterior surface has contours that have strong dependencies or curvature in directions away from the coordinate axes, this version of the algorithm may have poor convergence properties. Think about a two-component model in which θ1 and θ2 are highly correlated. This means that the density of (θ12) lines up along the θ12 line. Consequently, component-wise proposals for θ1 and θ2 will always lead to departures away from the area of high density and are likely to be rejected. In this simple example, the original algorithm with proposal transitions in the two-dimensional space of (θ12) may be adequate, but in more complex problems this may also lead to difficulties. A common solution to this problem, if available, is to reparameterize the model. In this simple case, redefining ϕ11−θ2 and ϕ212 is likely to work well. The componentwise Markov chain on this space should have good convergence and mixing properties. Inference on the target parameters θ1 and θ2 is then obtained by applying the inverse transforms for ϕ1 and ϕ2 to the generated chain, and treating the resultant series in the usual way.

4 Bayesian analysis of seasonal surge data

For the reasons discussed in Sect. 1, our raw data comprise a series of surge measurements recorded at the time of high tides. Our objective is to analyse such data, at each of 11 locations, via a Bayesian inference of the point process model specified in Sect. 2.

When applying the point process likelihood an important issue is threshold selection. This choice comprises a bias-variance tradeoff. High thresholds have reliable asymptotics and therefore low bias, but there will be a few threshold exceedances and therefore high sampling variance. In contrast, and for the opposite reasons, low thresholds lead to biased estimates, but with low sampling variability. For stationary processes there are simple diagnostics to support threshold choice (Davison and Smith 1990). For non-stationary processes the situation is more difficult. Figure 1 shows the high-tide surges for each site plotted as time within year. Clearly, at all sites, the process is strongly seasonal. Taking constant thresholds in this case will generate more exceedances in the winter period than the summer, leading to an unbalanced bias-variance tradeoff over the year. Intuitively, it makes more sense to select a time-varying threshold, u(t), that also has period 1-year and that has an approximately uniform crossing rate throughout the year.
Fig. 1

High-tide surge observations plotted against within-year time of occurrence for each site on the eastern UK coastline. Superimposed are time varying thresholds calculated as explained in text to achieve a uniform crossing rate of 5%. Sites, from north to south, are Wick (Wic), Aberdeen (Abe), North Shields (Nor), Whitby (Whi), Immingham (Imm), Lowestoft (Low), Harwich (Har), Walton (Wal), Southend (Sou), Sheerness (She) and Dover (Dov)

In principle, we could adopt a quantile regression method to estimate u(t) for some fixed small crossing rate (Koenker and Hallock 2001, for example). In practice, this level of sophistication is probably exaggerated: the Poisson process model is invariant to threshold choice, provided the threshold is high enough. Consequently, we have adopted a simpler approach. We divide the days of the year into a winter and summer period, roughly October–March and April–September respectively. We then specify
$$u(t;a,b)=a+b\cos(2\pi t), $$
where again t measures time in units of years. Consequently, Eq. 8 corresponds to an annual cycle peaking at 1 January, as seems reasonable from Fig. 1 for all sites. Finally, we choose a and b so as to obtain a fixed uniform crossing rate over the winter and summer periods. This requires the solution of two non-linear equations. For computational purposes, we found it easier to minimize the function
$$h(a,b)=\max\{(p_s(a,b)-q)^2,(p_w(a,b)-q)^2\}, $$
where ps(a,b) and pw(a,b) are the proportion of exceedances of the threshold u(t;a,b) in the summer and winter periods respectively, and q is the desired uniform threshold crossing rate. The discreteness in the data leads to non-unique solutions for a and b, but since the idea is just to get a threshold that has a crossing rate that is only approximately uniform, this is not problematic. With q=0.05, the thresholds obtained in this way are included in Fig. 1. The estimated coefficients a and b are plotted as a function of coastal position in Fig. 2. For each site, the parameters a and b can be interpreted respectively as the median threshold level and the scale of variation between peak summer and winter periods. Though this aspect of the analysis is purely empirical, we already see spatial cohesion in each of these characteristics.
Fig. 2

Coefficients a and b in varying threshold specification as functions of location. Distances between annotated locations are in proportion to coastal distances between sites

Having specified thresholds for each site, we now fit the seasonal point process model described in Sect. 2. The model specification is completed by specifying prior distributions with large variances to reflect prior ignorance.1 However, as discussed in Sect. 3, the efficient implementation of an MCMC algorithm to make inferences on the model may depend on the model parameterization. Consider first the stationary point process model with intensity function 1. Though (μ,σ,ξ) represents a natural parameterization of the model, giving simple connections to the GEV annual maximum model Eq. 3, this parameterization is not well suited to the component-by-component implementation of an MCMC algorithm as described in Sect. 3. The difficulty here is that there is a large amount of information in the data concerning basic threshold crossing rates in the point process compared with information about the scale and form of threshold exceedances. This means that the likelihood function and, by extension, the posterior distribution have contours of high density that lie along paths in the (μ,σ,ξ) space corresponding to constant crossing rates. The strong non-linearity in such contours then creates difficulties for MCMC algorithms parameterized in terms of μ, σ and ξ, which are likely therefore to have poor convergence and mixing properties.

The natural way to resolve this difficulty is to reparameterize the model so that the threshold crossing rate, or some function of it, is one of the parameters updated in the MCMC algorithm. The crossing rate (per observation) is given by
$$\lambda=1- \exp\left[-n_{o}^{-1}\left\{1+\xi\frac {(u-\mu)}{\sigma}\right\}^{-1/\xi}\right], $$
but it is convenient to avoid boundary constraints by transforming to the parameter
$$\tau=\log\{-n_{o}\log(1-\lambda)\}. $$
We can therefore reparameterize the model in terms of μ,τ and ξ, run the MCMC on these parameters, and recapture the inference on σ by transforming back to
$$\sigma=\frac{\xi(u-\mu)}{\exp(-\xi\tau)-1}. $$

The performance of a standard MCMC algorithm on this version of the model is likely to be far superior to one based on the original (μ, σ, ξ) parameterization.

The situation is slightly more complicated in the non-stationary model because of the non-constant threshold. However, similar arguments can be used to guide sensible periodic models. For example, we might hypothesise that the location parameter μ(t) is sinusoidal,
$$\mu(t)=\mu_0+\mu_1\cos(2\pi t-\mu_{2}) $$
and that the shape parameter is constant,
$$\xi(t)=\xi_0. $$
Now, since the threshold u(t) has already been selected to have a constant threshold exceedance rate λ, it follows immediately that
$$\sigma(t)=\frac{\xi_0\{u(t)-\mu(t)\}}{\exp(-\xi_0\tau)-1}. $$

Thus, the seasonal variation in σ(t) requires no additional parameterization; its form is a consequence of the variation in μ(t) and the form chosen for u(t) to achieve a constant crossing rate. This formulation differs from, say, Coles and Tawn (2004), where non-linked models for μ(t) and σ(t) were adopted. There are pros and cons. On the positive side, if the crossing rate of u(t) is genuinely constant and the model for μ(t) is correct, then model 12 is exact, and it is wasteful constructing and inferring a parametric model to approximate σ(t). On the downside, the model is likely to have some sensitivity to these assumptions. With the surge data, the preliminary inference on u(t) seems quite satisfactory at each site, lending weight to the use of Eq. 12. This means that the MCMC can be parameterized simply in terms of (μ012,τ,ξ0). In principle, since the crossing rate is pre-selected as λ=0.05, we could replace τ with the fixed value log{−log (0.95)}, but retaining it as a parameter enables the uncertainty in threshold crossing rates to be incorporated into the inference.

Working with this parameterization and using the Metropolis–Hastings algorithm with simple random walk proposals for each parameterer, we obtained realizations of Markov chains with good mixing properties at each of the 11 locations. As an example, Fig. 3 shows the generated chain for each parameter based on the data from the most northerly site, Wick. The plots show rapid convergence from the initial states and good mixing properties thereafter. On the basis of similar plots at each site, our subsequent analyses are based on 5,000 iterations after an initial burn-in period of 1,000 iterations. There is some apparent residual correlation between the output for μ1 and μ2 that could perhaps have been reduced by further reparameterization, but the effect does not seem serious enough to warrant this. Figure 4 shows the posterior mean estimates for each parameter across sites together with 95% credibility intervals. The patterns of spatial variation in μ0 and ξ are similar to the analogous parameters in Coles and Tawn (2004), but with some differences in the numerical values, presumably due to the change in threshold specification and the original misspecification of a seasonally homogeneous model. The estimated value for τ—around 3.56—is virtually uniform across sites. This is in near perfect agreement with Eq. 9, since log {−705log(0.95)}=3.59.
Fig. 3

MCMC output for parameters in seasonal point process model for Wick
Fig. 4

Points are posterior mean estimates of seasonal point process model as function of location. Vertical bars are corresponding 95% credibility intervals. Distances between annotated locations are in proportion to coastal distances between sites

The information on seasonal variability is contained in the plots for μ1 and μ2. First, looking at μ2, each credibility interval contains the value 0. Consequently, there would be little to lose in setting μ2=0 in Eq. 10, corresponding to maxima in the annual cycles occurring on 1 January, as with the threshold settings. In contrast, μ1 shows strong spatial variation. Furthermore, the values at each site are clearly quite distinct from zero, reinforcing the evidence for seasonal variation. At first sight, the spatial variation in μ1 appears to match very closely that of μ0, but in detail the situation is rather more complex. Figure 5 shows posterior means and marginal 95% credible intervals for the pairs (μ01) across locations. Though there is an obvious tendency for μ1 to increase with μ0, the relationship is non-linear and possibly non-monotone. Formal testing of linear or non-linear relationships is beyond the scope of our analysis here, requiring the specification and modelling of spatial dependence relationships.
Fig. 5

Posterior mean estimates across locations with marginal 95% credibility intervals for μ0 (horizontal) and μ1 (vertical) shown as lines

Finally, we recall that seasonal variation in the parameter σ is completely specified as a consequence of the chosen threshold and the estimated model parameters via Eq. 12. As an example, Figure 6 shows the estimated model for σ(t) for the site of Lowestoft, where seasonal variation is greatest.
Fig. 6

Seasonal variation in σ(t), based on posterior mean of point process model estimates at Lowestoft

5 Seasonal modelling of extreme sea levels

The seasonal model for surges enables the probability of an extreme surge event to be calculated for a specific time point (Eq. 5) or over an aggregated time period such as a year (Eq. 6). In Coles and Tawn (2004) we focused on the latter of these, arguing the case for a predictive estimate, which is obtained by averaging Eq. 6 over the posterior distribution of the model parameters. However, it is usually the case in applications though that the distribution of surges in isolation is of lesser interest than the distribution of total sea level.

In principle, the sea level distribution can be obtained as a convolution of the surge and tide distributions (Tawn 1992). In practice, this procedure is complicated by seasonality, which affects tidal distributions as well as surge distributions. We illustrate with the data from Lowestoft. Figure 7 shows boxplots of the tidal levels stratified into approximately fortnightly within-year periods for the same period of observations for which the surges are available. All aspects of the tidal distribution are seen to change smoothly across the year. In particular, the maximum tidal level varies by around 20 cm, with a minimum level in April and a maximum in October. Since the highest tides tend to occur in a period when the surge distribution is also relatively large (though, perhaps fortunately, not at its largest), it is likely that the sea level distribution would be underestimated by a naive calculation that ignored seasonality.
Fig. 7

Box-plots of tidal observations at Lowestoft stratified by (approx.) fortnightly period

The standard convolution argument can be adapted to incorporate seasonality in the following way. Denote Zj=Xj+Yj, where Xj, Yj and Zj correspond to surge, tide and sea level at the jth high tide in a year and let\(M_Z=\max\{Z_1,\ldots,Z_{n_o}\}\) be the annual maximum of the sea level process. By restricting analysis to events on high tides it is reasonable to assume that Xj is independent of the corresponding tidal level Yj, so that
$$\hbox{pr}\left(Z_j\le z\;|\;\theta\right)=\hbox{pr}\left(X_j\le z-Y_j\;|\;\theta\right) $$
where, for large z, the right hand side is given by approximation 5 with θ corresponding to the parameters of the extreme values of the seasonal surge model. Though the tidal process is deterministic, its value at a certain time in an arbitrary year may be thought of as random. Consequently, we treat all high tide observations within a time window of time t—in fact we use a 2-week time window aggregated across years—as a sample from the high tide distribution at this time point. It follows that
$$\hbox{pr}\left(Z_j\le z\;|\;\theta\right)=\frac{1}{K}\sum_{k=1}^{K}\hbox{pr}\left(X_j\le z-y_{j,k}\;|\;\theta\right) $$
where yj,1, ... ,yj,K are the K high tide observations obtained from the window based on the time of occurrence of the variable Zj.
We next make the working assumption that the Xj series is temporally independent. As discussed in Sect. 2, so far as extremal modelling goes, this is reasonable provided the extremal index, ϕ, is close to 1. Tawn (1992) showed that the hourly surge series has an extremal index less than 1, with clusters of extreme events typically lasting between 2 and 6 h. However, it is the series of surges on high tides that we are modelling here, and there are theoretical arguments that suggest this filtered series should have an extremal index close to 1 (Robinson and Tawn 2000). By standard probability arguments, the distribution of MZ is therefore
$$\hbox{pr}\left(M_Z\leq z\;|\;\theta\right)= \prod_{j=1}^{n_o}\left\{\frac{1}{K}\sum_{k=1}^{K}\hbox{pr}\left(X_j\le z-y_{j,k}\;|\;\theta\right)\right\}$$
Finally, taking account of the estimation of the surge distribution on the basis of the information available, that is the data and the information contained in the prior, which we collectively denote by \({\mathcal H},\) the predictive estimate of this distribution is
$$\hbox{pr}\left(M_Z\leq z\;|\;{\mathcal H}\right)= \int_{{\mathcal H}} \prod_{j=1}^{n_{\rm o}}\left\{\frac{1}{K}\sum_{k=1}^{K}\hbox{pr}\left(X_j\le z-y_{j,k}\;|\;\theta\right)\right\} f(\theta\;|\;{\mathcal H})\hbox{d}\theta, $$
where \(f(\theta\;|\; {\mathcal H})\) is the posterior density of θ. Although this calculation is analytically infeasible, an estimate of \(f(\theta\;|\;{\mathcal H})\) is implicit in the post burn-in MCMC output, θ1,...,θI say. We finally obtain, therefore,
$$\hbox{pr}(M_Z \leq z\;|\; {\mathcal H})\approx \frac{1} {I}\sum_{i=1}^{I} \prod_{j=1}^{n_o}\left\{\frac{1} {K}\sum_{k=1}^{K}\hbox{pr}\left(X_j\le z-y_{j,k}\;|\;\theta_{i}\right)\right\}. $$
where pr(Xjzyj,k | θi) is obtained from Eq. 5. Though trivial, this calculation remains computationally demanding, especially as it needs to be repeated across a range of values of z.
As we argued in Coles and Tawn (2004), a natural way to present the results of a Bayesian inference on extremes is via a predictive return level plot. A standard return level plot is a graph of zp against −1/log(1−p) on a logarithmic scale, where \(p=1-\hbox{pr}(M_Z\leq z_p\;|\; \hat{\theta})\) is calculated on the basis of a point estimate \(\hat{\theta},\) such as the maximum likelihood estimate. Since −1/log(1−p)≈ 1/p for small p, it follows that the level zp is expected to be exceeded approximately once every 1/p years, hence the terms ‘return level’ and ‘return period’. The predictive return level plot is produced identically, except that p and zp are related by
$$p=1-\hbox{pr}\left(M_Z\leq z_p\;|\; {\mathcal H}\right) $$
according to Eq. 13. Predictive return level plots therefore include implicitly the uncertainty due to parameter estimation. The plot for Lowestoft, obtained by evaluating Eq. 14 across a range of values of z, is shown in Fig. 8. We believe this to be the first time such an estimate has been made that incorporates seasonal variation in both surge and tide processes.
Fig. 8

Predictive return level plot of annual maximum total sea level at Lowestoft

6 Conclusions

We have extended the seasonal analysis of extreme surges discussed in Coles and Tawn (2004) to a network of 11 sites along the eastern UK coastline. To maintain the advantages of a Bayesian form of inference we developed an MCMC algorithm for computation. However, careful consideration of the form of this algorithm led us to certain conclusions about appropriate forms for seasonal extreme value models that appear not to have been discussed before. In particular, the identification of smoothly varying quantile levels for thresholds has implications for structural forms on extreme value parameters which, if ignored, can lead to MCMC output with very poor convergence properties. These conclusions are likely to have implications for the study of the extremes of any process that displays strong seasonality.

The surge results themselves indicate some similarities in the seasonal effects across locations, but also some differences. The timing of the seasonal cycle appears to be uniform across sites, but the magnitude of this cycle varies, with sites having the more extreme surges also having the greatest difference between summer and winter surge levels. Though all aspects of the surge process appear to vary smoothly along the coastline, we have not attempted to build formal spatial models. Largely, this is because of the difficulty in building models that properly account for spatial dependence, but the limited number of data locations is also a restriction.

Finally, we have shown how the fitted seasonal model for surges can be combined with an observed tidal record to produce an estimate of the annual maximum sea level distribution. The calculation is trivial, but computationally demanding. However, to our knowledge, this is the first time such a calculation has been made, taking into account the seasonal variation in both the surge and tide processes.


Subsequent repeat analyses verified that results were essentially unchanged under order of magnitude changes to the variances.



We are grateful for the helpful and insightful comments of two anonymous referees. Our work was supported by grants connected to the projects Statistics as an aid for environmental decisions: identification, monitoring and evaluation (2002134337) funded by the Italian Ministry for Education and Methods for the analysis of extreme sea-levels and for coastal erosion (CPDA037217) funded by the University of Padova.

Copyright information

© Springer-Verlag 2005