# Seasonal effects of extreme surges

## Authors

- First Online:

DOI: 10.1007/s00477-005-0008-3

- Cite this article as:
- Coles, S. & Tawn, J. Stoch Environ Res Ris Assess (2005) 19: 417. doi:10.1007/s00477-005-0008-3

- 8 Citations
- 80 Views

## Abstract

Extreme value analysis of sea levels is an essential component of risk analysis and protection strategy for many coastal regions. Since the tidal component of the sea level is deterministic, it is the stochastic variation in extreme surges that is the most important to model. Historically, this modelling has been accomplished by fitting classical extreme value models to series of annual maxima data. Recent developments in extreme value modelling have led to alternative procedures that make better use of available data, and this has led to much refined estimates of extreme surge levels. However, one aspect that has been routinely ignored is seasonality. In an earlier study we identified strong seasonal effects at one of the number of locations along the eastern coastline of the United Kingdom. In this article, we discuss the construction and inference of extreme value models for processes that include components of seasonality in greater detail. We use a point process representation of extreme value behaviour, and set our inference in a Bayesian framework, using simulation-based techniques to resolve the computational issues. Though contemporary, these techniques are now widely used for extreme value modelling. However, the issue of seasonality requires delicate consideration of model specification and parameterization, especially for efficient implementation via Markov chain Monte Carlo algorithms, and this issue seems not to have been much discussed in the literature. In the present paper we make some suggestions for model construction and apply the resultant model to study the characteristics of the surge process, especially in terms of its seasonal variation, on the eastern UK coastline. Furthermore, we illustrate how an estimated model for seasonal surge can be combined with tide records to produce return level estimates for extreme sea levels that accounts for seasonal variation in both the surge and tidal processes.

### Keywords

Bayesian statisticsExtreme valuesPoint processSea levelSeasonalitySurge## 1 Introduction

The various European coastlines that border the North Sea are especially prone to flooding. The causes of this are compound: some regions are very low lying; tidal ranges are large in some areas; and meteorological depressions causing large surge events can be severe. To some extent, the problem of low land levels and high tidal ranges can be mitigated against, as the effects are predictable and perfectly quantifiable. More problematic is the effect of extreme surges, since both the timing and magnitude of such events are difficult to predict.

In a recent paper—Coles and Tawn (2004)—we reviewed the statistical methodology that has been developed for the analysis of extreme sea levels. The principal developments include the use of asymptotic models for extreme events, the separate analysis of the various sea-level components, the enhancement of the basic extreme value model to incorporate extra information into inference and the spatial mapping of model outputs. As well as review, we provided a case study based on 11 sites on the eastern UK coastline, and focused on the spatial mapping of extremal characteristics of the surge distribution. The analysis, which we set in a Bayesian framework, stopped short of a fully spatial analysis as spatial dependence was not taken into account. Despite this, clear spatial patterns were identified. Focusing on one site in particular, Lowestoft, we also found an evidence for a strong seasonal effect in surge behaviour. The aim of this paper, which complements Coles and Tawn (2004), is to address the issue of seasonal variation in the surge process in greater detail. Our aims are twofold: first, to discuss methodological aspects of modelling seasonality in extreme value processes that seem not to have been widely discussed in the literature; second, to explore the seasonal aspects of the surge process based on the same set of data analysed in our earlier paper.

The fact that the sea level is driven by different physical mechanisms, each having their own temporal and spatial dynamic, means that it is almost always more efficient to analyse the constituent processes separately. This concept is at the heart of the joint probability method (Pugh and Vassie 1979, 1980) in which the two principal sea level components—surge and tide—are separately modelled, prior to combination for an inference on total sea level. A complication is that surge and tide processes are generally dependent and it is necessary to modify the joint probability approach to take account of the interaction in the processes (Tawn and Vassie 1989). On the east coast of the UK, for example, the surge distribution tends to be dampened on the high tide. Fortunately, this affords the region some natural protection against excessive sea levels.

In Coles and Tawn (2004), where we also worked with the constituent processes of the sea level, we took a rather more pragmatic approach, filtering the surge series to obtain those values that coincided with a high tide. There are approximately two such events per day, or more precisely, 705 events per year. In this way, the high tide and coincident surge events can reasonably be treated as independent. Moreover, time-dependence within each series is considerably reduced, to the point where independence is a reasonable working approximation. The price to be paid for this simplification is that the surge series generated does not necessarily include the most extreme surges. On the other hand, since they are the ones that occur on the highest tide, it is likely that they are the ones leading to the greatest damage. In any event, it is the series of high-tide surge events that we continue to work with here.

In Sect. 2 we set out a basic modelling strategy for extreme surges. This is based on the latest ideas from extreme value modelling, linking a point process model for extremes to a Bayesian method of inference and adapted to allow for seasonality. There are many advantages to adopting a Bayesian inference, but computation becomes more difficult. Recent years have seen the development of many simulation-based approaches, and in particular the technique of Markov chain Monte Carlo (MCMC). This method is outlined in Sect. 3. However, the efficiency of MCMC algorithms can depend dramatically on the parameterization of a model, and this has serious implications for extreme value models, especially once seasonality is incorporated. This is discussed in Sect. 4, where we make some novel suggestions about parameterization and apply the resultant model to surge series at sites on the eastern UK coastline. Informally, we are able to identify the spatial characteristics of the seasonal variation. Finally, in Sect. 5, we turn to the issue of sea level modelling. In principle, due to the independence of our tide and surge events, the sea level distribution can be obtained as a convolution of the tide and surge distributions. In practice this is complicated not just by seasonality in the surge process, but also by seasonality in the tide process. We illustrate by example how this issue can be handled.

## 2 Seasonal models for extremes

*X*

_{1},

*X*

_{2},...,

*X*

_{n}in which each

*X*

_{i}is the surge recorded at a single location at a time of high tide. If there are

*n*

_{o}observations per year, then the number of years of observation is given by

*n*

_{y}=

*n*/

*n*

_{o}. Initially we assume the surge series to be independent in time and homogeneous in distribution, though subsequently we consider the effect of relaxing each of these assumptions. Now let

*P*

_{n}that occur in a region of the form

*u*. For large

*u*, therefore, the points of

*P*

_{n}falling in \({\mathcal A}_u\) correspond to the extreme events, where extreme now has a very precise meaning. The characterization obtained by Pickands (1971) is based on a limiting result for a scaled version of

*P*

_{n}restricted to \({\mathcal A}_u,\) for increasingly large values of

*u*. Interpreting this limiting result as an approximation implies that, for large

*u*, the process

*P*

_{n}on \({\mathcal A}_u\) is approximately Poisson with intensity function belonging to the family

*a*

_{+}=max(

*a*, 0) and

*t*measures time in years.

*x*

_{i}that exceed the threshold

*u*, and \(t_1,\ldots,t_{n_u}\) are the associated times of these events. Defining the annual maximum of the surge process to be

*M*

_{X}, with \(M_X=\max\{X_1,\ldots,X_{n_{\rm o}}\},\) it follows from the point process representation that the distribution of

*M*

_{X}is generalized extreme value (GEV) with parameters μ,σ,ξ (Coles 2001, for example). That is,

^{−1}, and ϕ∈(0,1] is termed the extremal index. It then follows that

_{ϕ}=μ+σ(ϕ

^{ξ}−1)/ξ and σ

_{ϕ}=σϕ

^{ξ}. Consequently, temporal dependence impacts only on model parameters, which in any case have to be inferred, and not on the class of extremal models itself. It should be noted, however, that there are many examples of dependent series for which there is no local clustering of extreme values and consequently ϕ=1. Ledford and Tawn (2003) give examples of series of this type as well as diagnostic methods for distinguishing between the cases ϕ<1 and ϕ=1 from data. When ϕ is identified to be 1, it is appropriate to model extremes of a series as if the series were independent.

It follows from this discussion that there is a mutual consistency between the classical annual maxima model and the point process model for extremes. The models even have the same parameters, enabling inference to be performed on one of the models, but interpretation given in terms of the other. This is convenient, since inference based on the point process model uses all data that have exceeded some threshold *u*, which is likely to be a larger set than the set of annual maxima. Consequently, such inference is likely to be more precise than an inference based only on the annual maximum data, but since the parameters of the two models are the same, we can present results in the more usual form of the estimated distribution of the annual maximum. Working with the point process model, however, means that the issue of stationarity becomes more critical, and in the presence of seasonality this has positive and negative aspects. On the negative side, if there is seasonality and it is ignored, inferences based on the point process model are likely to be biased. This contrasts with the annual maximum model, for which the asymptotic basis of the model is also undermined by seasonality, but for which the process of modelling just the annual maximum reduces the impact of model misspecification. On the positive side, the point process machinery provides an opportunity to model the seasonality and to incorporate its effects into the extremal analysis, enabling both aggregated and time-specific estimates of extreme behaviour.

*t*), σ(

*t*) and ξ(

*t*) are parametric functions of time with period 1-year and

*u*(

*t*) is a time-dependent threshold. It follows that the tail of the distribution of a variable in the

*X*

_{j}series that occurs at time

*t*, may be approximated as

*x*(Smith 1989). Then assuming temporal independence, the distribution of the annual maximum is

*x*and parameter θ, Bayes’ theorem states simply

In this expression, *f*(θ) is a prior probability density function that expresses beliefs about θ prior to data observation, *f*(*x* | θ) is the likelihood function and *f*(θ | *x*) is the posterior density function for θ. Accordingly, Bayes’ theorem updates prior information with information contained in the data to give the posterior distribution, which is the complete inference. Depending on the application, it may be necessary to condense the information in the posterior distribution to provide a single point estimate of θ, or a range of plausible estimates. Usually, the posterior mean, median or mode is taken as a point estimator. Any interval [*a*, *b*] such that pr(*a* ≤ θ ≤ *b* | *x*)=1−α is termed a (1−α)% credibility interval for θ. The choice of specific point or credibility interval for any particular problem can naturally be set in a decision theoretic framework, leading to an optimal selection. In practice, however, choices are usually determined by pragmatic considerations, using whichever of the point estimators happens to be easiest to calculate, and a credibility interval that is in some sense central.

Arguments in favour of a Bayesian inference include the possibility of including external information through the prior distribution, the fact that inferences have a precise probabilistic interpretation and the extension of results to make predictive assessment. For further discussion in the context of modelling extreme surges see Coles and Tawn (2004). The two principal objections to a Bayesian analysis are first that it requires the specification of a prior distribution, and second that the proportionality factor in Eq. 7 makes computation difficult. The first of these points is a question of taste. In our view, the option to include additional non-data sources of information into the analysis in the form of a prior distribution is an advantage, while the fact that results are often stable across classes of prior distributions with large variance means that a Bayesian analysis, properly conducted, can still be regarded as objective. With regards to computation, many recent advances based on simulation strategies for inference have greatly increased the range of models that can be handled from a numerical point of view. In particular, the technique of MCMC provides a class of algorithms that are especially well suited to Bayesian computation; see Gilks *et al*. (1995) for a general discussion of MCMC methodology and Coles (2001) for a discussion in the specific context of extreme value modelling.

## 3 Simulation techniques for Bayesian inference

With a specified likelihood function *f*(*x* | θ) and prior *f*(θ) Bayesian inference requires the calculation of the posterior distribution, or perhaps a summary of it, via Bayes’ theorem Eq. 7. Except in very simple cases, direct computation is not feasible, as calculation of the proportionality factor requires integration of the right-hand side of Eq. 7 over the parameter space. If it is possible to simulate from *f*(θ | *x*) directly, we could simply use empirical summaries of the simulated sample to approximate the theoretical equivalents: the sample mean to approximate the posterior mean, marginal histograms to approximate marginal densities, and so on. Unfortunately, explicit simulation is not usually feasible. In such situations MCMC algorithms provide an indirect way of generating a sample from the posterior.

*f*(θ |

*x*) with simulation from something simpler, modifying the simulation rule in an appropriate way to ensure that the simulated values are related to

*f*(θ |

*x*), rather than the model from which they were simulated. A widely-used MCMC routine is the Metropolis–Hastings algorithm which takes the form:

- 1.
Specify θ

_{0}; set*i*=0. - 2.
Set

*i*=*i*+1; simulate θ^{*}∼*q*(. | θ_{i-1}) - 3.Calculate$$\alpha=\min\left\{1,\frac{f(\theta^*)f(x\;|\;\theta^*)q(\theta_{i-1}\;|\;\theta^*)} {f(\theta_{i-1})f(x\;|\;\theta_{i-1})q(\theta^*\;|\;\theta_{i-1})}\right\} $$
- 4.Set$$\theta _{i} = \left\{ \begin{aligned} & \theta ^{*} \quad {\text{with}}\;{\text{probability}}\;\alpha \\ & \theta _{i - 1} \quad {\text{with}}\;{\text{probability}}\;1 - \alpha \\ \end{aligned} \right. $$
- 5.
Return to 2.

The simulation of each θ_{i} is dependent on the value of θ_{i-1}, but given this value, it is independent of the previous history θ_{0},...,θ_{i-2}. This is the defining characteristic of a Markov chain. Under suitable regularity conditions, such chains converge to their stationary distribution. In other words, for large enough *N*, the sequence θ_{N+1},θ_{N+2},... can be regarded as a dependent series from a common distribution that is the stationary distribution for the chain.

The remarkable thing about the MCMC algorithm above is that, again subject to some regularity, for any choice of transition density *q*, the stationary distribution for the chain is the target posterior distribution *f*(θ | *x*). Consequently, for large *N*, the series θ_{N+1},θ_{N+2},... can be regarded as a sample, albeit of dependent values, from *f*(θ | *x*), and summarized to obtain, for example, posterior means and variances. The fact that this is true for essentially any choice of *q* means that *q* itself can be chosen for ease of simulation—a common choice is a simple random walk model. On the other hand, the stochastic acceptance or rejection of the proposal at step 4 means that the generated series has a limit behaviour that inherits the properties of *f*(θ | *x*) rather than *q*.

Of course, nothing comes for free. The flexibility of the MCMC algorithm is paid for in various ways. Primarily, the distribution of the generated θ_{i} only *converges* to the target posterior distribution. Such convergence may be slow (and therefore costly) and hard to identify (and therefore lead to doubts about model estimates). Furthermore, even after convergence, the series is dependent, meaning that many simulations may be needed to obtain reliable results. In practice this means that although *q* is arbitrary, some fine-tuning may be needed to obtain a chain that both converges reasonably rapidly and does not have excessively high dependence.

When θ is a vector of parameters, the algorithm is still valid, though it can be difficult to make good choices for *q* in this case. Because of this, it is usual to apply the MCMC algorithm to each component of θ in turn, choosing for each component a one-dimensional (and potentially different) transition model *q*. The point is that, whilst it may be difficult to propose moves in high dimensional space that explore regions of high density, explorations component-by-component are likely to be more efficient. There is a crucial caveat here though: if the posterior surface has contours that have strong dependencies or curvature in directions away from the coordinate axes, this version of the algorithm may have poor convergence properties. Think about a two-component model in which θ_{1} and θ_{2} are highly correlated. This means that the density of (θ_{1},θ_{2}) lines up along the θ_{1}=θ_{2} line. Consequently, component-wise proposals for θ_{1} and θ_{2} will always lead to departures away from the area of high density and are likely to be rejected. In this simple example, the original algorithm with proposal transitions in the two-dimensional space of (θ_{1},θ_{2}) may be adequate, but in more complex problems this may also lead to difficulties. A common solution to this problem, if available, is to reparameterize the model. In this simple case, redefining ϕ_{1}=θ_{1}−θ_{2} and ϕ_{2}=θ_{1}+θ_{2} is likely to work well. The componentwise Markov chain on this space should have good convergence and mixing properties. Inference on the target parameters θ_{1} and θ_{2} is then obtained by applying the inverse transforms for ϕ_{1} and ϕ_{2} to the generated chain, and treating the resultant series in the usual way.

## 4 Bayesian analysis of seasonal surge data

For the reasons discussed in Sect. 1, our raw data comprise a series of surge measurements recorded at the time of high tides. Our objective is to analyse such data, at each of 11 locations, via a Bayesian inference of the point process model specified in Sect. 2.

*u*(

*t*), that also has period 1-year and that has an approximately uniform crossing rate throughout the year.

*u*(

*t*) for some fixed small crossing rate (Koenker and Hallock 2001, for example). In practice, this level of sophistication is probably exaggerated: the Poisson process model is invariant to threshold choice, provided the threshold is high enough. Consequently, we have adopted a simpler approach. We divide the days of the year into a winter and summer period, roughly October–March and April–September respectively. We then specify

*t*measures time in units of years. Consequently, Eq. 8 corresponds to an annual cycle peaking at 1 January, as seems reasonable from Fig. 1 for all sites. Finally, we choose

*a*and

*b*so as to obtain a fixed uniform crossing rate over the winter and summer periods. This requires the solution of two non-linear equations. For computational purposes, we found it easier to minimize the function

*p*

_{s}(

*a*,

*b*) and

*p*

_{w}(

*a*,

*b*) are the proportion of exceedances of the threshold

*u*(

*t*;

*a*,

*b*) in the summer and winter periods respectively, and

*q*is the desired uniform threshold crossing rate. The discreteness in the data leads to non-unique solutions for

*a*and

*b*, but since the idea is just to get a threshold that has a crossing rate that is only approximately uniform, this is not problematic. With

*q*=0.05, the thresholds obtained in this way are included in Fig. 1. The estimated coefficients

*a*and

*b*are plotted as a function of coastal position in Fig. 2. For each site, the parameters

*a*and

*b*can be interpreted respectively as the median threshold level and the scale of variation between peak summer and winter periods. Though this aspect of the analysis is purely empirical, we already see spatial cohesion in each of these characteristics.

Having specified thresholds for each site, we now fit the seasonal point process model described in Sect. 2. The model specification is completed by specifying prior distributions with large variances to reflect prior ignorance.^{1} However, as discussed in Sect. 3, the efficient implementation of an MCMC algorithm to make inferences on the model may depend on the model parameterization. Consider first the stationary point process model with intensity function 1. Though (μ,σ,ξ) represents a natural parameterization of the model, giving simple connections to the GEV annual maximum model Eq. 3, this parameterization is not well suited to the component-by-component implementation of an MCMC algorithm as described in Sect. 3. The difficulty here is that there is a large amount of information in the data concerning basic threshold crossing rates in the point process compared with information about the scale and form of threshold exceedances. This means that the likelihood function and, by extension, the posterior distribution have contours of high density that lie along paths in the (μ,σ,ξ) space corresponding to constant crossing rates. The strong non-linearity in such contours then creates difficulties for MCMC algorithms parameterized in terms of μ, σ and ξ, which are likely therefore to have poor convergence and mixing properties.

The performance of a standard MCMC algorithm on this version of the model is likely to be far superior to one based on the original (μ, σ, ξ) parameterization.

*t*) is sinusoidal,

*u*(

*t*) has already been selected to have a constant threshold exceedance rate λ, it follows immediately that

Thus, the seasonal variation in σ(*t*) requires no additional parameterization; its form is a consequence of the variation in μ(*t*) and the form chosen for *u*(*t*) to achieve a constant crossing rate. This formulation differs from, say, Coles and Tawn (2004), where non-linked models for μ(*t*) and σ(*t*) were adopted. There are pros and cons. On the positive side, if the crossing rate of *u*(*t*) is genuinely constant and the model for μ(*t*) is correct, then model 12 is exact, and it is wasteful constructing and inferring a parametric model to approximate σ(*t*). On the downside, the model is likely to have some sensitivity to these assumptions. With the surge data, the preliminary inference on *u*(*t*) seems quite satisfactory at each site, lending weight to the use of Eq. 12. This means that the MCMC can be parameterized simply in terms of (μ_{0},μ_{1},μ_{2},τ,ξ_{0}). In principle, since the crossing rate is pre-selected as λ=0.05, we could replace τ with the fixed value log{−log (0.95)}, but retaining it as a parameter enables the uncertainty in threshold crossing rates to be incorporated into the inference.

_{1}and μ

_{2}that could perhaps have been reduced by further reparameterization, but the effect does not seem serious enough to warrant this. Figure 4 shows the posterior mean estimates for each parameter across sites together with 95% credibility intervals. The patterns of spatial variation in μ

_{0}and ξ are similar to the analogous parameters in Coles and Tawn (2004), but with some differences in the numerical values, presumably due to the change in threshold specification and the original misspecification of a seasonally homogeneous model. The estimated value for τ—around 3.56—is virtually uniform across sites. This is in near perfect agreement with Eq. 9, since log {−705log(0.95)}=3.59.

_{1}and μ

_{2}. First, looking at μ

_{2}, each credibility interval contains the value 0. Consequently, there would be little to lose in setting μ

_{2}=0 in Eq. 10, corresponding to maxima in the annual cycles occurring on 1 January, as with the threshold settings. In contrast, μ

_{1}shows strong spatial variation. Furthermore, the values at each site are clearly quite distinct from zero, reinforcing the evidence for seasonal variation. At first sight, the spatial variation in μ

_{1}appears to match very closely that of μ

_{0}, but in detail the situation is rather more complex. Figure 5 shows posterior means and marginal 95% credible intervals for the pairs (μ

_{0},μ

_{1}) across locations. Though there is an obvious tendency for μ

_{1}to increase with μ

_{0}, the relationship is non-linear and possibly non-monotone. Formal testing of linear or non-linear relationships is beyond the scope of our analysis here, requiring the specification and modelling of spatial dependence relationships.

## 5 Seasonal modelling of extreme sea levels

The seasonal model for surges enables the probability of an extreme surge event to be calculated for a specific time point (Eq. 5) or over an aggregated time period such as a year (Eq. 6). In Coles and Tawn (2004) we focused on the latter of these, arguing the case for a *predictive* estimate, which is obtained by averaging Eq. 6 over the posterior distribution of the model parameters. However, it is usually the case in applications though that the distribution of surges in isolation is of lesser interest than the distribution of total sea level.

*Z*

_{j}=

*X*

_{j}+

*Y*

_{j}, where

*X*

_{j},

*Y*

_{j}and

*Z*

_{j}correspond to surge, tide and sea level at the

*j*th high tide in a year and let\(M_Z=\max\{Z_1,\ldots,Z_{n_o}\}\) be the annual maximum of the sea level process. By restricting analysis to events on high tides it is reasonable to assume that

*X*

_{j}is independent of the corresponding tidal level

*Y*

_{j}, so that

*z*, the right hand side is given by approximation 5 with θ corresponding to the parameters of the extreme values of the seasonal surge model. Though the tidal process is deterministic, its value at a certain time in an arbitrary year may be thought of as random. Consequently, we treat all high tide observations within a time window of time

*t*—in fact we use a 2-week time window aggregated across years—as a sample from the high tide distribution at this time point. It follows that

*y*

_{j,1}, ... ,

*y*

_{j,K}are the

*K*high tide observations obtained from the window based on the time of occurrence of the variable

*Z*

_{j}.

*X*

_{j}series is temporally independent. As discussed in Sect. 2, so far as extremal modelling goes, this is reasonable provided the extremal index, ϕ, is close to 1. Tawn (1992) showed that the hourly surge series has an extremal index less than 1, with clusters of extreme events typically lasting between 2 and 6 h. However, it is the series of surges on high tides that we are modelling here, and there are theoretical arguments that suggest this filtered series should have an extremal index close to 1 (Robinson and Tawn 2000). By standard probability arguments, the distribution of

*M*

_{Z}is therefore

_{1},...,θ

_{I}say. We finally obtain, therefore,

*X*

_{j}≤

*z*−

*y*

_{j,k}| θ

_{i}) is obtained from Eq. 5. Though trivial, this calculation remains computationally demanding, especially as it needs to be repeated across a range of values of

*z*.

*z*

_{p}against −1/log(1−

*p*) on a logarithmic scale, where \(p=1-\hbox{pr}(M_Z\leq z_p\;|\; \hat{\theta})\) is calculated on the basis of a point estimate \(\hat{\theta},\) such as the maximum likelihood estimate. Since −1/log(1−

*p*)≈ 1/

*p*for small

*p*, it follows that the level

*z*

_{p}is expected to be exceeded approximately once every 1/

*p*years, hence the terms ‘return level’ and ‘return period’. The predictive return level plot is produced identically, except that

*p*and

*z*

_{p}are related by

*z*, is shown in Fig. 8. We believe this to be the first time such an estimate has been made that incorporates seasonal variation in both surge and tide processes.

## 6 Conclusions

We have extended the seasonal analysis of extreme surges discussed in Coles and Tawn (2004) to a network of 11 sites along the eastern UK coastline. To maintain the advantages of a Bayesian form of inference we developed an MCMC algorithm for computation. However, careful consideration of the form of this algorithm led us to certain conclusions about appropriate forms for seasonal extreme value models that appear not to have been discussed before. In particular, the identification of smoothly varying quantile levels for thresholds has implications for structural forms on extreme value parameters which, if ignored, can lead to MCMC output with very poor convergence properties. These conclusions are likely to have implications for the study of the extremes of any process that displays strong seasonality.

The surge results themselves indicate some similarities in the seasonal effects across locations, but also some differences. The timing of the seasonal cycle appears to be uniform across sites, but the magnitude of this cycle varies, with sites having the more extreme surges also having the greatest difference between summer and winter surge levels. Though all aspects of the surge process appear to vary smoothly along the coastline, we have not attempted to build formal spatial models. Largely, this is because of the difficulty in building models that properly account for spatial dependence, but the limited number of data locations is also a restriction.

Finally, we have shown how the fitted seasonal model for surges can be combined with an observed tidal record to produce an estimate of the annual maximum sea level distribution. The calculation is trivial, but computationally demanding. However, to our knowledge, this is the first time such a calculation has been made, taking into account the seasonal variation in both the surge and tide processes.

Subsequent repeat analyses verified that results were essentially unchanged under order of magnitude changes to the variances.

## Acknowledgements

We are grateful for the helpful and insightful comments of two anonymous referees. Our work was supported by grants connected to the projects *Statistics as an aid for environmental decisions: identification, monitoring and evaluation* (2002134337) funded by the Italian Ministry for Education and *Methods for the analysis of extreme sea-levels and for coastal erosion* (CPDA037217) funded by the University of Padova.