Basin-wide spatial conditional extremes for severe ocean storms

Physical considerations and previous studies suggest that extremal dependence between ocean storm severity at two locations exhibits near asymptotic dependence at short inter-location distances, leading to asymptotic independence and perfect independence with increasing distance. We present a spatial conditional extremes (SCE) model for storm severity, characterising extremal spatial dependence of severe storms by distance and direction. The model is an extension of Shooter et al. 2019 (Environmetrics 30, e2562, 2019) and Wadsworth and Tawn (2019), incorporating piecewise linear representations for SCE model parameters with distance and direction; model variants including parametric representations of some SCE model parameters are also considered. The SCE residual process is assumed to follow the delta-Laplace form marginally, with distance-dependent parameter. Residual dependence of remote locations given conditioning location is characterised by a conditional Gaussian covariance dependent on the distances between remote locations, and distances of remote locations to the conditioning location. We apply the model using Bayesian inference to estimates extremal spatial dependence of storm peak significant wave height on a neighbourhood of 150 locations covering over 200,000 km2 in the North Sea.


Introduction
A key issue in modelling spatial extremes is assessing the nature of dependence between extreme events. If we observe an extreme event at a location, we are interested in the information provided by this event about the probability of observing extreme events simultaneously at other locations. We expect that over short distances, an extreme event being observed at one location may be related to an extreme observation at another. However extremes observed at distant locations are likely to be less related to each other. To describe extremal dependence, Coles et al. (1999) introduce the measures χ and χ, most easily calculated through their sub-asymptotic forms χ(u) and χ (u). where u ∈ [0, 1]. For bivariate uniform random variables (U, V ) these are defined as χ(u) = 2 − log P(U < u, V < v)/ log P(U < u) and χ (u) = 2 log P(U > u)/ log P(U > u, V > v) − 1. χ and χ may be obtained by taking the respective limits of these functions, as u → 1. The nature of extremal dependence between U and V may then be described by considering χ and χ together. If χ = 0 and −1 ≤ χ < 1, the random variables are asymptotically independent, and the value of χ signifies the level of dependence. On the other hand, if χ = 1 and 0 < χ ≤ 1, then the pair (U, V ) exhibit asymptotic dependence, with χ providing a measure of this. For a full description of extremal dependence types, we refer the reader to Ledford and Tawn (1996) and Coles et al. (1999). Spatial extensions of these measures are discussed in Tawn et al. (2018).
Models for spatial extremes typically require that the type of extremal dependence exhibited be decided before model fitting. Here, we describe a spatial conditional extremes (SCE) model able to capture different types of asymptotic behaviour without prior assumption of dependence class. The SCE framework is useful to assess the risk involved in the construction of coastal and marine structures, enabling improved description of extremal dependence between variables characterising the meteorological-oceanographic ("metocean") environment, and better joint quantification of risk for severe ocean storms.
The current study involves characterisation of extremal spatial dependence of ocean storm severity, quantified for a storm event using storm peak significant wave height (H S ). For two sampling locations a short distance apart (relative to the size of a storm), we expect that an extreme value of H S arises at each location from the same storm event, characteristic of asymptotic dependence. In contrast, since it is unlikely that extremes occurring at two distant locations would be simultaneously large, we expect observations from distant locations to exhibit asymptotic independence. Previous studies (Kereszturi et al. 2016;Ross et al. 2017a) have shown that the nature and extent of extremal dependence in an ocean basin changes with distance between locations, and also potentially with their orientation.
The traditional approach to spatial extremes is to consider max-stable processes, as introduced by Brown and Resnick (1977), Smith (1990) and Schlather (2002). Max-stable process models typically make the assumption that the spatial process is asymptotically dependent, and hence such models may be inappropriate for modelling H S over an ocean basin. Other asymptotically dependent spatial extremes models have been proposed, such as the hierarchical max-stable model of Reich and Shaby (2012) and the Pareto process approach of Ferreira and de Haan (2014) or de Fondeville and Davison (2020), a method motivated by peaks-over-threshold analysis. The link between Pareto processes and the SCE model is described by Tawn et al. (2018); in particular, these processes exhibit asymptotic dependence. More recently, a number of models able to describe either class of extremal dependence have been proposed, including Wadsworth and Tawn (2012), Wadsworth et al. (2017) and Huser and Wadsworth (2019). These models suffer from the restriction that either the fitted model must assume a certain type of extremal dependence across the entire spatial domain in which it is fitted (Huser and Wadsworth 2019), or the model is rather computationally challenging to fit (Wadsworth and Tawn 2012). More detailed overviews of spatial extremes modelling may be found in Davison et al. (2012), Ribatet (2013) and Tawn et al. (2018), for example.
The SCE model presented here overcomes these issues. Careful parameterisation of distance and directional effects enables the spatial extremes problem to be described parsimoniously, even when the number of measurement locations is large. This reduces computational burden when fitting across hundreds of sampling locations compared to broadly equivalent max-stable proceses, although Pareto processes are less computationally expensive than max-stable processes. Moreover, the model admits both asymptotic dependence and asymptotic independence, evolving with distance and direction on the spatial domain.
The model builds upon the work of Wadsworth and Tawn (2019) and Shooter et al. (2019), adopting functional forms for SCE parameters. Shooter et al. (2019) found that assuming a simple two-parameter for the decay of SCE slope parameter, α with distance was adequate to capture parameter behaviour adequately whilst reducing computational time for parameter estimation, since α would otherwise need to be estimated separately for pair of conditioning and remote locations. We build upon this idea by imposing either parametric or piecewise-linear forms for SCE parameters with distance and direction, motivated by theoretical considerations and evidence from preliminary analysis. The model is applied using Bayesian inference to a neighbourhood of 150 locations, to describe the joint extremal dependence of 149 remote locations on a central conditioning location. Such an analysis would be computationally demanding in the max-stable process framework. The current SCE model also adopts a delta-Laplace (or generalised Gaussian) form for the marginal distribution of model residuals, coupled to a spatial conditional Gaussian copula for residual dependence.
The article is presented as follows. In Section 3, we discuss the conditional extremes model of Heffernan and Tawn (2004) and the spatial extension of Wadsworth and Tawn (2019). Section 3 summarises the inferential scheme used for parameter estimation. Section 4 outlines the performance of our model in application North Sea storm severity. Discussion and conclusions are given in Section 5. An outline of the Markov chain Monte Carlo (MCMC) scheme used for inference, and the conditional quantile constraints of Keef et al. (2013), is given in the Appendix.

Multivariate conditional extremes
Suppose we have a vector of random variables (X 0 , X), where X 0 and X = (X 1 , . . . , X p ) have Gumbel marginal distributions, and random variables Z defined by where all operations are taken to be component-wise. Heffernan and Tawn (2004) assume the existence of a and b such that, for x > 0, where G is a joint distribution with non-degenerate margins. This form for the conditional extremes model is asymptotically justified; see Heffernan and Tawn (2004) and Heffernan and Resnick (2007) for details. Keef et al. (2013) show that if the margins of X are assumed to be Laplace-distributed (achievable using the probability integral transform), then canonical functional forms for a(·) and b(·) are a(x) = αx and b(x) = x β (for x > 0), where α = (α 1 , . . . , α p ) and β = (β 1 , . . . , β p ). In this representation, each α i ∈ [−1, 1] and β i ∈ (−∞, 1], with different parameter values corresponding to different classes of extremal dependence. In the present spatial context, we assume positive dependence and restrict α i ∈ [0, 1], β i ∈ [0, 1] for all i ∈ {1, . . . , p}. Motivated by Eq. 1, for some high threshold u and all x 0 > u, we assume the model form where Z ∼ G is independent of X 0 .

Spatial conditional extremes
We extend the model described in Eq. 2 to a spatial context, as described by Tawn et al. (2018) and Wadsworth and Tawn (2019). Consider a stationary spatial process X(·), on a domain S, which has Laplace marginal distributions and locations r, r 0 ∈ S. Then for distance d ∈ R ≥0 and heading θ ∈ [0, 2π) between sites, assuming positive dependence between variables, we have in general that, for all x 0 > u, is a residual process independent of X(·). For inference on a sample of spatial data (X 0 , X) observed at a finite set of locations r 0 , r 1 , . . . , r p , X(·) is treated as a finite-dimensional process. We set d j = dist(r j , r 0 ) for j = 1, 2, . . . , p and θ j = head(r j , r 0 ), for distance and heading metrics dist(·, ·) and head(·, ·). Then θ describes the relative heading of a remote location from the conditioning site; note that θ is defined using the mathematical convention, anti-clockwise from due East, for which θ = 0. We then set α j = α(d j , θ j ) and β j = β(d j , θ j ). We follow Wadsworth and Tawn (2019) and suppose that process Z(·) has delta-Laplace margins with parameters μ, σ and δ dependent on d and θ. That is, the marginal density f Z j of Z at distance d j is for j = 1, 2, . . . , p, δ j , σ j , κ j ∈ R >0 , μ j ∈ R, z j ∈ R, where κ 2 j = Γ 1/δ j /Γ 3/δ j and Γ (·) represents the gamma function. The mean and variance of this distribution are respectively μ j and σ 2 j , regardless of the choice of δ j , and we denote the distribution as DL(μ j , σ 2 j , δ j ). With this notation, δ j = 2 corresponds to a Gaussian distribution, whereas δ j = 1 yields a Laplace distribution; the standard Laplace distribution with variance 2 corresponds to σ 2 j = 2 in our notation. Of particular importance is the requirement that as d → ∞, we approach perfect independence between X(r) and X(r 0 ). Inspection of the equations above suggests that we should simply be left with standard Laplace random variables; i.e., The model is not informative for δ and ρ and d = 0. Now consider the vector of random variables (X 0 , X) corresponding to p + 1 spatial locations, with standard Laplace marginal distributions, i.e., X j ∼ DL(0, 2, 1) for j = 0, 1, . . . , p. We assume, conditional on X 0 = x 0 , for x 0 > u, that X follows a multivariate extension of the delta-Laplace distribution . is the correlation matrix for a Gaussian dependence structure between residual components, discussed further in Section 2.4, such that where F represents a cumulative distribution function and Φ is the cumulative distribution function of a standard Gaussian distribution. Φ p (0, ) is the cumulative distribution function of a p-dimensional Gaussian distribution with mean 0 and covariance matrix . Hence marginally, , for conditional means m j and standard deviations s j given by for j = 1, 2, . . . , p. Hence a p-dimensional delta-Laplace distribution with mean m = (m 1 , m 2 , . . . , m p ), variance s 2 = (s 2 1 , s 2 2 , . . . , s 2 p ), δ = (δ 1 , δ 2 , . . . , δ p ) and (standard Gaussian-scale) covariance .
The model in Eq. 5 describes different types of extremal dependence, inferred from the values of parameters (α j , β j ) for j ∈ {1, . . . , p}. If (α j , β j ) = (1, 0), then random variables X 0 and X j are asymptotically dependent, whereas if α j < 1, the random variables exhibit asymptotic independence. Further discussion of this can be found in Tawn et al. (2018).

Model parameter variation with distance and direction
The p + 1 measurement locations are assumed to have coordinates r j , j = 0, 1, . . . , p. Parameters α, β, μ, σ, δ are assumed to be continuous functions of distance with d j = dist(r j , r 0 ) and heading θ j = head(r j , r 0 ) for remote location r j , j = 1, 2, . . . , p given conditioning location r 0 , defined using a local Cartesian frame discussed further in Section 2.5. Two specific forms for the distance and directional dependence of SCE model parameters are considered. The first representation is piecewise linear in d and θ . The second representation, adopted to reduce model size and improve computational performance, assumes parametric variation of α and β with d.
For the piecewise linear representation, we specify a set of n d equally-spaced nodes d k , k = 1, 2, . . . , n d , covering the domain radially from r 0 , and a corresponding set of n θ directions θ ∈ [0, 2π), = 1, 2, . . . , n θ to codify angular variation. The value of any SCE model parameter η(d, θ) on the spatial domain can then be expressed in terms of the set of values {η k,l } as with analogous definitions for Δ θ terms. The values of η k,l , k = 1, 2, . . . , n d , = 1, 2, . . . , n θ are estimated during inference. We also consider parametric representations of α and β with d for each θ , = 1, 2, . . . , n θ , taking the forms These choices were motivated by previous application studies including Shooter et al. (2019), together with the known theoretical behaviours of α and β for d = 0 and d → ∞. This parametric form does not admit asymptotic dependence, however, since α(d, θ t ) = 1 for d > 0, although for suitably large A 1 and A 2 ≥ 2 then α(d, θ t ) ≈ 1 for d ≈ 0; a possible parametric form for α(d, θ t ) which permits asymptotic dependence is given by Wadsworth and Tawn (2019). Angular effects continue to be parameterised using piecewise linear forms analogous to that in the equation above. Using these parametric forms, the full behaviour of α and β with distance and direction can be inferred by estimating just 5n θ parameters rather than 2n d n θ for the piecewise linear representation. With n d = 7 and n θ = 6, this corresponds to a reduction of (2n d − 5)n θ = 54 in the number of model parameters to be estimated, improving the computational efficiency of inference.

Gaussian residual correlation matrix with distance
The j, j element jj of residual correlation matrix , j, j = 1, 2, . . . , p quantifies the dependence between SCE residuals (on standard Gaussian-scale) at locations r j and r j given conditioning on location r 0 . It is reasonable to expect that the value of jj depends on both the distance dist(r j , r j ) between remote locations, and distances dist(r j , r 0 ) and dist(r j , r 0 ) from remote locations to conditioning site. When dist(r j , r 0 ) and dist(r j , r 0 ) are large relative to dist(r j , r j ), the conditional correlation between remote locations will be similar to the unconditional correlation. However, when dist(r j , r 0 ) and dist(r j , r 0 ) are of comparable size to dist(r j , r j ), we expect conditioning to influence correlation considerably. We adopt a parameterisation for equivalent to the correlation function for a standard Gaussian field evaluated at p + 1 locations conditioned on its value at one location, with powered exponential dependence. With * representing the correlation matrix of the unconditioned field (and matrix indexing starting from zero for convenience), the correlation matrix for the conditional field has elements given by for j, j = 1, 2, . . . , p with conditioning location indexed by zero (see e.g. Mardia et al. 1979, page 63). Further, it is reasonable to assume that the correlation between observations at different locations in the unconstrained field reduces as a function of the distance between the locations. For this reason, we adopt an isotropic powered exponential form for the correlation of the unconditional field, given by and j, j = 0, 1, 2, . . . , p. Parameters R 1 and R 2 are to be estimated.

Calculating distances and directions
Locations of points on the surface of the Earth are typically specified in terms of longitude-latitude coordinates. Temporarily adopting oceanographic notation, the shortest distance (e.g. in metres) on the surface of a spherical Earth between locations with longitude-latitude coordinates (λ, ϕ) and (λ , ϕ ) can be calculated using the spherical law of cosines. For calculations of local distances and directions between locations, it is convenient to adopt a local Cartesian description of displacement on the surface of a sphere, (see e.g. Vallis 2017, page 69). For locations with longitudelatitude coordinates r = (λ, ϕ) and r = (λ , ϕ ), we locate the local origin of coordinates at λ ,φ = (λ + λ )/2, (ϕ + ϕ )/2 , with x axis running West-East (in the Northern Hemisphere) and y axis poleward. Then, to a good approximation when |λ − λ| and |ϕ − ϕ| are small, the local Cartesian displacement d between the points can be written where a is the radius of the spherical Earth. We adopt this model to estimate distances and directions for all pairs of locations considered, with dist(r, r ) = ||d|| and head(r, r ) = arg(d).

Parameter estimation
We use Bayesian inference to estimate the joint posterior distribution of SCE model parameters, with sample negative log-likelihood as defined above. In practice we find that allowing directional variation of SCE residual parameters μ, σ and δ cannot be justified given data examined to date. Similarly, we assume that the residual process depends on distances between locations only. Any directional effects in the Laplacescale data can be accommodated by parameterisation of SCE parameters α and β as described below. Prior marginal transformation of data per location to Laplace scale using directional extreme value models estimates marginal directional variation of the original sample on physical scale. For SCE models with non-parametric representations for α and β terms (henceforth 'ABN' models), the parameter set Ω is {{{α k, , β k, } n θ =1 , μ k , σ k , δ k } n d k , R 1 , R 2 }, consisting of 2n d n θ + 3n d + 2 parameters. In this case, a grouped adaptive MCMC algorithm based on Roberts and Rosenthal (2009) is used for parameter inference, described in Shooter et al. (2019). Briefly, random search is used to find a reasonable starting solution. Then a Metropolis-within-Gibbs algorithm is used iteratively to sample each of the elements of Ω in turn for a total of n MiG iterations. Subsequently we use the grouped adaptive Gaussian random walk Metropolis-within-Gibbs algorithm iteratively for a further n GA iterations. Within the grouped adaptive algorithm, we jointly update groups {{α k , β k , μ k , σ k } n θ =1 } of parameters (for each d = 1, 2, . . . , n d ) following the adaptive Metropolis scheme of Roberts and Rosenthal (2009). Parameters {δ k } n d ,n θ k, =1 , R 1 and R 2 are updated using Gaussian random walk Metropolis-within-Gibbs, with automatic proposal step-size adjustment to achieve proposal acceptance rates near 0.25. Uniform prior distributions on plausible domains are used for model parameters. Chain convergence is judged to have occurred when trace plots for parameters and their dependence stabilise. Typically, prior MCMC analyses on multiple chains is performed for burn-in and to identify "hot" starting solutions for subsequent analysis; it was ensured that the number of MCMC iterations after burn-in was at least 30,000. Mean effective sample size (across all parameters) was approximately 400, with a minimum effective sample size in excess of 200. A summary of the MCMC scheme and prior specification are given in the Appendix.
For SCE models with parametric representations for α and β terms (henceforth 'ABP' models), the parameter set Ω is {A 1 , A 2 , B 1 , B 2 consisting of 5n θ + 3n d + 2 parameters. In this case, we apply the adaptive Metropolis scheme of Roberts and Rosenthal (2009) directly on the full set of parameters. Uniform prior distributions on plausible domains are used for model parameters; details are given in the Appendix.
We further optionally restrict the space of feasible sets of SCE parameters using the conditional extremes model constraints of Keef et al. (2013). The constraints are motivated by the fact that conditional quantiles (of one variable given a large value of another) under any form of asymptotic independence (α < 1) cannot be larger than under asymptotic positive dependence (α = 1). Details of the constraints applied in a spatial setting are given in the supplementary material of Shooter et al. (2019), and summarised here in the Appendix.

North Sea storm peak significant wave height
We apply the SCE model to data for ocean storm severity from a spatial lattice (see Fig. 1) consisting of 150 locations in the North Sea. The sample corresponds to hindcast values of winter storm peak significant wave height from the NEXTRA hindcast (Oceanweather 2002). Storm intervals for a total of 1680 storms during the period 1 Oct 1964 to 31 Mar 1995 were isolated from up-and down-crossings of a sea state significant wave height threshold for a central conditioning location, Fig. 1 Map of the 150 locations in the North Sea ocean. The larger disc represents the conditioning location. The neighbourhood extends approximately ± 150 km West-East, and ± 300 km North-South relative to the conditioning location. UK and Norway land masses also shown using the procedure outlined in Ewans and Jonathan (2008). Storm peak significant wave height for each storm interval at each location provided a sample of 1680 × 150 observations for further analysis. For each storm-location combination, the direction (from which waves emanate, measured clockwise from North) at the time of the storm peak, referred to as the storm direction, was also retained. The spatial extremal characteristics of this sample have been examined previously in Ross et al. (2017a) and Shooter et al. (2019). At each location, marginal directional extreme value analysis of storm peak values was performed, and storm peak data transformed to standard Laplace marginal scale (see Ross et al. 2017b). The central location was adopted as the conditioning site, an SCE model is estimated to describe the joint characteristics of the remaining 149 locations given a large value at the conditioning site.
In the metocean community, extreme value analysis is typically conducted on "storm peak" events, taken to be temporally independent. Storm peak events are isolated from (temporally dependent) time-series of storm severity at a location using physical considerations. The fundamental motivation is that storm peak events are temporally sufficiently separated that the causative physical processes responsible for them are reasonably taken to be independent. Therefore the storm peak significant wave heights are assumed independent; storm peak events are typically separated by at least a day (see Ewans and Jonathan 2008). The characteristics of the ocean environment within storm (including the evolution of sea-state significant wave height, and the dependence between it and other oceanographic and structural loading variables of interest) are then inferred conditional on storm peak characteristics. In other applications, when a physical basis for declustering of time-series of extremes is not available, Ferro and Segers (2003) provides one possible approach to characterising that dependence. Covariates also potentially play a role in the selection of storm periods and storm peaks, and are critical to reasonable transformation of storm peak data to standard marginal scale. Scatter plots of Laplace-scale data for selected locations on West-East and South-North transects are given in Fig. 2. For locations within 50 kilometres (km) of the conditioning site, scatter plots suggest that values of α near unity are expected, decaying with increasing distance.

Results
For the sample of storm peak significant wave height data transformed to standard Laplace margins, the SCE model was estimated as follows.
Using a threshold u at the conditioning location with marginal non-exceedance probability of 0.95, joint posterior distributions of SCE model parameters were estimated based on both the ABN and ABP parameter representations. For the ABN model, a distance-direction lattice of n d × n θ = 7 × 6 was adopted to estimate the distance-direction evolution of α and β parameters, with n d nodes used to characterise the evolution of residual parameters μ, σ and δ. The Gaussian-scale residual correlation was estimated in terms of two parameters R 1 , R 2 . The ABP model is similarly parameterised, except that evolution of α and β with distance is described parametrically for each of n θ directions. Results illustrated here were obtained with the conditional quantile constraints of Keef et al. (2013) active. A discussion of the There is generally good agreement between estimates for the distance-direction behaviour of SCE model parameters using the ABN and ABP parameterisations. Both show decay of α with distance (Fig. 3), and that the rate of decay reduced with increasing distance. At distances between 150 km and 250 km, the parametric form for α from the ABP parameterisation suggests smaller values; this is accompanied by correspondingly larger values of μ (Fig. 5). There is little evidence for directional variation in the decay of α with distance. For α • , = 1, 2, 3, 4, the estimate reverts to the prior Unif(0, 1) at distances larger than 250km, since there are no remote locations present at these distance-direction combinations. Figure 4 suggests that the parametric form for β with distance is also generally appropriate. Estimates of β around 0.4 are prevalent for intermediate distances. There is some support for β approaching zero with decreasing distance. Estimates for β •2 (d) and β •5 (d) in particular appear somewhat different to the others. Estimates for σ and δ, and powered-exponential ρ with distance in Fig. 5 show good agreement between ABN and ABP parameterisations. That σ (d = 0) is greater than zero suggests it may be playing the role of a "nugget effect" at zero distance; the same might be true for β(d = 0). It is noteworthy that δ is near unity (corresponding to a standard Laplace distribution) for both small and large distances, increasing to around 1.5 at intermediate distances; that is, marginal residual distributions appear more Laplace-like (δ = 1) than Gaussian (δ = 2) for the current application. Since the pair of values (α • (d), β • (d)) = (1, 0), for = 1, . . . , 6, is not observed for any d = 0 then the data do not exhibit asymptotic dependence, and Pareto process models would be inappropriate here (Tawn et al. 2018). Achieved sample negative log-likelihoods for ABN and ABP parameterisations were similar relative to the variation in negative log-likelihood MCMC traces, with ABN providing a somewhat lower value. In terms of the deviance information criterion (DIC, Spiegelhalter et al. 2002), the ABP model provided somewhat lower values. A table of DIC values for different comparable models is given in the Appendix.
For practical purposes of simulation, it is useful to examine distance-direction variation of conditional means and standard deviations (4) for the ABN and ABP parameterisations. For a conditioning value x 0 at the conditioning site with nonexceedance probability 0.95, these are shown by distance and direction in Figs. 6 and 7. There is again good agreement between the two parameterisations. The narrow width of credible intervals for conditional means and standard deviations, for a given distance, using the ABP model reflects the constraining influence of the parametric forms assumed; this is also clear in Figs. 3-5 for the underlying model parameters. As previously mentioned, the ABP model has 53 degrees of freedom compared with 107 for the ABN model. Comparison of observed spatial trajectories with those generated by simulation under the fitted model is illustrated in Fig. 8. The figure shows conditional quantiles with non-exceedance probabilities as described in the figure caption, for conditioning values x 0 with non-exceedance probability ∈ (0.95, 0.96] at the conditioning site, for two specific transect directions, using the ABN model. The left hand transect emanates from the conditioning site to the North-East; the right hand transect emanates from the conditioning site to the North-West. The number of observed trajectories available for estimation of conditional quantiles is small (22 and 20 respectively); 1000 realisations were used to estimate simulated conditional quantiles. Agreement between observation and simulation is good. Figure 10 plots the observed trajectories together with 100 simulated trajectories for each transect. Visual inspection of similar plots generated using the ABP model indicated a similar level of agreement. Figure 9 provides scatter plots of the correlation of residuals at pairs of remote locations falling within different distance intervals with from the conditioning site. The x-coordinate in each panel gives the value of residual correlation, on standard Gaussian scale, calculated using the estimated parameters R 1 and R 2 , and Eq. 6. The y-coordinate shows the residual correlation estimated empirically by (a) calculating fit residuals on delta-Laplace scale using r DL ij = (x ij − m j )/s j (for ith observation x ij at location j , see Eq. 4), (b) transforming to standard Gaussian scale residuals r N ij using the probability integral transform and the appropriate estimated marginal delta-Laplace model (see Section 2.2) and (c) estimating the correlation of samples r N ij and r N ij for appropriate choices of j and j . Again, the correspondence between residual correlations is good for all distances from the conditioning site. The characteristics of estimated residual correlation structure are illustrated further in Fig. 11.

Sensitivity studies
The ABN analysis above was repeated for different conditioning threshold choices (corresponding to non-exceedance probabilities of 0.90, 0.95 and 0.975). Estimated characteristics of SCE parameters with distance and direction were very similar to those reported above. The main distinguishing features for the three thresholds examined is the anticipated increase in estimated parameter uncertainty with increasing threshold. The ABN and ABP models were also estimated without imposition of the conditional quantile constraints. In this case, the decay of α with distance was considerably slower, decreasing to approximately 0.7 at 200 km, than for the constrained analysis. However, the peak value of β at around 100 km to 200 km was larger, at around 0.7. However, these changes were compensated for by smaller estimates for μ very close for zero, and more gradual growth of σ with distance. Therefore overall, conditional means and standard deviations for both ABN and ABP models with and without constraints were found to be very similar. Removal of conditional quantile constraints reduces the negative log-likelihood for posterior median parameters, as might be expected. However, there was little difference in the performance of ABN and ABP models, with or without constraints, in terms of agreement with observed trajectories.
The ABN model was also used with each of a number of different conditioning sites. The general characteristics of parameter estimates with distance are similar to those reported above. Again, there was no evidence for systematic directional variation of α. Other models for the residual correlation were considered. We found that the conditional Gaussian correlation adopted above performed better, in terms of negative log-likelihood for models of similar complexity, and better trajectory simulation.
Choice of number n d of distance nodes and n θ of directional nodes was based on preliminary analyses, and physical considerations. For example, the minimum distance between locations on the lattice is approximately 30 km, but physically we do not expect large changes in extremal dependence over such short distances. The maximum distance between a remote location and conditioning site is just over 300 km. With n d = 7, an inter-distance node interval of 56 km provides a reasonably smooth description. Similarly, n θ = 6 corresponds to directional nodes at π/3 or 60 • intervals, sufficient from a physical perspective to characterise any directional variability. A non-directional (n θ = 1) ABN model was also estimated (with and without conditional quantile constraints) yielding values of negative log-likelihood and DIC considerably higher than for n θ = 6. Truncated zero-mean Gaussian priors for μ (on domain [−1, 1]) and σ (on domain (0, √ 2)) were also evaluated, for different choices of Gaussian standard deviation, for the ABN model; no appreciable influence on parameter estimates was observed for reasonable choices of standard deviation.

Discussion
In this paper we develop a two-dimensional spatial conditional extremes model incorporating distance-directional variation of model parameters, with parametric and non-parametric model parameter representations. The SCE model allow asymptotic dependence at short inter-location distances, leading to asymptotic independence, and eventually perfect independence, as distance increases. The SCE residual process is given delta-Laplace (generalised Gaussian) marginal distributions, and a (Gaussian-scale) conditional Gaussian correlation structure with powered exponential correlation function. The model is applied to estimate the conditional extremal dependence of a sample of hindcast storm peak significant wave height data for a lattice of 150 locations in the North Sea. The model fit is found to be reasonable with or without applied conditional quantile constraints, and is relatively insensitive to conditioning threshold choice, within the interval considered. Prior transformation of data to standard Laplace marginal scale is achieved using non-stationarity directional marginal extreme value analysis.
For the current analysis, an occurrence of a large extra-tropical cyclone would cover the entire spatial domain considered. We therefore expect to find some dependence of extreme values of significant wave height over the whole domain. The nature of that dependence is less clear a priori. The importance of allowing for different forms of extremal dependence with distance is illustrated in Fig. 2, for example, which shows that SCE parameter α decays from approximately unity to zero over approximately 300 km for North Sea storm peak significant wave height. The current analysis suggests that there is little directional variation of extremal dependence of storm peak significant wave height for the lattice considered. There is some evidence supporting relatively large values of β for direction π/3 and small values at 4π/3. There are a number of physical features of the North Sea environment which might explain such differences, including prevailing wind and storm directions, and bathymetric effects.
In the absence of conditional quantile constraints, the values of α parameter estimates are generally higher than when then constraints are applied. The values of parameter estimates for μ are near zero in the absence of conditional quantile constraints, but positive when constraints are applied. The net result of these competing effects is that conditional means and standard deviations are very similar for lower conditioning values x 0 (x 0 > u) at the conditioning site. Nevertheless, for conditioning values x 0 corresponding to large non-exceedance probabilities, we might expect that parameter estimates from unconstrained models would provide larger values for conditional means, since αx 0 for α ∈ [0, 1] increases more quickly with x 0 ∈ R >0 than μx β 0 , μ ∈ [0, 1] and β ∈ [0, 1). However, no obvious evidence for this difference was observed for the current sample.
One source of uncertainty not explored in the current work is the effect of uncertainty in marginal fitting of directional generalised Pareto models for the 150 locations. Transformation to standard Laplace marginals was performed using posterior median estimates. Exploratory analysis (and plots such as Fig. 2) suggest that marginal fitting is reasonable. It would be interesting however to propagate uncertainty from marginal fitting through to SCE estimation, or better to jointly estimate marginal and SCE models.
Our results suggest a number of potential avenues for further method development and application. From a physical perspective, we are keen to ensure that the SCE model provides an adequate representation of the conditional spatial structure of large ocean storms. For the sizes of spatial domains examined here (and in Shooter et al. 2019;Shooter 2020), this would appear to be the case. Current work is considering yet larger spatial domains. The approach would appear to be ideally suited for characterisation of spatial ocean surface roughness, e.g. as measured by satellite altimetry (Young and Ribal 2019).
The SCE model allows multiple different classes of extremal dependence. Our results indicate that α ≈ 1 occurs for small distances d, but that α < 1 otherwise. This suggests that a modelling strategy admitting both asymptotic dependence and asymptotic independence is essential to characterise large ocean storms. In particular, competitor approaches involving max-stable and Pareto process models, which admit only asymptotic dependence, are not appropriate. Further, the SCE model is computationally relatively efficient to estimate, and does not require the adoption of composite likelihoods for tractable inference. We believe that the model proposed in this work is a flexible and practical approach, with some obvious advantages to estimation of max-stable and Pareto processes for relatively large neighbourhoods of spatial locations.
Iteration to convergence Throughout, a candidate state is accepted using the standard Metropolis-Hastings acceptance criterion. Since prior distributions for parameters are uniform, and proposals symmetric, this is just a likelihood ratio. Candidates lying outside their prior domains, or violating the conditional quantile constraints if applied, are rejected. Under the ABP model (with parametric forms for α and β with distance only), the estimation scheme is simplified so that the full set of parameters is updated at the grouped adaptive stage. The resulting procedure is the same as the original scheme of Roberts and Rosenthal (2009). Uniform priors A 1 ∼ Unif(1, 20), A 2 ∼ Unif(0.1, 5), B 1 ∼ Unif(0.1, 1), B 2 ∼ Unif(1, 5) and B 3 ∼ Unif(0, 20), = 1, 2, . . . , n θ were applied. The prior distributions for all other parameters under the ABN and ABP models are the same.  Illustration of the estimated Gaussian-scale residual correlation function. Consider a pair of remote locations with common distance to conditioning site. Distance between remote locations is indicated by x-coordinate of black disc for each curve. Curve through disc indicates the residual correlation between remote locations as a function of distance to the conditioning site Figure 11 illustrates the fitted conditional Gaussian correlation function of Eq. 6. Consider two remote locations, equidistant from a third conditioning location. For each of the curves in the figure, the x-coordinate of the black disc indicates the distance between the remote locations. The curve passing through each disc gives the value of conditional residual correlation as a function of the (common) distance of the remote locations from the conditioning site. Thus, the leftmost curve pertains to two remote locations approximately 30 km apart. When the distance of these locations to the conditioning site is large, the remote locations have large residual correlation since conditioning has effectively no effect: the residual process is essentially unconstrained, and the value of correlation determined by the powered exponential ρ. However, as distance to the conditioning site is reduced, residual correlation decreases, since more sample variation is explained by the SCE model (as opposed to the SCE conditional residual process). The minimum value of distance to conditioning site occurs when the three locations are collinear, with the conditioning site mid-way between remote locations. For this arrangement, conditioning induces a small negative residual correlation between remote locations. As the distance between remote locations increases, for different curves left to right, the maximum conditional residual correlation is reduced.

Deviance and DIC
The Columns in the table are as follows: n θ is the number of directional parameters in the model, 'Constraints' indicates whether the conditional quantile constraints of Keef et al. (2013) were imposed, D(Ω) is the posterior mean deviance, sd(D) is its standard deviation and D(Ω) the deviance calculated using posterior mean parameters. In summary, there is evidence in favour of a directional model, since lower values of deviance and DIC are obtained for n θ = 6 compared with n θ = 1 both with and without application of conditional quantile constraints. DIC values for ABP are somewhat lower than for the corresponding ABN, but differences are small.