Improving earthquake ground-motion predictions for the North Sea

Estimates of the expected ground motion are essential for the design, assessment and decommissioning of offshore critical infrastructure. The North Sea is an area of moderate seismic hazard that contains many high-value offshore structures (e.g. oil, gas and wind-turbine facilities). The most recent seismic hazard assessment for the North Sea is about 20 years old, before many innovations in ground-motion modelling were developed. In this study, firstly we investigate which ground-motion model from more than a dozen recent models is the most appropriate for this area based on a residual analysis of ground-motion data from onshore seismic stations surrounding the North Sea. The limited data that are available for this area and the poor magnitude and distance coverage are inherent weaknesses of this residual analysis. A recent model developed for Europe and the Middle East is the model that shows the lowest bias and minimal statistical trends with respect to magnitude and distance. Following this, we develop adjustments to this best-performing model to relax the ergodic assumption, i.e. to make the model more site- and path-specific thereby allowing a smaller aleatory variability (sigma) to be used within a probabilistic seismic hazard assessment. The use of this adjusted model within seismic hazard assessments for the North Sea should lead to better estimates of the expected ground motion for critical offshore infrastructure sites, although this would require the effects of the geotechnical properties of the seafloor to be accounted for.


Introduction
The most recent probabilistic seismic hazard assessment (PSHA) for the North Sea was undertaken almost two decades ago by Bungum et al. (2000). Since the Bungum et al. (2000) PSHA was completed, seismicity related to hydrocarbon production (induced seismicity) in the North Sea has become of interest to the scientific and engineering communities, as well as the oil and gas companies active within the area (Ottemöller 2005). This interest increased after the occurrence of the 2001 Ekofisk local magnitude (M L ) 4.2 earthquake, which generated surprisingly large ground motions given its moderate size (Ottemöller 2005). Because of the emerging problem of induced seismicity, the considerable advances made in PSHA since the Bungum et al. (2000) study and the large number of oil and gas installations present in the region and the increasing prevalence of offshore wind turbine facilities, it is timely to reassess the seismic hazard of the North Sea. The rationale for reassessment is further validated by the offshore seismic hazard maps within the current industry standards for offshore seismic design (API RP 2EQ 2014) being based on the results of a PSHA undertaken over two decades ago for onshore Norway (NORSAR and NGI 1998), rather than specifically for the North Sea region. There have been a number of recent PSHAs for locations surrounding the North Sea (e.g. Grünthal et al. 2018;Mosca et al. 2019;Tromans et al. 2018) but there is nothing in the public domain since the study of Bungum et al. (2000) for the North Sea itself.
A key step when reassessing the seismic hazard is the identification of the ground motion prediction equations (GMPEs) which best fit observed strong ground motions in an area. Because of a lack of observed ground motions from the North Sea, Bungum et al. (2000) only used expert judgement to select the most appropriate GMPEs for their PSHA due to a lack of observations from this area. Ground-motion records for this area are now publicly available. Therefore, we use residual analyses using the available data to determine the best-fitting GMPEs. The best-performing GMPE is used as a base model to which incremental improvements are implemented to develop a GMPE better suited for use in the North Sea. These improvements are made through relaxing the ergodic assumption (Anderson and Brune 1999) with respect to site and path in the area by adapting recently proposed techniques. These improvements are found to provide incremental but significant improvements in GMPE performance, which could further benefit from an expanded ground-motion dataset.

North Sea dataset preparation
A dataset of North Sea ground-motion records was extracted from the European Integrated Data Archive (EIDA, Orfeus-eu.org 2019), which comprises data from broadband seismometers. The preliminary dataset covered most of north-western Europe, and contained 38,562 ground-motion records from 773 earthquakes and 634 stations. A band-pass filter of 1-10 Hz was applied to remove noise from the ground-motion records in this preliminary dataset. The corner-frequencies (1 Hz, 10 Hz) of this filter were constrained by considering signal-to-noise ratios over several bandwidths. Because of the moderate magnitudes of the earthquakes considered here, the low frequency cut-off of 1 Hz should not be affecting significantly the spectral accelerations at periods of 1 s and 2 s as there is little long-period energy in the records.
All ground-motion records pertaining to earthquakes of M L < 2.5, source-to-site distances greater than 500 km or originating from outside of the North Sea were removed. Initial residual analysis was carried out using the Bragato and Slejko (2005) GMPE to remove erroneous records. Erroneous records were defined as those with residuals (in base 10 log) greater than 2 or less than − 2. This GMPE, developed for the Eastern Alps, was chosen because it was a simple model using a dataset with similar magnitude-distance range to the North Sea dataset. The final dataset comprised 120 groundmotion records from 50 earthquakes and 17 stations ( Fig. 1; Fig. 2). The 120 records of the North Sea dataset were checked for noise contamination by visual inspection of observed versus predicted spectra, again using the Bragato and Slejko (2005) GMPE. Such noise contamination was found to be minimal. It is assumed that M L and moment magnitude M w are equal for these records. A uniform value of the average shear-wave velocity of the top 30 m, V s30 , of 600 m/s was assumed for all stations (due to a lack of available subsurface geotechnical information for each site), corresponding to an upper 30 m comprising of tens of metres of very dense sand, gravel or very stiff clay, according to the Eurocode 8 earthquake design code (BS EN 19982004. The North Sea dataset contains few near-source (< 50 km source-to-site distance) records, because there are no public offshore seismometers in the North Sea. The dataset also lacks larger magnitude (M L > 4.0) ground-motion records because of the area's moderate seismicity (Fig. 2). This lack of true "strong-motion" data increases the uncertainty associated with calibrating/testing GMPEs at higher ground-motion levels. Data from near-source seismometers (i.e. those installed on offshore platforms or from seafloor seismic monitoring networks) was sought from private companies; however, no such data were obtained.
Following the preparation of the North Sea dataset, testing of how well the considered GMPEs fit the ground-motion data was undertaken using more thorough residual analyses. These analyses were undertaken using the Python/OpenQuake-based gmpe-smtk toolkit (Global Earthquake Model Foundation 2019). The residuals were computed in natural log units. The 16 GMPEs considered here are Abrahamson et al.  Akkar and Bommer (2010); Akkar and Çağnan (2010); Akkar et al. (2014) using epicentral distance (R epi ); Akkar et al. (2014) using hypocentral distance (R hyp ); Akkar et al. (2014) using the distance to the surface projection of the rupture (R jb ) 1 ; Boore and Atkinson (2008); Boore et al. (2014); Bindi et al. (2017) using R jb ; Bindi et al. (2017) using R hyp ; Campbell and Bozorgnia (2014); Chiou and Youngs (2014); Cauzzi et al. (2015); Rietbrock et al. (2013); Toro et al. (1997) and Toro (2002). These 16 GMPEs were chosen because they had been recently developed for use in Europe and/or were derived using datasets of similar magnitude-distance ranges to the North Sea dataset or they had been used in previous North Sea studies. Equality between RotD50 2 and geometric mean was assumed for the GMPEs developed for RotD50. This assumption is justified by the relations observed between these two horizontal-component ground motion intensity measures by Boore and Kishida (2017).
Residual analysis with the gmpe-smtk toolkit was undertaken for the following ground motion intensity measures: (1) peak ground acceleration (PGA), (2) spectral acceleration (SA) for periods of 0.1 s, 0.5 s and 1.0 s and (3) peak ground velocity (PGV). The fit of each model was determined through the linear trend in the residuals. The residuals of the dataset were evaluated with respect to magnitude and distance. The inter-event standard deviation (τ), intra-event standard deviation (Φ), bias (mean total residual) of each model and p values (i.e. the probability that, when the null hypothesis (here "no linear trend") is true, would be greater than or equal to the observed results) with respect to both magnitude and distance were calculated. In this investigation a p value of less than 0.1 (rather than the more common 0.05) is considered as statistically significant, thus representing a trend of over-or under-predicting a ground motion intensity measure with respect to either magnitude or distance. The inter-event standard deviation is associated with event-specific factors, e.g. randomness in the source process. The intra-event standard deviation represents the variability associated with recordspecific factors, e.g. site amplification or geometrical spreading, for the same event.
The dataset-derived intra-event and inter-event standard deviations of each evaluated GMPE are slightly higher than those predicted by the GMPEs. These elevated values can be attributed to (1) the GMPEs having been developed for regions other than the North Sea, (2) the small events and large distances mainly covered by the dataset [previous studies (e.g. Ambraseys et al. 2005) have found that ground motions from small events and/or large distances are more variable than those from larger events and/or short distances], (3) uncertainties in the magnitude and distance estimates in the North Sea dataset and (4) a lack of information on the near-surface site conditions at the seismometers.

PGA results
For PGA (Table 1), the best-fitting model was the Akkar et al. (2014) GMPE using the distance to the surface projection of the rupture (R jb ). This GMPE provides the lowest bias, inter-event and intra-event standard deviations and does not show statistically significant trends with respect to either magnitude or distance ( Fig. 3;  Fig. 4). This GMPE was derived from data mainly from the Mediterranean region (Italy, Greece and Turkey), and using data from earthquakes down to M w 4 with a functional form that captured the magnitude-and distance-scaling of ground motions from moderate earthquakes. These reasons likely explain its superior performance over the other GMPEs considered.
The Toro et al. (1997) GMPE used by Bungum et al. (2000) was found to significantly over-predict PGA in the North Sea, with the residuals computed with this GMPE displaying significant statistical trends for both magnitude and distance. The Ambraseys et al. (1996) GMPE [the other GMPE used within Bungum et al. 2000's logic tree] is not currently available within the gmpe-smtk toolkit and hence could not be considered.
The constant over-prediction by the Toro et al. (1997) GMPE is most likely the result of the GMPE being primarily applicable to (1) very hard (V s30 > 2 km/s) rock sites of very low near-surface attenuation (kappa) and (2) calibration of this model to central and eastern North America. Anelastic seismic attenuation is probably greater throughout the North Sea than in central and eastern North America, as suggested by the observations of poor Lg wave propagation within the North Sea's crust due to the presence of the Central and Viking Grabens in the region's continental structure (Gregersen and Vaccari 1993). The considerable over-prediction by the Toro et al. (1997) GMPE for PGA is likely minimised by the equal contribution of the Ambraseys et al. (1996) GMPE within Bungum et al. (2000's logic tree. This minimising contribution by the Ambraseys et al. (1996) GMPE is indicated by the seismic hazard curve computed with Bungum et al. "Good" means − 1.5 < bias < 1.5, p value (mag) > 0.01 and p value (dist) > 0.01; "Moderate" means − 2.5 < bias < 2.5, p value (mag) > 0.0001, p value (dist) > 0.0001; and "Poor" means bias < − 2.5, bias > 2.5, p value (mag) < 0.00001 or p value (dist) < 0.00001

SA results
The GMPEs that provide the best overall fits at the selected spectral periods are mostly the same group as those that provide the best fits for PGA. For SA ( The Toro et al. (1997) GMPE again significantly over-predicts ground shaking for a spectral period of 0.1 s, albeit moderately less so than it does for PGA, especially with regard to magnitude and distance dependency (i.e. larger p values are observed for PGA). At larger spectral periods (0.5 s and 1.0 s), the Toro et al. (1997) GMPE slightly to moderately under-predicts. However, the corresponding p values with respect to both magnitude and distance are significantly smaller than for PGA and SA (0.1 s) meaning the overall fit is also poor for Toro et al. (1997) at larger spectral periods.

PGV results
Only the Rietbrock et al. (2013) GMPE provides a good fit when predicting PGV, and only the Cauzzi et al. (2015) GMPE provides a moderate fit. The majority of the evaluated GMPEs show low to moderate bias. However, as with the residual analysis results for SA (1.0 s), many of the considered GMPEs show statistically significant trends with respect to magnitude, which is probably due to these GMPEs not having been calibrated for small-to-moderate earthquakes especially from considerable distances (> 100 km).

Comparative North Sea hazard calculations
Hazard curves were computed for an example North Sea location (Fig. 5) using the CRISIS seismic hazard software (Ordaz et al. 2015) and the R jb variant of the Akkar et al. (2014) GMPE, which was determined to be the most appropriate GMPE for the North Sea from those investigated, as well as the same two GMPEs as used by Bungum et al. (2000). The hazard curves are largely representative of the GMPE testing results; the R jb variant of the Akkar et al. (2014) GMPE (with the GMPE's default standard deviations) predicts considerably lower frequencies of exceedance for higher levels of PGA compared to Bungum et al. (2000)'s logic tree approach.
Use of the larger database-derived inter-event and intra-event standard deviations for the R jb variant of the Akkar et al. (2014) also has a moderate impact on the seismic hazard calculations for this example North Sea location; e.g. for an annual frequency of exceedance of 10 −3 , the predicted PGA is approximately 30% larger with the database-derived standard deviations than the GMPE's reported standard deviations. In North Sea PSHA calculations with the R jb variant of the Akkar et al. (2014), the model's default standard deviations should be used for now until more work has been carried out.
Because of the sparsity of ground-motion records from moderate and large (M L > 5) earthquakes within the North Sea dataset, we also evaluate the performance of the R jb variant of the Akkar et al. (2014) GMPE using a small dataset of ground-motion records that are not in EIDA from larger North Sea earthquakes (8 groundmotion records for 5 earthquakes of sizes M L 4.4, 4.4, 5.3, 5.7 and 6.1 recorded at distances of approximately 300-700 km), including the M L 5.3 Viking Graben event of 1927 and the M L 6.1 Dogger Bank event of 1931. This ground-motion data was collated from Bungum et al. (2003), the Norwegian Seismological Array (NORSAR) and the Norwegian National Seismic Network (NNSN). These additional data are of poorer quality with less reliable metadata, and consequently were not incorporated within the original residual analysis because they could potentially bias the results from the EIDA (North Sea) dataset. Using this independent dataset, the GMPE was found to provide a good fit overall for PGA, with a bias of 0.712 and no statistically significant trending with respect to either magnitude or distance being observed. This good fit with the independent dataset provides additional support for the selection of the R jb variant of the Akkar et al. (2014) GMPE as the base model for the presented North Sea GMPE based on its performance using the North Sea dataset. The improvements proposed below will likely also further enhance the performance of the R jb variant of the Akkar et al. (2014) GMPE for predicting the ground shaking resulting from larger earthquakes.

Constraining site effects in the North Sea
The GMPE testing concluded that no single GMPE provides an overall good fit to the North Sea dataset. However, the R jb variant of the Akkar et al. (2014) GMPE was found to predict the observations well for PGA, SA (0.1 s) and SA (0.5 s), and therefore is considered to perform the best overall for the North Sea dataset. Consequently, for the remainder of the investigation, the R jb variant of the Akkar et al. (2014) GMPE is treated as the base model to which improvements are made.
The intensities of ground shaking resulting from a single earthquake vary among sites due to local site conditions (e.g. Sanchez-Sesma 1987). Generally, if the subsurface comprises soil or soft rock, the groundmotions are amplified with respect to hard rock (e.g. Bowden and Tsai 2017). Site classification schemes enable simple yet effective adjustments to groundmotion predictions at many sites. These adjustments effectively result in the partial relaxation of the ergodic assumption with respect to site, subsequently leading to improved GMPE performance.
Significant challenges are associated with using such site classifications schemes, the most prominent of which being the requirement of an abundance of a priori information. To circumvent this dependency on a priori site information, Kotha et al. (2018) developed a datadriven method for site classification. This method uses a variety of statistical techniques to determine an optimal number of site classes using only the intra-event residuals of a selected GMPE as a priori information, and then known site characteristics (e.g. V S30 -the timeaveraged shear-wave velocity through the top 30 m of the site soil profile) are used as a posteriori information to assign characteristic subsurface conditions to each class. The assignment of characteristic subsurface conditions to each class is not required if applying this approach to sites with ground-motion data. However, this final step is necessary to apply the method to sites with only geotechnical information available and no ground-motion data (as is generally the case at the sites of engineering infrastructure).
Considering the lack of reliable a priori information for each site in the North Sea dataset, the Kotha et al. (2018) methodology is ideal for constraining site effects in the North Sea, so as to improve the performance of the base GMPE. It should be noted that the sites within the North Sea dataset are exclusively onshore sites. Therefore, the site classes/amplifications determined using the Kotha et al. (2018) technique would likely require modification to be applicable to the offshore sites, where oil, gas and wind-turbine facilities are generally located in the North Sea. (1) δS2S s -the average (intra-event) GMPE residual for one site, which represents the site-specific random effects, (2) ΔS2S s -a vector comprising the δS2S s scalar computed for each spectral period considered and (3) ϕS2S s -the standard deviation of the siteto-site (intra-event) GMPE residuals.
PCA is used to reduce the multi-dimensionality of an ΔS2S s vector dataset to produce principal component scores. The principal component scores represent the variability observed in the ΔS2S s vector dataset. The first two principal component scores represent the bulk of the variability in a dataset for which PCA is performed (Aggarwal 2014). The plotting of these first two principal component scores is representative of the variability observed in the ΔS2S s vector dataset in a twodimensional space.
The representation of the bulk variability of the ΔS2S s vector dataset in a two-dimensional space enables clustering of the ΔS2S s vectors using the k-means algorithm (Fig. 6). The k-means clustering assigns each ΔS2S s vector to a cluster comprised of similar ΔS2S s vectors. The k-means clustering of S2S s vectors therefore effectively groups sites with similar site-specific random effects over the range of spectral periods considered.
For each cluster, the means of the ΔS2S s vectors within that cluster are calculated to produce clusterspecific ΔS2S s vectors. The ΔS2S s vectors effectively act as amplification functions relative to one another, from which the response spectra computed for each site can be adjusted. For the ΔS2S s vectors to act as amplification functions relative to one another, a reference cluster must be chosen. This reference cluster should ideally display small site response at all considered spectral periods. Conventionally, outcropping hard bedrock sites (V s30 > 800 m/s) are used as reference sites, for which seismic hazard estimates are computed and then scaled using the appropriate site amplification function (Kotha et al. 2018).
Within this investigation, the North Sea ΔS2S s vector dataset comprises of 17 ΔS2S s vectors (one for each site). The North Sea δS2S s scalars were computed for each spectral period considered (0 s, 0.1 s, 0.5 s, 1.0 s and 2.0 s) 3 with the base model GMPE and the North Sea dataset. The Kotha et al. (2018) methodology was subsequently applied to this North Sea ΔS2S s vector dataset. PCA was performed to produce a twodimensional representation of the bulk variability observed in the North Sea ΔS2S s vector dataset. K-means clustering was then run for 1000 iterations, resulting in successful convergence with two clusters (Fig. 6). Cluster 1 contained 10 sites and cluster 2 contained 7 sites. Cluster 1 was chosen to be the reference cluster for the computation of the other cluster's amplification function due to cluster 1's residuals implying small site response at all considered spectral periods (Fig. 7). The intracluster standard deviations of δS2S s for each spectral period considered (i.e. ϕS2S s ) were also computed (Fig. 8).
Following the partitioning of the North Sea ΔS2S s vector dataset into clusters, and the subsequent computation of cluster amplification functions, characteristic geotechnical conditions should be defined for each cluster. However, minimal information was available for the subsurface conditions of each site in the North Sea dataset. In the Kotha et al. (2018) study, an abundance of geotechnical information enabled rigorous assignment of characteristic site conditions for each cluster, including the computation of two-dimensional kernel distributions of the geotechnical site characteristics, from which representative ranges for each cluster could be identified. In comparison, here the only geotechnical site characteristic which could be reliably assigned (albeit again from the ground-motion data) is the spectral period corresponding to peak amplification, which for cluster 2 was determined to be 0.5 s, thereby matching the period determined by the analysis using ΔS2S s . Because of the lack of geotechnical information, it can only be said that cluster 1 represents the reference cluster and that cluster 2 represents the amplified cluster.
Sites of interest in future PSHA, but not included in the clustering procedure can theoretically be assigned to a site class derived from the Kotha et al. (2018) method based on known geotechnical site characteristics, and the most appropriate cluster amplification function can be applied to response spectra computed for the site. However, Kotha et al. (2018) suggest that for their cluster site amplification functions to be applicable to new sites, additional site-response parameters should be developed and further geotechnical information for certain clustered sites must be obtained. Kotha et al. (2018) suggest such steps because using only the siteresponse parameters available during their study results in some clusters being indistinguishable (e.g. similar V S30 distributions are observed for significantly different site amplification functions). Considering the far smaller size of the dataset used in this study compared to the dataset used by Kotha et al. (2018), such recommendations are clearly necessary for the application of the cluster amplification functions computed in this investigation to new North Sea sites for which no groundmotions records are available.

Effect on aleatory variability
The k-means clustering produced two reasonably distinct amplification functions, with significant overlap only occurring for the North Sea ΔS2S s vectors (from which the amplification functions are calculated) at a period of 2.0 s (Fig. 7), which can be attributed to the high intra-cluster standard deviation for δS2S s values at a period of 2.0 s (Fig. 8). Consequently, it can be concluded that overall cluster 1 (the reference cluster) represents a more similar set of ΔS2S s vectors than cluster 2 (the amplification cluster).
A significant benefit observed within the Kotha et al. (2018) study is also observed in this analysis: the intracluster site-to-site response variability for each considered period is overall significantly smaller than the preclustered overall site-to-site response ( Fig. 8; Table 2)-this reduction is approximately 58% on average for cluster 1, and approximately 43% for cluster 2. The smallest reduction in intra-event standard deviation is observed at a period of 2.0 s for cluster 2 (~22%). This can be attributed to the poorer clustering observed in cluster 2 compared to cluster 1 (see Fig. 8). This large reduction can be attributed to (1) the relatively small size of the dataset and (2) the dataset comprising many small earthquakes [the greater variability in the ground shaking associated with smaller earthquakes (Ambraseys et al. 2005) results in a more significant variability reduction following clustering].
Importantly, the site amplification functions partially relax the ergodic assumption with respect to site, and  Table 2). The largest reduction in δS2S s is observed at a period of 0 s (~10%), although considerable reductions are still observed for periods of 0.1 s, 0.5 s and 1.0 s (~8%,~7% and~6% respectively). A smaller reduction in δS2S s variability is observed at 2.0 s (~1%), which can be attributed to site effects having smaller impacts at longer periods (longer periods are generally more affected by source and path effects).

Constraining path effects in the North Sea
Following application of the cluster amplification functions to the intra-event residuals, scatter in the intraevent residuals now represents variations in the path effects for each ground-motion record. Path effects represent the energy attenuated as seismic waves propagate through the subsurface (Kennett 2009). Seismic attenuation occurs through three mechanisms: (1) geometric spreading, (2) anelastic attenuation and (3) scattering. Through the implementation of a geometric spreading term in the Akkar et al. (2014) GMPE, the first of these mechanisms is already accounted for. The Akkar et al. (2014) GMPE does not include anelastic attenuation or scattering terms because the authors of this model did not have sufficient far-field data to constrain them (although the geometric-spreading term of this model may partially include attenuation from these two sources). Each site-corrected intra-event residual is, therefore, representative of the apparent attenuation (anelastic attenuation plus scattering) for each ground-motion record.  Here, the intra-event residuals are analysed to relax the ergodic assumption with respect to path (and therefore improve GMPE performance) through determining repeated attenuation effects for the North Sea region. These effects are determined by a tomographic inversion based on the approach of Dawood and Rodriguez-Marek (2013) for Japan.

Approach for constraining path effects in the North Sea
Prior to the implementation of the Dawood and Rodriguez-Marek (2013) approach, the latitudes and longitudes of the earthquake epicentres and the stations were converted to Northings and Eastings relative to the corresponding Universal Transverse Mercator (UTM) zone(s). This conversion ensured the curvature of the earth was accounted for even in the two-dimensional visualisation of this tomographic problem. After this conversion, the following four stages were undertaken.
First, a grid of an appropriate resolution was imposed over the region for which repeated attenuation effects were to be determined. For each grid cell, an apparent attenuation rate was computed. Ideally, the resolution of the grid had to be small enough that each grid cell captured significant yet particular variations in apparent attenuation over the region, but large enough that a sufficient number of travel paths also passed through each grid cell. A higher number of passes for each grid cell result in a better constrained attenuation rate for the associated grid cell (Dawood and Rodriguez-Marek 2013). The grid resolution was set at roughly 250 km per grid cell, resulting in a 3 × 3 grid overlay (Fig. 9).
Second, the distance traversed across each grid cell by each travel path was determined. The travel paths are represented as straight lines (Fig. 9), assuming that the seismic waves are travelling at constant velocities. This is a significant assumption but it was made to simplify the calculations.
Third, the apparent attenuation of each grid cell was solved for the considered spectral periods (0 s, 0.1 s, 0.5 s and 1.0 s) (e.g. Fig. 10). The computed attenuation rates are displayed in Table 4. The attenuation rates are solutions of a set of simultaneous linear equations. As within the Dawood and Rodriguez-Marek (2013) investigation, a replacement procedure was implemented for grid cells considered as poorly constrained due to a lack of passes. Within this investigation, any grid cell with fewer than five passes were considered as poorly constrained, and therefore the corresponding apparent attenuation rate for such grid cells was replaced with the average apparent attenuation of the other grid cells. This procedure was applied to one of the nine grid cells (grid cell A-see Fig. 9; Table 3).
The cells' attenuation rates, like the GMPE residuals, were computed in base natural log units. If the apparent attenuation rate for each grid cell, each expressed as an attenuation coefficient, is considered as a correction applied to each ground-motion record's travel path (to reduce the associated site-corrected residual), then a negative attenuation rate represents a zone of high attenuation, and a positive grid cell attenuation rate a zone of low attenuation. Equation 1 demonstrates how these corrections are applied: where Z p represents the path-corrected intra-event residual, Z s represents the site-corrected intra-event residual, P ies represents the distance (in km) through grid cell i for a straight line travel path of source e to site s, and δ i represents the apparent attenuation per km per grid cell. Finally, (1) checkerboard inversion testing and (2) rerunning the tomographic inversion with a smaller dataset were undertaken to check whether the grid resolution was appropriate for the North Sea dataset. Checkerboard testing comprises of a simple inversion, whereby a checkerboard pattern matrix of alternating high and low attenuation areas (relative to the outputted grid cell attenuation rates in the forward calculation) is input, and the forward model (the resolution of the overlain grid and the associated distance coefficients for each travel path) is used to invert the checkerboard matrix, resulting in the computation of synthetic sitecorrected residuals. Perturbations in the form of random sampling from a Gaussian noise distribution (with a standard deviation equal to that observed within the checkerboard matrix) are then added to these synthetic residuals. The perturbed synthetic residuals are then input into the forward model to reconstruct the original checkerboard pattern. The percentage difference between each grid cell in the reconstructed and the inputted checkerboard matrices (Fig. 11) provides an indication of how well the attenuation rate of each grid cell has been constrained using the forward model. A large difference is indicative of a grid cell being poorly constrained, and a small difference is indicative of a grid cell having been well constrained.
It should be noted that the checkerboard test has several weaknesses (Rawlinson et al. 2014;Lévěque et al. 1993): (1) adding perturbations in the form of Gaussian noise with a specified standard deviation is likely not representative of the noise in the observed dataset, (2) the use of an identical model for both the data inversion and the reconstruction will inherently result in a favourable reconstruction of the checkerboard matrix and (3) results can depend strongly on the inputted structure. These inherent weaknesses in the Fig. 9 North Sea region with 3 × 3 grid overlay. Green lines represent travel paths. Bold letters denote each grid cell for which path effects are to be constrained for Fig. 10 Computed North Sea apparent attenuation rates (per km) for a spectral period of 0.5 s. Blue stars represent the earthquake epicentre locations. Red triangles represent the station locations. Black lines represent travel paths checkerboard test led to the choice to rerun the tomographic inversion with smaller datasets for further verification of the viability of the chosen grid resolution.
For this analysis, records were randomly removed from the original dataset and the attenuation rates were recomputed using these reduced datasets. As discussed in detail below, if reasonably similar results can be computed with smaller datasets (e.g. Fig. 12), this indicates that the computed grid cell attenuation rates are not extensively perturbed by moderate reductions in the size of the North Sea dataset, and therefore the grid overlay resolution is appropriate.

Results for constraining path effects in the North Sea
The inclusion of the nine computed attenuation rates within the GMPE leads to small yet significant (all 2-5%) reductions in the standard deviations for the sitecorrected intra-event residuals at all considered spectral periods ( Fig. 13; Table 2). The checkerboard inversion test reconstructs the inputted checkerboard matrix well aside from for grid cell A (Fig. 11), suggesting the grid overlay resolution is appropriate for the travel path coverage provided by the North Sea dataset. The poor reconstruction of grid cell A can be attributed to the fact that only one travel path traverses this grid cell ( Fig. 9; Table 3). To better constrain this grid cell (and the others), additional North Sea ground-motion records would be required, ideally from offshore areas.
Rerunning the inversion with smaller datasets further validates the choice of grid resolution. The data was removed at semi-regular intervals from a latitudeordered form of the dataset, so as to ensure a reasonably even geographical distribution of observed travel paths was maintained, whilst still reducing the number of passes for each grid cell to test the validity of the computed attenuation rates. Similar results are obtained down to 2/3 (66%) of the North Sea dataset (with respect to their respective means and standard deviations Fig. 12). Rerunning the inversion with half (50%) of the dataset leads to considerably different results for some grid cells, and even more so with only 1/3 (33%) of the dataset. The largest differences in the reduced datasets compared with the full and 66% dataset are observed for grid cells D, G and H, which is due to there being fewer (12, 6 and 13, respectively) travel paths traversing them (Fig. 9). An additional indicator that the computed attenuation rates have been well constrained using this grid overlay resolution is the relatively minimal variation within the attenuation rate for each grid cell with respect to spectral period (Fig. 14). The attenuation rates are largest at periods of 0 s (PGA) or 0.1 s and lower at periods of 0.5 s or 1.0 s, which is expected from the theory of anelastic attenuation and scattering (e.g. Aki and Richards 2002). Fig. 12 Mean attenuation rates for each grid cell, computed using different size North Sea datasets. These mean values were determined using the computed attenuation rates for each spectral period considered (0 s, 0.1 s, 0.5 s and 1.0 s). The standard deviations represent the variability in the computed attenuation rates for these spectral periods Fig. 13 Standard deviation reductions in the site-corrected intra-event residuals following implementation of the nine computed apparent attenuation rates The computed apparent attenuation rates were also compared briefly to what would be expected based on the major geological structures present in the North Sea. Generally, tectonically active areas like heavily faulted areas and sedimentary basins should act as zones of high attenuation, whereas tectonically stable regions like cratons, and igneous and metamorphic terrains should act as zones of low attenuation (Hearn et al. 2008). The subsurface of the northern North Sea is dominated by grabens and sedimentary basins (Fossen and Hurich 2005), and therefore is likely to overall be a high attenuation region, although with subareas of lower attenuation. Due to the coarse resolution of the grid overlay, the path effects resulting from larger-scale geological structures like cratonic areas will be better constrained than the influence of comparatively smaller-scale geological structures like individual grabens. Therefore, the expected attenuation behaviour within each grid cell is approximated based solely on the presence of basin or cratonic environments.
Grid cells A, B, D, E, G, H and I are dominated by basin environments, whereas grid cells C and F are primarily cratonic environments dominated by the igneous and metamorphic terrains of the Norwegian coastline (Fossen and Hurich 2005; Fig. 10). Therefore, grid cells A, B, D, E, G, H and I are expected to be zones of relatively high attenuation, whilst grid cells C and F are expected to be zones of relatively low attenuation.
The computed attenuation rate for grid cell A is not compared to its expected attenuation behaviour due to the replacement procedure for poorly constrained grid cells having been applied to this cell. For spectral periods of 0 s and 0.5 s, the computed attenuation rates for six of the eight remaining grid cells are as would be expected (Fig. 10); and for 0.1 s, five out of eight, and for 1.0 s, four out of eight grid cells, are as would be expected. Therefore, overall, the computed attenuation rates correlate reasonably well with what would be expected based on the geological environments present.
This comparison of the expected attenuation rates and the computed attenuation rates is, however, associated with much uncertainty due to (1) the grid overlay resolution and (2) distinguishing between high and low attenuation zones solely on the presence of basin or cratonic settings within each grid cell. The coarse resolution of the grid overlay results in the attenuation effects of smaller geological structures (e.g. faultbounded basins) not being as well constrained as considerably larger-scale geological structures (e.g. basin and non-basin environments); however, these smallerscale structures likely still contribute to the attenuation behaviour within each grid cell. If the grid overlay resolution was higher, the contribution of smaller-scale Fig. 14 Computed apparent attenuation rates for each grid cell over the considered spectral periods geological structures would be better constrained, and therefore the attenuation behaviour expected within each grid cell could be approximated based on both the presence of tectonic faults and basin or non-basin environments, reducing the uncertainty in this comparison of expected and computed attenuation rates. A higher grid overlay resolution requires a larger dataset, further validating the need for additional North Sea ground-motion records to improve upon this analysis.

Impact of path effect corrections on predicted seismic hazard
Following the computation of the North Sea path effects, a Monte Carlo approach PSHA program was modified to implement the attenuation rates as path corrections. A Monte Carlo program was chosen due to its mathematical simplicity making it easier to modify than a program using a conventional approach for PSHA (Musson 2000). The implementation of the path corrections into a Monte Carlo PSHA permits assessment of their impact on the predicted seismic hazard, and the use of the base GMPE with the path adjustments for forward modelling. An example site (Fig. 9) was chosen because a site situated within the grid of computed attenuation rates will display a larger correction than a site situated outside of the grid (if a greater proportion of each travel path traverses through each grid cell, the computed attenuation rate for each traversed grid cell is scaled by a larger distance, resulting in a larger path correction). The results for the example site are typical of path-corrected hazard (i.e. following the implementation the path corrections into the Monte Carlo PSHA).
The computed seismic hazard curves for each considered spectral period and the correction ratios (Fig. 15) indicate the path corrections have varied but significant impacts on the predicted ground-motion intensities for the example site. For PGA and SA (0.1 s), the path corrections consistently reduce the predicted groundmotion by approximately 20-30%. For SA (0.5 s), below an annual frequency of exceedance of approximately 10 −3 , the path corrections result in moderate reductions in the predicted ground-motion of up to 30%, whereas above this annual frequency of exceedance, the path corrections result in increases of up to 60%. For SA (1.0 s), the path corrections result in increases of approximately 20-100%. The larger corrections observed for SA (0.5 s) and SA (1.0 s) can be attributed to the larger attenuation rates computed for these spectral periods ( Fig. 10; Table 4). The larger corrections observed for SA (0.5 s) and SA (1.0 s) at this example site are supported by the largest reductions in the standard deviation of the intra-event residuals being observed for these spectral periods (Fig. 13).

Conclusions
In this study the R jb variant of the Akkar et al. (2014) GMPE was found to perform moderately well for ground-motion predictions in the North Sea region. This GMPE was then modified in two ways to relax the ergodic assumption with respect to site and path in the North Sea region, resulting in incremental reductions in the intra-event residual variability ( Table 2).
The ergodic assumption was relaxed with respect to site through constraining site effects by clustering analysis of the North Sea ΔS2S s vector dataset. This cluster analysis can be considered as moderately successful due to the cluster amplification functions reducing the variability in site-response for onshore sites in the North Sea region ( Fig. 8; Table 2). However, the lack of geotechnical information for the sites in the North Sea dataset prevented the evaluation of the clustering in terms of how similar the site conditions are for the sites within each cluster, resulting in simpler descriptions of the clusters as either representative of high or low amplification subsurface conditions. This lack of geotechnical information highlights the need for abundant post posteriori information to maximise the effectiveness of the Kotha et al. (2018) site classification method. If more detailed descriptions of the geotechnical conditions for each cluster's sites were available, each cluster could also be assigned a more detailed (general) geotechnical description. Improved geotechnical descriptions for each cluster would enable onshore sites in the North Sea region which were not included in the original analysis, for which no ground-motions have been observed, but the site conditions are known, to be assigned the appropriate cluster amplification functions, and therefore be corrected for site effects in a future North Sea PSHA. Importantly, further work on the cluster functions is required for their application to offshore North Sea sites, where critical (in particular oil, gas and wind turbine) infrastructure is situated.
The ergodic assumption was relaxed with respect to path through the constraining of North Sea path effects.
The constraining of North Sea path effects was achieved through a tomographic inversion analysis, and can also be regarded as moderately successful. This success is indicated by the small but significant and reasonably consistent reduction in the standard deviations of the intra-event GMPE residuals over each considered period ( Fig. 13; Table 2). The validity of these reductions is largely supported by the two statistical tests carried out to determine the rigidity of the tomographic analysis results ( Fig. 11; Fig. 12). The comparison of the computed attenuation rates to the expected attenuation rates for each grid cell also indicates the validity of the tomographic analysis (Fig. 10).
Path corrections were successfully implemented for forward modelling through the modification of a Monte Carlo PSHA program. The implemented path corrections were shown to have varied but reasonable impacts on the predicted ground-shaking intensities. For the computed site effects to be implemented for forward modelling, improved geotechnical descriptions are required for each computed cluster. This would permit a site not incorporated within the original analysis, but for which sufficient geotechnical information is available to be assigned the most appropriate cluster amplification function, and therefore for incident ground-motions to be corrected for site effects within the modified Monte Carlo PSHA program.
The validity of the GMPE modifications developed here is limited by the small size of the North Sea dataset, although the limitations of this dataset are well accounted for when constraining North Sea site and path effects through (1) the use of only two clusters when Fig. 15 Path-corrected seismic hazard curves generated using the modified Monte Carlo PSHA program for an example site. The non-corrected predicted ground-motion was calculated using the base model GMPE with the standard deviations computed using the North Sea dataset. The corrected predicted ground-motion was calculated using the base GMPE with site-and path-corrected standard deviations. These standard deviations are listed in Table 2  implementing the k-means clustering algorithm to compute North Sea site classes and (2) the use of a low resolution grid overlay in the tomographic inversion analysis. Consequently, this GMPE provides a promising first step for further work on reassessing seismic hazard in the North Sea.