1 Introduction

The most recent probabilistic seismic hazard assessment (PSHA) for the North Sea was undertaken almost two decades ago by Bungum et al. (2000). Since the Bungum et al. (2000) PSHA was completed, seismicity related to hydrocarbon production (induced seismicity) in the North Sea has become of interest to the scientific and engineering communities, as well as the oil and gas companies active within the area (Ottemöller 2005). This interest increased after the occurrence of the 2001 Ekofisk local magnitude (ML) 4.2 earthquake, which generated surprisingly large ground motions given its moderate size (Ottemöller 2005). Because of the emerging problem of induced seismicity, the considerable advances made in PSHA since the Bungum et al. (2000) study and the large number of oil and gas installations present in the region and the increasing prevalence of offshore wind turbine facilities, it is timely to reassess the seismic hazard of the North Sea. The rationale for reassessment is further validated by the offshore seismic hazard maps within the current industry standards for offshore seismic design (API RP 2EQ 2014) being based on the results of a PSHA undertaken over two decades ago for onshore Norway (NORSAR and NGI 1998), rather than specifically for the North Sea region. There have been a number of recent PSHAs for locations surrounding the North Sea (e.g. Grünthal et al. 2018; Mosca et al. 2019; Tromans et al. 2018) but there is nothing in the public domain since the study of Bungum et al. (2000) for the North Sea itself.

A key step when reassessing the seismic hazard is the identification of the ground motion prediction equations (GMPEs) which best fit observed strong ground motions in an area. Because of a lack of observed ground motions from the North Sea, Bungum et al. (2000) only used expert judgement to select the most appropriate GMPEs for their PSHA due to a lack of observations from this area. Ground-motion records for this area are now publicly available. Therefore, we use residual analyses using the available data to determine the best-fitting GMPEs. The best-performing GMPE is used as a base model to which incremental improvements are implemented to develop a GMPE better suited for use in the North Sea. These improvements are made through relaxing the ergodic assumption (Anderson and Brune 1999) with respect to site and path in the area by adapting recently proposed techniques. These improvements are found to provide incremental but significant improvements in GMPE performance, which could further benefit from an expanded ground-motion dataset.

2 North Sea dataset preparation

A dataset of North Sea ground-motion records was extracted from the European Integrated Data Archive (EIDA, Orfeus-eu.org 2019), which comprises data from broadband seismometers. The preliminary dataset covered most of north-western Europe, and contained 38,562 ground-motion records from 773 earthquakes and 634 stations. A band-pass filter of 1–10 Hz was applied to remove noise from the ground-motion records in this preliminary dataset. The corner-frequencies (1 Hz, 10 Hz) of this filter were constrained by considering signal-to-noise ratios over several bandwidths. Because of the moderate magnitudes of the earthquakes considered here, the low frequency cut-off of 1 Hz should not be affecting significantly the spectral accelerations at periods of 1 s and 2 s as there is little long-period energy in the records.

All ground-motion records pertaining to earthquakes of ML < 2.5, source-to-site distances greater than 500 km or originating from outside of the North Sea were removed. Initial residual analysis was carried out using the Bragato and Slejko (2005) GMPE to remove erroneous records. Erroneous records were defined as those with residuals (in base 10 log) greater than 2 or less than − 2. This GMPE, developed for the Eastern Alps, was chosen because it was a simple model using a dataset with similar magnitude-distance range to the North Sea dataset. The final dataset comprised 120 ground-motion records from 50 earthquakes and 17 stations (Fig. 1; Fig. 2). The 120 records of the North Sea dataset were checked for noise contamination by visual inspection of observed versus predicted spectra, again using the Bragato and Slejko (2005) GMPE. Such noise contamination was found to be minimal. It is assumed that ML and moment magnitude Mw are equal for these records. A uniform value of the average shear-wave velocity of the top 30 m, Vs30, of 600 m/s was assumed for all stations (due to a lack of available subsurface geotechnical information for each site), corresponding to an upper 30 m comprising of tens of metres of very dense sand, gravel or very stiff clay, according to the Eurocode 8 earthquake design code (BS EN 1998 2004).

Fig. 1
figure 1

Map of earthquakes and stations in the North Sea dataset. For more details on these site clusters, please see Section 4

Fig. 2
figure 2

Magnitude-distance plot for the processed North Sea dataset. Each blue circle represents one ground motion record

The North Sea dataset contains few near-source (< 50 km source-to-site distance) records, because there are no public offshore seismometers in the North Sea. The dataset also lacks larger magnitude (ML > 4.0) ground-motion records because of the area’s moderate seismicity (Fig. 2). This lack of true “strong-motion” data increases the uncertainty associated with calibrating/testing GMPEs at higher ground-motion levels. Data from near-source seismometers (i.e. those installed on offshore platforms or from seafloor seismic monitoring networks) was sought from private companies; however, no such data were obtained.

3 North Sea GMPE testing

Following the preparation of the North Sea dataset, testing of how well the considered GMPEs fit the ground-motion data was undertaken using more thorough residual analyses. These analyses were undertaken using the Python/OpenQuake-based gmpe-smtk toolkit (Global Earthquake Model Foundation 2019). The residuals were computed in natural log units. The 16 GMPEs considered here are Abrahamson et al. (2014); Akkar and Bommer (2010); Akkar and Çağnan (2010); Akkar et al. (2014) using epicentral distance (Repi); Akkar et al. (2014) using hypocentral distance (Rhyp); Akkar et al. (2014) using the distance to the surface projection of the rupture (Rjb)Footnote 1; Boore and Atkinson (2008); Boore et al. (2014); Bindi et al. (2017) using Rjb; Bindi et al. (2017) using Rhyp; Campbell and Bozorgnia (2014); Chiou and Youngs (2014); Cauzzi et al. (2015); Rietbrock et al. (2013); Toro et al. (1997) and Toro (2002). These 16 GMPEs were chosen because they had been recently developed for use in Europe and/or were derived using datasets of similar magnitude-distance ranges to the North Sea dataset or they had been used in previous North Sea studies. Equality between RotD50Footnote 2 and geometric mean was assumed for the GMPEs developed for RotD50. This assumption is justified by the relations observed between these two horizontal-component ground motion intensity measures by Boore and Kishida (2017).

Residual analysis with the gmpe-smtk toolkit was undertaken for the following ground motion intensity measures: (1) peak ground acceleration (PGA), (2) spectral acceleration (SA) for periods of 0.1 s, 0.5 s and 1.0 s and (3) peak ground velocity (PGV). The fit of each model was determined through the linear trend in the residuals. The residuals of the dataset were evaluated with respect to magnitude and distance. The inter-event standard deviation (τ), intra-event standard deviation (Φ), bias (mean total residual) of each model and p values (i.e. the probability that, when the null hypothesis (here “no linear trend”) is true, would be greater than or equal to the observed results) with respect to both magnitude and distance were calculated. In this investigation a p value of less than 0.1 (rather than the more common 0.05) is considered as statistically significant, thus representing a trend of over- or under-predicting a ground motion intensity measure with respect to either magnitude or distance. The inter-event standard deviation is associated with event-specific factors, e.g. randomness in the source process. The intra-event standard deviation represents the variability associated with record-specific factors, e.g. site amplification or geometrical spreading, for the same event.

The dataset-derived intra-event and inter-event standard deviations of each evaluated GMPE are slightly higher than those predicted by the GMPEs. These elevated values can be attributed to (1) the GMPEs having been developed for regions other than the North Sea, (2) the small events and large distances mainly covered by the dataset [previous studies (e.g. Ambraseys et al. 2005) have found that ground motions from small events and/or large distances are more variable than those from larger events and/or short distances], (3) uncertainties in the magnitude and distance estimates in the North Sea dataset and (4) a lack of information on the near-surface site conditions at the seismometers.

3.1 PGA results

For PGA (Table 1), the best-fitting model was the Akkar et al. (2014) GMPE using the distance to the surface projection of the rupture (Rjb). This GMPE provides the lowest bias, inter-event and intra-event standard deviations and does not show statistically significant trends with respect to either magnitude or distance (Fig. 3; Fig. 4). This GMPE was derived from data mainly from the Mediterranean region (Italy, Greece and Turkey), and using data from earthquakes down to Mw 4 with a functional form that captured the magnitude- and distance-scaling of ground motions from moderate earthquakes. These reasons likely explain its superior performance over the other GMPEs considered.

Table 1 Parameters determined from residual analysis for each considered GMPE with regard to the North Sea data subset for PGA. All parameters are in terms of natural logarithms
Fig. 3
figure 3

PGA residual analysis of the North Sea data subset with respect to local magnitude, using the Akkar et al. (2014) Rjb GMPE. Each blue circle represents one inter-event residual

Fig. 4
figure 4

PGA Residual analysis of the North Sea data subset with respect to distance, using the Akkar et al. (2014) Rjb GMPE. Each blue circle represents one intra-event residual

The Toro et al. (1997) GMPE used by Bungum et al. (2000) was found to significantly over-predict PGA in the North Sea, with the residuals computed with this GMPE displaying significant statistical trends for both magnitude and distance. The Ambraseys et al. (1996) GMPE [the other GMPE used within Bungum et al. 2000’s logic tree] is not currently available within the gmpe-smtk toolkit and hence could not be considered.

The constant over-prediction by the Toro et al. (1997) GMPE is most likely the result of the GMPE being primarily applicable to (1) very hard (Vs30 > 2 km/s) rock sites of very low near-surface attenuation (kappa) and (2) calibration of this model to central and eastern North America. Anelastic seismic attenuation is probably greater throughout the North Sea than in central and eastern North America, as suggested by the observations of poor Lg wave propagation within the North Sea’s crust due to the presence of the Central and Viking Grabens in the region’s continental structure (Gregersen and Vaccari 1993). The considerable over-prediction by the Toro et al. (1997) GMPE for PGA is likely minimised by the equal contribution of the Ambraseys et al. (1996) GMPE within Bungum et al. (2000)’s logic tree. This minimising contribution by the Ambraseys et al. (1996) GMPE is indicated by the seismic hazard curve computed with Bungum et al. (2000)’s logic tree (Fig. 5) predicting lower annual frequencies of exceedance for a given level of PGA than the hazard curve computed with the Toro et al. (1997) GMPE. The over-prediction by the Toro et al. (1997) GMPE is a strong indication of the individual inadequacy of the Toro et al. (1997) GMPE for predicting ground-motion intensities in the North Sea region.

Fig. 5
figure 5

PGA seismic hazard curves comparing components of Bungum et al. (2000)’s GMPE logic tree with the Rjb variant of the Akkar et al. (2014) GMPE for a North Sea site of moderate hazard (easting = 353,300, northing = 7,100,467 for UTM zone 31 N). Each curve represents the hazard computed for one GMPE

3.2 SA results

The GMPEs that provide the best overall fits at the selected spectral periods are mostly the same group as those that provide the best fits for PGA. For SA (0.1 s), all three variants of the Akkar et al. (2014) GMPE provide good fits. Additionally, the Akkar and Çağnan (2010) and the Rietbrock et al. (2013) GMPEs provide good fits, despite only providing moderate fits when predicting PGA. For SA (0.5 s) none of the considered GMPEs provide good fits. However, the Abrahamson et al. (2014) GMPE, the Rietbrock et al. (2013) GMPE, the Cauzzi et al. (2015) GMPE and all variants of the Akkar et al. (2014) GMPE were found to provide moderate fits. For SA (1.0 s), all considered GMPEs provide poor fits bar the Abrahamson et al. (2014) GMPE, which provides a moderate fit. These poor fits are predominantly due to the significant trends with respect to magnitude.

The Toro et al. (1997) GMPE again significantly over-predicts ground shaking for a spectral period of 0.1 s, albeit moderately less so than it does for PGA, especially with regard to magnitude and distance dependency (i.e. larger p values are observed for PGA). At larger spectral periods (0.5 s and 1.0 s), the Toro et al. (1997) GMPE slightly to moderately under-predicts. However, the corresponding p values with respect to both magnitude and distance are significantly smaller than for PGA and SA (0.1 s) meaning the overall fit is also poor for Toro et al. (1997) at larger spectral periods.

3.3 PGV results

Only the Rietbrock et al. (2013) GMPE provides a good fit when predicting PGV, and only the Cauzzi et al. (2015) GMPE provides a moderate fit. The majority of the evaluated GMPEs show low to moderate bias. However, as with the residual analysis results for SA (1.0 s), many of the considered GMPEs show statistically significant trends with respect to magnitude, which is probably due to these GMPEs not having been calibrated for small-to-moderate earthquakes especially from considerable distances (> 100 km).

3.4 Comparative North Sea hazard calculations

Hazard curves were computed for an example North Sea location (Fig. 5) using the CRISIS seismic hazard software (Ordaz et al. 2015) and the Rjb variant of the Akkar et al. (2014) GMPE, which was determined to be the most appropriate GMPE for the North Sea from those investigated, as well as the same two GMPEs as used by Bungum et al. (2000). The hazard curves are largely representative of the GMPE testing results; the Rjb variant of the Akkar et al. (2014) GMPE (with the GMPE’s default standard deviations) predicts considerably lower frequencies of exceedance for higher levels of PGA compared to Bungum et al. (2000)’s logic tree approach.

Use of the larger database-derived inter-event and intra-event standard deviations for the Rjb variant of the Akkar et al. (2014) also has a moderate impact on the seismic hazard calculations for this example North Sea location; e.g. for an annual frequency of exceedance of 10−3, the predicted PGA is approximately 30% larger with the database-derived standard deviations than the GMPE’s reported standard deviations. In North Sea PSHA calculations with the Rjb variant of the Akkar et al. (2014), the model’s default standard deviations should be used for now until more work has been carried out.

Because of the sparsity of ground-motion records from moderate and large (ML > 5) earthquakes within the North Sea dataset, we also evaluate the performance of the Rjb variant of the Akkar et al. (2014) GMPE using a small dataset of ground-motion records that are not in EIDA from larger North Sea earthquakes (8 ground-motion records for 5 earthquakes of sizes ML 4.4, 4.4, 5.3, 5.7 and 6.1 recorded at distances of approximately 300–700 km), including the ML 5.3 Viking Graben event of 1927 and the ML 6.1 Dogger Bank event of 1931. This ground-motion data was collated from Bungum et al. (2003), the Norwegian Seismological Array (NORSAR) and the Norwegian National Seismic Network (NNSN). These additional data are of poorer quality with less reliable metadata, and consequently were not incorporated within the original residual analysis because they could potentially bias the results from the EIDA (North Sea) dataset. Using this independent dataset, the GMPE was found to provide a good fit overall for PGA, with a bias of 0.712 and no statistically significant trending with respect to either magnitude or distance being observed. This good fit with the independent dataset provides additional support for the selection of the Rjb variant of the Akkar et al. (2014) GMPE as the base model for the presented North Sea GMPE based on its performance using the North Sea dataset. The improvements proposed below will likely also further enhance the performance of the Rjb variant of the Akkar et al. (2014) GMPE for predicting the ground shaking resulting from larger earthquakes.

4 Constraining site effects in the North Sea

The GMPE testing concluded that no single GMPE provides an overall good fit to the North Sea dataset. However, the Rjb variant of the Akkar et al. (2014) GMPE was found to predict the observations well for PGA, SA (0.1 s) and SA (0.5 s), and therefore is considered to perform the best overall for the North Sea dataset. Consequently, for the remainder of the investigation, the Rjb variant of the Akkar et al. (2014) GMPE is treated as the base model to which improvements are made.

The intensities of ground shaking resulting from a single earthquake vary among sites due to local site conditions (e.g. Sanchez-Sesma 1987). Generally, if the subsurface comprises soil or soft rock, the ground-motions are amplified with respect to hard rock (e.g. Bowden and Tsai 2017). Site classification schemes enable simple yet effective adjustments to ground-motion predictions at many sites. These adjustments effectively result in the partial relaxation of the ergodic assumption with respect to site, subsequently leading to improved GMPE performance.

Significant challenges are associated with using such site classifications schemes, the most prominent of which being the requirement of an abundance of a priori information. To circumvent this dependency on a priori site information, Kotha et al. (2018) developed a data-driven method for site classification. This method uses a variety of statistical techniques to determine an optimal number of site classes using only the intra-event residuals of a selected GMPE as a priori information, and then known site characteristics (e.g. VS30the time-averaged shear-wave velocity through the top 30 m of the site soil profile) are used as a posteriori information to assign characteristic subsurface conditions to each class. The assignment of characteristic subsurface conditions to each class is not required if applying this approach to sites with ground-motion data. However, this final step is necessary to apply the method to sites with only geotechnical information available and no ground-motion data (as is generally the case at the sites of engineering infrastructure).

Considering the lack of reliable a priori information for each site in the North Sea dataset, the Kotha et al. (2018) methodology is ideal for constraining site effects in the North Sea, so as to improve the performance of the base GMPE. It should be noted that the sites within the North Sea dataset are exclusively onshore sites. Therefore, the site classes/amplifications determined using the Kotha et al. (2018) technique would likely require modification to be applicable to the offshore sites, where oil, gas and wind-turbine facilities are generally located in the North Sea.

4.1 Application of the Kotha et al. (2018) method to the North Sea dataset

The Kotha et al. (2018) site classification method incorporates principal component analysis (PCA) and k-means clustering (Joliffe 2013; Aggarwal 2014) to derive site classes, and is described below. This site classification method incorporates nomenclature defined by Al Atik et al. (2010). The relevant nomenclature is as follows: (1) δS2Ss—the average (intra-event) GMPE residual for one site, which represents the site-specific random effects, (2) ∆S2Ss—a vector comprising the δS2Ss scalar computed for each spectral period considered and (3) ϕS2Ss—the standard deviation of the site-to-site (intra-event) GMPE residuals.

PCA is used to reduce the multi-dimensionality of an ∆S2Ss vector dataset to produce principal component scores. The principal component scores represent the variability observed in the ∆S2Ss vector dataset. The first two principal component scores represent the bulk of the variability in a dataset for which PCA is performed (Aggarwal 2014). The plotting of these first two principal component scores is representative of the variability observed in the ∆S2Ss vector dataset in a two-dimensional space.

The representation of the bulk variability of the ∆S2Ss vector dataset in a two-dimensional space enables clustering of the ∆S2Ss vectors using the k-means algorithm (Fig. 6). The k-means clustering assigns each ∆S2Ss vector to a cluster comprised of similar ∆S2Ss vectors. The k-means clustering of S2Ss vectors therefore effectively groups sites with similar site-specific random effects over the range of spectral periods considered.

Fig. 6
figure 6

Clustered North Sea ∆S2Ss vectors following 1000 iterations of the k-means algorithm. Each circle represents a clustered data point. Each square represents a cluster centroid

For each cluster, the means of the ∆S2Ss vectors within that cluster are calculated to produce cluster-specific ∆S2Ss vectors. The ∆S2Ss vectors effectively act as amplification functions relative to one another, from which the response spectra computed for each site can be adjusted. For the ∆S2Ss vectors to act as amplification functions relative to one another, a reference cluster must be chosen. This reference cluster should ideally display small site response at all considered spectral periods. Conventionally, outcropping hard bedrock sites (Vs30 > 800 m/s) are used as reference sites, for which seismic hazard estimates are computed and then scaled using the appropriate site amplification function (Kotha et al. 2018).

Within this investigation, the North Sea ∆S2Ss vector dataset comprises of 17 ∆S2Ss vectors (one for each site). The North Sea δS2Ss scalars were computed for each spectral period considered (0 s, 0.1 s, 0.5 s, 1.0 s and 2.0 s)Footnote 3 with the base model GMPE and the North Sea dataset. The Kotha et al. (2018) methodology was subsequently applied to this North Sea ∆S2Ss vector dataset. PCA was performed to produce a two-dimensional representation of the bulk variability observed in the North Sea ∆S2Ss vector dataset. K-means clustering was then run for 1000 iterations, resulting in successful convergence with two clusters (Fig. 6). Cluster 1 contained 10 sites and cluster 2 contained 7 sites. Cluster 1 was chosen to be the reference cluster for the computation of the other cluster’s amplification function due to cluster 1’s residuals implying small site response at all considered spectral periods (Fig. 7). The intra-cluster standard deviations of δS2Ss for each spectral period considered (i.e. ϕS2Ss) were also computed (Fig. 8).

Fig. 7
figure 7

Mean North Sea ∆S2Ss vectors computed following 1000 iterations of the k-means algorithm. Error bars represent plus/minus one standard deviation. Feint lines represent pre-clustering ∆S2Ss vectors

Fig. 8
figure 8

Intra-cluster standard deviations of δS2Ss per period

Following the partitioning of the North Sea ∆S2Ss vector dataset into clusters, and the subsequent computation of cluster amplification functions, characteristic geotechnical conditions should be defined for each cluster. However, minimal information was available for the subsurface conditions of each site in the North Sea dataset. In the Kotha et al. (2018) study, an abundance of geotechnical information enabled rigorous assignment of characteristic site conditions for each cluster, including the computation of two-dimensional kernel distributions of the geotechnical site characteristics, from which representative ranges for each cluster could be identified. In comparison, here the only geotechnical site characteristic which could be reliably assigned (albeit again from the ground-motion data) is the spectral period corresponding to peak amplification, which for cluster 2 was determined to be 0.5 s, thereby matching the period determined by the analysis using ∆S2Ss. Because of the lack of geotechnical information, it can only be said that cluster 1 represents the reference cluster and that cluster 2 represents the amplified cluster.

Sites of interest in future PSHA, but not included in the clustering procedure can theoretically be assigned to a site class derived from the Kotha et al. (2018) method based on known geotechnical site characteristics, and the most appropriate cluster amplification function can be applied to response spectra computed for the site. However, Kotha et al. (2018) suggest that for their cluster site amplification functions to be applicable to new sites, additional site-response parameters should be developed and further geotechnical information for certain clustered sites must be obtained. Kotha et al. (2018) suggest such steps because using only the site-response parameters available during their study results in some clusters being indistinguishable (e.g. similar VS30 distributions are observed for significantly different site amplification functions). Considering the far smaller size of the dataset used in this study compared to the dataset used by Kotha et al. (2018), such recommendations are clearly necessary for the application of the cluster amplification functions computed in this investigation to new North Sea sites for which no ground-motions records are available.

4.2 Effect on aleatory variability

The k-means clustering produced two reasonably distinct amplification functions, with significant overlap only occurring for the North Sea ∆S2Ss vectors (from which the amplification functions are calculated) at a period of 2.0 s (Fig. 7), which can be attributed to the high intra-cluster standard deviation for δS2Ss values at a period of 2.0 s (Fig. 8). Consequently, it can be concluded that overall cluster 1 (the reference cluster) represents a more similar set of ∆S2Ss vectors than cluster 2 (the amplification cluster).

A significant benefit observed within the Kotha et al. (2018) study is also observed in this analysis: the intra-cluster site-to-site response variability for each considered period is overall significantly smaller than the pre-clustered overall site-to-site response (Fig. 8; Table 2)—this reduction is approximately 58% on average for cluster 1, and approximately 43% for cluster 2. The smallest reduction in intra-event standard deviation is observed at a period of 2.0 s for cluster 2 (~ 22%). This can be attributed to the poorer clustering observed in cluster 2 compared to cluster 1 (see Fig. 8). This large reduction can be attributed to (1) the relatively small size of the dataset and (2) the dataset comprising many small earthquakes [the greater variability in the ground shaking associated with smaller earthquakes (Ambraseys et al. 2005) results in a more significant variability reduction following clustering].

Table 2 Intra-event residual standard deviation following each adjustment to the base GMPE

Importantly, the site amplification functions partially relax the ergodic assumption with respect to site, and therefore improve the performance of the Rjb variant of the Akkar et al. (2014) GMPE in the North Sea. This improved performance is primarily indicated by small but mostly consistent reductions in the variability in the intra-event residuals when incorporating the computed cluster amplification functions (Fig. 8; Table 2). The largest reduction in δS2Ss is observed at a period of 0 s (~ 10%), although considerable reductions are still observed for periods of 0.1 s, 0.5 s and 1.0 s (~ 8%, ~ 7% and ~ 6% respectively). A smaller reduction in δS2Ss variability is observed at 2.0 s (~ 1%), which can be attributed to site effects having smaller impacts at longer periods (longer periods are generally more affected by source and path effects).

5 Constraining path effects in the North Sea

Following application of the cluster amplification functions to the intra-event residuals, scatter in the intra-event residuals now represents variations in the path effects for each ground-motion record. Path effects represent the energy attenuated as seismic waves propagate through the subsurface (Kennett 2009). Seismic attenuation occurs through three mechanisms: (1) geometric spreading, (2) anelastic attenuation and (3) scattering. Through the implementation of a geometric spreading term in the Akkar et al. (2014) GMPE, the first of these mechanisms is already accounted for. The Akkar et al. (2014) GMPE does not include anelastic attenuation or scattering terms because the authors of this model did not have sufficient far-field data to constrain them (although the geometric-spreading term of this model may partially include attenuation from these two sources). Each site-corrected intra-event residual is, therefore, representative of the apparent attenuation (anelastic attenuation plus scattering) for each ground-motion record.

Here, the intra-event residuals are analysed to relax the ergodic assumption with respect to path (and therefore improve GMPE performance) through determining repeated attenuation effects for the North Sea region. These effects are determined by a tomographic inversion based on the approach of Dawood and Rodriguez-Marek (2013) for Japan.

5.1 Approach for constraining path effects in the North Sea

Prior to the implementation of the Dawood and Rodriguez-Marek (2013) approach, the latitudes and longitudes of the earthquake epicentres and the stations were converted to Northings and Eastings relative to the corresponding Universal Transverse Mercator (UTM) zone(s). This conversion ensured the curvature of the earth was accounted for even in the two-dimensional visualisation of this tomographic problem. After this conversion, the following four stages were undertaken.

First, a grid of an appropriate resolution was imposed over the region for which repeated attenuation effects were to be determined. For each grid cell, an apparent attenuation rate was computed. Ideally, the resolution of the grid had to be small enough that each grid cell captured significant yet particular variations in apparent attenuation over the region, but large enough that a sufficient number of travel paths also passed through each grid cell. A higher number of passes for each grid cell result in a better constrained attenuation rate for the associated grid cell (Dawood and Rodriguez-Marek 2013). The grid resolution was set at roughly 250 km per grid cell, resulting in a 3 × 3 grid overlay (Fig. 9).

Fig. 9
figure 9

North Sea region with 3 × 3 grid overlay. Green lines represent travel paths. Bold letters denote each grid cell for which path effects are to be constrained for

Second, the distance traversed across each grid cell by each travel path was determined. The travel paths are represented as straight lines (Fig. 9), assuming that the seismic waves are travelling at constant velocities. This is a significant assumption but it was made to simplify the calculations.

Third, the apparent attenuation of each grid cell was solved for the considered spectral periods (0 s, 0.1 s, 0.5 s and 1.0 s) (e.g. Fig. 10). The computed attenuation rates are displayed in Table 4. The attenuation rates are solutions of a set of simultaneous linear equations. As within the Dawood and Rodriguez-Marek (2013) investigation, a replacement procedure was implemented for grid cells considered as poorly constrained due to a lack of passes. Within this investigation, any grid cell with fewer than five passes were considered as poorly constrained, and therefore the corresponding apparent attenuation rate for such grid cells was replaced with the average apparent attenuation of the other grid cells. This procedure was applied to one of the nine grid cells (grid cell Asee Fig. 9; Table 3).

Fig. 10
figure 10

Computed North Sea apparent attenuation rates (per km) for a spectral period of 0.5 s. Blue stars represent the earthquake epicentre locations. Red triangles represent the station locations. Black lines represent travel paths

Table 3 Number of travel path passes for each grid cell in Fig. 9. *Grid cell A is intercepted once by a marginal amount not clearly observable on the scale of Fig. 9

The cells’ attenuation rates, like the GMPE residuals, were computed in base natural log units. If the apparent attenuation rate for each grid cell, each expressed as an attenuation coefficient, is considered as a correction applied to each ground-motion record’s travel path (to reduce the associated site-corrected residual), then a negative attenuation rate represents a zone of high attenuation, and a positive grid cell attenuation rate a zone of low attenuation. Equation 1 demonstrates how these corrections are applied:

$$ {Z}_p={Z}_s-\sum \limits_{i-1}^n{\updelta}_{\mathrm{i}}{P}_{\mathrm{i}\mathrm{es}} $$
(1)

where Zp represents the path-corrected intra-event residual, Zs represents the site-corrected intra-event residual, Pies represents the distance (in km) through grid cell i for a straight line travel path of source e to site s, and δi represents the apparent attenuation per km per grid cell. Finally, (1) checkerboard inversion testing and (2) rerunning the tomographic inversion with a smaller dataset were undertaken to check whether the grid resolution was appropriate for the North Sea dataset.

Checkerboard testing comprises of a simple inversion, whereby a checkerboard pattern matrix of alternating high and low attenuation areas (relative to the outputted grid cell attenuation rates in the forward calculation) is input, and the forward model (the resolution of the overlain grid and the associated distance coefficients for each travel path) is used to invert the checkerboard matrix, resulting in the computation of synthetic site-corrected residuals. Perturbations in the form of random sampling from a Gaussian noise distribution (with a standard deviation equal to that observed within the checkerboard matrix) are then added to these synthetic residuals. The perturbed synthetic residuals are then input into the forward model to reconstruct the original checkerboard pattern. The percentage difference between each grid cell in the reconstructed and the inputted checkerboard matrices (Fig. 11) provides an indication of how well the attenuation rate of each grid cell has been constrained using the forward model. A large difference is indicative of a grid cell being poorly constrained, and a small difference is indicative of a grid cell having been well constrained.

Fig. 11
figure 11

Percentage difference between the reconstructed checkerboard matrix and the inputted checkerboard matrix. Blue stars represent the earthquake epicentre locations. Red triangles represent the station locations. Black lines represent travel paths

It should be noted that the checkerboard test has several weaknesses (Rawlinson et al. 2014; Lévěque et al. 1993): (1) adding perturbations in the form of Gaussian noise with a specified standard deviation is likely not representative of the noise in the observed dataset, (2) the use of an identical model for both the data inversion and the reconstruction will inherently result in a favourable reconstruction of the checkerboard matrix and (3) results can depend strongly on the inputted structure. These inherent weaknesses in the checkerboard test led to the choice to rerun the tomographic inversion with smaller datasets for further verification of the viability of the chosen grid resolution.

For this analysis, records were randomly removed from the original dataset and the attenuation rates were recomputed using these reduced datasets. As discussed in detail below, if reasonably similar results can be computed with smaller datasets (e.g. Fig. 12), this indicates that the computed grid cell attenuation rates are not extensively perturbed by moderate reductions in the size of the North Sea dataset, and therefore the grid overlay resolution is appropriate.

Fig. 12
figure 12

Mean attenuation rates for each grid cell, computed using different size North Sea datasets. These mean values were determined using the computed attenuation rates for each spectral period considered (0 s, 0.1 s, 0.5 s and 1.0 s). The standard deviations represent the variability in the computed attenuation rates for these spectral periods

5.2 Results for constraining path effects in the North Sea

The inclusion of the nine computed attenuation rates within the GMPE leads to small yet significant (all 2–5%) reductions in the standard deviations for the site-corrected intra-event residuals at all considered spectral periods (Fig. 13; Table 2). The checkerboard inversion test reconstructs the inputted checkerboard matrix well aside from for grid cell A (Fig. 11), suggesting the grid overlay resolution is appropriate for the travel path coverage provided by the North Sea dataset. The poor reconstruction of grid cell A can be attributed to the fact that only one travel path traverses this grid cell (Fig. 9; Table 3). To better constrain this grid cell (and the others), additional North Sea ground-motion records would be required, ideally from offshore areas.

Fig. 13
figure 13

Standard deviation reductions in the site-corrected intra-event residuals following implementation of the nine computed apparent attenuation rates

Rerunning the inversion with smaller datasets further validates the choice of grid resolution. The data was removed at semi-regular intervals from a latitude-ordered form of the dataset, so as to ensure a reasonably even geographical distribution of observed travel paths was maintained, whilst still reducing the number of passes for each grid cell to test the validity of the computed attenuation rates. Similar results are obtained down to 2/3 (66%) of the North Sea dataset (with respect to their respective means and standard deviations Fig. 12). Rerunning the inversion with half (50%) of the dataset leads to considerably different results for some grid cells, and even more so with only 1/3 (33%) of the dataset. The largest differences in the reduced datasets compared with the full and 66% dataset are observed for grid cells D, G and H, which is due to there being fewer (12, 6 and 13, respectively) travel paths traversing them (Fig. 9).

An additional indicator that the computed attenuation rates have been well constrained using this grid overlay resolution is the relatively minimal variation within the attenuation rate for each grid cell with respect to spectral period (Fig. 14). The attenuation rates are largest at periods of 0 s (PGA) or 0.1 s and lower at periods of 0.5 s or 1.0 s, which is expected from the theory of anelastic attenuation and scattering (e.g. Aki and Richards 2002).

Fig. 14
figure 14

Computed apparent attenuation rates for each grid cell over the considered spectral periods

The computed apparent attenuation rates were also compared briefly to what would be expected based on the major geological structures present in the North Sea. Generally, tectonically active areas like heavily faulted areas and sedimentary basins should act as zones of high attenuation, whereas tectonically stable regions like cratons, and igneous and metamorphic terrains should act as zones of low attenuation (Hearn et al. 2008). The subsurface of the northern North Sea is dominated by grabens and sedimentary basins (Fossen and Hurich 2005), and therefore is likely to overall be a high attenuation region, although with subareas of lower attenuation. Due to the coarse resolution of the grid overlay, the path effects resulting from larger-scale geological structures like cratonic areas will be better constrained than the influence of comparatively smaller-scale geological structures like individual grabens. Therefore, the expected attenuation behaviour within each grid cell is approximated based solely on the presence of basin or cratonic environments.

Grid cells A, B, D, E, G, H and I are dominated by basin environments, whereas grid cells C and F are primarily cratonic environments dominated by the igneous and metamorphic terrains of the Norwegian coastline (Fossen and Hurich 2005; Fig. 10). Therefore, grid cells A, B, D, E, G, H and I are expected to be zones of relatively high attenuation, whilst grid cells C and F are expected to be zones of relatively low attenuation.

The computed attenuation rate for grid cell A is not compared to its expected attenuation behaviour due to the replacement procedure for poorly constrained grid cells having been applied to this cell. For spectral periods of 0 s and 0.5 s, the computed attenuation rates for six of the eight remaining grid cells are as would be expected (Fig. 10); and for 0.1 s, five out of eight, and for 1.0 s, four out of eight grid cells, are as would be expected. Therefore, overall, the computed attenuation rates correlate reasonably well with what would be expected based on the geological environments present.

This comparison of the expected attenuation rates and the computed attenuation rates is, however, associated with much uncertainty due to (1) the grid overlay resolution and (2) distinguishing between high and low attenuation zones solely on the presence of basin or cratonic settings within each grid cell. The coarse resolution of the grid overlay results in the attenuation effects of smaller geological structures (e.g. fault-bounded basins) not being as well constrained as considerably larger-scale geological structures (e.g. basin and non-basin environments); however, these smaller-scale structures likely still contribute to the attenuation behaviour within each grid cell. If the grid overlay resolution was higher, the contribution of smaller-scale geological structures would be better constrained, and therefore the attenuation behaviour expected within each grid cell could be approximated based on both the presence of tectonic faults and basin or non-basin environments, reducing the uncertainty in this comparison of expected and computed attenuation rates. A higher grid overlay resolution requires a larger dataset, further validating the need for additional North Sea ground-motion records to improve upon this analysis.

5.3 Impact of path effect corrections on predicted seismic hazard

Following the computation of the North Sea path effects, a Monte Carlo approach PSHA program was modified to implement the attenuation rates as path corrections. A Monte Carlo program was chosen due to its mathematical simplicity making it easier to modify than a program using a conventional approach for PSHA (Musson 2000). The implementation of the path corrections into a Monte Carlo PSHA permits assessment of their impact on the predicted seismic hazard, and the use of the base GMPE with the path adjustments for forward modelling. An example site (Fig. 9) was chosen because a site situated within the grid of computed attenuation rates will display a larger correction than a site situated outside of the grid (if a greater proportion of each travel path traverses through each grid cell, the computed attenuation rate for each traversed grid cell is scaled by a larger distance, resulting in a larger path correction). The results for the example site are typical of path-corrected hazard (i.e. following the implementation the path corrections into the Monte Carlo PSHA).

The computed seismic hazard curves for each considered spectral period and the correction ratios (Fig. 15) indicate the path corrections have varied but significant impacts on the predicted ground-motion intensities for the example site. For PGA and SA (0.1 s), the path corrections consistently reduce the predicted ground-motion by approximately 20–30%. For SA (0.5 s), below an annual frequency of exceedance of approximately 10−3, the path corrections result in moderate reductions in the predicted ground-motion of up to 30%, whereas above this annual frequency of exceedance, the path corrections result in increases of up to 60%. For SA (1.0 s), the path corrections result in increases of approximately 20–100%. The larger corrections observed for SA (0.5 s) and SA (1.0 s) can be attributed to the larger attenuation rates computed for these spectral periods (Fig. 10; Table 4). The larger corrections observed for SA (0.5 s) and SA (1.0 s) at this example site are supported by the largest reductions in the standard deviation of the intra-event residuals being observed for these spectral periods (Fig. 13).

Fig. 15
figure 15

Path-corrected seismic hazard curves generated using the modified Monte Carlo PSHA program for an example site. The non-corrected predicted ground-motion was calculated using the base model GMPE with the standard deviations computed using the North Sea dataset. The corrected predicted ground-motion was calculated using the base GMPE with site- and path-corrected standard deviations. These standard deviations are listed in Table 2

Table 4 Computed apparent attenuation rate (per km) for each grid cell

6 Conclusions

In this study the Rjb variant of the Akkar et al. (2014) GMPE was found to perform moderately well for ground-motion predictions in the North Sea region. This GMPE was then modified in two ways to relax the ergodic assumption with respect to site and path in the North Sea region, resulting in incremental reductions in the intra-event residual variability (Table 2).

The ergodic assumption was relaxed with respect to site through constraining site effects by clustering analysis of the North Sea ∆S2Ss vector dataset. This cluster analysis can be considered as moderately successful due to the cluster amplification functions reducing the variability in site-response for onshore sites in the North Sea region (Fig. 8; Table 2). However, the lack of geotechnical information for the sites in the North Sea dataset prevented the evaluation of the clustering in terms of how similar the site conditions are for the sites within each cluster, resulting in simpler descriptions of the clusters as either representative of high or low amplification subsurface conditions. This lack of geotechnical information highlights the need for abundant post posteriori information to maximise the effectiveness of the Kotha et al. (2018) site classification method. If more detailed descriptions of the geotechnical conditions for each cluster’s sites were available, each cluster could also be assigned a more detailed (general) geotechnical description. Improved geotechnical descriptions for each cluster would enable onshore sites in the North Sea region which were not included in the original analysis, for which no ground-motions have been observed, but the site conditions are known, to be assigned the appropriate cluster amplification functions, and therefore be corrected for site effects in a future North Sea PSHA. Importantly, further work on the cluster functions is required for their application to offshore North Sea sites, where critical (in particular oil, gas and wind turbine) infrastructure is situated.

The ergodic assumption was relaxed with respect to path through the constraining of North Sea path effects. The constraining of North Sea path effects was achieved through a tomographic inversion analysis, and can also be regarded as moderately successful. This success is indicated by the small but significant and reasonably consistent reduction in the standard deviations of the intra-event GMPE residuals over each considered period (Fig. 13; Table 2). The validity of these reductions is largely supported by the two statistical tests carried out to determine the rigidity of the tomographic analysis results (Fig. 11; Fig. 12). The comparison of the computed attenuation rates to the expected attenuation rates for each grid cell also indicates the validity of the tomographic analysis (Fig. 10).

Path corrections were successfully implemented for forward modelling through the modification of a Monte Carlo PSHA program. The implemented path corrections were shown to have varied but reasonable impacts on the predicted ground-shaking intensities. For the computed site effects to be implemented for forward modelling, improved geotechnical descriptions are required for each computed cluster. This would permit a site not incorporated within the original analysis, but for which sufficient geotechnical information is available to be assigned the most appropriate cluster amplification function, and therefore for incident ground-motions to be corrected for site effects within the modified Monte Carlo PSHA program.

The validity of the GMPE modifications developed here is limited by the small size of the North Sea dataset, although the limitations of this dataset are well accounted for when constraining North Sea site and path effects through (1) the use of only two clusters when implementing the k-means clustering algorithm to compute North Sea site classes and (2) the use of a low resolution grid overlay in the tomographic inversion analysis. Consequently, this GMPE provides a promising first step for further work on reassessing seismic hazard in the North Sea.