Methods for assessing the epistemic uncertainty captured in ground-motion models

A key task when developing a ground-motion model (GMM) is to demonstrate that it captures an appropriate level of epistemic uncertainty. This is true whether multiple ground motion prediction equations (GMPEs) are used or a backbone approach is followed. The GMM developed for a seismic hazard assessment for the site of a UK new-build nuclear power plant is used as an example to discuss complementary approaches to assess epistemic uncertainty. Firstly, trellis plots showing the various percentiles of the GMM are examined for relevant magnitudes, distances and structural periods to search for evidence of “pinching”, where the percentiles narrow excessively. Secondly, Sammon’s maps, including GMPEs that were excluded from the logic tree, are examined to check the spread of the GMPEs for relevant magnitudes and distances in a single plot. Thirdly, contour plots of the standard deviation of the logarithms of predicted ground motions from each branch of the logic tree (σµ) are compared with plots drawn for other relevant hazard studies. Fourthly, uncertainties implied by a backbone GMM derived using Campbell (2003)’s hybrid stochastic empirical method are compared to those of the proposed multi-GMPE GMM. Finally, the spread of the percentile of hazard curves resulting from implementing the GMM are examined for different return periods to check whether any bands of lower uncertainty in ground-motion space result in bands of lower uncertainty in hazard space. These five approaches enabled a systematic assessment of the level of uncertainty captured by the proposed GMM.


Introduction
Probabilistic seismic hazard assessments (PSHAs) for sites of critical infrastructure (e.g. nuclear power plants) comprise two principal components. The first component is a seismicsource model (SSM) capturing forecasts of the future earthquakes (in terms of their locations, magnitudes and other characteristics) that could generate strong ground motions of engineering significance. The second component is a ground-motion model (GMM) providing estimates of the median and distribution of strong ground motions for each of the earthquakes in the SSM for intensity measures (IMs) that are useful for the design or assessment of the infrastructure. Both of these models are associated with considerable epistemic uncertainty because of a lack of knowledge and data (e.g. incomplete earthquake catalogues, a lack of local strong-motion data and sparse measurements of geotechnical and geophysical characteristics) from the region surrounding the site and from the site itself. The reader is referred to the recent landmark article on this topic by Bommer (2022).
It has often been shown (e.g. Bradley et al. 2012;Pecker et al. 2017;Porter et al. 2012) that the epistemic uncertainty in the GMM is one of, if not the, largest contributor to the uncertainties in the final hazard results. Consequently, within seismic hazard assessments there are often considerable efforts made in measuring, comparing and justifying the levels of epistemic uncertainty captured in the GMM. The methods to undertake these steps and the basis of these judgements are not commonly presented in international peer-reviewed journals but are often only contained within reports of commercial projects. The purpose of this article, therefore, is, firstly, to describe work undertaken within a recent PSHA for the site of a new-build nuclear power plant in the UK, Sizewell C (SZC), to assess the epistemic uncertainty in the GMM developed for this project and, secondly, to summarise the arguments used to judge the level of uncertainty captured.
The PSHA for the SZC site was conducted using an approach similar to that followed for the PSHA for the Hinkley Point C site and described by Aldama-Bustos et al. (2019) and Tromans et al. (2019). This approach, which aimed to satisfy UK nuclear regulatory requirements, emphasised: technical rigour, use of modern data and techniques, the use of external experts, logical treatment of uncertainties and review and oversight by means of a three-member external peer review team (PRT). One of the main focusses of the PRT was that the epistemic uncertainties at all stages of the PSHA were being sufficiently well captured in terms of the centre, body and range (CBR) of the technically defensible interpretations (USNRC 2018). In particular, the PRT expressed interest in whether "pinching" (i.e. when predictions from all, or a significant number of, alternative models in the logic tree converge to a single value, or narrow excessively, at a particular magnitude-distance-period scenario) is present in the GMM proposed for this project and what impact this has on the overall uncertainty in the hazard results. The work presented in this article was undertaken partly because of this request from the PRT.
In the next section the GMM developed using the multi-ground motion prediction equation (GMPE) approach to predict the median peak ground acceleration (PGA) and elastic response pseudo-spectral accelerations (PSAs), the principal IMs of interest for the client, is described. The following section presents an alternative GMM developed using the backbone approach, which was compared to the proposed GMM. Sections 4, 5 and 6 describe three approaches (trellis plots, Sammon's maps and σ µ contour plots) used to investigate the epistemic uncertainty captured in the proposed GMM and compare it to uncertainties in GMMs from comparable projects. Section 7 shows how the ground-motion uncertainty transfers to uncertainty in the hazard results at the reference velocity horizon. This horizon was selected here rather than the target horizon (the final hazard output of the PSHA) as the latter includes effects of the site response and, hence, the epistemic uncertainty in the derivation of the site amplification factors would be included in the hazard results. The penultimate section discusses how these various pieces of evidence were used to judge the capturing of epistemic uncertainty within the GMM. The article ends with some brief conclusions.

SZC ground-motion model for median
The development of the proposed SZC GMM followed what is commonly referred as the "traditional" or "multi-GMPE" approach. The selection of the suite of GMPEs considered in the GMM followed a rigorous and systematic approach where candidate models were identified using the comprehensive compendium of published GMPEs by Douglas (2018a). The preliminary exclusion criteria were predominantly based on recommendations from Cotton et al. (2006) and Bommer et al. (2010), with consideration also given to selection criteria used in previous projects for nuclear facilities (e.g. Renault 2014). From this preliminary selection, 11 GMPEs were retained for further assessment. These GMPEs were assessed by comparing ground-motion predictions from each of the individual GMPEs for various earthquake scenarios in terms of distance and magnitude scaling and in terms of response spectra. Also, Sammon's maps were produced for selected IMs considering the magnitude and distance ranges of interest to the PSHA. Both the graphs of predicted spectral accelerations and the Sammon's maps helped to identify clusters of GMPEs in ground-motion space.
The assessment process was complemented by comparing predictions to various observations of ground shaking from a project-specific database that included ground-motion records from the UK, northern France, Belgium and western Germany. Comparisons against ground-shaking observations, although important and necessary, provided only limited, qualitative, guidance for the selection of the most appropriate GMPEs for a site-specific PSHA in the UK. This was mainly because of the small number of records available, which are limited to those from events with magnitudes generally below M 5.3 recorded at considerable (> 100 km) source-to-site distances. Also, due to limitations of the instrumental data (e.g. a lack of detailed site information for most strong-motion stations), quantitative methods to assess the match between predictions from the GMPEs and observations, such as those proposed by Scherbaum et al. (2004aScherbaum et al. ( , 2009, were not applied. The difficulties of using data from small magnitude earthquakes for the evaluation of GMPEs are clearly shown in the analyses presented by Beauval et al. (2012).
The final step of the GMPE selection process consisted of an expert assessment of the preliminary selected GMPEs by the GMM team based on the comparisons previously described. The aim of this assessment was to reduce the number of candidate GMPEs to a more manageable figure, whilst ensuring that the range of predicted motions from the selected GMPEs was sufficient to account for the considerable epistemic uncertainty associated with ground-motion prediction in the UK. Other factors, such as whether the GMPEs could be adjusted for site-specific conditions or would allow scaling to correct for stress drop variation were also taken into consideration, along with other technical and projectspecific issues.
The compatibility of the ground-motion predictions from the selected suite of GMPEs was reviewed as the GMPEs may use different definitions of the predicted ground motion parameters and explanatory variables. The only compatibility issue identified was the consideration of alternative faulting mechanisms or "style of faulting". BETAL14, CETAL15 and CY14 all include a style-of-faulting term in their functional from; also, the criterion used to define style-of-faulting is consistent across these three models. However, RE19 and YA15 do not provide an explicit style-of-faulting term.
To address the style-of-faulting compatibility issue, reverse to strike-slip and normal to strike-slip adjustment factors (F RV/SS and F N/SS , respectively) were developed. Predictions from the YA15 model were adjusted using these adjustment factors, following the approach recommended by Bommer et al. (2003), in conjunction with the faulting composition observed in the NGA East flatfile (i.e. 68% reverse, 2% normal and 30% strikeslip). The style-of-faulting distribution for the UK has been estimated by Baptie (2010) as 22% reverse, 39% normal and 39% strike-slip. Predictions from the RE19 model were left unadjusted as any adjustments required were estimated to be small (< 1%). Therefore, any improvements in the predictions are likely to be offset by the introduction of additional epistemic uncertainty in the development of the adjustment factors.
All selected GMPEs are generic (i.e. non-site specific) models based on data from regions other than the UK, with exception of RE19 which used UK data. Therefore, the GMPEs required adjustments to consider the site-specific conditions at the reference velocity horizon, which for this study was defined at 82 m depth, with a V S30 of 1,139 m/s. These adjustments were performed via the application of host-to-target adjustments (HTTAs) designed to account for differences in the shear-wave velocity profile, V S , and the factor characterising the high-frequency attenuation, kappa. HTTAs are also commonly known as V S -kappa adjustments.
The calculation of the HTTAs requires four inputs to be defined, the average V S profiles of the host (V S,host ) and target (V S,target ) locations, which allow the amplification due to V S impedance to be computed; and the average kappa in the host (kappa host ) and target (kappa target ) locations, which allow for modelling differences in the high-frequency attenuation.
The V S,host profiles were assessed based on the information provided by the model developers for the stochastic models (RE19 and YA15). For the three empirical models (BETAL14, CETAL15 and CY14) the generic V S profiles of Cotton et al. (2006) for a V S30 value of 1,000 m/s were used. It should be noted that Boore (2016) developed a procedure to overcome limitations in the Cotton et al. (2006) generic V S profiles, which should be preferred for future studies. In addition, it is also recommended to consider the approach of Al Atik and Abrahamson (2021) to estimate V S,host profiles in future studies as such profiles are then fully-compatible with the GMPE. This approach had not been published at the time of our study. The impact of using the V S profiles from Boore (2016) or those derived using the method of Al Atik and Abrahamson (2021) compared with the profile from Cotton et al. (2006) is likely to be small as differences in kappa have a stronger impact on the HTTAs than differences in V S profiles for similar near-surface velocities.
The kappa host values were assessed using the inverse random vibration theory technique (Al ) and considering a single pair of magnitude and distance (M 5.5 and 24 km) that are representative of the earthquake scenario controlling the hazard in the highfrequency range. The kappa host values assessed for the five selected GMPEs for a V S30 value of 1,000 m/s were: BETAL14, 0.0405 s; CETAL15, 0.0281 s; CY14, 0.0332 s; RE19, 0.0222 s; and YA15, 0.0209 s. These values are comparable to other estimates published for the same or similar GMPEs, where available (Zandieh et al. 2016).
The V S,target profile was developed based on the Poggi et al. (2011) V S profile with a V S of 1,100 m/s (V S30 of 1,139 m/s) at 82 m depth and 3,410 m/s at 4 km depth, which provided the best fit to site-specific data at the reference velocity horizon and the generic UK model (Turbitt 1985) and other regional deep V S profiles (Davis et al. 2012;Ottemöller et al. 2009) at around 4 km.
The best estimate kappa target value for the site was thoroughly investigated for the project site using a combination of approaches including site-specific estimates from an "analogous" site (~ 70 km from the project site) with very similar geological conditions at the reference velocity horizon, estimates from other UK data/sites (Rietbrock and Edwards 2017;Villani et al. 2019a, b) and estimates based on V S30 -kappa correlations from global data (Cabas and Rodriguez-Marek 2017;Ktenidou et al. 2016;Van Houtte et al. 2011). Based on these various estimates of kappa, the following logic-tree branches were used to construct the HTTAs: lower target kappa of 0.01 s; middle target kappa of 0.02 s; and higher target kappa of 0.03 s.
The approach of Al  was extended to account for differences in V S profiles as followed by Bommer et al. (2015) and Rodriguez-Marek et al. (2014), and then was implemented for the estimation and application of the final HTTAs. The final HTTAs were developed for the three target kappa values stated above and all five GMPEs, leading to a suite of 15 adjustment factors.
Specifications for the SZC PSHA required the GMM should provide ground-motion for periods up to 10 s. To account for unrealistic behaviour observed in the predicted spectral displacements (SD) above T D , which is the period at which spectral displacement amplitudes are expected to become constant, and to extrapolate to periods above the longest period for which the GMPEs provide coefficients, the GMM team developed an approach to adjust long-period ground predictions for all GMPEs. This approach consisted of capping the SD for long periods at T D , where T D was calculated using the equation below assuming a stress drop, ∆σ, of 80 bars (8 MPa): In the case where T D was longer than the longest period for which the GMPE provides coefficients, a linear extrapolation was performed in the PSA domain considering a log-log scale for periods below T D and capped at T D , in the SD domain, for longer periods. T D from the Eq. 1 was constrained to be always above 1.0 s, where the predictions from the GMPEs are expected to be reliable and no adjustment is required The median ground-motion logic tree is presented in Fig. 1. This captures the uncertainty in the selection of the most appropriate GMPE for the purposes of seismic hazard assessment at the SZC site as well as on the kappa target . The weights assigned to the GMPEs were based on the assessments of the GMPE's merits and weaknesses and reflect the level of confidence in each particular GMPE to provide an appropriate prediction of the ground-motion amplitudes expected at the SZC site. Weights assigned to each kappa target branch reflect the GMM team's expectations, based on the assessment discussed above, that the "true" kappa value at the SZC site is more likely to be somewhere between 0.01 and 0.02 s (and closer to the latter) than above 0.02 s

Comparison to a backbone model
An alternative, which has been increasingly used in site-specific PSHAs, to using multiple GMPEs with V S -kappa adjustments is the so-called "backbone" approach (e.g. Atkinson et al. 2014;Douglas 2018c). Therefore, one method that was used to judge the appropriateness of the epistemic uncertainty in the SZC GMM was to compare it to a UK backbone model that was developed specifically for this project. The backbone model presented in this section accounts for uncertainty in the median stress drop and the kappa at the bedrock horizon of the SZC site. It is noted that this backbone model is not recommended for use within PSHAs because it only considers uncertainties in two inputs to the stochastic model and an approximate stochastic model was used for the host-region empirical GMPE. Nevertheless, this backbone model is a useful point of reference to help assess whether the SZC multi-GMPE GMM is capturing the CBR of the expected ground motions at the SZC site.

Application of the hybrid-empirical method for the SZC site
In their recent article Bommer and Stafford (2020) recommend using the Chiou and Youngs (2014) GMPE as an empirical backbone GMPE for site-specific PSHAs. They propose branching out this backbone GMPE to capture uncertainty in the source, path and site conditions in the target region (in this case the region of the SZC site) using a variation on the hybrid empirical method (HEM) of Campbell (2003Campbell ( , 2004. The HEM adjusts an empirical GMPE [here, the Chiou and Youngs (2014)] for the host region [here, western North America, WNA, i.e. the host region of Chiou and Youngs (2014)] by multiplying predictions from this GMPE for each magnitude and distance of interest by the ratio between predictions from stochastic ground-motion models for the target region (here, the SZC site) and the host region (here, WNA) for that magnitude and distance. The output of this calculation is an "empirical" GMM for the target region that accounts for effects (e.g. finite-fault behaviour) that cannot easily be accounted for by a stochastic model. Bommer and Stafford (2020) propose that, instead of having a single host-to-target adjustment factor incorporating the effects of all relevant parameters for which the GMPE is being adjusted (e.g., stress drop, focal depth, geometrical spreading, path quality factor, kappa and site V S ), as proposed in Campbell (2003Campbell ( , 2004, a better approach would be to make these adjustments individually to capture the uncertainty in each of the adjustments through a logic tree. This has the advantage of a more transparent mapping of the epistemic uncertainty. The HEM is useful for areas with little strong-motion data from which to derive fully empirical GMPEs as it provides a way to account, in a transparent manner, for uncertainties in the input parameters of the stochastic model for the target region. This is achieved by using alternative values for the various input parameters to the stochastic model along with weights expressing "belief" in those values being appropriate for the target region. In this manner, multiple target stochastic models are generated for each magnitude and distance each with appropriate overall weights. The latter are produced by multiplying the weights of each input parameter (correlations between input parameters can be modelled by only allowing certain combinations, although in the implementation presented here all combinations of parameters are allowed).
The HEM was used by Campbell (2003Campbell ( , 2004 to develop GMPEs for central and eastern North America (CENA) by adjusting GMPEs derived for WNA by multiplication by the ratio between stochastic models for CENA and WNA. Douglas et al. (2006), as part of a study to derive sets of GMPEs for southern Spain and southern Norway, developed an open-source Fortran program CHEEP 1 based on the SMSIM software 2 of David M. Boore to undertake the HEM. Because it is generally not possible to find a stochastic model that provides a perfect match to a host-region empirical GMPE, CHEEP extends the HEM to account for uncertainties in the host stochastic model. In addition, CHEEP converts the distance metrics of the empirical GMPEs and makes other conversions of independent (e.g. style of faulting) and dependent parameters (e.g. horizontal component definitions). CHEEP was used to develop a preliminary backbone GMM for the SZC site. Chiou and Youngs (2014) was not originally included within CHEEP as this software was developed about a decade before the Chiou and Youngs (2014) GMPE was published. As part of the SZC project a new subroutine was included in CHEEP to evaluate the Chiou and Youngs (2014) GMPE. The HEM requires stochastic model(s) for the host region that closely fit(s) the empirical GMPE. In the original formulation of Campbell (2003Campbell ( , 2004 a single model is used and in the extended formulation of Douglas et al. (2006) a set of such models is used. CHEEP includes sets of 100 stochastic models for each of the 10 host empirical GMPEs that were coded in the original version of this software. These stochastic models were taken from Scherbaum et al. (2006), who used a genetic algorithm to determine the 100 best-fitting models for each GMPE to account for the uncertainty in the inversion process. Recent studies that have built on the work of Scherbaum et al. (2006) to develop equivalent stochastic models include those by Zandieh et al. (2018) and Stafford et al. (2022).

Host stochastic model for Chiou and Youngs (2014)
CHEEP did not include Chiou and Youngs (2014) as one of its in-built GMPEs so stochastic models for this GMPE are also not available in the form required by this software. Because the software to repeat Scherbaum et al. (2006)'s inversion process is not available, the stochastic models provided in CHEEP for the WNA GMPEs of Abrahamson and Silva (1997) and Boore et al. (1997) were both used as proxies for the unavailable stochastic models for Chiou and Youngs (2014). By examining the match between the predictions from these stochastic models and the Chiou and Youngs (2014) GMPE over a wide range of magnitudes and distances it was confirmed that these stochastic models are suitable for the HEM. In addition, similar backbone GMMs were obtained by using both the stochastic models for Abrahamson and Silva (1997) and that for Boore et al. (1997). As the backbone GMM was only being used for comparisons and not for the actual PSHA, and because small misfits between empirical and stochastic host models have previously been shown to be relatively unimportant (Campbell 2003;Douglas et al. 2006), it was decided that the use of these approximate stochastic models was sufficient for our purposes.

Target stochastic model for SZC site
For the target-region stochastic models Table 2 of Rietbrock and Edwards (2019) was used for: source spectral shape, source duration, median stress drop parameter (Δσ = 5, 10 and 20 MPa), geometric spreading, path attenuation Q (only the top layer of their layered structure was used because CHEEP cannot account for a layered Q structure -this simplification will have only a minor effect and mainly at large distances) and path duration [corrected for the two typographical errors that seem to be present in the publication of Rietbrock and Edwards (2019) for longer distances]. This part of the stochastic model accounts for uncertainty in the median stress drop for the UK. The site amplification and attenuation components of the target stochastic models are the SZC shear-wave velocity profile and the three values of kappa proposed for this site (0.01, 0.02 and 0.03 s), which are discussed above. In total, there are 3 × 3 = 9 separate target stochastic models accounting for uncertainty in Δσ and kappa.

Resulting backbone GMM
Running CHEEP for the inputs presented above leads to many thousands of samples of the median predicted ground motions for the 9 combinations of stress drop and kappa for all magnitudes (4 to 7.5 in steps of 0.25) and distances (from 1 to 300 km in roughly logarithmic steps) considered. Because of the use of multiple stochastic models for the host region and the conversion of distance metrics (Scherbaum et al., 2004b) from rupture distance (for Chiou and Youngs 2014) to hypocentral distance (to evaluate the stochastic models) to distance to the surface projection of rupture (for comparison with the other GMPEs for SZC) 100 Monte Carlo samples were required to obtain stable estimates of the ground motions at each magnitude and distance.
To facilitate the use of the models, the samples for each magnitude and distance and given Δσ and kappa were averaged and then regression analysis conducted to obtain coefficients that can be used to evaluate the nine models for any magnitude and distance. The functional form assumed was that used by Campbell (2003) for his HEM for eastern North America [and also used by Douglas et al. (2006) for Norway] but modified to account for the distances at which the geometric spreading of Rietbrock and Edwards (2019) changes (50 and 100 km). A check confirmed that this step introduced only a very minimal error [similar findings were reported by Douglas et al. (2006)]. The resulting backbone GMM is presented in Fig. 2 with respect to distance for magnitude 5 and PGA. Results for other magnitudes and spectral periods show similar behaviour. The spread in the curves decreases as period increases because stress drop and kappa uncertainty in the target stochastic models only have a large influence at short structural periods.

Trellis plots
To assess whether the median predictions from the various GMPEs in the proposed SZC GMM narrow excessively at any particular magnitude-distance-period scenario, trellis plots were produced in terms of response spectra and distance scaling of the GMM predictions. These plots were drawn for percentiles between 0 and 100, in steps of 1%, for a number of selected IMs and for magnitudes between M 4.0 and 7.5, and distances between 0.1 and 300 km.
Median predictions of the Chiou and Youngs (2014) model, including the HTTA for the middle kappa (CY14-middle kappa), were also included in the response spectra and distance scaling plots. The main reason for this was to assess whether the SZC GMM is "well centred" based on the comparisons between the SZC GMM median predictions and the median predictions from a well-constrained model (i.e. CY14) adjusted to the local conditions at the SZC site. The CY14 model was selected for this comparison as it is suggested by Bommer and Stafford (2020) to be a good backbone model candidate. Distance scaling plots for PGA and PSA at 0.05, 0.2 and 1.0 s are presented in Fig. 3, while response spectra plots for M 4.0, 5.0, 6.0 and 7.0 are shown in Fig. 4.
In addition to the plots described above, distance attenuation and response spectra were produced comparing the 16th, 50th and 84th percentile predictions of the SZC GMM and the UK backbone model (see Sect. 3), for the same magnitudes, distances and periods previously described. These comparisons are presented in Fig. 5 in terms of distance scaling and in Fig. 6 in terms of response spectra.
From these comparisons it was concluded that although bands of lower epistemic uncertainty (i.e. where the spread of the percentiles is reduced) were observed at specific periods and magnitude-distance ranges (e.g. the narrow spread of the percentiles observed in the response spectra for M 5.0 at 0.2 s in Fig. 4), these were not as excessively narrow so as to Fig. 2 Ground-motion models derived for the SZC site using the HEM for PGA and magnitude 5 against R JB , distance to the surface projection of rupture. The blue curves are for 20 MPa, red for 10 MPa, black for 5 MPa, the upward triangles are for kappa = 0.01 s, the crosses for kappa = 0.02 s and the downward triangles for kappa = 0.03 s reject the GMM. These bands of lower uncertainty are more easily observed in the σ µ plots presented in Sect. 6 and their effects on the uncertainty in the hazard space (i.e., at the reference velocity horizon) are discussed in Sect. 7.
SZC GMM median predictions are in good agreement with median predictions from the CY14-middle kappa model at distances shorter than ~ 80 km. At longer distances differences are driven mainly by the plateau on the attenuation of the ground motion at about 100 km, which is characteristic of stable continental regions (SCRs) and is not modelled by the CY14-middle kappa branch. In addition, there is generally a good agreement between the SZC GMM and UK backbone model predictions for the 16th, 50th and 84th percentiles. Again, differences at about 100 km are mainly due to the 50-100 km plateau assumed in the functional form of the target stochastic model, which is associated with Moho bounce effects and is characteristic of SCRs. It is questionable whether such effects would be visible for > M 6 earthquakes where a significant part of the crust is ruptured. The RE19 model (from where the target-region stochastic model was taken) assumes that such effects are seen at all magnitudes but this model was derived using only data from small earthquakes.
In summary, comparisons between median predictions from SZC GMM and the CY14middle kappa, and SZC GMM and UK backbone, provided confidence that the SZC GMM is "well centred". Likewise, similarities in the distribution of the percentiles (16th, 50th and 84th ) from the SZC GMM and the UK backbone suggest that the "body" of the epistemic uncertainty is also well captured by the SZC GMM.

Sammon's maps
As discussed in Sect. 2, Sammon's maps were produced as part of the assessment and selection of the GMPEs comprising the SZC GMM. The Sammon's map approach is based on a well-established visualisation technique (Sammon 1969) and then first applied by Frank Scherbaum and co-workers Scherbaum and Kuehn 2011) for the assessment of GMPEs. This approach provides a two-dimensional map representing the magnitude and distance dependence of median predictions from GMPEs. This map allows the proximity of GMPEs to be more easily seen than would be possible in a series of graphs for different magnitudes and distances. The method has been implemented in various recent PSHA projects for nuclear facilities (e.g. PEGASOS Refinement Project, Thyspunt PSHA, Such assessment helps ensure that epistemic uncertainty is adequately captured, whilst avoiding involuntary biases in the ground-motion distribution implied by the GMM logic tree. Such biases result from the fact that available GMPEs are frequently not independent from each other, as they are developed from overlapping, and in some cases, identical, databases. Sammon's maps were produced considering all GMPEs in the preliminary selection, which were considered sufficient for the purposes of this study to identify clustering of models or outliers. Figure 7 presents Sammon's maps for PGA and PSA at 0.2 and 1.0 s. These maps were produced using median ground-motion predictions for magnitudes between M 4.0 and M 7.5 and distances between 10 and 300 km. All preliminary GMPEs not included in the final GMM are greyed out while the edges of the selected unadjusted GMPEs (i.e., without HTTAs) are colour coded; asterisks correspond to the 15 branches of the SZC GMM logic tree (i.e., HTTA-adjusted GMPEs) and the SZC GMM (i.e., weighted median prediction of the GMM logic tree) is shown as a red square. Note that Fig. 7 also includes rejected variants of the same GMPE, when provided by the developers of the GMPE (e.g. RE19 provides coefficients for stress drops of 5, 10 and 20 MPa; all three alternatives are plotted but only RE19-5 MPa is colour coded).
The spread of the models in the two-dimensional surface can be considered representative of the range of the epistemic uncertainty in the predicted ground motion captured by the models used in the analysis. This spread does not necessarily, and most likely does not, represent the full range of the epistemic uncertainty, which by definition is unknown. However, if a relatively large number of models, for relevant tectonic regimes, are included in the analysis, the expectation is that the predictions from the set of models, collectively, would capture a range of epistemic uncertainty large enough as to be considered a reasonable representation of the full range. However, the closeness to the full range is, by definition, unknown.
From Fig. 7 it is observed that the 15 branches of the SZC GMM logic tree comfortably cover the ground-motion space indicated by all preliminary selected models, while the median predictions of the SZC GMM tends towards the centre. It is also worth noting that no clustering of the 15 branches of the SZC GMM logic tree can be observed. This confirms that the SZC GMM spans a range of behaviours in ground-motion space and suggests that the GMM is sufficiently diverse to capture epistemic uncertainty, whilst avoiding model redundancy. One way of judging if the level of epistemic uncertainty captured in the median groundmotion logic tree is appropriate is to compare it to the levels of uncertainty captured by other GMMs, particularly those developed for use in site-specific PSHAs for critical infrastructure. A standard method of measuring the epistemic uncertainty is the standard deviation of the weighted natural logarithms of the estimates of the median ground motion (σ µ , where µ is the natural logarithm of the estimates of the median ground motion), where the weights correspond to the weights assigned to the alternative models in the logic tree. It could be argued that the use of σ µ is not ideal when considering a multi-GMPE GMM, as this measure is not directly captured in the PSHA (unlike when using some backbone models). We believe, however, that it is still a useful tool for comparison with other GMMs in the absence of a better alternative.
Plots of σ µ were produced for a number of selected periods, for the SZC GMM and the ground-motion models listed below, with the objective of assessing the level of epistemic uncertainty captured by the SZC GMM (i.e. the "range" of the epistemic uncertainty): • Preliminary UK backbone model (see Sect. 3); • NGA-W2 models (Al Atik and Youngs 2014); • NGA-East models (Goulet et al. 2018); The preliminary UK backbone model was developed as part of this study (see Sect. 3) with the purpose of developing a model that explicitly captured uncertainty in the source and site conditions in the target region (in this case the SZC site) using the HEM. This is used as a reference point to help assess whether the SZC GMM adequately captures the CBR of the expected ground motions at the SZC site.
Al Atik and Youngs (2014) assess the level of epistemic uncertainty captured by the set of NGA-W2 models (i.e. model-to-model variability) and propose a minimum additional epistemic uncertainty to lead to a more appropriate spread. The minimum epistemic uncer- Fig. 8 Comparison of σ µ for PGA for the selected GMMs tainty suggested by Al Atik and Youngs (2014) for the NGA-W2 models captures the epistemic uncertainty in ground-motion estimation for active shallow crustal regions (ASCR), using the most complete and verified ground-motion database available worldwide. It could, therefore, be considered as a lower bound of epistemic uncertainty for GMMs in other regions (i.e. epistemic uncertainty in any other region can be expected to be higher). Bommer (2022) discusses three limitations of using the NGA-W2 models for this purpose: the uncertainty implied by these models tends to decrease as magnitudes increase (and data becomes sparser), the geographical region covered by the NGA-W2 models is much larger than many target regions (here, a small area of the UK), and over half of the V s30 values used to develop the NGA-W2 models are estimates rather than measured values. Despite of these limitations, Boore et al. (2022) also use the NGA-W2 models to judge the epistemic uncertainties captured by their GMM. The NGA-East GMPEs were developed specifically for CENA, using a range of alternative approaches for modelling ground motions in a region with scarce data (Goulet et al. 2018). The final suite of GMPEs was developed based on stochastic sampling using Sammon's maps of the GMPE space covered by the set of "seed" GMPEs developed by the various modelling teams. The area in ground-motion space covered by the resampled models was considered to be representative of the epistemic uncertainty. This area was then divided into 17 sectors and a representative non-parametric model (i.e. a set of magnitudedistance-frequency-lnY quadruplets, where lnY is the logarithmic ground-motion amplitude) selected for each sector. Weights for each of the 17 final models were then determined from comparisons of the model predictions with a subset of well-recorded data, considering both the fit to the data and the shape of the overall ground-motion distribution. The NGA-East models cover a very large study region where some of the parameters controlling ground-motion amplitudes (e.g. stress drop or crustal attenuation, Q) can be considered nonhomogeneous, with regional adjustments only partially capturing the associated epistemic uncertainty. Therefore, the σ µ values based on the weighted model-to-model variance of the 17 models developed as part of the NGA-East project, can be considered an upper bound of the modelling epistemic uncertainty (i.e. epistemic uncertainty in smaller regions with more homogeneous conditions can reasonably be expected to be lower).
Weatherill and Cotton (2020) developed a backbone model, referred herein as the SERA model, for shallow seismicity in Europe using the Kotha et al. (2020) GMPE as its backbone. This model was developed as part of the Horizon 2020 Seismology and Earthquake Engineering Research Infrastructure Alliance for Europe (SERA) project.
The ground-motion models developed for the Wylfa (Villani et al. 2020) and Thyspunt (Bommer et al. 2015) PSHAs were considered for comparisons with the SZC GMM as examples of what the UK nuclear regulator refers to as "relevant good practice" for single-site PSHAs. The Wylfa GMM is a site-specific, single-station, multi-GMPE model and consists of seven alternative GMPEs. Three of the seven GMPEs in this model are the alternative variants of the model by Rietbrock and Edwards (2019). As the Rietbrock and Edwards (2019) GMPEs were developed specifically for the UK, the authors of the Wylfa PSHA only applied HTTA factors to the GMPEs from other regions.
As part of the Thyspunt PSHA, the authors of the study developed a site-specific backbone model using the models of Akkar and Çağnan (2010), Abrahamson and Silva (2008) and Chiou and Youngs (2008) as the backbone GMPEs. This backbone model considered adjustment factors for V S , kappa and stress drop parameters.
Plots of σ µ were produced for a selected number of structural periods for each of the GMM discussed above and the proposed SZC GMM. Figures 8, 9 and 10 shows comparisons of the σ µ estimates for all GMMs for PGA and PSA at 0.2 and 1.0 s, respectively, as examples. Note that, the Thyspunt GMM is not plotted for magnitudes below 5.0 because that PSHA used a minimum magnitude of 5, but the same magnitude scale on the horizontal axis was kept for easy comparison with the other GMMs.
From examination of the σ µ plots across all selected structural periods, an epistemic uncertainty of ~ 0.2 could be considered as an acceptable lower bound, based on the uncertainty captured by the NGA-W2 models. In a similar manner, based on the uncertainty captured by the NGA East models, an upper bound for the epistemic uncertainty of ~ 0.4 could be defined for all structural periods, which could be increased up to ~ 0.5 for magnitudes higher than M 7.0. The uncertainty captured by the SZC GMM was generally above the lower bound of 0.2, with exception of structural periods above 1.0 s at very short distances (< 10 km) and a narrow band of magnitudes and distances (~ M 4.5 at 8 km) for periods around 0.5 s. However, this was considered not to affect significantly the epistemic uncertainty in hazard space at the reference velocity horizon, as discussed in the following section.
At the shorter structural periods, epistemic uncertainty in the SZC GMM was equivalent to the uncertainty in the UK backbone model and comparable with other models, including the NGA-East models. At intermediate periods (0.2-0.5 s) zones of lower epistemic uncertainty were observed in the SZC GMM. However, this seems to be a characteristic consistent across most of the selected GMMs (see NGA-W2, Thyspunt and Wylfa in Fig. 9), the only exception being the two backbone models that are based on a single GMPE (i.e. UK backbone and SERA). The presence of this feature in the total epistemic uncertainty recommended by Al Atik and Youngs (2014) for the NGA-W2 may suggest that these zones of Fig. 10 Comparison of σ µ for PSA at 1.0 s for the selected GMMs lower uncertainty represent a real behaviour of the distribution of the epistemic uncertainty in the magnitude-distance space and across periods. This could be as a result of the presence of a larger number of good quality records for that range of period-magnitude-distance in the various databases used to derive the GMPEs, or simply that the variability at intermediate periods from variations in stress drop, geometric spreading, site amplification and other characteristics is intrinsically lower. If these zones of lower uncertainty are a real feature of the epistemic uncertainty across response periods, it could be suggested that backbone models based on a single GMPE and constant scaling may fail to capture such features and overestimate the epistemic uncertainty at intermediate periods and at magnitude-distance scenarios that are likely to be relevant to the seismic hazard.
At long periods (≥ 1.0 s) the epistemic uncertainty captured by the SZC GMM is generally higher than in other models and is in good agreement with the NGA-East model. The exception is the Wylfa GMM, where epistemic uncertainty ranges between 0.5 and 1.0 for the range of magnitudes and distances expected to control the hazard. This is likely to be driven by unrealistic predictions at long periods from the RE19 models, particularly for the 10 and 20 MPa variants of this GMPE.
It is also worth noting the lack of magnitude dependency of the epistemic uncertainty in the SERA backbone and the very weak, almost null, distance dependency of the epistemic uncertainty in the UK backbone for periods above 0.2 s (at shorter periods the uncertainty is effectively uniform across all magnitudes and distances). This lack of magnitude or distance dependency could be argued to be an unrealistic distribution of the epistemic uncertainty as it is in disagreement with the epistemic uncertainty observed from physics-based models (Frankel 2018) where the epistemic uncertainty is magnitude and distance dependent. It should be noted that these comments reflect these specific applications of the backbone approach rather than a general criticism of the method.

Effect on hazard
To understand how the epistemic uncertainty in the SZC GMM propagates to the hazard estimates at the reference velocity horizon, the Toro (2006) equation, as re-arranged by Douglas (2018b), was used to estimate σ µ for the range of return periods and response periods relevant to the SZC PSHA (Fig. 11). These estimates were obtained using the hazard curves from the final hazard calculations of the SZC PSHA. This is especially important given the zones of lower epistemic uncertainty discussed in previous sections.
The re-arranged Toro (2006) equation uses the mean and median hazard curves from the hazard calculations to estimate σ µ . There is currently no approach to split the epistemic uncertainty into the SSM and GMM parts from the hazard results (which accounts for the combined uncertainties in both the SSM and the GMM), which limits the insights that can be reached from these estimates. σ µ estimates obtained using the re-arranged Toro (2006) equation have, however, been observed to be a good approximation to the true uncertainty in the GMM (Douglas 2018b;Douglas et al. 2014).
The main difference between σ µ values shown in Fig. 11 and σ µ values in Figs. 8, 9 and 10 is that σ µ values in Fig. 11 directly account for the relative contributions to the total hazard of the range of magnitudes and distances considered in the SZC PSHA. As observed in Fig. 11, uncertainty in the hazard estimates tends to be lower at structural periods around 0.2 s, but remains, across all structural periods, consistently above the lower bound value of 0.2 defined as the lower acceptable value for the epistemic uncertainty from the GMM. As expected, the lower levels of uncertainty observed in the SZC GMM for periods above 1.0 s, at short distances (< 10 km), do not have any significant contribution to the epistemic uncertainty at the reference velocity horizon where the observed σ µ values are about 0.3 or higher.

Discussion
From observation of the trellis and σ µ plots bands of lower epistemic uncertainty were identified at specific periods and magnitude-distance ranges. This issue is mitigated, however, by the hazard at the reference velocity horizon not being dominated by a single magnitudedistance scenario at any response period. This issue could become important if the hazard is controlled by magnitude-distance scenarios that correspond to those where lower epistemic uncertainty is observed.
Comparisons with the median predictions of the CY14-middle kappa, and selected percentiles of the UK backbone model, provided confidence that the SZC GMM is well "centred" and that it adequately captures the "body" of the epistemic uncertainty of the ground-motions expected at the SZC site. Also, from the distribution of the predictions from the 15 branches comprising the SZC GMM logic tree in the ground-motion space, as shown in the Sammon's maps (Fig. 7), it can be considered that the SZC GMM adequately captures the "range" of the epistemic uncertainty while avoiding model redundancy.
In addition to this, the epistemic uncertainty captured by the SZC GMM, expressed in terms of σ µ , is generally above the acceptable lower bound of 0.2, which was defined based on the uncertainty captured by the NGA-W2. The exception to this were periods above 1 s Fig. 11 σ µ estimates at the reference velocity horizon from the final hazard calculations for the SZC site at very short distances (< 10 km) and a narrow band of magnitudes and distances (around M 4.5 at 8 km) for periods around 0.5 s. However, σ µ estimates obtained using the re-arranged Toro (2006) equation (Fig. 11) confirmed that these lower bands of epistemic uncertainty only propagate slightly to the hazard estimates at the reference velocity horizon. These findings provided confidence that the epistemic uncertainty in the SZC GMM, for the magnitude-distance ranges of interest for the SZC PSHA, is sufficiently well captured.
One of the most significant challenges when assessing whether the level of epistemic uncertainty may be appropriate is the definition of a minimum level of uncertainty that could be considered adequate, or acceptable, at a given structural period and magnitude-distance range. In this study, this minimum level of uncertainty was defined as 0.2 based on the epistemic uncertainty captured by the NGA-W2 models. Comparisons against the variability observed from ground-motion measurements could also help to this end. For example Douglas (2018b) provides estimates of the standard error of the median PGA from binning the NGA-W2 database (his Fig. 6) into small magnitude-distance intervals, which could be used to assess a lower bound of acceptable epistemic uncertainties from a GMM. In areas of low to very low seismicity levels, as is the case of the UK, the limitations of the available instrumental database (i.e., small magnitude events, typically below the lower magnitude of interest for PSHA, at long distances, > 100 km) do not allow a meaningful assessment of the ground-motion uncertainty. Nevertheless, such assessments could be carried out using macroseismic intensity data, which in the case of the UK is relatively abundant and cover magnitude and distance scenarios relevant to PSHA (although there is still a lack for earthquakes with M > 5.5). This type of comparisons was not considered in the present assessment. They may be useful in future studies, although the large uncertainties in converting macroseismic intensities to instrumental ground motions (e.g., Caprio et al. 2015) mean that such comparisons are more likely to lead to qualitative rather than quantitative conclusions.

Conclusion
Seismic hazard assessments are always associated with considerable epistemic uncertainty because of a lack of knowledge and information on potential future earthquakes in the region of the site and the strong ground motions that such events could generate. Capturing epistemic uncertainty within the seismic-source and ground-motion models of these assessments is critical, as is demonstrating to reviewers, the regulator and the client that the uncertainties are appropriate and well justified. This article has applied five methods to assess the epistemic uncertainty within the ground-motion model developed within a recent seismic hazard assessment for the site of a new-build nuclear power plant in the UK. In addition, the impact of uncertainties in the ground-motion model on the uncertainties in the hazard curves is also measured. Examining the results from applying these methods and comparing them to results for ground-motion models developed in comparable projects led the project team to conclude that the epistemic uncertainties were sufficiently well captured. This conclusion was also informed by how the epistemic uncertainty in the ground-motion model propagated to the reference velocity horizon, which is the key output of the hazard assessment. Julian Bommer and an anonymous reviewer for their detailed and constructive comments on a previous version of this article.
Author contributions All authors contributed to the study conception and design. Analyses were performed by Guillermo Aldama Bustos, John Douglas, Fleur Strasser and Manuela Daví. The first draft of the manuscript was written by Guillermo Aldama Bustos and John Douglas and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding We express our gratitude to SZC NNB GenCo, the sponsor of the project, for agreeing to the publication of this article.

Data Availability
No data were used in the writing of this article.

Statements and declarations
Competing interests The authors have no relevant financial or non-financial interests to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.