1 Introduction

Seismic risk expresses the expected probable losses due to shaking, measured through different metrics e.g.: economic, social, and environmental (e.g., Musson 2000). The basis of estimating seismic risk is a hazard analysis, which establishes the likelihood of earthquake ground motion of a given intensity occurring in the area under investigation. Probabilistic predictions of earthquake hazard and risk can be obtained via two distinct approaches: unconditional and conditional (e.g., Scozzese et al. 2020).

Unconditional methods use direct observations to estimate hazard and risk, an example of this being Monte Carlo hazard assessment (e.g., Musson 2000) or subset simulation (e.g., Au and Beck 2003). The unconditional method is noted for its adaptability, flexibility, and conceptual simplicity, and has been used frequently in research to good effect, such as EqHaz (Assatourians and Atkinson 2013) a program to assess seismic hazard. Unconditional methods are considered more robust approaches to estimate risk; however, they are computationally expensive, meaning they are rarely used in practice.

Conditional methods are a more practice-orientated approach to estimate risk. Such methods include the Pacific Earthquake Engineering Research Center’s performance-based earthquake engineering approach (Moehle and Deierlein 2004). Cornell (2005), among others, discusses both the benefits and drawbacks of this type of approach, which is illustrated by Fig. 1. These techniques require the definition of an intensity measure (IM) which describes the ground-motion intensity at the site of interest. Seismic hazard can then be evaluated by characterising the seismic source zones of the site and combining this with a ground-motion model (GMM), to fully describe the frequency of exceeding different levels of this IM during a time period of interest. The next step is to perform structural analyses for a given system to calculate the conditional probability of exceeding a given engineering demand parameter (EDP) at certain IM values; this can be done through various approaches, with the most common being incremental dynamic analysis (e.g., Vamvatsikos and Cornell 2002) and multiple stripe analysis (e.g., Mackie and Stojadinovic, 2005; Scozzese et al. 2020). Finally, risk is estimated by convolving the results from both these steps. It is worth noting here that this article uses two different terms to describe methods of predicting ground motion. Ground motion prediction equations (GMPEs) refer to traditional methods of predicting ground motions, i.e., those relying on regression analysis and a prescribed functional form, whereas GMM is used to refer to all models that predict ground motion.

Fig. 1
figure 1

Workflow of the conditional approach to risk assessment. The unconditional approach can be described by the blue workflow. P(IM) represents the complementary cumulative distribution function (CCDF) of IM within the region of interest, whilst P(EDP|IM) is the CCDF of the EDP conditional on the IM

One disadvantage of the unconditional approach is that it needs a large number of ground motions to be accurate; therefore, a rapid method of simulating ground motions is required. Stochastic ground motion simulations model the randomness of the earthquake rupture process and seismic wave propagation, which cause ground motions, thereby creating samples from just a few seismological inputs. This research uses the same terminology as Boore (2003), i.e. the general means of simulating ground motions is referred to as the stochastic method, whilst the specific application of this method is called a stochastic model. Stochastic methods can create a large number of records rapidly, helping to remove potential gaps and biases in empirical data. This makes them attractive for conditional risk assessments where stochastic models can be used to bolster empirical datasets and improve probabilistic seismic hazard assessment (PSHA), as is done by Meirova et al. (2018) building upon the SvE approach described in Shapira and Van Eck (1993) to produce an updated PSHA for Israel. Kowsari et al. (2021) and Zolfaghari (2015) also used ground motion simulations to help improve PSHA in Iceland and Iran, respectively.

This study makes use of a stochastic model to directly estimate hazard and risk from simulated ground motions through the unconditional approach. Few studies explicitly compare conditional and unconditional methods. Both Jalayer and Beck (2008) and Franchin et al. (2012) explored the effects of using conditional and unconditional approaches on seismic risk estimates for a reinforced-concrete frame structure. Whilst Azar and Dabaghi (2021) and Bijelic et al. (2019) examined unconditional hazard and risk assessments, respectively, using the CyberShake software (Graves et al. 2011). Unconditional techniques also provide a good reference solution to evaluate different methods of carrying out a conditional risk assessment. Bradley et al. (2015) uses this direct method of estimating hazard and risk to compare different ground motion selection strategies when evaluating the peak displacement response of a nonlinear single degree of freedom (SDOF) system. Scozzese et al. (2020) uses the outputs of Monte Carlo analyses as a reference solution to investigate the accuracy of a conditional method based on multiple stripe analysis. Moreover, Beauval et al. (2009) used earthquake simulations in the unconditional method to derive a hybrid deterministic-probabilistic hazard assessment, for the island of Guadeloupe (France). Comparisons were made between the built model and the Ambraseys et al. (2005) GMPE, but this is purely for illustrative purposes—as Douglas et al. (2006) showed that GMPE not to be well adapted for predicting earthquakes in the region. The study by Beauval et al. (2009) built on the work by Convertito et al. (2006) and Hutchings et al. (2007) to develop unconditional hazard assessments based on simulations of ground motion.

Stupazzini et al. (2021) also used ground motion simulations to perform hazard assessment using different GMMs and compare these results. The study also looked at their impact on conditional risk assessment procedure, estimating the risk to a portfolio of buildings for a case-study in Istanbul. Medel-Vera and Ji (2016) also proposed an approach to use ground-motion simulations in an unconditional risk assessment, although this study was specifically for nuclear facilities in the UK. They found results to be comparable to current practice, justifying the method’s use for earthquake risk assessments.

When estimating hazard, and subsequently conditional risk, conditional methods are far more prevalent in the literature—as seen by the ever-growing number, and variety, of GMMs (Douglas 2022). It is difficult to make an effective comparison between different conditional methods for risk assessment, especially with respect to the precision of the GMM. This is because there are a lack of high-quality ground-motion data in many regions of the world (e.g., Xie et al. 2020) and so it is hard to establish an effective benchmark for models to be compared against.

Moreover, the data are being constantly updated, meaning that hazard models of the same region can provide different estimates. Gkimprixis et al. (2021) showed that using two different hazard models (built in different years) at the same site in Italy, yielded significantly different hazard results, which directly impacted on the estimated risk. Again, this shows a difficulty in comparing hazard models, as the benchmark that does exist is constantly changing, alongside its associated uncertainties.

The need for any comparison between hazard assessments could be made redundant if enough high-quality ground motion data were present in a region of interest, as the need for GMMs could then be removed completely. Sufficient data collected from observations of shaking used in a Monte Carlo hazard assessment could fully capture all forms of variability and uncertainty without the need for modelling, making the most accurate hazard assessment possible. The collation of large strong-motion databases, such as the NGA-West2 database (Ancheta et al. 2014), presents a dataset for this idea to be tested.

This paper aims to compare the impact of using different GMMs on estimates of the seismic hazard, and subsequently risk. For this purpose, a stochastic model is employed within a fictive scenario to simulate ground motions (Sect. 2), allowing the creation of benchmark hazard and risk estimates through the unconditional approach. The following sections describe the creation of three GMMs (Sect. 3), before investigating hazard assessment results from these GMMs against the created benchmark (Sect. 4). These results are extended to a risk assessment of a simple structural model for both conditional and unconditional approaches (Sect. 5). For the conditional approaches an empirical fragility curve is created from the strucural model (Sect. 5), allowing a convolution with the hazard estimates from each of the GMMs (Sect. 5). This procedure allows for direct comparison between the unconditional and conditional approaches, allowing judgements to be made on the impact that each of the conditional approaches has on risk. Finally, an observation-based hazard assessment approach is demonstrated (Sect. 6), which directly estimates hazard using a strong-motion database, to investigate how well recorded data can match the benchmark hazard.

2 Seismic scenario and stochastic model

For this study, a fictive scenario is established with a circular source zone of radius 100km, and two faults of length 75 km and 25 km. Ground motions are computed at eleven stations along a line through the centre of the areal source, with hazard and risk calculated for a site at the centre-point of the region. This allows for different distances from the faults to be sampled—and so properly account for ground motion attenuation from the faults. The location and details of each of these seismic sources are shown in Fig. 2. All earthquakes in the scenario follow the Gutenberg-Richter relationship with minimum magnitude 5 and b = 1.0; the maximum magnitude and the a value for each source are provided in Fig. 2.

Fig. 2
figure 2

The seismic source model used for this study

The Atkinson and Silva (2000) stochastic model is implemented within this study. The model is altered by the generalised double-corner-frequency source spectral approach of Boore et al. (2014) to allow the stress drop of each source (detailed in Fig. 2) to be changed, therefore changing the strength of each source, and allowing the ground motions from each source to be accounted for when developing hazard assessments.

The stochastic model is used to obtain ground motions for this scenario, with the IM assumed as the spectral acceleration (Sa (T, ζ)) corresponding to a damping ratio, ζ = 5% and a period T = 0.2s. One hundred catalogues are generated for a time period of 25,000 years to form the hazard and risk assessments. The number of events occurring from each source is a function of their yearly activity. Multiplying this by the number of years produces a total of 10,605 events within the time frame from all sources. Each event is recorded by all 11 stations, yielding a total of 116,655 records per catalogue. All stations are used to derive GMMs; however, hazard and risk is only evaluated at the site of interest, defined in Fig. 2. A large number of simulations are used within this research to capture the precision of both the unconditional and conditional approaches, with mean hazard and risk estimates presented alongside their accompanying uncertainties.

Both magnitude and distance samples are generated based on their respective probability distributions to create ground-motion samples using the stochastic model. The distributions of the simulated magnitude, distance, and Sa were evaluated to ensure that they conform to expectations.

The assumption is made that the stochastic model used produces accurate and realistic ground motion intensities for the earthquakes modelled in this study. This is justified by various studies that validate the stochastic method (e.g., Silva et al. 1996; Tsioulou et al. 2019). This assumes the unconditional hazard and risk estimates from the stochastic model are the truth, and so act as a benchmark for the created GMMs to be compared against.

3 Ground-motion models

Separate to the unconditional analysis, the simulated ground motions are used to create three GMMs. These include a basic and more complex GMPE created through least squares regression analysis, with the functional forms shown in Eq. (1) and Eq. (2) respectively:

$${\text{ln}}\left(Sa\right)={{C}_{0}+ C}_{1}M+{C}_{2}{\text{ln}}\left(R+5\right)+ {C}_{3}Fault_{A}+ {C}_{4}Fault_{B}$$
(1)

where \({C}_{0}=-2.6642, {C}_{1}=1.110, {C}_{2}=-1.6812,{C}_{3}=-0.4639\mathrm\,{ and }\,{C}_{4}=0.2926\)

$${\text{ln}}\left(Sa\right)={{C}_{0}+ C}_{1}M+{C}_{2}{\text{ln}}\left(R+5\right)+{C}_{3}{M}^{2}+{C}_{4}R+ {C}_{5}Fault_{A}+{C}_{6}Fault_{B}$$
(2)

where \({C}_{0}=-8.0230, {C}_{1}=2.4141, {C}_{2}=-1.1646, {C}_{3}=-0.1134,{C}_{4}=-0.0073,\)

$${C}_{5}=-0.4154 \mathrm\,{and }\,{C}_{6}=0.3748$$

where \(M\) represents magnitude, \(R\) distance (km) and \(Sa\) is 5% damped spectral acceleration (g). The terms \(\it {\text{Fault}}_{\text{A}}\) and \(\it {\text{Fault}}_{\text{B}}\) equal 1 to predict ground motions from Fault A or Fault B and 0 otherwise.

A feedforward ANN is also considered, consisting of a single hidden layer of five nodes. The input of the ANN is the same as the GMPEs (i.e., \(M\), \(R\), \(\it {\text{Fault}}_{\text{A}}\) and \(\it {\text{Fault}}_{\text{B}}\)) and the output remains as \({\text{ln}}\left(Sa\right)\). The ANN uses the Levenberg–Marquardt optimisation technique (e.g., Dhanya and Raghukanth 2018) and five nodes are used to prevent over-fitting (e.g., Derras et al. 2014).

Machine learning tools have become more common in the field of engineering seismology in recent years (Kong et al., 2018). There is clear potential for their use in ground-motion prediction, helping to model complex nonlinear behaviours of ground motion that the fixed functional form of a GMPE may fail to capture (e.g., Alavi and Gandomi 2011), providing there are enough data. An ANN was selected for this study as its use has been well investigated for the purpose of ground-motion prediction; further research could investigate the use of other machine learning tools, as discussed by Khosravikia and Clayton (2021).

Each of these GMMs are used in PSHA to predict ground motion intensity samples based on the simulated magnitude-distance combinations from the stochastic earthquake catalogue. The conditional hazard analyses rely on the standard deviation (sigma) of each GMM to introduce variability in results when estimating Sa, whilst the unconditional approach already models the variability in ground motions by using every simulated record in the hazard analysis. On top of this, a model-error parameter (\({\varepsilon }_{mod}\)) is introduced to the stochastic model, as proposed by Jalayer and Beck (2008). This parameter scales the radiation spectra calculated by the stochastic model in order to account for modelling uncertainty. It is characterised by a normal distribution with mean of zero and standard deviation (in natural logarithm) of 0.5. For this study, only the total sigma is considered within the GMMs created. Future research could consider the effects of inter-event and intra-event variabilities separately.

4 Results

In this section, results from the conditional hazard models are presented and compared to each other, and against the results obtained using the unconditional approach. This includes comparing the residuals from each GMM, the returned median spectral acceleration values from the GMMs, a comparison of the hazard results produced by each of the models, and an investigation into the differences in these hazard results.

4.1 Residual analysis

Residual plotting of the three models indicated that magnitude-dependant sigmas could be considered (e.g., Youngs et al. 1997). As such, plotting magnitude (binned at 0.1 intervals) against sigma for each of these intervals showed a relationship between these two variables. An ANN (with a single hidden layer of two neurons) was fitted for each of the models to predict sigma based on magnitude. The ANN was selected to achieve a good fit of these data, ensuring accurate sigma prediction. A summary of the sigma values for each of these six considered models is presented in Table 1. The basic GMPE constructed has a higher sigma value than both the complex GMPE and ANN, implying it has a worse fit to the true ground-motion samples than the other two models. The sigmas are slightly smaller than generally observed for GMPEs obtained from actual ground-motion records (e.g., Douglas et al. 2014). This is because the stochastic model does not include all sources of variability in earthquake ground motions.

Table 1 Comparing sigma values from the GMMs created in this study

4.2 Predicting spectral acceleration

Median predictions of Sa for the three conditional models are presented on Fig. 3 for fixed distances of 25 km and 75 km, and fixed magnitudes of 5.75 and 6.75: with dashed lines on Fig. 3 representing plus and minus one standard deviation from the median. Magnitude-dependant sigma models are not included in this plot as median Sa predictions are not affected by this. As the differences between each of the GMMs was consistent across all three of the seismic sources, only the median Sa predictions from the background source are presented here.

Fig. 3
figure 3

Median Sa predictions from the area source for all three conditional models (solid lines), at fixed distances of 25 km and 75 km, and fixed magnitude of 5.75 and 6.75, dashed lines represent plus and minus one standard deviation of the median

For the fixed distance plots, predictions from the ANN appear almost identical to that of the complex GMPE: whilst all three models appear to vary from each other at extreme distances in the fixed magnitude plot. There are similarities in median Sa prediction at points where there are a wealth of magnitude and distance samples, allowing each of the models to be sufficiently trained. The differences in predictions appear where fewer events are expected in the catalogue. This is most noticeable with the basic GMPE, which considerably overpredicts Sa values for higher magnitudes and extreme distances due to the simplicity of its functional form, meaning it cannot fully describe the possible ground motions for all possible earthquakes.

4.3 Assessing hazard

The hazard, or mean annual frequency of exceedance (MAF), is estimated for each source by finding the number of ground motions that exceed an Sa threshold and multiplying it by the seismicity rate of the source: this is performed on a series of Sa thresholds (finely discretised at 1.0 × 10−3 g spacing between 0 and 2.5 g) and summed across all sources at these thresholds, to produce the overall site hazard. The process is repeated for 100 sets of records to obtain the MAF. This procedure is performed on the simulated ground motions in the case of the unconditional Monte-Carlo based hazard assessment, and on the predicted ground motions from the GMMs in the case of the conditional methods. Figure 4 shows the mean hazard curves for all created models: with dashed lines representing the 16th and 84th percentiles of the mean.

Fig. 4
figure 4

Mean hazard curves for all created models, solid lines show the mean hazard results whilst dashed lines show the 16th and 84th percentiles of the mean. Sa is plotted on a linear scale to better show the differences between the results

The standard deviation of MAF values is also computed, allowing the calculation of the 16th and 84th percentiles—assuming a lognormal distribution. All models appear to have a similar predictive quality at more frequent hazard occurrences, matching until an MAF of exceedance of 0.025 years−1, but are quite different for lower MAFs. The hazard curves assuming magnitude-dependant sigma models show similar behaviour to the fixed sigma alternatives, and so results from these models are not shown in the following plots. This similarity between magnitude-dependent and magnitude-independent GMMs could be because the magnitude dependency of sigma is quite small in the developed GMMs.

Since the differences between the hazard curves are large when using a wealth of simulated data, they will likely be even more significant when using actual strong-motion records, which are fewer in number and sparser in distribution. This indicates that GMM selection can be important when carrying out a hazard analysis, as these inaccuracies will be propagated through to the risk assessment, leading to poorer loss estimates.

To check whether the differences in hazard curves were persistent, the same procedure was carried out to predict Sa at a period of T = 1.0s. Figure 5 presents the MAF of exceedance hazard curves for three GMMs derived for T = 1.0s. There is still a difference between the results from each of the GMMs, especially with respect to the basic GMPE; however, the models are more similar than when evaluated at T = 0.2s. Further research could investigate whether the differences between conditional hazard models is maintained for a range of IMs. For this study, the differences in hazard estimates of Sa at T = 1.0s are not further investigated as the differences for this spectral period still indicate the importance of GMM selection for risk assessment.

Fig. 5
figure 5

Mean hazard curves for all created models using Sa (T = 1.0s), solid lines show the mean hazard results whilst dashed lines show the 16th and 84th percentiles

A different set of mean hazard results can also be obtained by finding the mean ground motion intensity for a fixed set of MAF of exceedances. The suitability of these two approaches has been discussed previously and they were found to yield distinct hazard curves (e.g., Bommer and Scherbaum 2008). Interestingly, with the large suite of simulated data created by this study, both approaches to calculate the mean hazard yield very similar results, as shown in Fig. 6.

Fig. 6
figure 6

Comparison of mean hazard calculations via two approaches; calculating the mean of the annual frequency of exceedance; and calculating the mean Sa. Solid lines show the mean hazard, with dashed lines showing the 16th and 84th percentiles

4.4 Evaluating differences between hazard results

The Kolmogorov–Smirnov (KS) test (e.g., Stephens 1974) is used to compare results from the conditional hazard models against the benchmark hazard results. The test is carried out by finding the maximum absolute difference between two cumulative probability distributions (CDFs) tested. Although different techniques exist to compare hazard results such as Cohen’s effective size (e.g., Malhotra, 2015), the KS test was chosen for its ease of application. As a non-parametric test, the KS test does not rely on the assumption and description of a probability distribution. It evaluates differences between the entire range of the two distributions, so it can ascertain differences at the extremes of the distribution, including stronger ground motions, which are more likely to be different in the various hazard results.

The null hypothesis of the KS test is that the two CDFs are drawn from the same population. A CDF can be obtained from a hazard curve by first converting the frequency of exceedance for a range of Sa values to a probability. Given that our seismic source models assume a Poisson process, Eq. (3) converts MAF of exceedance to annual probability of exceedance:

$$P=1-{{\text{e}}}^{-\lambda }$$
(3)

where \(P\) is the annual probability of exceeding a certain Sa value, and \(\lambda\) is the MAF of exceeding the same Sa value at the site i.e., the cumulative MAF of exceedance from each of the three sources. One minus the annual probability of exceedance provides the CDF value for the given Sa. CDFs for each hazard model are obtained by performing this procedure on hazard results for Sa values at 0.01g intervals between 0 and 2.5g.

Each of the CDFs from the conditional approaches can then be tested against the CDF of the reference solution. Failing the test implies that the created GMM did not generate similar hazard estimates to the reference curve. This would imply that it would not be an appropriate model to use for risk assessment.

Table 2 presents the returned p value of the KS test, with only the ANN model passing the test at the 5% significance level—agreeing with visual inspection of the hazard curves. Thus, this is the only model that could be considered as coming from the same distribution as the simulated data, and so provides the best prediction of the hazard.

Table 2 Results from Kolmogorov–Smirnov Test on the three conditional hazard models

The KS test is known for its sensitivity when the CDF is described at different sampling points, which could influence the returned result from the test, and this holds true for this scenario. For instance, carrying out the test at intervals of 10−4 g spacing between 0 and 2.5 g rejected the null hypothesis for all three models; whilst using intervals of 0.1g for the same Sa bounds only rejected the null hypothesis for the basic GMPE. Ultimately, the interval spacing of 0.01 g used to perform the test was considered acceptable for this scenario as this broader spacing is more likely to be used when comparing conditional hazard models to real world data. Nevertheless, further research should consider using other statistical tests to test the similarity between the hazard results, because of the sensitivity of the KS test.

4.5 Hazard disaggregation

To investigate the differences in hazard predictions, hazard disaggregation (Bazzurro and Cornell 1999) is performed. This breaks down the ground motions into the factors that contribute towards hazard, i.e., in this case magnitude, distance, and epsilon. Figure 7 plots disaggregation results from both the Monte Carlo and conditional hazard analysis approaches for Sa values of 0.1 g and 1.0 g—mean magnitude and distance are also provided. Results are very similar at 0.1 g, as expected by the agreement of the hazard curves (Fig. 4) at this value of Sa. However, when disaggregation is performed at 1.0 g, results vary between each of the models, as seen by differences in the plots.

Fig. 7
figure 7

Hazard disaggregation results from the Monte Carlo-based approach and three conditional hazard assessment approaches, for Sa values of 0.1 g and 1.0 g

The disaggregation results at 1.0 g show a change in the dominant earthquake scenario. The mean magnitude-distance combination for the Monte Carlo-based approach at 0.1 g is 5.86 and 33.56 km, respectively, whilst at 1.0 g this changes to 6.51 and 10 km. At smaller magnitudes and further distances, the GMPEs accurately predict ground motions but at higher magnitudes and shorter distances, the GMPEs poorly predict the less abundant, stronger ground motions. This creates greater inaccuracies within the hazard assessment, which could lead to poor quality risk assessments, reaffirming the importance of GMM selection.

4.6 Restricting the magnitude and distance ranges of seismicity

All hazard models appear to perform well for smaller Sa. This is well indicated by the hazard curves of Fig. 3 and the hazard disaggregation of Fig. 7. To confirm this agreement, hazard assessment is performed again on each of the models, but this time the seismicity is restricted to magnitude and distance ranges of 5.5–6.5 and 10–50 km, respectively. In order to create this restricted scenario, both faults are halved in length and in their distance from the centre of the site, so that all distances simulated from the stochastic model will be between 10 and 50 km. The mean hazard results for this restricted scenario are plotted on Fig. 8.

Fig. 8
figure 8

Mean hazard curves produced by the Monte Carlo-based hazard approach and three conditional hazard approaches, for restricted distance range of 10–50km and magnitude range of 5.5–6.5, plotted by solid lines. Dashed lines show 16th and 84th percentiles of mean hazard

When comparing the hazard curves created by this new, restricted, scenario, all models match quite well, with the ANN mirroring the true values for the whole range of values under investigation. This shows that the models are a good fit for the simulated data for this range of interest, and that discrepancies in hazard predictions between the models (when considering the whole range of magnitudes and distances), is likely down to the influence of ground motions caused by higher magnitudes and smaller distances, as suggested by the hazard disaggregation.

5 Extension to risk assessment

To demonstrate the impact of GMM selection on seismic risk assessment, each hazard model was extended to assess the risk of an inelastic SDOF system, with system ductility selected as the EDP. The system has elastic-perfectly plastic behaviour designed to withstand a ductility factor of q = 4 for a MAF of exceedance of 2.1 × 10–3 years-1; with elastic period T = 0.2s, and yield displacement μy = 0.0013 m. The yield displacement was calculated for the ductility factor, based on the assumption that the SDOF system is in the medium ductility class defined in Eurocode 8 (EN-1998-1). As the structure is in the short period range, the equal energy rule is used (Eq. 4) as inferred by the N2 method (Fajfar 1999) and Eurocode 8 (EN-1998-1):

$${q}_{d}=\frac{u}{{u}_{y}}=1+(q-1)\frac{{T}_{c}}{T}>q$$
(4)

where \({q}_{d}\) is the modified ductility factor, \(u\) is the inelastic displacement of the SDOF, \({u}_{y}\) is the yield displacement, \(q\) is the ductility factor, \({T}_{c}\) is the corner period of the SDOF (\({T}_{c}\)=0.5s according to Eurocode 8 (EN-1998-1)) and \(T\) is the period of the SDOF. The inelastic displacement of the SDOF, \(u\), is estimated by multiplying the design elastic displacement with the Newmark and Hall inelastic displacement coefficient, C, (Newmark and Hall 1982), where C was found to equal 2.34.

Nonlinear dynamic structural analyses were carried out for all 100 sets of 10,605 ground-motion samples at the site of interest to calculate the maximum system displacement for each record. Displacements are normalised by the system’s yield threshold to produce the system ductility demand.

For the unconditional Monte Carlo-based approach, seismic risk can be assessed by finding the annual frequency of exceeding a ductility demand threshold, as illustrated by Eq. 5:

$$v\left(\mu \right)=\sum_{j=1}^{sources}{\lambda }_{{source}_{j}}\sum_{i=1}^{{N}_{j}}\frac{{I}_{i,j}\left(\mu \right)}{{N}_{j}}$$
(5)

where \(v\left(\mu \right)\) is the annual frequency of ductility demand exceedance, \({N}_{j}\) is the number of magnitude-distance pairs simulated from each source, \({\lambda }_{{source}_{j}}\) describes the activity-rate of each source, and \({I}_{i,j}(\mu )\) is an indicator function equal to one if the \(i\)th record is greater than the ductility threshold and zero otherwise. The mean risk estimate is the average rate that the ductility demand value is exceeded for all 100 sets of records. Reproducing this assessment for a series of ductility demand thresholds creates a Monte Carlo-based risk curve.

To estimate risk for the conditional approach, a fragility curve must be created and combined with the hazard results. An empirical fragility curve is created for each ductility threshold by finding the conditional probability that the system ductility exceeds the threshold level, given the intensity measure of shaking. These are derived from the simulated hazard and ductility values, with fragility curves for ductility thresholds of 1, 2, 3, 4 and 5 shown in Fig. 9. To produce a fragility curve, Sa is separated into 40 bins; displayed by the dashed lines on Fig. 9. The probability of exceeding a ductility threshold is then calculated for each bin for all 100 sets of records, with the process repeated for each ductility threshold to create a series of fragility curves.

Fig. 9
figure 9

Example empirical fragility curves created for conditional risk assessment, at ductility thresholds 1, 2, 3, 4 and 5. Curves are created as a mean from all 100 sets of catalogues

Conditional risk estimates are made for each ductility threshold by convolving the corresponding fragility curve with the hazard curves for each GMM (e.g., Baker et al., 2021). Risk curves for both the Monte Carlo based approach and conditional approaches are presented in Fig. 10, with mean risk (solid lines) and the 16th and 84th percentiles of the mean risk (dashed lines) presented.

Fig. 10
figure 10

Mean risk curves for all created models, solid lines show the mean risk whilst dashed lines show the 16th and 84th percentiles of the mean

The basic GMPE considerably overpredicts risk at estimates lower than approximately 0.025 years−1. For instance, for a ductility of 2, the basic GMPE overpredicts risk by 24% and at a ductility of 4 the benchmark risk is overpredicted by 35%.

Both the complex GMPE and ANN lead to far better risk estimates, with the ANN marginally better than the complex GMPE, especially at higher ductility values. This extension reinforces the results from the hazard assessment, demonstrating the importance of GMM selection when carrying out a risk assessment.

For completeness, risk is also estimated for an SDOF where the period of the system is changed to T = 1.0s. As this structure is in the medium-long period range, the equal displacement rule can be used, as per Eurocode 8 (EN-1998-1). This assumes that the peak elastic displacement of the system is equal to the peak inelastic displacement of the system. Therefore, dividing the calculated elastic displacement of the SDOF by the ductility factor q = 4, the system is found to have yield displacement uy = 0.0067m. Figure 11 presents mean risk curves with both 16th and 84th percentiles of the mean (dashed lines) for the unconditional and conditional approaches investigate. Similar to the hazard assessment for a period of T = 1.0s, the conditional risk estimates appear to match the unconditional approach more closely—however there is still a noticeable difference between the two approaches.

Fig. 11
figure 11

Mean risk curves for all created models evaluating an SDOF with period T = 1.0s, solid lines show the mean risk whilst dashed lines show the 16th and 84th percentiles of the mean

6 Observation-based hazard assessment

So far, this study has compared the impact that different GMMs have on risk estimates. However, if sufficient real-world ground motions had been recorded, stochastic models and GMMs would no longer be needed to assess seismic risk. Instead, the empirical data could be implemented directly within a Monte Carlo analysis to estimate the hazard and risk in the area—as has been done with the simulated ground motions in this study. To test this idea, an “observation-based” hazard assessment is presented here, where the hazard estimated using real strong-motion records is compared against the benchmark hazard.

First, to improve the match between the benchmark hazard and the empirical data, a simpler seismic scenario is considered from the previous study: leaving just the areal source. To provide strong motion records the NGA-West2 database (Ancheta et al. 2014) was selected. This is one of the largest databases yet compiled so there is a good chance of containing enough records to provide a comparable hazard assessment to the benchmark. Moreover, NGA-West2 includes mainly records from the same geographical region as the Atkinson-Silva stochastic model (i.e., western United States)—so ground motions from this database should be similar to those simulated by the stochastic model: making a comparison of this method against the benchmark hazard suitable.

To estimate hazard from this observational data, both the simulated magnitude-distance pairs from the stochastic model, and magnitude-distance combinations from the NGA-West2 database, are binned. The bins, and the number of magnitude-distance pairs in each bin, are shown in Fig. 12. Records that fall outside of these bins (i.e., with magnitudes less than 5 and greater than 7.5, and hypocentral distances greater than 100km) are discarded as they fall outside the range of the seismicity model used.

Fig. 12
figure 12

Heat-map showing number of records belonging to each magnitude-distance bin from a an example simulated catalogue and b the NGA-West2 database

Each simulated magnitude-distance combination is randomly matched to an NGA-West2 record that falls into the same bin—with the corresponding empirical ground motion assigned to this simulated event. For example, if the simulated magnitude-distance pair fell in the range of 5.5–6.0 magnitude and 25.12–39.81 km, the ground motion assigned to this simulated event would be one of the 129 records that fell into the same bin. Records are randomly selected from each bin to introduce variability into the hazard model. The process is repeated for all 100 sets of records to produce mean hazard results.

The stochastic model used in this research is calibrated to rock sites; however, there were insufficient data to only include rock sites from the NGA-West2 database in this analysis. To account for this, two different approaches were considered. The first where all sites are used within hazard assessment, irrespective of the average shear-wave velocity in the top 30m of the ground (Vs30). The second, where only sites with Vs30 greater than 450 m/s are considered. Figure 13 shows the mean hazard curves produced from these two approaches, with dashed lines representing the 16th and 84th percentiles.

Fig. 13
figure 13

Mean hazard curves for the four observation-based hazard assessment models compared to the unconditional benchmark hazard, solid lines show the mean hazard whilst dashed lines show the 16th and 84th percentiles of the mean

Figure 13 shows that the purely observational method with no site restrictions consistently over-predicts hazard. This is likely down to the NGA-West2 database being dominated by sites with relatively low Vs30, meaning site amplification is greater and stronger ground motions are produced than the fixed Vs30 of 620 m/s in the benchmark stochastic model. The site restricted model only produces an improved hazard prediction at small Sa, suggesting that there are an abundance of sites with a low Vs30 producing small ground motions.

To try and achieve a better fit to the benchmark hazard assessment, a Vs30-based adjustment is introduced. This adjustment scales the observed ground motions by a site amplification factor. The site amplification equation from the Chiou and Youngs (2014) GMPE was used to estimate this site amplification factor. This model was calibrated for the NGA-West2 database and hence is appropriate for this adjustment. This modelling was used for both the site restricted and unrestricted approaches to produce two new observation-based hazard curves, also shown in Fig. 13.

Out of these two new hazard curves, only the non-restricted model with site effects scaling (denoted as “observational scaled” in Fig. 13) makes a great improvement on results. This model matches the benchmark considerably better than the other curves, across the whole range of Sa values, only slightly overpredicting hazard. Although promising, this method does not provide a particularly accurate description of the scenario hazard, when compared against the benchmark. To improve this, further study could consider a larger database with a closer match to the site conditions of the stochastic model in the hopes to obtain a better hazard assessment without the need for the introduction of any ground-motion modelling at all.

It is worth noting that the uncertainties presented by Fig. 13 appear small. This is likely down to the poor sampling that can be observed for the observation-based models. For example, in the bin for magnitudes 5.5–6.0 and distances 39.81–63.10km there are 204 empirical records and 6088 simulated records – meaning each record would be sampled on average 30 times for this set of data. Using a bigger database may resolve this issue to capture the natural variability of ground motions more accurately.

7 Conclusions

In this study, a comparison of different ground motion prediction methods has been carried out, in regard to their impact on hazard and risk estimates. A fictive scenario was established, with a stochastic model employed to simulate ground motions and build these models.

Three ground-motion models were considered: a basic and more complex classical ground motion prediction equation, and an artificial neural network. An empirical fragility curve was also created from simulated ductility data, before being convolved with each of the ground-motion models to produce risk estimates via the conditional approach. Alongside this, simulations were directly used in a Monte Carlo analysis to directly estimate benchmark hazard and risk results, to which the ground-motion model-based estimates were compared.

Finally, an observation-based hazard assessment technique was outlined that demonstrated potential to estimate hazard when compared against the benchmark results, if a site-amplification scaling was applied. This technique indicates that in certain situations it may be possible to estimate hazard and risk purely from empirical data, without needing a ground-motion model.

Conditional results show that careful selection of a ground-motion model is required to obtain the best estimates of seismic risk. Out of the three ground-motion models created, only the artificial neural network appears to produce hazard estimates similar to the benchmark results, and the same outcome is visible from the risk estimates. Although it is important to note that artificial neural networks are only successful when trained on large, complete, datasets—something that is hard to replicate in the real world.

Hazard disaggregation was performed on each hazard model. With the ground motion prediction equations struggling to predict high magnitude, short distance, events at higher ground-motion levels, causing the over-prediction of hazard, and ultimately risk. It may be possible to partially reduce this problem by using more complex functional forms, but these are difficult to constrain without large datasets. Uncertain risk estimates may result in inaccurate risk assessment and design, and ill-informed decision making. If estimates are poor in a data-rich scenario, they will be worse in the real world where data are less comprehensive.