1 Introduction

Large shallow crustal earthquakes occur infrequently. For example, in the Global Centroid Moment Tensor (CMT) catalogue for the 10-year period running from 1 January 2006 to 31 December 2015 there were 75 earthquakes with moment magnitude (M) 7.2 or larger and centroid depths 50 km or shallower. Of these earthquakes, only a small proportion are crustal earthquakes (the rest are interface subduction events) and, of those, only seven were recorded at close distances by strong-motion networks, i.e. an average of less than one per year. As demonstrated by the 2015 Gorkha (Nepal) earthquake (M 7.9) such events have the capability of causing great destruction over a wide area. Because of the large fault ruptures associated with such events, their study also improves our understanding of tectonics. Therefore, these earthquakes are important both from engineering and scientific viewpoints.

The purpose of this study is to collect the largest possible set of peak ground accelerations (PGAs) from large shallow crustal earthquakes and to compare them to predictions from recent ground motion prediction equations (GMPEs). The recent NGA-West2 GMPEs were derived using a global database that contained 15 earthquakes with M ≥ 7.2 contributing 1224 records (Ancheta et al. 2014). By conducting an extensive literature review and by including data that would generally not be considered for the derivation of GMPEs (e.g. PGAs from non-digitized analogue accelerograms) it has been possible to compile PGAs from an additional 23 crustal earthquakes with M ≥ 7.2. The total number of available PGA observations from these additional events is quite low, and there are often uncertainties associated with their use. Nevertheless, due to the rarity of large earthquakes and the often-stated need to better constrain ground-motion predictions for such events (e.g. Douglas and Edwards 2016), the use of these data for this study was considered acceptable.

In the following section characteristics of the collected data are reported (for the interested reader an Electronic Supplement containing the data used for this study as well as references is provided). The methods used to estimate the independent parameters used by the considered GMPEs are also presented. Section 3 presents the comparisons to eight recent GMPEs in terms of residual plots. Some brief conclusions are then given.

2 Data

To identify data from large earthquakes that are not listed in the NGA-West2 (Ancheta et al. 2014) or RESORCE (Akkar et al. 2014a) databases, a list of shallow earthquakes with magnitudes (any scale) larger or equal to 7.2 since 1933 (the advent of strong-motion recording) was obtained from the online catalogue of the International Seismological Centre (2016). PGA observations from all earthquakes in this list that occurred close to land that were not clearly subduction events were sought via a web and literature search. Valuable sources of data were the Seismic Engineering Program Reports published as United States Geological Survey Circulars as well as earthquake field reports published by the Earthquake Engineering Research Institute and, for recent events, the Center for Engineering Strong Motion Data. In an attempt to add PGAs for poorly-recorded earthquakes, data were sought from the broadband Global Seismographic Network available through the Wilber 3 data service of the Incorporated Research Institutions for Seismology. However, because of the low sampling rates and high-sensitivity of these instruments, no reliable PGAs were obtained from distances less than 500 km from this network so these data were not considered further.

A lower magnitude limit of 7.2 was chosen for this study rather than 7.0, which would be more usual for “large” earthquakes, because lowering the limit would have significantly increased the number of earthquakes that would need to be considered. Assuming a Gutenberg–Richter b value of −1 the number of earthquakes would have increased by more than 50% in going from 7.2 to 7.0 with a concomitant increase in time required for data collection. Given more time this study could obviously be repeated with a lower magnitude limit but it is likely that the conclusions would be similar.

From this search, 23 earthquakes, in addition to the 15 contained in the NGA-West2 database, were identified as having PGA observations from ground-response stations (these are not necessarily “free-field” installations—the instrument housing may have slightly affected the measured ground motions). Because it is thought that some of the identified records were never digitised (due to their considerable source-to-site distances, for example) or publicly disseminated, it was often not possible to obtain the actual accelerograms associated with these PGAs. Consequently this study is limited to an analysis of PGAs. Some additional PGAs were also included for the 15 earthquakes already listed in NGA-West2. In total 1895 PGAs have been collated (1224 of which are from the NGA-West2 database, whose PGAs and metadata are used when available). A summary of these data is given in Table 1. Most of the recent earthquakes are well-recorded but for many of the early events and those occurring in areas with sparse instrumentation (e.g. 2006 Koryak and 2009 Offshore Honduras) only a few PGAs are available. This means that the between-event terms for these earthquakes are poorly constrained. For some of the earthquakes (e.g. the Japanese earthquakes of 1964, 1983 and, particularly, 1993) more accelerograms were probably recorded but these do not appear to have been published in publicly-available English-language sources.

Table 1 Summary of the earthquakes and PGAs compiled for this study

An extension of the data collection to response spectral acceleration or peak ground velocity was considered. For at least 290 of the 671 additional observations identified for this study (i.e. those not in the NGA-West2 database) only PGAs are readily available as the corresponding accelerograms were never digitised or the time-histories publicly released. Obtaining the accelerograms from which to compute other intensity measures for the remaining identified PGAs would often require considerable effort (or it would be impossible if the data have been lost or were not released by the owners) because the data are not available online but simply referred to in publications. Consequently it was decided to limit the study to PGA rather than examine a smaller dataset similar to the original NGA-West2 database for other intensity measures.

For the 15 earthquakes contained in the NGA-West2 database the moment magnitudes reported there have been retained. For the other earthquakes, the moment magnitudes reported in the Global CMT catalogue have been used except for pre-1976 events, for which estimates reported in earthquake-specific studies are adopted (because the Global CMT catalogue begins in 1976). For the 1976 Caldiran event NGA-West2 reports a magnitude of 7.21 and hence it is included within the selected data, whereas Global CMT gives 7.0 for this earthquake and hence it should not perhaps have been retained. For the 1992 Cape Mendocino earthquake NGA-West2 lists a magnitude of 7.01 and hence it is not used here but Global CMT reports 7.2 for this event and hence it should perhaps have been included. A comparison was made between magnitudes from Global CMT and those provided by the National Earthquake Information Center (NEIC), which is another common global source for moment magnitudes, for the 32 earthquakes considered here that are in both catalogues. For two earthquakes (1979 St Elias; 1983 Sea of Japan), which contribute only 16 PGAs, the Global CMT magnitudes were 0.4 and 0.3 units higher than those given by NEIC, respectively, but for all other events the magnitudes were within 0.1 units and the overall mean difference was zero (to two decimal places). Therefore, we conclude that the magnitudes used for the considered events are generally consistent with those provided by other sources.

The Global CMT catalogue is also used to classify events by mechanism, with earthquakes with rake angles within 30° of the horizontal classified as strike-slip, and normal and reverse earthquakes being those with other negative and positive rake angles respectively. Of the 38 earthquakes only two (1959 Hebgen Lake; 1995 Aqaba) are normal-faulting events (contributing 20 PGAs), 20 are reverse faulting (contributing 1073 PGAs) and the remaining 16 have strike-slip mechanisms (contributing 802 PGAs). The classification of some of these earthquakes (1964 Niigata; 1975 Kalapana; 1983 Sea of Japan; 1993 Hokkaido-Nansei-Oki; 2015 Gorkha and its aftershock) in the same category as the shallow crustal earthquakes used for the derivation of the considered GMPEs is debatable because of their locations and shallow dip angles but they were retained for completeness (as shown below PGAs in these events do not show large differences compared to the other data). The 2001 Bhuj earthquake is generally considered as having occurred in a stable continental region rather than an active region so again it may have atypical characteristics (as shown below this seems to be the case). As shown below, distant PGAs from the 2008 Wenchuan earthquake, which occurred in an active region (as exemplified by its inclusion in the NGA-West2 database), appear affected by travel paths through stable continental crust that attenuate high-frequencies less than does active crust. Large earthquakes often have complex ruptures so the pertinence of mechanism categories based on teleseismic centroid moment tensors is debatable but it is retained here for simplicity and consistency with previous studies. A map showing the locations of the analysed earthquakes is given in Fig. 1. The magnitude–distance-mechanism distribution of the collected data is shown in Fig. 2. Figure 3 shows a histogram of the source-to-site distances of the data—much of the data are from considerable distances but nevertheless they provide valuable information on ground motions in large events.

Fig. 1
figure 1

Geographical distribution of the collated data. The filled markers indicate earthquakes already included in the NGA-West2 database, the size of the markers indicates the number of PGA observations used and the colour indicates the mechanism: black (strike-slip), red (reverse) and cyan (normal)

Fig. 2
figure 2

Magnitude–distance-mechanism distribution of collated data (unfilled symbols indicate data not from the NGA-West2 database). Note that recordings at RJB < 0.1 km have been plotted at 0.1 km

Fig. 3
figure 3

Distribution of data with RJB. Note that most distant (>400 km) data are from the 2001 Bhuj and 2008 Wenchuan earthquakes

Thanks to their large size, the location and geometry of the fault rupture planes for almost all of the earthquakes are well established through special studies or from surface rupture (the Finite-Source Rupture Model Database: http://equake-rc.info/SRCMOD/ was useful in providing geometries). This information allows the computation of the extended-source distance metrics employed by all the selected GMPEs as well as the depth to the top of rupture used by three of the models. For those data within the NGA-West2 database the estimates of these parameters reported there have been used but for the other records these distances and depths have been calculated or taken from other sources.

The selected GMPEs all use the average shear-wave velocity in the top 30 m (VS30) to characterise the near-surface site conditions at the strong-motion stations. If this information was available in the NGA-West2 database or was given in other published sources, these estimates are used. Only descriptions of the site conditions were available for many of the additional data. For these sites, the procedure proposed by Seyhan et al. (2014) of using various geological or geotechnical proxies to estimate VS30 was used. For some PGAs no information on the local site conditions could be found—for these we assume a VS30 of 310 m/s (corresponding to generic “soil”). Such estimates are obviously associated with large uncertainties but again given the sparsity of data from large earthquakes it was decided that this uncertainty was acceptable.

Because of their age, much of the additional data collected for this study come from analogue instruments that recorded on film or paper in triggered mode. Such instruments are less sensitive than modern digital sensors to ground motions and hence there is a chance that they only triggered because the ground motions were higher than would be expected given the earthquake size, source-to-site distance and site conditions. Because of this, strong-motion studies often limit analysis to those records that occurred at distances shorter than the average triggering limit (e.g. Douglas 2003). For this study the triggering limits proposed by Boore et al. (2014, their Fig. 1) for analogue instruments (220 km for M ≥ 7) were considered along with using all data regardless of the distance at which it was recorded, although this is often far beyond the stated range of applicability of the GMPEs.

Because some analogue records were never examined in detail, the true PGA was not reported, only that it is over the (generally unknown) triggering threshold and below the (unknown) amplitude for digitisation. To retain these records it was decided to assume a PGA of 0.03 g, which is half way between a 0.01 g trigger and a 0.05 g minimum for digitisation. This obviously introduces an additional uncertainty into the analysis but it only affects 14 PGAs from five different earthquakes; hence, the impact on the overall conclusions is minimal.

An Excel spreadsheet is provided as an Electronic Supplement, which contain the data used for this study and the references.

3 Comparisons to GMPEs

The observations are compared to predictions from eight recent GMPEs that are not excluded by the criteria of Bommer et al. (2010): Abrahamson et al. (2014), Akkar et al. (2014b, c), Bindi et al. (2014a, b), Boore et al. (2014), Campbell and Bozorgnia (2014), Cauzzi et al. (2015), Chiou and Youngs (2014) and Zhao et al. (2016). Only ground-motion models for shallow crustal earthquakes in active tectonic regimes are considered as almost all identified records come from such regions (as opposed to stable areas). The four NGA-West2 models selected (Abrahamson et al. 2014; Boore et al. 2014; Campbell and Bozorgnia 2014; Chiou and Youngs 2014) are derived using predominately data from California when considering all magnitudes. For M ≥ 7.2, however, the 1999 Chi–Chi and 2008 Wenchuan events contribute just over half of the records used to derive these models, the 2010 El Mayor earthquake (in Baja California) about a third, the 1992 Landers and the 1952 Kern County earthquakes (in California) less than 10% and other non-Californian earthquakes the rest. Two of the models (Akkar et al. 2014b, c; Bindi et al. 2014a, b) used data from Europe, the Mediterranean and the Middle East, for which the database is particularly sparse for large events. The final two models (Cauzzi et al. 2015; Zhao et al. 2016) are derived principally from Japanese data, where again the database of strong-motion data for large crustal events is limited. Nevertheless, the GMPE of Zhao et al. (2016) accounts for the findings of Zhao and Lu (2011), based on a worldwide database of 710 records, that magnitude-scaling of M7+ earthquakes is much lower than for smaller events. Information on these models can be obtained from the original articles and the online GMPE compendium of Douglas (2017).

The mean offset (or bias, ck), event terms and within-event residuals for each of the eight GMPEs were computed using the procedure given in Boore et al. (2014, Section “Methodology and Model Performance”) using the linear mixed effects algorithm (lme function) as implemented in the nlme package of R (Pinheiro et al. 2017). Natural logarithms (ln) are used here as this is becoming the de facto standard. These biases and between-event and within-event residuals are the basis of most of the comparisons discussed in this section.

Figure 4 shows the collated PGAs against Joyner–Boore distance (RJB). The data generally follow the predictions from the GMPEs, although there is considerable dispersion in the predictions, particularly close (<20 km) and far (>200 km) from the source. The purpose of this figure is not a rigorous comparison between predictions and observations but to show the distribution of the PGAs and to examine general trends in the data for certain distance ranges (hence the use of a single magnitude, VS30 and style of faulting). The PGAs show clear saturation at short (<20 km) distances. At large distances, there is more variability in the observations because of different regional attenuation rates. PGAs from two events (2001 Bhuj and 2008 Wenchuan, which are highlighted) at large distances are much higher than PGAs in other earthquakes of similar size (e.g. 2016 Kaikoura, also highlighted) as well as predictions from models that explicitly include anelastic attenuation (e.g. Boore et al. 2014), even if the regional adjustments for these models are included. Predictions from models that do not include terms for anelastic attenuation (e.g. Akkar et al. 2014b, c), however, match the observations better. This is clearly seen in a plot of the within-event residuals (Fig. 5). The good fit of the Akkar et al. (2014b, c) model to the data at greater distances may be coincidental, given that these developers did not use data beyond 200 km or any data from the Bhuj or Wenchuan earthquakes.

Fig. 4
figure 4

PGA versus distance for all collated data and for three highlighted events that are discussed in the text. Also shown are predicted PGAs from all eight GMPEs (and two regional variants of two models) evaluated for M = 7.5, VS30 = 370 m/s and strike-slip faulting. The Scherbaum et al. (2004) approach is used to convert RRUP to RJB if required to evaluate the GMPEs. The vertical lines at 220 and 400 km show the two distance limits used in the residual analyses reported in this article; all subsequent figures are for the 400 km maximum distance

Fig. 5
figure 5

Within-event residuals with respect to RJB for two GMPEs. The vertical grey line marks the maximum distance used in subsequent figures. Data at distances less than 0.1 km are plotted at 0.1 km

The maximum distance for which the selected GMPEs are recommended for use by their developers varies between 150 km (Cauzzi et al. 2015) and 400 km (Boore et al. 2014), with most suggesting 300 km. Because of these recommendations, Figs. 4 and 5 (showing divergent behaviour of the observations and GMPEs for large distances), and the fact that PGAs at great distances are too small to be of much engineering interest, we decided to limit our subsequent analysis to data from RJB ≤ 400 km for all GMPEs. Removal of these data means that the Koryak 2006 earthquake (with a single PGA from a distance of 577 km) is not included in the final analysis. As noted above, we also repeat the analysis using a cut-off of 220 km (based on probable triggering distances).

Three of the considered GMPEs include hanging wall terms (Abrahamson et al. 2014; Campbell and Bozorgnia 2014; Chiou and Youngs 2014). For consistency between models and because the effect of these terms for this dataset was limited, we decided not to consider these terms for prediction. For the same reasons, the minor differences in predictions when these are applied (Fig. 4) and because this is a global rather than a regional study, we chosen not to apply the regional correction factors of the NGA-West2 models. As an example, the overall bias of the Boore et al. (2014) model only changes from −0.046 to −0.057 (220 km cut-off) and from 0.095 to 0.068 (400 km cut-off) when the regional adjustments are applied.

The overall bias, within-event and between-event variability of the eight GMPEs are listed in Table 2 for both cut-off distances: part a of this table shows the results using all events whereas parts b, c, d and e of this table consider various subsets of the data (see Sect. 3.3). The lowest overall bias (less than ±0.1, equivalent to ±10%) are for the models of Akkar et al. (2014b, c), Abrahamson et al. (2014), Bindi et al. (2014a, b) and Zhao et al. (2016). All models show an overall bias less than ±20%. The analysis was repeated after removing all PGAs from distances RJB > 220 km. This reduces the total number of records to 1424 from 36 earthquakes (1969 Offshore Portugal and 2006 Koryak have only a single PGA observation, which is from farther than 220 km). In general, the between-event and within-event variabilities decrease slightly (Table 2) and the absolute overall biases are less than ±10% for Akkar et al. (2014b, c), Abrahamson et al. (2014), Bindi et al. (2014a, b), Boore et al. (2014) and Campbell and Bozorgnia (2014), less than ±20% for Cauzzi et al. (2015) and less than ±30% for Chiou and Youngs (2014) and Zhao et al. (2016) (note the biases for these two models are slightly higher using this shorter maximum distance). More discussion of the within- and between-event variabilities is given in Sect. 3.3.

Table 2 Summary of the overall bias, within-event variability (ϕ) and between-event variability (τ) using data from RJB ≤ 400 km and RJB ≤ 220 km

3.1 Magnitude scaling

Examining the between-event residuals with respect to magnitude (Fig. 6) allows the magnitude-scaling of the GMPEs to be tested. The magnitude scaling of five of the models (Abrahamson et al. 2014; Boore et al. 2014; Campbell and Bozorgnia 2014; Cauzzi et al. 2015; Chiou and Youngs 2014) is valid for these data, with no clear trends. The two RESORCE models (Akkar et al. 2014b, c; Bindi et al. 2014a, b) show slight overprediction for M > 7.6; neither model was derived using any data from such events. Zhao et al. (2016) shows slight underprediction for M > 7.6, again a range for which it is lacking data. We checked to see whether there are trends in the between-event residual with respect to the number of PGAs from each event that are available (see Fig. 7 for an example GMPE, although the conclusion is the same for all GMPEs) but the lack of data from some earthquakes does not appear to be affecting the results as no trends are seen. No trends are seen in the event terms grouped by mechanism.

Fig. 6
figure 6

Between-event residuals for all eight GMPEs

Fig. 7
figure 7

Between-event residuals for Zhao et al. (2016) with respect to the number of recordings per event

3.2 Distance scaling

The within-event residuals with respect to RJB (Fig. 8) show no clear trends for all NGA-West2 models, which were derived using a considerable number of records from all distances. The GMPEs of Akkar et al. (2014b, c), Bindi et al. (2014a, b), Cauzzi et al. (2015) and Zhao et al. (2016) show slight overprediction for RJB < 10 km, a distance range for which their underlying datasets are lacking records. There are some trends in the residuals at great distances (>200 km), which could be an indication of differing regional attenuation characteristics but these trends are limited. The residuals from the 2008 Wenchuan earthquake are more centred for some GMPEs (e.g. Bindi et al. 2014a, b) than for others (e.g. Zhao et al. 2016) but this should not be taken as a clear indication that those GMPEs are in general more reliable because this earthquake’s observations are probably not representative of the attenuation in generic active crustal regions (see discussion above). For many engineering purposes PGAs at distances greater than 100 km are generally of limited interest even for the largest crustal earthquakes.

Fig. 8
figure 8

Within-event residuals for all eight GMPEs with respect to RJB (ck is the overall bias). Versions with narrower y axes are given in the “Appendix”. The red dots and bars are means and 95% CI of the means. Data at distances less than 0.1 km are plotted at 0.1 km

3.3 VS30 scaling

Within-event residuals with respect to VS30 (Fig. 9) should not be considered in detail as most of the VS30 for the new data are estimates from site descriptions or are assumed generic values. There are some apparent trends: the site terms of Akkar et al. (2014b, c), Abrahamson et al. (2014) and Zhao et al. (2016) lead to underprediction for low VS30 (<300 m/s) and overprediction for high VS30 (>700 m/s). All other models show varying levels of overprediction for high VS30 (>700 m/s).

Fig. 9
figure 9

Within-event residuals for all eight GMPEs with respect to VS30. The y axes have been cut at −1 and 1 to better show the trends in the averages. Versions with wider y axes are given in the “Appendix”. The red dots and bars are means and 95% CI of the means

3.4 Ground-motion variability

This study should provide better between-event variability estimates (τ) for large earthquakes because of the larger number of events, although the data uncertainties mentioned in Sect. 2 and the inclusion of earthquakes from various regions needs to be considered before making firm conclusions on τ. Because most data collected were already used by the NGA-West2 developers, the within-event variability (φ) is still mainly controlled by those data, which have already been extensively analysed by those authors. In addition, because of the uncertainty in the VS30 estimates for most new records this also means that this component of ground-motion variability is not as well constrained as for the NGA-West2 models.

The φ estimates obtained from the residual analysis are comparable, if slightly higher, than those in the GMPEs (Table 2). The slightly higher values are likely attributable to the uncertainties in the VS30 estimates for some data as well as the inclusion of distant data from many regions with differing attenuation rates.

The estimates of τ from the residual analysis from the complete dataset cut at 400 km or at 220 km are larger (often by 0.1 or more) than those from all considered GMPEs except for that of Cauzzi et al. (2015), which has a particularly large model τ value (0.521) (Table 2). The NGA-West2 models have magnitude-dependent τ values, which decrease with magnitude. The addition of more events from the large-magnitude range suggests that the τ values associated with all models except Cauzzi et al. (2015) may need revising upwards if all the included earthquakes are considered representative of events in active crustal regions. As discussed below, however, the high τ values are coming mainly from two potentially atypical earthquakes and, therefore, it is not clear that the τ values in these models actually need to be increased.

Two events (2001 Bhuj and 2005 Crescent City) have high absolute event terms (PGAs from Bhuj are higher than the average while those from Crescent City are lower than the average). As noted above the Bhuj earthquake occurred in a stable continental region (north-west India) and hence perhaps should not be used to develop robust τ models for use with GMPEs derived for active zones. The Crescent City earthquake occurred near a plate boundary off the northern California coast and, hence, may not be a typical shallow crustal event. Removing those two events from the calculation leads to much lower τ estimates (they reduce by roughly 0.1) (Table 2b). They are now similar to the τ models proposed in the GMPEs, except for Cauzzi et al. (2015) where these data suggest a lower value is appropriate. The impact of also removing the data from the 2008 Wenchuan earthquake, which as noted above includes data with travel paths through stable crust, is minimal (Table 2c). Restricting the selection to only those contained in the NGA-West2 database also leads to similar estimates of τ and φ (Table 2d) thereby showing that these estimates are stable with respect to data selection (if potentially atypical events are excluded). Using only the data collected specifically for this study (i.e. the 671 PGAs not in the NGA-West2 database) leads to similar biases as for the other runs but the τ and φ values are slightly increased (Table 2e), which is possibly related to the higher metadata uncertainties of these data (e.g. limited VS30 estimates).

4 Conclusions

We have constructed a large peak ground acceleration dataset (1895 recordings) for earthquakes with moment magnitudes greater than or equal to 7.2. Those data are generally in good agreement with predictions from eight recently-published ground-motion prediction equations (GMPEs). The majority of the GMPEs have an overall bias in their predictions of less than 10%, with the largest bias being less than 20% when data to 400 km are considered. Most of the GMPEs show almost no trends in magnitude scaling. There is a general tendency to overpredict observations at distances within about 10 km by up to 40%; the GMPEs were developed using relatively few data in this distance range. There is an indication of some distance trends in the within-event residuals with distance beyond about 200 km, where regional differences in attenuation, not included in our evaluation of the GMPEs, can be important. The trends are not consistent between the GMPEs, being overpredictions of 10 to 20% for some GMPEs and underprediction by similar amounts for others. The within-event residuals plotted against VS30 show a trend with a negative slope, with overprediction at large VS30. The bias amounts to about a 20 to 30% difference in the range of VS30 from 200 m/s to 800 m/s. The within-event uncertainties are similar to those from the GMPEs as are the between-event uncertainties if two events with unusually large and small motions, and which are probably atypical for active crustal regions, are removed.

Although much of the data collected from this study have missing or uncertain independent parameters (particularly site conditions) and the PGAs themselves are associated with uncertainty (because some are read directly from undigitised accelerograms, many of which we have not seen) it is believed that the additional constraints provided by these data are valuable given how infrequently large events occur. These data provide additional support to the ground-motion scaling predicted by the eight examined GMPEs and could be useful for future revisions of these models.