Abstract
Seismic hazard and risk analyses are increasingly tapping into the previously underused resource of local weakmotion records. This is facilitating the development of local or even applicationspecific models for the characterisation of earthquake ground motion. In turn, this offers the opportunity to derive non or partially nonergodic models and significantly reduce bias and uncertainty. However, weakmotion data, while carrying important information about local earthquake source, path and site effects, are susceptible to noise. We show that highfrequency noise has a record, or regionspecific, impact on pseudospectral acceleration (PSA). This impact depends on the shape of the records’ Fourier amplitude spectrum (FAS): PSA from moderately to highly damped ‘soil’ records (e.g. Groningen, the Netherlands) is much less susceptible to highfrequency noise than PSA from weakly damped ‘rock’ records (e.g. Eastern North America). We make use of simulated ground motion records to develop a parametric model for the lower usable period of PSA (T_{min}). The model accounts for the impact of highfrequency noise on PSA, conditional on easily measured parameters characterising the shape of a record’s FAS. We then present a workflow, describing processing undertaken for records of induced seismicity from the Groningen gas field. The workflow includes the definition of maximum and minimum usable frequencies and periods of FAS and PSA, respectively. As part of the workflow, we present an approach that considers multiple estimates of T_{min}. These include the parametric model and, additionally, recordspecific hybrid simulations that artificially extend or modify time series’ FAS beyond the noise floor to assess subsequent impacts on PSA.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Ground motion models (GMMs) are a key component of a variety of scientific and engineering products, from seismic hazard and risk analyses, to shakemaps and magnitude scales. GMMs are developed using empirical data, either through direct regression, or in the case of simulationbased models, through calibration to recorded data. The quality of the underlying ground motion dataset is therefore of significant importance. The main issues that may reduce the quality of ground motion records are the instrument and datalogger (i.e. the record’s fidelity), and the background noise. The use of modern instrumentation, with broadband flat sensitivity, highresolution dataloggers and reliable data transmission means that the main constraint on the usability of an earthquake ground motion record now lies with background noise. A great deal of attention has been paid to the processing of earthquake records in strong ground motion datasets to avoid or minimise the impact of noise on response spectra at long periods (Akkar and Bommer 2006) and short periods (Douglas and Boore 2011). However, little attention has been paid to the impact of noise on weakmotion records, which are increasingly utilised in studies developing application and locationspecific GMMs (e.g. Bommer et al. 2016; Novakovic et al. 2018; Edwards et al. 2021).
The recent increase in induced seismicity related to emerging industries, such as hydraulic fracturing, geothermal energy and CO_{2} sequestration is of particular global concern, especially within the context of the transition to lowcarbon economies. Induced seismicity tends to be of smaller magnitude, however is often at shallow depths and in close proximity to urban areas. This means that smaller magnitude events contribute to the seismic hazard and risk of local population centres. Furthermore, there is increasing recognition that ground motions are regionally variable, particularly for small to moderate earthquakes occurring at shallow depths (Douglas and Edwards 2016). As a result, engineers must either modify existing models, or develop local GMMs using regional or, better still, data local to the target site. This spatial limitation means that motions will inevitably be of smaller amplitude if uncertainty and biases in developed models are to be reduced.
Seismic noise is a continuous, but variable, vibration with various sources. At low frequency, the microseism dominates and is related to natural phenomena such as ocean waves (Webb 1992). At high frequency, sources of seismic noise tend to be anthropogenic, owing to their lower propagation efficiency. Monitoring instruments in close proximity to urban environments are therefore susceptible to both low and highfrequency disturbances. Cauzzi and Clinton (2013) and Peterson (1993) provide an overview of seismic noise and develop models for ‘high’ and ‘low’ noise cases. In terms of the impact of seismic noise on ground motion timehistories or, equivalently, Fourier spectra, an unambiguous assessment is possible by comparing earthquake records with ‘noise records’. Noise records are typically taken from the immediate preevent timehistory (such that transient noise at the time of recording may be captured), but equally an estimate could be reconstructed based on the high and lownoise models (Cauzzi and Clinton 2013; Peterson 1993). The ratio of signal (plus noise) to noise Fourier spectral amplitudes, typically referred to as the signaltonoise ratio (SNR), provides a useful measure of signal contamination. Thresholds above three are usually considered suitable, but this varies between applications, with authors typically striking a balance between data quality and quantity.
The impact of noise on response spectral ordinates such as pseudospectral acceleration (PSA), which form the basis of seismic hazard and risk analyses, is more difficult to quantify than for Fourier amplitude spectra (FAS). This is because of the nonlinear transform between the Fourier domain (representing the signal amplitude at a given signal frequency) and response spectral amplitudes (representing peak motions of an oscillator with characteristic period). Bora et al. (2016) show that this transform results in a roughly linear relationship between low Fourier frequencies and long oscillator periods (i.e. FAS(1/f)∝PSA(T) for T>0.5 s), but quickly becomes nonlinear as periods reduce. At long periods, Akkar and Bommer (2006) showed that a usability limit of T_{max} = [0.7 to 0.97]/f_{l} was required to avoid the impact of long period noise on PSA, with f_{l} (the minimum uncontaminated signal frequency) defined, for example, by a SNR threshold.
On the other hand, Bora et al. (2016) show that the shortest oscillator periods of engineering interest (e.g. 0.01–0.1 s) are driven by motions with longer Fourier periods than those of the corresponding responsespectrum oscillator period. In fact, noise present in the time series at very short periods, such as 0.01–0.03 s does not necessarily affect the corresponding response spectrum at all, even in that specific period range. For instance, PSA at T = 0.01 s (often assumed equivalent to PGA) is typically related to ground motions at 20–30 Hz or lower, and nowhere near the 100 Hz implied by the reciprocal of the oscillator period. This was explored in detail by Douglas and Boore (2011), who concluded, through simulations of moderate to large events, that contaminating records with highfrequency noise had a negligible impact on their response spectra. This is frequently used as justification to ignore highfrequency noise and T_{min} when dealing with PSA.
Douglas and Boore (2011) investigated records for earthquakes typically found in strong ground motion datasets (M > 4.5) and, furthermore, made use of site conditions representative of such records, typically soil or stiffsoil sites. Both the moderate to high magnitude of events and the relatively high damping result in records with Fourier spectral amplitudes naturally lacking in highfrequency content. The records used by Douglas and Boore (2011) therefore exhibited low source corner frequencies (f_{0} < 1 Hz) and strong exponential decay at high frequency due to damping. For application to smaller events (weakmotion data), with higher source corner frequencies, or to records from rock or hard rock sites with low damping, we must consider signals with very different spectral content to those investigated by Douglas and Boore (2011). As noted previously, in such cases, the degree to which short oscillator period PSA is driven by longer period motions reduces and the conclusions of Douglas and Boore (2011) may, therefore, not be transferable to weakmotion data.
It is the aim of this study to investigate the impact of highfrequency noise on the response spectrum and propose a robust workflow for defining the usable bandwidth of both FAS and PSA from weakmotion records. For clarity, throughout this manuscript, we refer to PSA in terms of oscillator period, T, and FAS in terms of signal frequency, f, as per convention. A parametric model for the lowest usable (uncontaminated) period, T_{min}, is initially developed using simulations that account for the influence of Fourier spectral shape on the influence of noise in a record’s 5% damped response spectrum. The model is then used in direct application to investigate the impact of highfrequency noise on response spectral ordinates from weakmotion records of induced seismicity in the Groningen Gas Field, the Netherlands. The database consists of 803 triaxial recordings from events between 2006 and 2020 with local magnitudes ranging from M_{L} 2.5 to M_{L} 3.6 and is being used in the framework of the Groningen gas field hazard and risk analyses (van Elk et al. 2017) to develop a GMM (Bommer et al. 2017). The recordings are from highquality digital accelerographs at 98 sites belonging to the B and Gnetworks of the Royal Netherlands Meteorological Institute (KNMI; see Ntinalexis et al. 2019; Dost et al. 2017; KNMI 1993). By virtue of the small magnitude of the events, the recordings contain smallamplitude motions, with asrecorded horizontal PGA values ranging from 0.068 cm/s^{2} (7 × 10^{−5}g) to 108.68 cm/s^{2} (0.11 g).
2 Noise and its impact on FAS and PSA
The assessment of noise and its impact in the Fourier domain is relatively straightforward. Typically, preevent noise samples are taken and compared with the record in the Fourier domain. It is important to account for differences in signal duration when sampling the timehistory for noise. Authors sometimes ensure that both the earthquake timehistory and the noise timehistory are of equal length, but this is not always possible. In this case, noise FAS should be scaled by the square root of the ratio of duration between the earthquake and noise timehistory (after Parseval’s theorem) to provide FAS amplitudes that are consistent (i.e. corresponding to equivalent signal lengths). Due to the characteristic ‘trapezoidal’ shape of the earthquake acceleration spectrum, it stands out over the broadly flat noise floor within the passband that can be considered acceptable (Fig. 1). Lower (f_{l}) and upper (f_{u}) usable frequency limits are therefore clearly identifiable.
As noted earlier, for the response spectrum (PSA), the correlation with FAS amplitudes at corresponding oscillator periods decreases dramatically as signal frequencies increase above the record’s apparent corner frequency (roughly the peak of the FAS) (Bora et al. 2016). While at long periods we can therefore assume a correspondence of the minimum usable frequency of the record’s FAS (f_{l}) and the maximum usable period of the record’s response spectrum (i.e. T_{max} = [0.7 to 0.97]/f_{l}, after Akkar and Bommer 2006), at high signal frequencies and short oscillator periods, we cannot make this assumption.
In order to assess the impact of highfrequency noise on PSA, we initially work with synthetic data. This allows us an unambiguous definition of the true signal amplitude and corresponding uncontaminated response spectrum, which is not afforded with real data. Timedomain stochastic simulations have been performed using EXSIM (Motazedian and Atkinson 2005) as modified by Boore (2009). Simulations have been performed using the GMMs for:

i)
the Groningen gas field at a buried rock horizon (V_{s30} = 1400 m/s), as detailed in Edwards et al. (2019), and

ii)
Eastern North America (ENA, Atkinson and Boore 2006).
For the Groningen simulations, we investigate the impact of damping by varying the simulation parameter κ_{0} (Anderson and Hough 1984), using κ_{0} = 0.01, 0.03 and 0.05 s (roughly equivalent to damping expected at competent rock outcrops through to low V_{s} soil site conditions). The ENA GMM specifically allows the investigation of weakly damped motions, with a very hardrock site condition (κ_{0} = 0.005 s) implicit in the GMM. In both cases, the GMMs are calibrated against local empirical data in the magnitude range of interest for this study (approx. M_{L} < 4), and the simulations can therefore be considered to be representative, yet diverse, in terms of amplitude and frequency content, of real earthquake records.
Noisefree acceleration time series from earthquakes with moment magnitudes from 1.0 to 6.0 (in 0.5 unit increments) are simulated at 20 logspaced JoynerBoore distances from 0.1 to 60 km (Fig. 1). Noise is subsequently applied to the simulations in increasing amplitude until the signals are completely lost. We use two noise forms: (i) white noise and (ii) the noise model of Cauzzi and Clinton (2013). The white noise is generated in the time domain and defined by a normal distribution with zero mean and standard deviation: 0.01, 0.1, 1, 10, 100 cm/s^{2}. The higher values are not intended to reflect typical noise levels, but to ensure that all records are affected by noise. The model of Cauzzi and Clinton (2013) is used to consider a realistic highnoise scenario, using their highnoise powerspectrum model converted from dB/Hz to absolute units of spectral acceleration. We scale those amplitudes incrementally by factors 0.1, 0.2, 0.5, 0.75, and 1.0, to generate a realistic stochasticphase noise timehistory that is added to the noisefree simulation in the time domain. With this approach, we retain acceleration timehistories for the noisefree simulation, the background noise and the contaminated ‘noisy’ simulation, with the latter referred in the following as the ‘synthetic’ time series.
From each of the synthetic time series’ FAS, various measures are determined to allow investigation of the impact of the noise level:

f_{u}: the upper usable FAS frequency in Hz (defined by a signal to noise ratio of 3);

f_{peak}: the frequency at the FAS peak;

A_{peak}: the natural logarithm FAS amplitude at f_{peak};

A_{u}: the natural logarithm FAS amplitude at f_{u};

∆A: the amplitude difference, A_{peak} − A_{u};

∆f: the frequency difference, f_{u} − f_{peak}.
We measure T_{min} by determining the lowest period at which the 5% damped response spectrum of the acceleration timehistory is unaffected by noise. This is defined as where the response spectrum of the synthetic timehistory is within a 5% tolerance of the true value (Fig. 2). This is a conservative estimate, as PSA at shorter periods than the subsequently defined T_{min} may return to within the defined threshold. Our observations show that while this is often the case, PSA then tends to fluctuate within and outside the acceptable tolerance level at periods below T_{min} (Fig. 2). Lowpass frequency filtering of the time series at, or around, f_{u} results in severe impact on the PSA for weakmotion data, and, as such, should not be used. It is clear that unfiltered (or highpass frequency filtered: bandpass f > f_{l}) time series allow calculation of PSA to periods well below 1/f_{u}. In fact, at the 5% tolerance level, the PSA from both these cases in Fig. 2 (from the simulation shown in Fig. 1) only just fail, with most PSA amplitudes being within ~ 10% of the true values. On the other hand, the lowpass frequency filtered time series result in up to 50 % underestimation of PSA.
Figure 3 shows the suite of measured f_{u} and T_{min} from simulations using the Groningen GMMs with alternative damping (κ_{0}), along with the ENA GMM. Clearly, as expected, for signals with higher f_{u}, we obtain PSA with smaller T_{min}. However, the spectral shape has a significant impact on the usability of the response spectra: low κ_{0} (weakly damped) records require much higher values of f_{u} to maintain usability of PSA down to 0.01 s. In terms of estimating T_{min} from the time series, we therefore require knowledge of not only the usable FAS bandwidth (i.e. f_{u}, which can be directly measured from field records), but also the spectral shape. In the following, we therefore develop a model for determining T_{min} that accounts for spectral shape by using easily measured characteristics of a waveform and its FAS.
First, we define an adjusted upper usable FAS frequency (\( {f}_u^{\ast } \)) in Hz. The adjustment normalises f_{u} to that expected, given the same noise and peaksignal amplitude, for a signal with a reference damping, defined by κ_{ref} = 0.03 s. It therefore results in a predictor for T_{min} that is unbiased.\( {f}_{\mathrm{u}}^{\ast } \) is given by:
Figure 4 shows the resulting \( {f}_{\mathrm{u}}^{\ast } \) for the three alternative Groningen GMM simulations (κ_{0}= 0.01, 0.03 and 0.05 s). Note that \( {f}_{\mathrm{u}}^{\ast } \) values for the κ_{0}= 0.03 s simulations (the same as our selected reference, κ_{ref}) are broadly consistent with measured f_{u}. For the κ_{0}= 0.01 s simulations, \( {f}_{\mathrm{u}}^{\ast } \) are systematically reduced with respect to f_{u}, while for the strongly damped κ_{0}= 0.05 s simulations, \( {f}_{\mathrm{u}}^{\ast } \) are higher than measured f_{u}. Note that we define a maximum adjustment factor, 0.4, in Eq. 1 based on trial and error after observing overcorrection of very weakly damped (low κ_{0}) signals.
Figure 5 compares the original f_{u} (as Fig. 3) and adjusted \( {f}_{\mathrm{u}}^{\ast } \) against T_{min}. The use of \( {f}_{\mathrm{u}}^{\ast } \) clearly removes the dependence of spectral shape on the correlation. Based on \( {f}_{\mathrm{u}}^{\ast } \) and T_{min} for the Groningen GMM simulations (Fig. 5), a bestestimate lower usable period, \( \overline{T_{\mathrm{min}}} \) (in seconds), within an acceptable tolerance is defined by:
where a_{3} is the logmean \( {f}_{\mathrm{u}}^{\ast } \) for 0.01 < T_{min} < 0.02 s and a_{1 − 2} are determined through loglinear regression of \( {f}_{\mathrm{u}}^{\ast } \) versus T_{min}. Bounds on T_{min} are then given by introducing a scaling factor, c, on \( {f}_{\mathrm{u}}^{\ast } \):
with the factors c = 1.113 (upper bound) and c = 1/1.113 (lower bound) designed to encapsulate the data (where 0.01 < T_{min} < 0.02 s) at n standard deviations of \( {\log}_{\mathrm{e}}\left({f}_{\mathrm{u}}^{\ast}\right) \). Based on an average over three Groningen GMM simulation scenarios with 1100 simulations in each, and using only data where T_{min} > 0.01 s, we determine a_{1} = − 1.753, a_{2} = 1.946, and a_{3} = 25.41 Hz (Fig. 5c). In addition, a maximum threshold of T_{min} = 0.1 s is imposed beyond which it is not possible to reliably estimate T_{min} from \( {f}_{\mathrm{u}}^{\ast } \). Predicted values of T_{min} are therefore deemed unresolved if they exceed 0.1 s.
In order to test if there is any sensitivity of the model to the selected magnitudedistance range, the data are split into subsets with magnitude 1–4.5 and 4.5–6 and distances 0–30 and 30–60 km. The model was found to be equally applicable to all of the data subsets. An example is shown for M = 4.5–6 at all distances in Fig. 6. These simulations were subject to unrealistically high levels of noise in order to obtain relatively low f_{u} for such large events. Interestingly, the model appears equally valid for these very noisy records of larger events, in addition to weakmotion data. While not routinely useful for strongmotion datasets (since noise levels rarely reach such amplitudes), the model would be useful for cases where significant anthropogenic noise levels are present, such as those used for earthquake early warning in industrial settings (Cauzzi et al. 2016). As further validation exercise, we apply the model to the ENA dataset (Fig. 7), which has so far been withheld from the model development. Disregarding the very noisy data with T_{min} > 0.1 (which, as noted previously, shows very little correlation with f_{u}), the consistency with results from the Groningen simulations is very good, and the model for T_{min} is clearly suitable independent of the region.
Our choice of 5% tolerance for selecting observed T_{min} will clearly have an impact on the results discussed previously: allowing a larger tolerance when measuring T_{min} means that lower f_{u} are required (for a given T_{min}). In order to facilitate choice when implementing the T_{min} model, we have also calibrated coefficients for Eqs. 2 and 3 using alternative tolerances of 10 and 15 % (Table 1, Fig. 8). An alternative to white noise was also explored by implementing the high noise model of Cauzzi and Clinton (2013), which is somewhat more forgiving in the midperiod range than white noise. Here the noise is more realistic, but the larger events, particularly those simulated at near distances, are unaffected by the noise and are therefore not included in the derivation of the alternative model (since T_{min} = 0.01 s for those records). Using the high noise model, a_{3}, defining the frequency \( {f}_{\mathrm{u}}^{\ast } \) above which T_{min}= 0.01 s, is almost unchanged, being instead strongly related to the acceptable tolerance within the true PSA. For tolerance values of 5, 10 and 15%, we observe a_{3} values of 24.4–25.4, 19.3–20.3 and 17.0–17.1 Hz, respectively: \( {f}_{\mathrm{u}}^{\ast } \) above which results in T_{min} = 0.01 s. The shape of the T_{min} versus \( {f}_{\mathrm{u}}^{\ast } \) slope does change depending on the noise model used, however. This suggests that the shape of the noise spectrum itself, as well as the earthquake time series, has an impact on the usability of PSA.
3 Workflow: usable frequency range of FAS
In the following sections, we detail the application of a workflow used to define usable frequency (for FAS) and period (for PSA) for an induced seismicity dataset, specifically, a database of 803 recordings from the B and Gnetworks of the KNMI in the Groningen region, the Netherlands. The recording networks and instrumentation used to record the acceleration time series are described in Ntinalexis et al. (2019). Prior to 2014, the monitoring network consisted of several GeoSig digital accelerographs. As a consequence of a M_{L}3.6 earthquake that occurred in Huizinge on 16 August 2012, more detailed seismic studies were commissioned for the area. A significant upgrade and expansion of the existing network as well as the installation of new networks became part of this effort (Ntinalexis et al. 2019). The KNMI networks now consist of almost 100 modern Kinemetrics accelerometer stations with high rate 24bit datalogging. We can therefore safely assume that the predominant source of signal contamination in the dataset analysed will be external noise. The recordings examined were obtained during induced events of local magnitudes ranging from M_{L}2.5 to M_{L}3.6 that occurred between 2006 and 2020 in Groningen. The asrecorded horizontal PGA values of the records range from 0.068 cm/s^{2} to 108.68 cm/s^{2} and were recorded at epicentral distances ranging from 0.4 to 34 km (Fig. 9).
3.1 Maximum usable frequency
As mentioned previously, the maximum usable FAS frequency, f_{u}, can be selected via a signaltonoise ratio analysis. We choose to select f_{u} as the maximum frequency of the continuous frequency window with SNR above 3. This is the simplest method to select the maximum usable frequency and is also widely employed in engineering and seismology. To conduct the SNR analysis, it is first necessary to obtain a noise model representative of the noise in the record. This is routinely determined as the FAS of the preevent time series. In most modern recording networks, continuous data streams are available via online services and data portals, which allows the user to select a time window of their choice around the event. In these cases, it suffices for the user to select a time window with a long preevent memory and select the first several seconds of that window to sample noise adequately. However, in networks operating on a triggeringonly basis, such as the KNMI Bnetwork in Groningen prior to 2014 (see Ntinalexis et al. 2019), limited time lengths of the preevent memory may be available. In smallamplitude records such as those included in the Groningen database, the SNR at frequencies above 20 Hz can also be very sensitive to the selection of the noise window due to transient signals, and hence it is important to make sure that the noise window is carefully selected.
A technique of dynamic noise window selection is employed in our analyses. We use the vertical component motion to determine the noise window to ensure we avoid P wave energy in the selected analysis window. While small in amplitude on the horizontal components, the P wave has nonnegligible highfrequency energy that may bias the noise estimate (and therefore f_{u}). We begin by locating the time window from the beginning of the record to the point where the Arias Intensity is 0.5% of the total. We then determine short (−1 to +0.5 s) and longterm (−3 to +0.5 s) moving averages (STA and LTA, respectively) and compute the ratio (STA/LTA). A ratio above 1.2 signifies a significant amplitude change that can be associated with the first observable arrivals of the earthquake signal. We choose the end of the noise window to be the earliest of either the 0.5% Arias Intensity or the STA/LTA trigger (assumed to be the P wave). The noise window, as defined on the vertical component, is then used for the horizontal components. An example is shown in Fig. 10.
For the Groningen dataset, we found that often the Gstation sensors were located close to the electricity mains network. In this case, it was very likely that the record was contaminated with 50Hz noise. For smallamplitude records, this may result in a significant peak in the FAS (Fig. 11) and affect the calculation of f_{u}, as well as the response spectra of the record (Fig. 12). Douglas and Boore (2011) recommend the removal of this peak at 50 Hz with a narrow notch filter and in our case, we found it absolutely necessary to remove the 50Hz noise in order to obtain correct estimates of shortperiod PSA (Fig. 12).
3.2 Minimum usable frequency
Determining the lower usable frequency (f_{l}) by employing the same SNR>3 criterion as used for f_{u} is a choice that is often employed. However, because the SNRs of smallamplitude records are smaller and the resulting bandwidth can be very limited, it is desirable in our case to use a method that results to more forgiving estimates of f_{l}. The first step is to obtain an initial estimate of f_{l}. This is defined as the first point (with decreasing frequency) that the linear trend of the recording’s FAS is observed to systematically decay more slowly than a theoretical Brune (1970) spectrum. The next step is to lowcut filter the record using f_{l} as the filter cornerfrequency and then compute the displacement trace through double integration of the acceleration time series. The filter used is an 8th order acausal Butterworth filter. Any lowfrequency noise can then easily be observed in the time domain. If the total displacement is zero and longperiod noise cannot be readily observed in the displacement trace, then the initial estimate is selected as the final f_{l} value. If the user judges the displacement trace to still be unacceptably contaminated with noise, a higher frequency is selected, and the process is iterated until f_{l} is found that results in a noisefree displacement time series.
An example of the application of the iterative selection of f_{l} is shown in Figs. 13 and 14. Figure 13 shows the FAS of the NorthSouth component of recording KANT from the M_{L}3.2 Garrelsweer earthquake of 27 June 2011. The identification of different possible lowcut filter frequencies from the FAS of the record is illustrated. The displacement traces obtained after the application of the different filters are compared in Fig. 14. It is obvious that applying a filter of 0.342 Hz (the initial estimate based on spectral shape) is insufficient, as longperiod waves are still clearly observable in the displacement trace (Fig. 14). After iterating through increased values of f_{l}, we observe that a frequency of 1.611 Hz is excessive as it results in a reduction in the amplitude of the record. Lowcut frequencies of 0.635 Hz and 0.732 Hz both produce acceptable results; hence, the lowest, 0.635 Hz, is selected.
3.3 Removal criteria
Figure 15 shows ratios of the PGV and PSA of noisecontaminated synthetic recordings to the noisefree versions using the Groningen GMM with κ_{0} = 0.03 s. The ratios are plotted as a function of the maximum usable frequency (f_{u}). It is immediately apparent that, when f_{u} is low, PGV and the shortperiod spectral ordinates have significantly increased amplitudes. We therefore recommend that records with f_{u} below 15 Hz should not be used at all and should be discarded from groundmotion databases. As shown in Fig. 14, lowcut filtering with an excessively high cutoff frequency can result in a reduction in amplitude and should be avoided. Therefore, when f_{l} is identified above 2 Hz, we also consider a record to also be unusable. When either horizontal component fulfils at least one of these removal criteria, we discard the entire triaxial recording, as both horizontal components are required to compute the intensity measures commonly predicted by GMPEs and GMMs.
Figure 16 illustrates which records of the database were removed entirely by applying the constraints on f_{l} and f_{u}. A total of 96 out of the 800 records (12%) from the Groningen database were removed. As expected, these recordings correspond to the relatively weaker motions within the database, which come from the lower end of the magnitude range and stations at longer epicentral distances (Fig. 16).
4 Workflow: usable period range of PSA
4.1 Maximum usable period
Once the usable bandwidth of the FAS is defined, the next step is to lowcut filter the records. We recommend the use of an 8th order acausal (zero phase) Butterworth filter, which has been found to be more suitable for use on digital records (Boore and Akkar 2003). For the correct use of this type of filter, it is necessary to zeropad both ends of the record (Boore and Bommer 2005). The pad length is a calculated using the function of Converse and Brady (1992) which is dependent on both the chosen filter corner frequency and the order of the filter. We apply the same filter to both horizontal components, using the lowest cutoff, f_{l}, of the two components, as they are typically used in conjunction when calculating intensity parameters for use in GMPE/GMM development.
The amplitudes of longperiod spectral ordinates are highly sensitive to the application of lowcut filters. As the filter removes both signal and noise, an unknown combination of both is left behind by the filter at frequencies lower and close to the cutoff frequency. Therefore, the response spectra are reliable for use only up to a certain period, lower than the longperiod cutoff (T_{c}, the inverse of the cutoff frequency, f_{u}). Different studies have employed schemes to define this usable period limit. Some examples are described in Boore and Bommer (2005) and Akkar and Bommer (2006). The most widely employed technique—and the one adopted in this study—to define the usable period limit is to identify the ratio T_{c}/T_{max}.
According to Akkar and Bommer (2006), for digital records from soft soil sites such as those in Groningen, this ratio is between 0.7 and 0.97. The method we adopted to select from this range consists of comparing the PSA before and after filtering and only using the spectral ordinates where the change in amplitude is within a certain threshold. For the Groningen data, we selected this threshold to be 5%. Figure 17 shows ratios of PSA post to prefiltering, plotted as a function of the ratio of each period to the cutoff period. In this case, it can be observed that more than 95% of the response spectra have changed by less than 5% up to a period of 70% of the cutoff period. Hence, we selected the ratio of 0.7 and define the maximum usable period for each record as T_{max} = 0.7T_{c} = 0.7/f_{l}. It must be noted that, for databases with a small number of available records, it may be preferable to define a larger ratio to maximise the available data, using a more generous threshold.
4.2 Minimum usable period
Filtering high frequencies prior to computing PSA is not recommended as it may have a knockoff effect on a wide range of periods (see Fig. 2). However, as shown earlier, it is still necessary to define a minimum usable period in order to exclude noisecontaminated PSA from use. The first estimate of T_{min} is the result of the upperbound T_{min} model presented earlier at n = 3 (Eq. 3), which we apply for a threshold 5% using the white noise model (Table 1).
In addition to the parametric T_{min} model, we devise additional measures to constrain T_{min}. We create two hybridsynthetic records using the FAS of each record under analysis. To create the first synthetic, we fit an idealised Brune (1970) spectrum to the FAS of the record (Fig. 18), and use the FAS of the record within its usable frequency range (f_{l} to f_{u}) and the Brune spectrum in the unusable frequencies. Thus, we create an idealised ‘noisefree’ version of the record when performing an inverse Fourier transform. To create the second synthetic, we use the full FAS of the record but double it for frequencies higher than f_{u}. In this way, we obtain a noisier version of the same record.
By comparing the response spectrum of the original record to the idealised ‘noisefree’ version, we obtain an estimate of the periods that are affected by noise. At the same time, by comparing the original response spectrum with the ‘noisier’ version, we can observe which periods are sensitive to additional noise. From these comparisons, we can define two additional estimates of T_{min}, based on the divergence (with 5 % tolerance) of the hybridsynthetic and the original response spectra. Finally, we select T_{min} using the following logic (Fig. 19):

If the parametric T_{min} model is 0.01 s (the shortest period defined), we retain that value.

If two of the three T_{min} estimates are within 10% of one another, we retain the average value of those T_{min}.

Otherwise, we select the result of the parametric T_{min} model, but restrict T_{min} between the values calculated using the two hybridsynthetics.
The number of usable PSA, as defined by T_{min} and T_{max}, is shown in Fig. 20 over 13 approximately linearly spaced periods from 0.01 to 1.5 s. The largest quantity of usable spectral accelerations correspond to the intermediate periods (0.1–0.7 s), a smaller number (498) is available at 0.01 s and a rapid decay can be observed with increasing period from 0.85 s onward. At 1.5 s, the number of usable spectral accelerations is 184, which can still be considered sufficient for the limited distance (R_{epi} < 35 km) and magnitude range covered by the database. In total, 206 records (29.2% of the 704 usable records) are unusable at 0.01 s due to noise.
5 Conclusions
Shortperiod noise in acceleration time series has the potential to influence response spectral accelerations at short oscillator periods. This has previously been investigated by Douglas and Boore (2011) in the context of data found in typical in strongmotion datasets. Analysis of ‘strongmotion’ data, however, generally avoids the influence of highfrequency noise. This is both due the relative amplitude of signal and noise, and also due to the fact that the dominant frequency of motion of strongmotion data is much lower than any highfrequency noise. Our simulations show that PSA from noisy weakmotion records, as present in many ground motion databases such as those for induced seismicity, is susceptible to highfrequency noise. This is particularly so for weakly damped records, such as those on ‘hardrock’ sites. The impact of highfrequency noise on PSA should be considered by assigning record specific T_{min} and without any form of lowpass frequency filtering. A parametric T_{min} model, based on easily measurable properties of waveform FAS (peak/noise amplitudes, frequencies), is proposed herein and can be used as a guide to assign T_{min}. We additionally propose an easily implementable approach to assess the impact of noise using hybridsynthetic records, which modify the ‘unusable’ noisy portion of the records’ FAS, before reconstructing time series and subsequently PSA for comparison with the original spectrum. An example of the full workflow used to define usable FAS frequencies and PSA periods was presented for the Groningen induced seismicity database. We showed that only 12% (96 out of 800 available records) were required to be removed in their entirety due to excessive noise. Further to the removal of records in the long periods range (based on T_{max}), which is already common practice for GMPE/GMM databases, we showed that 29% of the usable records of the database are unusable at 0.01 s due to the influence of highfrequency noise.
References
Akkar S, Bommer JJ (2006) Influence of longperiod filter cutoff on elastic spectral displacements. Earthq Eng Struct Dyn 35(9):1145–1165
Anderson JG, Hough SE (1984) A model for the shape of the Fourier amplitude spectrum of acceleration at high frequencies. Bull Seismol Soc Am 74(5):1969–1993
Atkinson GM, Boore DM (2006) Earthquake groundmotion prediction equations for eastern North America. Bull Seismol Soc Am 96(6):2181–2205
Bommer JJ, Dost B, Edwards B, Stafford PJ, van Elk J, Doornhof D, Ntinalexis M (2016) Developing an applicationspecific groundmotion model for induced seismicity. Bull Seismol Soc Am 106(1):158–173
Bommer JJ, Dost B, Edwards B, Kruiver PP, Ntinalexis M, RodriguezMarek A, Stafford PJ, van Elk J (2017) Developing a model for the prediction of ground motions due to earthquakes in the Groningen gas field. Neth J Geosci 96(5):203–213
Boore DM (2009) Comparing stochastic pointsource and finitesource groundmotion simulations: SMSIM and EXSIM. Bull Seismol Soc Am 99(6):3202–3216
Boore DM, Akkar S (2003) Effect of causal and acausal filters on elastic and inelastic response spectra. Earthq Eng Struct Dyn 32(11):1729–1748
Boore DM, Bommer JJ (2005) Processing strongmotion accelerograms: needs, options and consequences. Soil Dyn Earthq Eng 25(2):93–115
Bora SS, Scherbaum F, Kuehn N, Stafford P (2016) On the relationship between Fourier and response spectra: implications for the adjustment of empirical groundmotion prediction equations (GMPEs). Bull Seismol Soc Am 106(3):1235–1253
Brune JN (1970) Tectonic stress and the spectra of seismic shear waves from earthquakes. J Geophys Res 75(26):4997–5009
Cauzzi C, Clinton J (2013) A highand lownoise model for highquality strongmotion accelerometer stations. Earthquake Spectra 29(1):85–102
Cauzzi C, Behr Y, Le Guenan T, Douglas J, Auclair S, Woessner J, Clinton J, Wiemer S (2016) Earthquake early warning and operational earthquake forecasting as realtime hazard information to mitigate seismic risk at nuclear facilities. Bull Earthq Eng 14(9):2495–2512
Converse A , Brady AG, (1992). BAP: basic strongmotion accelerogram processing software, version 1.0. US Department of the Interior, US Geological Survey.
Dost B, Ruigrok E, Spetzler J (2017) Development of seismicity and probabilistic hazard assessment for the Groningen gas field. Neth J Geosci 96(5):235–245
Douglas J, Boore DM (2011) Highfrequency filtering of strongmotion records. Bull Earthq Eng 9(2):395–409
Douglas J, Edwards B (2016) Recent and future developments in earthquake ground motion estimation. Earth Sci Rev 160:203–219
Edwards B, Zurek B, Van Dedem E, Stafford PJ, Oates S, Van Elk J, DeMartin B, Bommer JJ (2019) Simulations for the development of a ground motion model for induced seismicity in the Groningen gas field, The Netherlands. Bull Earthq Eng 17(8):4441–4456
Edwards B, Crowley H, Pinho R, Bommer JJ (2021) Seismic hazard and risk due to induced earthquakes at a shale gas site. Bulletin of the Seismological Society of America. https://doi.org/10.1785/0120200234.
KNMI (1993) Netherlands Seismic and Acoustic Network. Royal Netherlands Meteorological Institute (KNMI). https://doi.org/10.21944/e970fd3423b93411b366e4f72877d2c5.
Motazedian D, Atkinson GM (2005) Stochastic finitefault modeling based on a dynamic corner frequency. Bull Seismol Soc Am 95(3):995–1010
Novakovic M, Atkinson GM, Assatourians K (2018) Empirically calibrated groundmotion prediction equation for Oklahoma. Bull Seismol Soc Am 108(5A):2444–2461
Ntinalexis M, Bommer JJ, Ruigrok E, Edwards B, Pinho R, Dost B, Correia AA, Uilenreef J, Stafford PJ, van Elk J (2019) Groundmotion networks in the Groningen field: usability and consistency of surface recordings. J Seismol 23(6):1233–1253
Peterson J (1993) Observations and modeling of background seismic noise, openfile report 93322, U. S. Geological Survey, Albuquerque, NM, pp 42.
van Elk J, Doornhof D, Bommer JJ, Bourne SJ, Oates SJ, Pinho R, Crowley H (2017) Hazard and risk assessments for induced seismicity in Groningen. Neth J Geosci 96(5):259–269
Webb SC (1992) The equilibrium oceanic microseism spectrum. The Journal of the Acoustical Society of America 92(4):2141–2158
Acknowledgements
We thank Nederlandse Aardolie Maatschappij (NAM) for funding this study. The authors are also grateful for the funding from EU project New challenges for Urban Engineering Seismology (URBASIS) (Grant Number 813137), which has supported this work. We also thank KNMI for providing access to the data used during this study (KNMI 1993).
Availability of data and material
Recorded waveform data used in this study are available from the Royal Netherlands Meteorological Institute (KNMI) on request, and online at http://rdsa.knmi.nl/dataportal/ (last accessed January 2021).
Code availability
The software EXSIM used for this study is available on request from the author, and online at http://www.daveboore.com/software_online.html (last accessed January 2021).
Funding
This work has been funded by the Nederlandse Aardolie Maatschappij (NAM).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Highlights
• We show that PSA from weakmotion records are susceptible to highfrequency noise, which can lead to overestimation of spectral ordinates.
• A model defining the minimum usable period of 5% damped PSA, T_{min}, is presented, based on easily measurable characteristics of a waveform.
• A full workflow, defining the upper and lower usable frequencies (of FAS) and periods (of 5 % damped PSA) is presented, with application to an induced seismicity dataset.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Edwards, B., Ntinalexis, M. Defining the usable bandwidth of weakmotion records: application to induced seismicity in the Groningen Gas Field, the Netherlands. J Seismol 25, 1043–1059 (2021). https://doi.org/10.1007/s10950021100107
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10950021100107