1 Introduction

The increasing number of seismic Stations has allowed the study of near-fault earthquake ground motions in seismology. Occasionally, Stations recorded earthquake signals with unexpected patterns. These ground motions, called henceforward pulse-shape signals, can be seen at the beginning of the earthquake signal in velocity records.

Pulse-shape signals are researched in both classical and engineering seismology. In engineering seismology, it is vital to identify pulse-shape signals, since they can create high demands on structures around the period of the pulse signal (Bertero et al. 1978; Anderson and Bertero 1987; Hall et al. 1995; Iwan 1997; Alavi and Krawinkler 2001; Menun and Fu 2002; Makris and Black 2004; Mavroeidis et al. 2004; Akkar et al. 2005; Luco and Cornell 2007; Kalkan and Kunnath 2006). However, some of the probabilistic seismic hazard analysis (PHSA) models (Lanzano et al. 2016) and building design codes do not consider their effects on PHSA models. Although it is a rare phenomenon, it is vital to investigate pulse-shape signals due to their hazardous effects.

Pulse-shape signals can appear in some earthquake scenarios, like those considering forward directivity, which occurs when receivers are located in the forward direction of the fault rupture (Somerville et al. 1997; Somerville 2003, 2005; Spudich and Chiou 2008), the fling step effect, which is a permanent displacement of the ground resulting from fault rupture (Mavroeidis and Papageorgiou 2002) and when rupture velocity and shear-wave velocity of the bedrock of the site of interest are similar.

The seldom occurrence of pulse-shape signals depends mostly on the above conditions. However, because of their scarcity, velocity pulses are not taken into account in most of the ground motion prediction equations (GMPE) (Abrahamson et al. 2016; Boore et al. 2014). Yet, standard and median deviations of GMPE can be calibrated with certain factors to involve the effect of a pulse-shape signal (Akkar and Cheng 2016).

Indicators of pulse-shape signals in the waveforms are:

  1. 1.

    Signals with long and large amplitudes (Somerville et al. 1997),

  2. 2.

    A high PGV/PGA ratio (Bray and Rodriguez-Marek 2004),

  3. 3.

    Earthquake energy concentrated on one (or a few) pulse(s) (Somerville et al. 1997),

  4. 4.

    Unexpectedly high response values at the pulse period on response spectra (Yang and Wang 2012).

Various methods have been created for identifying the pulse-shape signals. Mavroeidis and Papageorgiou (2003) proposed a wavelet analysis to construct a mathematical representation of the pulse, which depends on amplitude, period, duration, and phase shift. Shahi and Baker (2014) used a 4th-order Daubechies wavelet to determine pulse-shape signals. The method has some constraints such as a minimum PGV amplitude, a pulse arrival located at the beginning of the signal and arbitrary thresholds for the energy function. Mena and Mai (2011) used windowed Fourier transform analysis for the pulse shape signal and its position with certain energy thresholds. Chang et al. (2016) used the energy function with certain thresholds to determine the pulse-shape signal position and period. Ghaffarzadeh (2016) used the S-transform to identify the pulses. Kardoutsou et al. (2017) used a cross-correlation between the potential pulse-shape signal and the wavelet functions to determine the pulse shape. Methods of Shahi and Baker (2014) and Chang, Chang et al. (2016) are explained in detail in Section 3.

The goal of this study is to create an alternative pulse identification algorithm. Main considerations are determining the time location of the pulse and mimicking impulsive part of the signal with known wavelets. Ricker and Morlet wavelets are used both for spectrum analysis to determine the pulse period and the region with maximum energy and mimicking impulsive part of the signal. Waveform that are identified as pulse shaped are compared with the wavelets which are created with calculated pulse period by checking spectral responses. If the wavelets correspond to the features of the long period part of the earthquake signal, the algorithm is considered as successful.

2 Data

The analyzed ground motions are selected from NGA-West2 (Ancheta et al. 2012), GeoNet, Itaca (Pacor et al. 2011; Luzi et al. 2016), and K-Net databases, which contain data from crustal earthquakes. Earthquake signals that are recorded due to Mw ≥ 5.5 earthquakes with a maximum distance range of 150 km from the epicenter are selected. In order to study pulse-shape signals, East and North components are rotated to radial and transverse components. Acceleration waveform has been bandpass filtered between 0.05 and 10 Hz and integrated to obtain velocity waveform. In total, our database contains 2785 waveform.

3 Previous methods

Two aforementioned algorithms, Shahi and Baker (2014) and Chang et al. (2016), are used for comparison with our new method. These algorithms are chosen since both are well known and widely used when dealing with this topic.

3.1 Shahi and Baker (2014)

Shahi and Baker (2014) classification algorithm uses wavelet-based signal processing to detect pulse-shape signals in the area with the largest value of velocity (PGV). The algorithm can differentiate early and late arrival pulses by analyzing the arrival of PGV. Early arrivals of PGV generally indicate directivity effects.

Classification algorithm uses two criteria to determine whether the signal has impulsive or non-impulsive behavior. First criterion is the hazardousness of the signal. If PGV is less than 30 cm/s, it is considered as non-hazardous signal. Second criterion is that the pulse indicator (PI) values should be bigger than 0. Calculation of PI has two stages. In the first stage, principal component (PC) is found as explained in Eq. 1:

$$ PC = 0.63 \times ({PGV ratio}) + 0.777 \times ({energy ratio}) $$
(1)

PGVratio indicates the ratio of the PGV value of the residual signal and the PGV value of the original signal, which is calculated by subtracting the original signal from that produced by the 4th-order Daubechies wavelet signal. energyratio is the ratio calculated by dividing the power of the residual signal by the power of the original signal. In the second stage, PI is calculated as Eq. 2:

$$ \begin{array}{@{}rcl@{}} PI &=& 9.384 (0.76 - PC - 0.00616 PGV)\\ &&\times(PC \!+ 6.914 \times\! 10^{-4} PGV - 1.072) - 6.179\\ \end{array} $$
(2)

If the signal is considered as pulse shaped signal, then 4th order Daubechies wavelet is fitted to entire waveform. It is hard to determine where the impulsive part of the signal starts and ends by using Shahi and Baker (2014) classification algorithm.

3.2 Chang et al. (2016)

Chang et al. (2016) use an energy-based classification algorithm. The algorithm determines a region around the PGV and determines the energy ratio between the pulse region and the total energy of the signal by taking the squared values on both signals. The region around PGV is calculated by using a least-square fitting for various pulse periods; then, the one with the smallest residual is used for the pulse region. The energy ratio is then calculated as Eq. 3:

$$ E(t) = \frac{{\int}_{t_{\mathrm{s}}}^{t_{\mathrm{e}}} v^{2}(\tau) d\tau}{{\int}_{0}^{\infty} v^{2}(\tau) d\tau} $$
(3)

ts and te represent the starting and ending point of the impulse part in time axis and v2 represents the velocity time history of the signal. If the ratio between the pulse region energy, the numerator part of Eq. 3, and the total energy, the denominator of Eq. 3, exceeds 0.34, the signal is considered as a pulse-shaped signal.

If the signal is considered as pulse shaped signal, then a waveform is fitted to impulsive part of signal. Contrary to Shahi and Baker (2014), one can identify the impulsive part of the seismic signal by using the algorithm of Chang et al. (2016).

4 The new method

Previous attempts to determine pulse shape signals were concentrated on determining if the signal has an impulsive or non-impulse behavior. Another common goal is to determine the period of the pulse, since it can have significant effects on structures. Therefore, the pulse period has been focused on when mimicking the pulse signal. One of the main assumptions on pulse shape signals is that the impulsive part is where PGV is located.

In this method, we took the minimum threshold of PGV as 30 cm/s, if the pulse occurs where PGV has occurred. As proposed by Shahi and Baker (2014), we used wavelet analysis to determine pulse shape signals. Unlike previous studies, apart from the PGV time interval, we also considered the possibility that the pulse occurs at other time intervals of the earthquake signal.

Mavroeidis and Papageorgiou (2003) and Chang et al. (2016) focused on the impulsive part of the signal for analysis whereas Shahi and Baker (2014) fitted almost all of the signal. We focused only on the impulsive part of signal, similar to Mavroeidis and Papageorgiou (2003) and Chang et al. (2016). Main goal of this study is to create a robust alternative to identify the pulse shaped signals. We implemented threshold for PGV and wavelet analysis for signal process to have similarities with previous studies while adding new features that these studies did not consider.

4.1 Wavelet analysis

Wavelet analysis package of Torrence and Compo (1998) is used for signal analysis. Two different wavelet types, which are Ricker (Appendix I) and Morlet wavelet (Appendix II), are implemented to the wavelet analysis process. Wavelet power spectrums of the signals are calculated by using these wavelets. Morlet wavelet is complex, while Ricker is real-valued. The complex wavelet function returns both amplitude and phase information, whereas real wavelet function returns only real components. This allows to isolate discontinuities. Since both of the wavelets are giving the same qualitative results on power spectra, both of them can be treated as equal. However, Ricker can distinguish the discontinuities, since it is a real-valued function, whereas Morlet can give more smooth results, which is important when there is a high-frequency content in the pulse region (Fig. 1).

Fig. 1
figure 1

Velocity waveform (upper), Ricker wavelet power spectrum (center) and Morlet wavelet spectrum (lower) of 1992 Landers earthquake (Mw = 7.3), Yermo Fire Station (Epicentral distance (rep) = 85.99 km). Red and blue colors represent high and low concentration of power, respectively

The resolution of the wavelet function depends on the width of real space and width in Fourier space. A broad function will give a poor time resolution but a good frequency resolution, and vice versa. The width of the wavelet is proportional to the sampling rate of the signal.

As a result of the analysis, one can determine the power spectrum values over time. The maximum power spectrum values at PGV and the biggest power spectrum value of the signal (if it does not occur at PGV) are used in the pulse identification.

4.2 Pulse identification

Unlike previous studies, our method can identify velocity pulses that occur away from the time interval where the PGV is located. Several decision mechanisms are used to identify pulse shape signals. The criteria for pulse shape signals differ with respect to the position of the pulse, as explained in Section 4.2.1 and in Section 4.2.2.

4.2.1 Velocity pulse at PGV

Most of the seismic energy is assumed to be concentrated at the position where PGV occurs. One of the logical ways to analyze the signal is to focus around the region of PGV. Our method is similar to Chang et al. (2016) since our method is also looking for the energy ratio of the waveform. Furthermore, our method is also looking for the spectral energy, which is similar to the method of Mena and Mai (2011). The threshold of our method occurs when the average of these two parameters around the PGV are equal or bigger than 30% of the whole waveform. The criteria are reported below:

  1. 1.

    PGV ≥ 30 cm/s.

  2. 2.
    $$ \frac{\left( \frac{{\int}_{t_{\mathrm{s}}}^{t_{\mathrm{e}}} v^{2}(\tau) d\tau}{{\int}_{0}^{\infty} v^{2}(\tau) d\tau}+\frac{{\int}_{t_{\mathrm{s}}}^{t_{\mathrm{e}}} WPS(\tau) d\tau}{{\int}_{0}^{\infty} WPS(\tau) d\tau}\right)}{2} \geq 0.30 $$
    (4)

In Eq. 4, ts and te represent the starting and ending points of the pulse, respectively. These points are found by identifying the period (Tp) where the maximum wavelet power spectrum occurs at PGV. The pulse area is then identified as tPGVTp/2 where tPGV represents the time of the PGV. WPS indicates the wavelet power spectrum. The parameters can be seen in Fig. 2.

Fig. 2
figure 2

1992 Landers earthquake, Yermo Fire Station velocity waveform. Red line and blue lines represent width (Tp) and borders (ts and te) of the pulse region, respectively. Background image is Ricker wavelet power spectrum of the signal with the same color content of Fig. 1

The left side of the numerator in Eq. 4 indicates the energy ratio between the impulsive part (velocity time history between ts and te in time axis). The right side of the numerator is the ratio between wavelet spectrum energy of the waveform and impulsive part between the aforementioned area of the signal. Integrals are for summation process and infinity signs indicate the whole waveform.

4.2.2 Velocity pulse outside the PGV region

Unlike previous studies, we also checked for the biggest energy arrival rather than the position of PGV. The logic behind the energy calculation is the same as in the Section 4.2.1. The minimum amplitude is fixed at 25 cm/s. However, this amplitude is not the amplitude of PGV, but the biggest amplitude of the region where the maximum energy is concentrated. Furthermore, the maximum energy of the region should be equal or bigger than 10% of the energy of the PGV region. The average of the energy of the waveform and wavelet power spectrum of this region should exceed 30% of the total energy of the signal. The criteria are reported below:

  1. 1.

    The biggest amplitude, in absolute sense, in the area where the maximum power spectrum value occurred, should be equal or bigger than 25 cm/s.

  2. 2.

    Difference between PGV and the time where the maximum power spectrum in time axis should be larger than Tp/4.

  3. 3.
    $$ \frac{{\int}_{{{}{t}_{\mathrm{s}}}_{\text{emax}}}^{{{}{t}_{\mathrm{e}}}_{\text{emax}}} v^{2}(\tau) d\tau}{{\int}_{t_{\mathrm{s}}}^{t_{\mathrm{e}}} v^{2}(\tau) d\tau} \geq 1.1 $$
    (5)
  4. 4.
    $$ \frac{{\int}_{{{}{t}_{\mathrm{s}}}_{\text{emax}}}^{{{}{t}_{\mathrm{e}}}_{\text{emax}}} WPS(\tau) d\tau}{{\int}_{t_{\mathrm{s}}}^{t_{\mathrm{e}}} WPS(\tau) d\tau} \geq 1.1 $$
    (6)
  5. 5.
    $$ \frac{\left( \frac{{\int}_{{{}{t}_{\mathrm{s}}}_{\text{emax}}}^{{{}{t}_{\mathrm{e}}}_{\text{emax}}} v^{2}(\tau) d\tau}{{\int}_{0}^{\infty} v^{2}(\tau) d\tau}+\frac{{\int}_{{{}{t}_{\mathrm{s}}}_{\text{emax}}}^{{{}{t}_{\mathrm{e}}}_{\text{emax}}} WPS(\tau) d\tau}{{\int}_{0}^{\infty} WPS(\tau) d\tau}\right)}{2} \geq 0.30 $$
    (7)

In Eqs. 5 and 6, teemax and tsemax represent the starting and ending points of the pulse in the maximum energy area in time axis. These points are found by identifying, in the area where the maximum power spectrum values are located, the maximum pulse period (Tp,emax) of the signal. The pulse area is then identified as teemaxTp,emax where teemax represents the time of the biggest value in the Tp,emax region. The parameters can be seen in Fig. 3.

Fig. 3
figure 3

1999 Chi-Chi Taiwan Earthquake (Mw = 7.6), TCU051 Station (rep = 38.53 km) velocity waveform. Red line and blue lines represent the width (Tp) and borders (ts and te) of the pulse region around PGV, respectively. The green line and cyan lines represent the width (Tp,emax) and borders (teemax and tsemax) of the area where the maximum energy is concentrated, respectively. Background image is Ricker wavelet power spectrum of the signal with the same color content of Fig. 1

Equations 5 and 6 describe the threshold for the energy ratios between the area around the PGV and the area around the maximum energy, if exists, for waveform and wavelet power spectrum, respectively. Other parameters have the same meanings that are explained in Section 4.2.1.

Energy ratio between PGV region and maximum energy region is determined by trial and error method. Time gap of Tp between PGV region and maximum energy region is implemented since Ricker wavelet power spectrum can identify discontinuities and that may cause erroneous interpretation of a single pulse into two or more separate pulses depending on the period. Amplitude threshold of the maximum power spectrum region is selected by considering the same idea behind the 30 cm/s of PGV, which is the possibility of creating damages on structures.

Both Ricker and Morlet wavelets are fitted to the pulse region when the algorithm detects a pulse shape signal.

5 Results

There are four main features of the pulse-shape signals as explained in Section 1. In this study, we mostly focused on the position, amplitude, and period of the pulse. One way to determine the validity of the method is to compare spectral response of the original and created signals. An unusual spectral response graph is also an indicator of the pulse shape signal. The wavelet signal is expected to imitate the behavior around the pulse period. In order to do that, we visually compared spectral responses of the original strong motion data and the wavelet that is expected to mimic the pulse. Some of the spectral response graphs can be seen in Fig. 4.

Fig. 4
figure 4

Velocity waveform and fitted wavelets (left column) and pseudo spectral velocity graphs (right column) of 1992 Landers Earthquake, Yermo Fire Station signal, and obtained Ricker wavelet signal (a, b); 1999 Chi-Chi Taiwan Earthquake, TCU039 Station signal; and obtained Ricker wavelet signal (c, d), 1980 Irpinia Earthquake (Mw = 6.9), STN Station (rep = 30.35 km) signal and obtained Ricker wavelet signal (e,f) and 1994 Northridge Earthquake (Mw = 6.7), SCE Station (rep = 24.97 km) signal and obtained 3rd-order Morlet wavelet signal (g, h). In all figures, the blue line represents the period of the pulse. Red and black colors indicate the velocity waveform and fitted wavelet signal, respectively

We also created a method to check the phase of the impulsive part of the velocity waveform. Impulsive signals that can be identified with Ricker wavelet is analyzed, since Ricker wavelet dominated the representation of the impulsive part of the signals. Ricker wavelet can be fitted to the original waveform to visualize the impulsive part more easily, however it is not providing further information about the impulsive part. Method for phase determination is explained in Appendix III.

Two hundred twenty-nine waveforms out of 2738 waveforms have been identified as pulse-shape signals. Shahi and Baker (2014) and Chang et al. (2016) identified 225 and 229 waveform as pulse-shape signals, respectively. One hundred seventy-eight of the signals are identified as pulse-shape signals by three of these studies, whereas 196 signals are identified as pulse-shape signals by both Shahi and Baker (2014) and this study (Fig. 5) and 198 signals are identified as pulse-shape signals by Chang et al. (2016) and this study. Twenty-six of the pulses are located outside the region of PGV. We also checked where the impulse part occurred in the signal and which wavelet better explains the pulse region. Two hundred twenty-six of the pulse-shaped signals are mimicked better by using a Ricker wavelet, whereas only 3 of them are mimicked better by using a 3rd-order Morlet wavelet. A 4th-order Morlet wavelet is not suitable to mimic any of the pulse-shape signals.

Fig. 5
figure 5

Pulse periods determined by Shahi and Baker (2014) and this study (a), pulse periods determined by Chang et al. (2016) and this study (b), pulse periods of the signals in which impulsive signals are outside of the PGV region determined by Shahi and Baker (2014) and this study (c), pulse periods of the signals in which impulsive signals are outside of the PGV region determined by Chang et al. (2016) and this study (d). In panels a and b, both periods of velocity pulses at PGV (Tp) and periods of velocity pulses at other places (Tp,emax) are plotted. In panels c and d, only the periods of velocity pulses at other places (Tp,emax) are plotted

5.1 Comparison with previous studies

Pulses that are also occurred outside of the PGV region are partially also detected by Shahi and Baker (2014) and Chang et al. (2016) (Fig. 5). Shahi and Baker (2014) identified 18 out of 26 of the signals as impulsive signal. Since Shahi and Baker (2014) are fitting full waveform on impulsive signals, it is not clear whether the impulsive part that was detected is the same region with our algorithm or not. Chang et al. (2016) identified 20 out of 26 of the signals as impulsive signal. Pulse periods that are calculated by our algorithm and Shahi and Baker (2014) and Chang et al. (2016) algorithms are close to each other.

Parts of signals that are considered by Shahi and Baker (2014) and Chang et al. (2016) and our method can be seen in Fig. 6. One can notice that these methods cover larger part of the waveform with respect to our method. This feature makes it harder to analyze the impulsive part of the waveform since it is spoiled by the non impulsive parts of the waveform.

Fig. 6
figure 6

Velocity waveform and fitted wavelets (left column) and pseudo spectral velocity graphs (right column) of 1992 Landers Earthquake, Yermo Fire Station signal (a, b), 1999 Chi-Chi Taiwan Earthquake, TCU039 Station signal (c, d), 1980 Irpinia Earthquake (Mw = 6.9), STN Station (rep = 30.35 km) signal (e, f) and 1994 Northridge Earthquake (Mw = 6.7), Rinaldi Reveiving Station (rep = 9.30 km) signal (g, h). Black, red, blue, and green signals are represent velocity waveform, Ricker wavelet, 4th-order Daubechies wavelet extracted by the algorithm of Shahi and Baker (2014) and extracted waveform by the algorithm of Chang et al. (2016), respectively. Vertical blue line represents the period of the pulse

We also focused on inconsistencies between previous methods and our method in terms of numerical results. One can notice that some signals are not identified as impulsive signal by one study whereas considered as impulsive in another one (Fig. 5a and b). Numerical results of Shahi and Baker (2014) (Eq. 2), Chang et al. (2016) (Eq. 3), and our method (Eq. 4) are explained in Table 1.

Table 1 Numerous signals with dissimilar impulsive results

Shahi and Baker (2014) were not able to identify some of the signals that are considered as impulsive by both Chang et al. (2016) and this study. PI is very close to the threshold of 0 on these examples. It is also valid for Chang et al. (2016). Threshold of 0.34 for Eq. 3 is almost exceeded at D08C Station. Brawley Airport Station is also just below the thresholds of Eq. 2 and Eq. 3, which gets a long pulse period by our study (Fig. 7). On the other hand, our method fails when the pulse period is short. In waveform energy parameter, which is the left side of the numerator of Eq. 4, the threshold is exceeded almost all non-impulsive signals, which are AQK, Pacoima Dam, KJMA, and Port Island Stations (Fig. 7). However, wavelet power spectrum energy (right side of the numerator of Eq. 4) is so small that it makes the signal not impulsive. Common feature of these signals that are identified as impulsive is the fact that they have very short impulses. One of the features of impulsive signals is their long periods (Section 1). Our method filters short period signals thanks to wavelet power spectrum energy.

Fig. 7
figure 7

Velocity waveforms and fitted wavelets of Brawley Airport (a), TCU078 (b), Pacoima Dam (c), and KJMA (d) stations. Colors represent the studies as in Fig. 6. Brawley Airport is labeled as impulsive by only our method (Tp = 6.04), TCU078 has been found considered as impulsive by both our study (Tp = 3.60) and Chang et al. (2016) (Tp = 1.00), Pacoima Dam and KJMA are identified as impulsive by both Shahi and Baker (2014) and Chang et al. (2016) with pulse periods of 0.78, 0.70 and 1.09, 1.00, respectively

6 Conclusion

In this study, we seek for an alternative way of identifying a pulse-shape signal. Combination of several methods that are created to look for the same features are used. The possibility of impulsive signals being located not on PGV but elsewhere is also taken into account. At the end, we have come up with the following conclusions:

  1. 1.

    Ricker wavelet analysis gives a higher resolution in the time domain, which is more suitable for determining the exact timing of the pulse.

  2. 2.

    A Ricker wavelet is better than Morlet wavelets for mimicking the pulse part of the earthquake signal based on residual analysis.

  3. 3.

    Our method is reproducing the spectral periods of the pulses, which makes the method convincing.

  4. 4.

    Most of the velocity pulses occurred at PGV. However, it is worth mentioning that pulses may occur also in other intervals of the signal.

  5. 5.

    This study has correlated with previous studies while expanding the information about the pulse shaped signal such as determining the pulse that occur outside the PGV region.