1 Introduction

Seismic electromagnetic signals (SEMS) are electric, magnetic, and electromagnetic signals related to the genesis and occurrence of earthquakes and the structural recovery of seismogenic region. SEMS have been reported in a large amount of literature (Eftaxias et al. 2001, 2009; Fujinawa et al. 1998; Han et al. 2014; Hattori et al. 2012; Huang et al. 2011b; King 1983; Park et al. 1993; Zhang et al. 2011; Zhao et al. 2009). Abundant indoor and/or outdoor experiments and numerical simulations have been conducted to verify the existence of SEMS phenomena (Huang et al. 1998; Kuo et al. 2014; Zhao et al. 2009; Enomoto et al. 2012; Ren et al. 2012; Potirakis et al. 2012; Huang 2011a). For example, radiation of ultra-low-frequency (ULF) electromagnetic signals was observed in the early, middle, and late stages of a rock-fracture experiment (Hao et al. 2003). At the end of last century, the Greek physicists proposed the VAN method for earthquake monitoring on the basis of seismic electrical signals (SES). It was claimed this method could be used to predict earthquakes above magnitude 5 (Uyeda et al. 2009; Varotsos et al. 1991), which generated much interest (Huang 2005). Since the 1960s, China has established a nationwide network of earthquake-monitoring stations based on geoelectric resistivity, geoelectric fields, and electromagnetic waves. In recent years, in China, much research has been conducted on the development of a method for monitoring extremely low-frequency electromagnetic radiation and an earthquake electromagnetic satellite (Zhao et al. 2007, 2012). The development of these monitoring techniques reflects the attention given to SEMS-based earthquake prediction methods by the scientific community.

To identify and extract SEMS effectively, such approaches as maximum entropy estimation (Liu et al. 2012; Fan et al. 2010), time–frequency analysis (Eftaxias et al. 2001), wavelet transform (Zhang et al. 2013; Xie et al. 2013; Han et al. 2011), and principle-component analysis (Uyeda et al. 2002; Han et al. 2009), have been used. The occurrence time, frequency band, and propagation path of SEMS have been studied (Fujinawa et al. 1998). However, most of these methods compare electromagnetic signals (or treated data) within a period of time of a single earthquake to seek earthquake-related anomalies and analyze the potential SEMS characteristics. In fact, the SEMS may not be obviously abnormal signals because of low-intensity and/or mixing with the background field and interfering noise. This may be why no SEMS anomalies have been observed before and after some earthquakes, and why some abnormal electromagnetic signals cannot be related to the corresponding earthquakes (Orihara et al. 2012; Fan 2010; Han et al. 2014; Hattori et al. 2012). Although many methods have been used to investigate and extract SEMS on the basis of ground and/or ionosphere measurements, the correlation between earthquakes and electromagnetic signals is still difficult to quantify solely on the basis of electromagnetic anomalies before and after a single event, not to mention the general features of complex and inconstant SEMS. Therefore, although many studies in this field have been reported, the nature of SEMS remains elusive, not to mention the temporal–spatial distribution of SEMS and their variations.

For a specific earthquake, SEMS may arise during the seismogenic process, in the course of rock rupture during the co-seismic moment, and during post-seismic recovery of the seismogenic structure. By taking the time of occurrence of an earthquake as a reference point, the time axis can be divided into three stages, pre, co, and post-seismic, in which SEMS may or may not exist. When multiple earthquakes occur successively, the SEMS in the three stages may overlap and be submerged by noise. Therefore, the specific characteristics of SEMS in the three stages cannot be determined with certainty by analyzing a single seismic event before the physical mechanisms of seismic electromagnetic radiation are clear. Recently, statistical study by superpose epoch analysis (SEA) has been used to investigate the relationship between earthquakes and geomagnetic variations. The results of statistical analysis preformed in Japan suggested there was no correlation between earthquakes and geomagnetic anomalies, and that ULF geomagnetic anomalies were probably more sensitive to earthquakes which were larger and closer to geomagnetic monitoring stations (Han et al. 2014; Hattori et al. 2012). As an alternative, in this paper, we introduce a new statistical method for study and investigation of the direct correlation between earthquakes and electromagnetic signals.

Earthquakes are caused by tectonic movement with similar dynamic processes. For all earthquakes, or all earthquakes of a specific type, the corresponding SEMS may share similar temporal distribution characteristics. This means we could choose several earthquakes for statistical analysis and extract the common features of SEMS. Correlation analysis is a basic statistical tool. For simultaneous recognition of pre, co, and post-seismic SEMS, we propose use of the shifting correlation method (SCM) for SEMS and earthquake series. Continuous shifting of two sequences and calculation of correlation coefficients can result in a plot of correlation coefficients over the entire time axis. By shifting correlation analysis of electromagnetic signals of different frequency, a diagram based on the correlation coefficients can be obtained, and these diagrams clearly show the time–frequency distribution characteristics of SEMS.

In this study, we first introduce the basic principle of the shifting correlation method (SCM) and then verify the performance and functional characteristics of the SCM by use of synthetic data. Later, we discuss the application of the SCM to SEMS recognition by applying it to sequences of the main-after shock of the Minxian–Zhangxian earthquake and electromagnetic monitoring data.

2 Basic Principle of SCM

In mathematics, the correlation coefficient between two discrete sequences can be used to measure the correlation between two variables, as shown by Eq. (1).

$$R_{ME} = \frac{{\sum\nolimits_{k = 1}^{m} {(M_{k} - \overline{M} )(E_{k} - \overline{E} )} }}{{\sqrt {\sum\nolimits_{k = 1}^{m} {(M_{k} - \overline{M} )^{2} } } \sqrt {\sum\nolimits_{k = 1}^{m} {(E_{k} - \overline{E} )^{2} } } }}$$
(1)

where R ME is the correlation coefficient between the two discrete sequences M and E. \(\overline{M}\) and \(\overline{E}\) are the means of the two sequences, and m is the number of samples.

Under actual conditions, the variations of the two variables may not be synchronized, i.e. the effect or action of one variable on another may happen in advance or lag behind. These phenomena with asynchronous correlation may be extracted by relative shifting then calculation of the correlation between the two sequences, i.e. before calculation of correlation coefficients by use of eq (1), one sequence is fixed and the other sequence is continually shifted. The correlation coefficient between the two sequences after each shift is then calculated. On the basis of the amount of shifting corresponding to the maximum correlation coefficient, we can determine not only whether the two variable sequences are significantly correlated but also the time difference between two asynchronous variables. This is highly important for analysis of the temporal distribution characteristics of SEMS. Figure 1 shows a schematic diagram of interpretation of this calculation process by use of the so-called shifting correlation method (SCM).

Fig. 1
figure 1

Schematic diagram of the SCM calculation process. Three sequences are given in the figure. The upper sequence M i is derived from a long sequence. i = 6 is shown in the middle. E is a short sequence which determines the sample size involved in the shifting correlation calculation. m is the sample size involved in calculation; m = 14 is used in this example. The range of variation of i is [1, n − m + 1] when the sample size of the long sequence is defined as n. The linear correlation coefficient between M 6 and E will be calculated when the value of i is 6. This means that, with continual variation of the positive integer i, different M i will be obtained from the long sequence for calculation of linear correlation coefficients with the sequence E during the calculation process. Different i correspond to different M i and to different correlation coefficients

M i is a sequence derived from a long sequence; as an example, i = 6 is shown in Fig. 1. E is the short sequence and determines the sample size involved in the shifting correlation calculation. m denotes the sample size involved in calculation; m = 14 is used as an example in Fig. 1. M i used to calculate the correlation coefficients with the sequence E changed by continual variation of a positive integer, i, which causes the invariable sequence E to be shifted to the right or left relative to the long sequence on the horizontal axis during the process of calculation of the correlation coefficients. The range of variation of i is [1, n − m + 1] when the sample size of the long sequence defined as n. The above process can be expressed as Eq. (2):

$$R_{{M_{i} E}} = \frac{{\sum\nolimits_{k = 1}^{m} {(M_{i,k} - \overline{{M_{i} }} )(E_{k} - \overline{E} )} }}{{\sqrt {\sum\nolimits_{k = 1}^{m} {(M_{i,k} - \overline{{M_{i} }} )^{2} } } \sqrt {\sum\nolimits_{k = 1}^{m} {(E_{k} - \overline{E} )^{2} } } }},\;\,i \in [1,n - m + 1]$$
(2)

where M i and E are the two equal-length sequences for correlation calculation (M i is obtained from the long sequence, and E is the short sequence), M i,k and E k are the kth sample values of the two sequences, \(\overline{{M_{i} }}\) and \(\overline{E}\) are the respective means of the two sequences, i is the serial number of the long sequence, m is the sample size involved in calculation, and \(R_{{M_{i} E}}\) is the correlation coefficient. In comparison with Eq. (1), M is replaced by \(M_{i}\) in Eq. (2).

The correlation coefficients between the two sequences, calculated by shifting correlation method, are not single value but a correlation coefficient sequence, i.e. the curve of variation of the correlation coefficient when the short sequence shifts to the left or right relative to the long sequence.

3 Effectiveness of the SCM Verified with Synthetic Data

First, recognition of the single correlated signals is confirmed. A series of discrete random numbers is generated to form a long sequence, as shown in Fig. 2a. Samples are derived from different sections of the long sequence and added to a specific amount of random noise to form the short sequences. As shown in Fig. 2a, three sequences with the sample size of 30 are selected from the blocks M1, M2, and M3 (representing different sections). Then approximately 30 % random noise was added to form the three sequences E1, E2, and E3. E1 and M1 are synchronously correlated sequences, E2 is a post-correlated sequence obtained by left shifting M2 by 20 samples, and E3 is a pre-correlated sequence obtained by right shifting M3 by 10 samples.

Fig. 2
figure 2

Shifting correlation coefficients obtained from synthetic data. a Long sequence and three short sequences. The horizontal axis represents the serial number of the sample; the vertical axis represents the preset amplitude, varying from 0 to 7. E1, E2, and E3 are sequences M1, M2 and M3 after addition of 30 % noise. The sample size is 30. E1 and M1 are synchronously correlated. E2 is derived by left shifting M2 by 20 samples; it is the post-correlated sequence. E3 is derived from right shifting M3 by 10 samples; it is the pre-correlated sequence. b R ME1, R ME2, and R ME3 are plots of the variation of the correlation coefficients between the three short sequences (E1, E2, and E3) and the long sequence. The absolute positive and negative values on the horizontal axis represent the number of samples in which the short sequences shift to the left and right

As is apparent from Fig. 2b, the shifting correlation coefficients are calculated when the three short sequences are shifted relative to the long sequence M. Relative to the starting point of the shifting, the numbers of the shifted samples are denoted by positive and negative values on the horizontal axis when the short sequences are shifted right or left. It is apparent that strong correlation occurs at positions of 0, −20, and 10, respectively, on the three curves which coincides completely with the preset signals. This indicates that the SCM can accurately distinguish asynchronous correlation between two sequences. This in indicative preliminary validation of the effectiveness of the method.

The effectiveness of the SCM was validated by simulating the single correlation through the synthetic data as above. However, if we superimposed sequences E1, E2 and E3 in equal proportion in Fig. 2a, so a short sequence of equal length was synthesized, and then performed shifting correlation calculation with the corresponding long sequence, could the SCM distinguish the correlation at the three different positions simultaneously? Following this idea, further verification of the effectiveness of the SCM was conducted.

As shown in Fig. 3, a long sequence containing 200 samples was randomly generated and the method for short sequence generation was the same as for Fig. 2, but the three short sequences (E1, E2, and E3) were superimposed into one short sequence in equal proportion. In the calculation of shifting correlation coefficients on the basis of the different sample size, three cases of m (m = 30, 50, and 100) as short sequences were analyzed. Figure 3b shows an example of a short sequence for which the sample size was 50; it was located at a position synchronous with the long sequence. Comparison of the curve shapes in Figs. 2 and 3 reveals that although E1, E2, and E3 contain random interference noise relative to M1, M2, and M3, some similarities could be still observed by pair-wise comparison, although the curves in Fig. 3a, b seems to be completely uncorrelated.

Fig. 3
figure 3

Result of recognition of several superimposed correlated signals. a Long sequence with a sample size of 200, randomly fluctuating in the range 0–7. b Short sequence formed by superimposition of three sequences. Take the sample size of 50 as an example. For the range 51–100, the short sequence is synchronous with the long sequence, i.e. the starting point. c Plot of variation of the correlation coefficient; there are three positions with significant correlation. Relative to the long sequence, the short sequence shifted to the left and right by 30 steps

In the same way, the shifting correlation calculation is performed for the two sequences in Fig. 3a, b. In Fig. 3c, it is apparent that three significant correlations have been distinguished simultaneously at 0, −20, and 10, respectively. To improve the universality of this phenomenon, one hundred tests were performed for the whole of the process above. The results are shown in Fig. 4a–c for when m (the sample size of the short sequence) is 30, 50, and 100, respectively. There is a “column-shaped” significant correlation at −20, 0, and 10 on the horizontal axis. This means that the SCM could simultaneously distinguish several hidden correlations which may occur at different times. Figure 4a–c also indicate that the resolution of the calculation increases with increasing sample size of the short sequence.

Fig. 4
figure 4

One hundred experiments of shifting correlation calculation. The vertical axis represents the times of the experiments; the horizontal axis represents that the largest sample size by which the short sequence shifts to the left and right, which is 30. The negative and positive specification is the same as above; the color code is the magnitude of the correlation coefficient. ac Show the results when the sample size of the short sequence (m) is 30, 50, and 100, respectively. The “column-shaped” significant correlation occurs at −20, 0 and 10 on the abscissa

These calculations using synthetic data indicate that the SCM can clearly distinguish simple, hidden, and complex correlations between two sequences. The larger the sample size of the short sequence, the higher the resolution. This method can be used to investigate the correlation between two physical quantities and the correlation characteristics. As an example, the method was used to analyze the correlation between an earthquake and electromagnetic signals.

4 Case Study

Electromagnetic signals are regarded as among the most sensitive physical responses to earthquake (Zhao et al. 2007) but are vulnerable to interference from several sources. If SEMS indeed exist, they may be mixed with a variety of electromagnetic signals and noise, which makes it difficult to distinguish then. If the earthquake is correlated with the electromagnetic signals, the correlation can be recognized by the shifting correlation method, as indicated by the verification above.

If the existence of SEMS is manifest as a co-seismic effect, the earthquake sequence may have a strong correlation with the electromagnetic sequences monitored synchronously; but if it is a pre or post-seismic effect, the correlation between the two physical quantities cannot be observed unless the electromagnetic sequences are shifted relative to earthquake sequences, by a corresponding distance backward (left) or forward (right). Therefore, it is possible for us to extract the pre, co, and post-seismic SEMS simultaneously when performing the SCM on electromagnetic signals the and earthquake sequence. On the basis of the results calculated we could further analyze the time–frequency feature of SEMS and its spatial distribution characteristics. It should be noted that the SCM is a statistical method based on several seismic events, not simply analysis of the correspondence between one or several seismic events with the anomalies in SEMS analysis.

4.1 General Information about the Minxian–Zhangxian Earthquake

The M L 6.5 earthquake (after correction by Fang et al.), with the focal depth of 20 km, occurred at the boundary of Minxian and Zhangxian in Dingxi City, Gansu Province (34.5°N, 104.2°E) at 07:45 on July 22, 2013 Beijing Time. According to the Institute of Geophysics, CEA. (see Data and Resources Section) the earthquake was a thrusting sinistral strike-slip earthquake. The earthquake was closely associated with the Lintan–Tanchang fault belt (F2) in the northeast. This fault is clamped between the East Kunlun fault system in the south and the northern margin of the western Qinling fault in the north and the Diebu–Bailongjiang fault (F3) and the Guanggaishan–Dieshan fault belt (F1). The three faults (F1–F3) were involved in tectonic transformation and regional stress redistribution (Zheng et al. 2013). After the earthquake, four monitoring stations, Zhujiawan (ZJW), Majiagou (MJG), Shimen (SHM), and Shuguang (SHG) (Fig. 5), were arranged within 15 km of the epicenter to perform continuous electromagnetic monitoring for nearly a month. The V5-2000 magnetotelluric device by Canada Phoenix was used.

Fig. 5
figure 5

Schematic maps of emergency electromagnetic monitoring sites and positions of the main shock and aftershocks (a) and active tectonics of the seismic region (b) (revised from Zheng et al. 2013). In a the big solid circle indicates the main shock, and the smaller solid circle indicates the largest aftershock. In b F1 is the Guanggaishan–Dieshan fault, F2 is the Lintan–Tanchang fault, and F3 is the Diebu–Bailongjiang fault

4.2 Shifting Correlation Analysis

The shifting correlation method was used to analyze electromagnetic monitoring data from the Minxian–Zhangxian earthquake. The main procedures were:

  • first, the main shock and aftershock sequence of the earthquake was processed into a long sequence used for correlation analysis;

  • second, the electromagnetic monitoring data was processed into the short sequence; and

  • third, final treatment, calculation, and analysis of the shifting correlation coefficient between the two sequences.

Note that road construction and mining were in progress near SHM and SHG stations, causing strong electromagnetic interference; as a result, data from these two stations were not of sufficient quality and data from ZJW and MJG stations, only, were used in the SCM analysis. Basic information about the ZJW and MJG stations are listed in Table 1.

Table 1 Basic time and data information for electromagnetic monitoring of the Minxian–Zhangxian earthquake

4.2.1 Earthquake Sequence Treatment

Here, first, we define the earthquake magnitude in different ways. The original conventional earthquake sequences were first divided by equal time intervals [we use local time (LT) rather than universal time]. Then, within each time interval, the sum of seismic energy released was estimated as conventional magnitude M and converted to “equivalent magnitude” (Meq) to represent the seismic energy sequence. Then, using Eq. (3), the original conventional magnitude sequence (main-after shocks sequence from 07:45 on July 22 to 21: 50 on September 26) was converted to energy sequence and the summation was performed within the time interval. The distribution of all the earthquake events was less than 26 km from both stations, and most were within 20 km of the stations.

$$E = \sum\limits_{i = 1}^{s} {10^{{(A \times M_{i} + B)}} }$$
(3)

where E is the seismic energy (Joules) released, M i is the magnitude of the ith earthquake, and s is the number of seismic events within the time interval. A and B are constants, with values of 1.96 and 2.05, respectively (Gutenberg et al. 1956). To ensure the effectiveness of the method, when determining the time interval, factors such as data amount, potential duration of SEMS for one event, and the continuity of seismic energy release should be taken into account. After repeated calculation and analysis, 1 day was selected as the time interval for the sequence of the Minxian–Zhangxian earthquake, that is, one energy value per day. The energy sequence was then converted to Meq sequence by use of Eq. (4):

$$M_{\text{eq}} = {\raise0.7ex\hbox{${[\lg (E) - B]}$} \!\mathord{\left/ {\vphantom {{[\lg (E) - B]} A}}\right.\kern-0pt} \!\lower0.7ex\hbox{$A$}}$$
(4)

\(M_{\text{eq}}\) is Meq. For days in which no earthquakes were detected, cubic spline interpolation was performed. (This was also performed on synthetic data to validate its suitability. The result of the calculation did not differ significantly from Fig. 4, so no results are given in this paper.) Figure 6a and b are the original earthquake sequence and the Meq sequence, after treatment, respectively. It is apparent the two sequences do not differ significantly before and after treatment. The Meq sequence in Fig. 6b was used for subsequent calculation.

Fig. 6
figure 6

Earthquake sequences (a, b) before and after treatment and electromagnetic signal sequence (c). a Original earthquake sequence. The main shock occurred on July 22. The last major aftershock occurred on September 26; b Meq sequence after treatment to convert it into 1-day time intervals. The horizontal axis is continuously marked with positive integers; “0” represents the time of occurrence of the main shock, which was on July 22, and “66” represents September 26. During this period, the days are marked consecutively; c daily variation of the electric field Ex at the frequency of 120 Hz at station MJG. The relative status of the two sequences in b and c is non-shifting

4.2.2 Treatment of Electromagnetic Monitoring Data

The robust time series processing software SSMT2000 provided by Phoenix was improved so the time series observed continuously could be divided into equal time intervals and the batch processing was performed automatically. In accordance with Meq sequence, the time series (Table 1) of the electromagnetic signal was also divided into time intervals of a day (LT 00:00 A.M–24:00 P.M.). SSMT2000 was then used to obtain the full-band (0.00055–320 Hz) electromagnetic spectrum. The treatment result for each day was output in the form of an EDI file. The EDI files of the power spectrum, after auto-edit by MTEDITOR (software provided by Phoenix), were managed by MT-Pioneer software (Chen et al. 2004), and output as a diurnal curve of the power spectrum values with different frequencies (a total of forty frequency points). As a result, forty curves of electromagnetic response were obtained. For data missing from Table 1, cubic spline interpolation was performed to form a complete electromagnetic signal sequence at the different frequency points. Figure 6c shows the sequence of Ex at a frequency of 120 Hz at MJG station.

The noise lever of electromagnetic data can be judged by the quality of the apparent resistivity curves. In Fig. 7, panels (c) and panels (f) show the daily apparent resistivity curves for stations MJG and ZJW, respectively. Except for a few small and individual disturbances, it is apparent that the apparent resistivity curves of the two stations are very smooth, which indicated that the artificial noise from electromagnetic signals is very small. To further investigate whether the electromagnetic disturbance around the stations was stronger during the daytime than at nighttime, we divided the 1 day time series into days and nights (LT, daytime is 06:00 a.m–06:00 p.m., the other time is nighttime) and the corresponding apparent resistivity was then calculated. The results are given in Fig. 7. Panels (a) and (b) show the daytime and nighttime curves, respectively, for MJG station. Panels (d) and (f) show the daytime and nighttime curves, respectively, for ZJW station. By comparing panels (a) and (b), and panels (d) and (f), we found that the quality of apparent resistivity at nighttime was slightly better than in the daytime. However, in the relatively low frequency bands, neither day nor nighttime was good enough at the two stations, and the results for entire day were much better than the single daytime or nighttime results. Furthermore, it is a fact that whenever an earthquake occurs, whether in the daytime or nighttime, its pre, co, and/or post-seismic electromagnetic responses may have arisen in the daytime and/or nighttime. Much SEMS information would probably be lost if only one time period was used in the statistical analysis. Using the mean of the entire day power spectrum is, therefore, the best choice for SCM calculation, and was used in the work discussed in this paper.

Fig. 7
figure 7

Apparent resistivity (Rxy) curves for two stations. ac are daytime, nighttime, and entire day apparent resistivity curves, respectively, from MJG station; df are the same as ac but for ZJW station

4.2.3 Calculation of Shifting Correlation Coefficients

By using the Meq sequence and electromagnetic signal sequence divided into time intervals of 1 day, the curve of the shifting correlation coefficient between the two physical quantities was calculated. During the shifting process, the number of days shifted was determined by the length of the electromagnetic signal sequence, m, after treatment, i.e. controlling the magnitude of n (in Fig. 1). To ensure satisfactory statistical analysis, the largest sample size was used when calculating the pre-seismic correlation, i.e. when the electromagnetic signal sequence shifts to the right, for MJG station. For example, m = 28, and the number of days shifted was marked by negative value, representing pre-seismic correlation. In this study, because the earthquake sequences before July 22 were not included into the calculation, the number of samples used for calculation of the correlation coefficient are reduced when the electromagnetic signal was shifted to the left relative to the Meq sequence. The number of left shifted days was marked by a positive value, representing post-seismic correlation. Finally, we calculated the shifting correlation coefficients between full-band electromagnetic signals and the Meq sequence and obtained nephrograms of the time–frequency distribution of the correlation coefficients.

5 Calculation Results and Analysis

It was found that the correlation coefficients between Ex and Hy and between Ey and Hx at the two stations were higher than 0.99, and the nephrograms of the correlation with Meq were also similar. Therefore, only the shifting correlation nephrograms between the Ex, Ey, and Meq sequences at the two stations are given in this paper (Figs. 8 and 9).

Fig. 8
figure 8

Calculation results for MJG station. The electromagnetic signal sequences in Ex (upper) and Ey (lower) shifted by 15 and 25 sample sizes (days) relative to the Meq sequence resulted in the correlation nephrograms. The marks on the horizontal axis are the same as those in Fig. 2a. The vertical axis represents the logarithm of frequency. The color represents the magnitude of correlation coefficient, as in the colored bar on the right

Fig. 9
figure 9

Calculation results for ZJW station. The meanings of the symbols are the same as in Fig. 8

5.1 MJG Station

After a series of treatments, the length of electromagnetic signal sequence and Meq sequence at MJG station was 28 and 67, respectively. In most previous studies, the SEMS probably arose within 2 or 3 weeks before the earthquakes (Zhang et al. 2011; Uyeda et al. 2002; Orihara et al. 2012; Hattori et al. 2012; Han et al. 2009, 2014), so to ensure sufficient sample size, the electromagnetic signal sequence was shifted to the left and right relative to the Meq sequence by a maximum of 15 and 25 days, respectively. In the former situation, when left shifted 15 days the number of samples for SCM calculation was 14. Thus, in Fig. 8, the strong correlation at 15 days after may be attributed to reduction of the amount of data available for calculation during the process of shifting to the left.

For the two anomaly regions with very prominent correlation at low-frequency, 23 days before the earthquake and 11 days after the earthquake, the number of samples involved in the calculation was 28 and 19, respectively. The correlation coefficient at 0.01 Hz was 0.5 and 0.7, respectively. The lead time of the pre-seismic anomaly was approximately similar to that in studies of the Wenchuan, Lushan, and Izu Island earthquake by use of different methods (Uyeda et al. 2002; Ma et al. 2013; Fan et al. 2010). Moreover, the correlation of the high-frequency component was not significant, especially preseismically. This is in agreement with the results of Fan et al. (2010), Fujinawa et al. ( 1998 ) and Park et al. (1993) who performed studies on the frequency of electromagnetic anomalies.

It is apparent from the nephrogram that during the period from 22 days before to 10 days after the earthquake, the correlation coefficient was very small. There was almost no co-seismic correlation. A weak correlation coefficient of approximately 0.38 appeared at high frequency approximately 1 day after the earthquake. Eftaxias et al. (2001) reported a failure to record the co-seismic anomaly; in contrast, (Orihara et al. 2012; Contoyiannis et al. 2010; Tang et al. 2010) reported that they had observed co-seismic electromagnetic signals. The missing co-seismic signals are discussed in detail below.

5.2 ZJW Station

It is apparent that 20 days before the earthquake, a moderately strong correlation was observed. A significant correlation occurred at a relatively high frequency (approx. 15 Hz) six days before earthquake, with the correlation coefficient 0.67. Sixteen and six days before the earthquake, correlation appeared intermittently at high frequency, with the highest on 6 days before the earthquake. The anomaly in the low-frequency was quite continuous from the co-seismic stage to six days after the earthquake. Starting from seven days after, the correlation extended to the medium frequency until 11 days after earthquake. The correlation coefficient was 0.56 in the co-seismic stage and increased to 0.65 five days after the earthquake. The continuous variation of correlation coefficient from the co-seismic stage to 11 days after the earthquake may be indicative of post-seismic stress adjustment. This result corroborates the hypothesis concerning the disappearance of the anomaly for the impending earthquake and that the post-seismic anomaly does not disappear immediately (Tang et al. 1998; Eftaxias et al. 2001). It can be seen from Fig. 7f that the quality of low-frequency data is still satisfactory at this station, which indicates the high-reliability of this phenomenon. It should be also noted that strong correlation appeared at medium frequency at Ey approximately 14 days after. This might be for the same reason as the strong correlation 15 days after earthquake at MJG station—a reduction in the amount of data.

6 Discussion

6.1 Correlation Between EM Sequence and Random Sequences

Considering the limited sample size of the electromagnetic signal sequences, the reliability of the results using shifting correlation method may be in doubt. Is it possible that similar correlation can be found between the electromagnetic signal sequence and a random sequence? To obtain an answer, the shifting correlation calculation was performed between the electromagnetic signal sequence and the “random sequence” of the Meq sequence. The electromagnetic signal sequence was the same, but the Meq sequences in Fig. 6b were rearranged randomly. Thus a new sequence with the same amplitude but different order was generated as the “random sequence”. A nephrogram with the best correlation was finally chosen from among 10 experiments, as shown in Fig. 10. The shifting was to the left and right by 25 and 35 days, respectively. It is apparent from the figure that the correlation coefficients indeed increase systematically after shifting to the right by 14 days. This demonstrates that the reduction in sample size has a significant effect on the correlation coefficients, which is consistent with our preconceived idea mentioned above. When the sample size was 28 and no shifting was performed, or the shifting was to the right, the correlation coefficients were below 0.4, as shown in Fig. 10. No correlation coefficient exceeded 0.5 but some were equal to 0.3, as for the measured data. Therefore, in this paper, we mainly focus on correlation coefficients larger than 0.5 in real measured data.

Fig. 10
figure 10

Result of shifting correlation coefficients between the disordered Meq sequence and the electromagnetic signal sequence in the Ex from MJG station

6.2 Comparative Analysis of Two Stations

In this paper, absolute correlation coefficients were used in the nephrograms (Figs. 4, 8, 9). Therefore, we could not discriminate whether the two physical quantities are positive or negative correlations. Figure 11 shows an example for MJG station. These are the variation curves for the Ex sequence and Meq sequence over time after shifting to the right by 23 days and shifting to the left by 11 days at a frequency about 0.05 Hz (curve smoothing by cubic spline interpolation). As seen from the curves, the correlation is primarily positive and negative before and after the earthquake, respectively, and correlation coefficients are 0.53 and −0.67, respectively. The same phenomenon is observed in the high-value area of the correlation coefficient for ZJW station. We also found the correlation was negative in the co-seismic stage, with correlation coefficient −0.56. These phenomena are very interesting, but the reason for this must be investigated further.

Fig. 11
figure 11

Comparison of curves of electromagnetic signal sequence in the Ex direction and the Meq sequence over time at a frequency of 0.05 Hz. The two vertical axes on the left and right are the logarithms of the electromagnetic spectrum value and Meq. The meaning of the horizontal axis is the same as in Fig. 6; a comparison of the two sequences after shifting to the right by 23 days; b comparison of the two sequences after shifting to the left by 11 days

Analysis of the data at the two stations shows that the correlation nephrograms reveal the occurrence of similarities and differences. First, from the nephrograms there is no co-seismic correlation in the high-frequency component at the two stations. We speculate that there are two different interpretations for the absence of co-seismic high-frequency signals:

  1. 1

    Because of the excessively high frequency of co-seismic electromagnetic radiation, the device fails to record the signals. In other reports, the frequency range of the co-seismic electromagnetic anomaly is of kHz or MHz magnitude (Eftaxias et al. 2009). However, the highest frequency of the electromagnetic monitoring instrument we used in this article is 320 Hz (V5-2000, Phoenix), thus the high-frequency signals has not been recorded.

  2. 2

    The high-frequency signals in the observed frequency range are absorbed by crustal media. The seismic electromagnetic signals may come from two sources: some electromagnetic signals are released through the hypocenter and transmitted by the earth medium, whereas others are transmitted by the waveguide between the ionosphere and earth (Fujinawa et al. 1998). The monitoring stations, which were within 12 km of the swarm of earthquakes, were located near the epicenter in this study. No surface ruptures were caused by this shock (Zheng et al. 2013). Therefore, we infer that most of observed SEMS may directly come from the hypocenter through the earth medium, the high-frequency SEMS may be absorbed by the crust, as suggested by Eftaxias et al. (2001). At ZJW station, highly correlated SEMS appeared below a frequency of 0.001 Hz, and the ribbon-like signals lasted 10 days after the shock. To some extent, this phenomenon seems to indicate regularity in seismogenic and post-seismic activity.

Second, although the distance of the two stations from the epicenter almost the same, statistical significance arose on different days and for different frequencies. The reasons for the different results at MJG and ZJW are:

  • because the number of samples in the electromagnetic sequence is limited, the result is vulnerable to local noise, and cannot reflect the time–frequency characteristics of SEMS steadily, which may lead to differences at the two stations; or

  • the distance between the epicenter and the two stations is only 12 km, but the apparent resistivity curves (Fig. 7) and the results from one-dimensional magnetotelluric inversions showed their deep resistivity structures are quite different (the results as seen in the attachment). The resistivity at MJG is significantly lower than that at ZJW. Moreover, previous studies have suggested that different deep structure may result in different recording of SEMS (Varotsos et al. 1991; Hattori et al. 2012; Huang et al. 2010); different distributions of tectonic deformation fields, stress and strain field, and active tectonics may also affect the results.

Because this study focuses mainly on the theory and realization of the SCM, further studies of the exact reason for the above phenomenon will be conducted in the future.

7 Conclusions

The shifting correlation method (SCM) is proposed for analysis of the correlation between earthquakes and electromagnetic signals. We assumed that seismic electromagnetic signals (SEMS) could exist in the pre, co, and post-seismic stages. After continuous shifting of one sequence, the correlation coefficients between the two physical quantities were calculated. Thus, we could seek information about SEMS along the entire time axis (the position with high correlation) with the time of occurrence of the earthquakes as the origin. Synthetic data were first used to estimate the efficacy of the SCM for recognition of signals with asynchronous correlation. SCM was then used for analysis of electromagnetic monitoring data from the Minxian–Zhangxian earthquake, to obtain the preliminary temporal–frequency distribution characteristics of SEMS.

From the synthetic study we found that SCM could suppress noise to some extent. The larger the sample size involved in SCM, the more effectively the noise was minimized and the higher the resolution of the correlating signals. Thus, SCM is very suitable for treatment and analysis of long-term monitoring data obtained by use of seismic station networks. Moreover, as was apparent from the analytical procedure, SCM is not confined to correlation analysis between earthquake and electromagnetic signals. It is also suitable for correlation analysis of other precursory physical quantities.

The results of a case study of the Minxian–Zhangxian earthquake corroborate the belief that SEMS precede earthquakes. In the frequency range involved in this study, SEMS may appear within 23 days before the shock, and disappear 5 days before the shock. Strongly correlated SEMS appear at low frequency in the co-seismic and post-seismic stages, and may disappear 10 days after the earthquake. We also found that the time of occurrence of SEMS varied for the different stations and the frequency band of SEMS was also different at different stages. However, the case study had some limitations, for example the limited number of samples of observed electromagnetic data and we only considered linear correlation between earthquakes and electromagnetic signals. Non-linear correlation with sufficient samples is worthy of study.

In general, a new method has been proposed for investigation of the relationship between earthquakes and electromagnetic signals, and some results are in agreement with those from previous studies. The relationship between SEMS characteristics, position of monitoring stations, active tectonics, and seismic rupture are worthwhile being further profound studied.