# Extending the double difference location technique—improving hypocenter depth determination

- First Online:

- Received:
- Accepted:

- 4 Citations
- 839 Downloads

## Abstract

Locating the seismic event hypocenter is the very first issue undertaken when studying any seismological problem. Thus, the accuracy of the obtained solution can significantly influence consecutive stages of an analysis, so there is a continuous demand for new, more efficient, and accurate location algorithms. It is well recognized that there exists no single universal location algorithm which performs equally well in all situations. Seismic activity and its spatial variability over time, seismic network geometry, and the controlled area’s geological complexity are factors influencing the performance of location algorithms. For example, in the case of mining applications, the planarity of the seismic network usually operated at the exploitation level becomes an important issue limiting the accuracy of location of the hypocenter depths. In this paper, we push forward the discussion on the performance of the newly proposed location algorithm called the extended double difference (EDD), concentrating on the reliability of source depth estimation for mining-induced seismic events. We demonstrate that the EDD algorithm very efficiently uses information originating from the nonplanarity of the seismic network, improving the hypocenter depth estimates with respect to the classical double difference technique. Methodological considerations are illustrated by real data analysis of selected events from the Rudna copper mine (Poland).

### Keywords

Mining seismology Source location Double difference technique Extended double difference## 1 Introduction

The main goal of seismic networks operating in mines is continuous monitoring of seismicity in the mining area (Gibowicz and Kijko 1994; Gibowicz and Lasocki 2001; Mendecki and Sciocatti 1997). This comprises detection of seismic events followed by location of hypocenters, energy and seismic moment estimation, and, possibly further, more advanced analysis. The growing demand for more and more precise monitoring of induced seismic activity, for increasing the accuracy and reliability of data analysis as well as the necessity of analyzing very small seismic events requires many improvements in seismic data analysis procedures. In this paper, we concentrate on the second step of mining (seismic) data analysis, namely, the hypocenter location task.

The possibility of achieving high location accuracy crucially depends on a few elements, namely, data quality, seismometer network geometry, and knowledge of the seismic velocity spatial distribution (Lomax et al. 2000; Wiejacz and Debski 2001). None of these elements are usually sufficiently known for the location problem. Some of them, like the optimum network configuration (Kijko and Sciocatti 1995) or data quality, are strongly connected to the mining process and cannot be easily changed or improved. Another factor, the velocity spatial distribution, is subject to many simplifications and is usually very poorly known (Lomax et al. 2000). Moreover, in a mining environment, the velocity also changes with time due to the dynamic response of the rock mass to excavation, and the changes can reach up to 10–20 % of the background velocity (Gibowicz and Lasocki 2001). As a consequence, the velocity distribution is usually poorly known and actually is the most significant factor limiting location accuracy (Lomax et al. 2000; Husen et al. 1999; Debski et al. 1997). Therefore, there is a need in mining practice to use new location algorithms which are as independent as possible of the velocity structure. An example of an algorithm which meets this criterion is the double difference (DD) technique proposed by Waldhauser and Ellsworth (2000).

The main idea behind this algorithm is to locate a group (spatial cluster) of events simultaneously rather than process seismic data for each event separately. As observational data, the method uses differential travel times—the differences between travel times of seismic waves coming from different events recorded at the same stations—instead of absolute travel time onsets for each event separately. This significantly reduces the dependence of location results on velocity models (Waldhauser and Ellsworth 2000). Moreover, it allows to achieve sub-sampling data accuracy and also automate data pre-processing (Waldhauser 2009) by using any of the signal cross-correlation techniques (Waldhauser and Schaff 2008). The DD algorithm has proved its outstanding performance in locating clusters and swarms of seismic events, significantly contributing to the better characterization of large earthquake source areas (see, e.g., Enescu et al. 2005), and thus, in previous papers (Rudzinski and Debski 2008), we have analyzed its application in a mining environment. We have concluded that the direct application of the DD approach for mining-induced seismicity analysis is quite limited. The reason for this is the algorithm’s stability. Achieving it requires the seismic signals from events located together to be recorded by most (ideally all) of the stations used for the DD location. Unfortunately, this condition can hardly be met in mining practice. Firstly, this is because the located events are usually quite small and they are recorded by nearby stations only—usually different for different events due to mining noise. Secondly, mining progress requires almost continuous updating of the seismic monitoring network. Again, this results in rejecting a number of events from the DD analysis because the events recorded before and after a network update cannot be located together due to differences in network configuration. Having identified the problems with using the DD algorithm, we have proposed its extension and call it the extended double difference (EDD) (Rudzinski and Debski 2011). In the cited paper, we already noticed an improvement of the depth estimation achieved by the EDD algorithm with respect to the DD one. This is a very important point because while the DD algorithm significantly reduces the influence of the velocity structure on location results, the problem of precise estimation of hypocenter depth still remains if the seismic network is almost planar and located at the same depth at which most seismic events occur. This is usually the case in underground deep mines where the seismic network being located at exploitation level is almost planar. This inherently limits location accuracy because for such a network and event configuration, there is no natural depth variability scale which could help determine the depth of events. It needs noting that in the case of horizontal spatial (epicentral) coordinates, such a length scale is provided by horizontal spreading of the network. Thus, any improvement in hypocenter depth estimation calls for algorithms which can efficiently “enhance” the information on network and source nonplanarity. The EDD algorithm seems to fulfill this requirement.

In this paper, we push forward the previous analysis (Rudzinski and Debski 2011) of the performance of the DD and EDD algorithms, concentrating this time on the resolution of the hypocenter depth and using real mining rockburst data as a case study.

The paper is organized as follows: first, we briefly describe the basic elements of the DD and EDD techniques; next, some elements of the probabilistic inverse theory are briefly outlined. This theoretical part is followed by an analysis of 10 events from the Rudna copper mine (Poland) and the obtained results are discussed.

## 2 Mathematics of the DD and EDD location algorithms

*m*

^{i}= (

*X*

^{i},

*Y*

^{i},

*Z*

^{i},

*T*

^{i}) be the set of parameters describing the hypocenter spatial coordinates and origin time of the

*i*th seismic event. Let \(T_{k}^{i}\) denote the generic wave onset time from the

*i*th event read at the

*k*th station. From now on, we assume that all arrival times correspond to the same seismic phase. Solving the location problem requires theoretical/numerical calculations of onset times (hereafter, denoted as \({(T^\textrm{m})}_{k}^{i}\)) for comparing with observed arrival times (\({(T^\textrm{o})}_{k}^{i}\text{(m)}\)). Next, let us introduce the generalized differential arrival times, defined as the difference between arrival times from different sources at different stations.

Let us consider events forming a spatial cluster and assume that the hypocenter separation between sources is small compared to the hypocenter–station distance and to local velocity heterogeneity sizes. Then, the ray paths between the sources and a common station can be hypothetically divided into two parts. The first part is common for all rays and roughly speaking spans from the station to the source cluster. The second one takes into account differentiation of ray paths in the neighborhood of and within the cluster. This hypothetical splitting of ray paths demonstrates that wave travel times from different sources have a large common factor connected to the wave traveling along the common part of all rays. The remaining part of the travel times brings information on the relative source location within the cluster. By using the differential travel times, we cancel (at least partially) the propagation time contribution from the common ray part, thus diminishing the location result’s sensitivity to the velocity model. Simultaneously, the differential travel times preserve information on the relative source distribution within the cluster (Evangelidis et al. 2008).

*l*

_{2}norm reads

Since the differential travel times \((\Delta^\textrm{o})_k^{ij}\) can be calculated automatically with very high accuracy from seismograms by means of the cross-correlation technique, the approach is very suitable for automatic almost real-time event location (Waldhauser and Ellsworth 2000).

*a*

_{i}coefficients to 0 or 1. They are listed in Table 1.

Seven elementary misfit functions and the resulting location algorithms generated from the generic misfit function described in Eq. 8 by setting the weighting coefficients *a*_{i} to 0 or 1

No. | \(a_{\textrm{dd}}\) | \(a_{\textrm{se}}\) | \(a_{\textrm{ed}}\) | \(\ensuremath{S(\mathbf{m})} \) |
---|---|---|---|---|

1 | 1 | 0 | 0 | \(S_{\textrm{DD}} \) |

2 | 0 | 1 | 0 | \(S_{\textrm{SE}} \) |

3 | 0 | 0 | 1 | \(S_{\textrm{ED}} \) |

4 | 0 | 1 | 1 | \(S_{\textrm{SE}} + S_{\textrm{ED}} \) |

5 | 1 | 0 | 1 | \(S_{\textrm{DD}} + S_{\textrm{ED}} \) |

6 | 1 | 1 | 0 | \(S_{\textrm{DD}} + S_{\textrm{SE}} \) |

7 | 1 | 1 | 1 | \(S_{\textrm{DD}} + S_{\textrm{SE}} + S_{\textrm{ED}}\) |

*μ*(·) stands for the reference distribution (usually a noninformative probability distribution which accounts for the size of the model space (Mosegaard and Tarantola 2002; Tarantola 2005)). We have chosen this parameter instead of the more traditional measures, like the a posteriori error or the RMS value of residua for the optimum model found, because the Shannon measure takes into account not only the “goodness” of the optimum best fitting model (RMS residua) but also the shape of the a posteriori probability distribution which determines the a posteriori errors. The price for choosing this robust measure

**I**

_{S}is the necessity of performing the full probabilistic (Bayesian) inversion rather than the most popular optimization-based inversion (Debski 2010). Since the probabilistic inverse theory is still not commonly used, let us describe its very basic elements in the next section. Readers interested in the details of the probabilistic inverse theory are referred to the basic textbooks and review papers, for example, (Tarantola 1987, 2005; Debski 2010; Mosegaard and Tarantola 2002; Mosegaard and Sambridge 2002; Lomax et al. 2000; Matsu’ura 1984).

## 3 Inverse theory—probabilistic point of view

The primary goal of the most frequent inverse tasks is determining the values of some parameters (**m**) which cannot be measured directly (Tarantola 2005). In the case of the location problem, they are the hypocenter coordinates and rupture origin time. To estimate the **m** parameters, we choose some additional physical parameters (**d**) which can be directly measured and which are connected to the sought ones in a known way (*dat* = **G**(**m**)). Then, we perform an inference about **m** having the measured \(\mathbf{d}^{\textrm{obs}}\).

There are basically three possible approaches to carry out this inference (Debski 2010; Tarantola 2005).

**G**(·) function.

**m**—to find the model \(\mathbf{m}^{\textrm{ml}}\) for which the theoretical prediction best fits the measured one. In practice, this is achieved by solving the optimization task which in a simplified form reads

**m**and assigning to it the probability of being the true one. In the simplest case, this a posteriori probability reads (Tarantola and Vallete 1982)

*f*(·) stands for an arbitrary a priori probability function known from elsewhere and the misfit function

*S*(

**m**) reads

The advantage of the probabilistic approach is that having the a posteriori probability density, we can evaluate any characteristic of the final solution allowing for any exhaustive error, resolution, trade-off, etc. analysis (Debski 2004; Wiejacz and Debski 2001). We need this feature to evaluate and compare the performance of the DD and EDD location algorithms.

However, any practical use of the probabilistic inverse technique requires an efficient method of sampling of the a posteriori distribution, which is by no means trivial if the number of inverted parameters is large (Curtis and Lomax 2001). The most popular class of sampling algorithms used in such situations is the Markov chain Monte Carlo (MCMC) technique (see Gilks et al. 1995; Robert and Casella 1999). This technique is able to perform the so-called importance sampling efficiently in any multidimensional model space which relies on sampling only that part of the model space where the sampled function has its dominating values. Since we were locating nine seismic events simultaneously in the studied problem, which give a total of 36 inverted parameters, the sampling of the a posteriori distribution was carried out by a very simple MCMC algorithm—the Metropolis algorithm (Chib and Greenberg 1995).

## 4 Rudna copper mine case

*dt*= 2 ms. The accuracy of routinely located events is better than 100 m, typically around 50 m for the epicentral coordinates and much worse for hypocenter depth (Fig. 1). For the comparison analysis, we have selected a cluster composed of 10 events from the XVII/1 mining panel described in Table 2 and shown in Fig. 2 where a sketch of the part of the mine where the considered events occurred is depicted. Event no. 1 was fixed as the master event and has not been relocated by either the DD or EDD algorithm but its location, well supported by mining observations, was provided by the mine. The location procedures were based on the P wave arrival times picked manually from seismograms. We have decided not to use any cross-correlation technique for observational data preprocessing to avoid any numerical errors which could influence the performance of the DD and EDD algorithms in different ways. The location procedure was carried out within the probabilistic (Bayesian) approach by sampling the a posteriori probability functions defined for DD and EDD as

Seismic events recorded on mining panel XVII/1 used in the current analysis

Event | Date | Hour | Energy (J) | \(N_{\textrm{arrivals}}\) |
---|---|---|---|---|

1 | 2010.03.18 | 16:45:38 | 1.60E+07 | 28 |

2 | 2010.01.29 | 5:58:56 | 4.10E+06 | 27 |

3 | 2010.06.23 | 21:24:56 | 1.20E+07 | 28 |

4 | 2010.07.25 | 16:52:24 | 2.10E+05 | 26 |

5 | 2010.05.26 | 14:17:16 | 2.60E+05 | 21 |

6 | 2010.06.04 | 21:50:50 | 2.20E+05 | 27 |

7 | 2010.01.25 | 20:24:4 | 5.90E+05 | 18 |

8 | 2010.03.31 | 5:6:12 | 5.60E+04 | 16 |

9 | 2010.12.07 | 0:48:22 | 8.10E+04 | 11 |

10 | 2010.09.11 | 6:59:8 | 1.50E+03 | 8 |

Let us begin the analysis of the location results by noticing, according to Table 2, a significant variation in the number of stations contributing to the location of different events. It ranges from 8 for the smallest event (event no. 10) up to 28 (event no. 3), which means that almost all the stations of the network contribute to the location of this event. The consequence of this fact is that the location errors for events no. 9 and 10 are larger than for other events from the cluster, as reported in Table 5.

*X*,

*Y*) shows that both solutions coincide quite well within 50–100-m accuracy, as is also shown in Fig. 1 where the epicentral distances weighted by the doubled epicentral errors are shown.

The EDD maximum likelihood solutions

Event | \(X_{\textrm{EDD}}\) | \(Y_{\textrm{EDD}}\) | \(Z_{\textrm{EDD}}\) |
---|---|---|---|

1 | 31948 | 8775 | −781 |

2 | 31886 | 8883 | −834 |

3 | 31846 | 8887 | −889 |

4 | 31898 | 8846 | −868 |

5 | 31948 | 8812 | −855 |

6 | 31897 | 8931 | −902 |

7 | 31970 | 8916 | −840 |

8 | 31975 | 8727 | −831 |

9 | 32185 | 8721 | −882 |

10 | 32172 | 8743 | −911 |

The DD maximum likelihood solutions

Event | \(X_{\textrm{DD}}\) | \(Y_{\textrm{DD}}\) | \(Z_{\textrm{DD}}\) |
---|---|---|---|

1 | 31948 | 8775 | −781 |

2 | 31942 | 8762 | −756 |

3 | 31894 | 8804 | −694 |

4 | 31899 | 8769 | −894 |

5 | 31951 | 8688 | −1144 |

6 | 31899 | 8772 | −935 |

7 | 31967 | 8747 | −892 |

8 | 31875 | 8712 | −835 |

9 | 31892 | 8807 | −683 |

10 | 31976 | 8857 | −755 |

*Z*not only differ between the two methods but also the inversion errors for the DD solutions are almost twice larger than for the EDD solutions. This is a result of significant differences between the a posteriori probability densities, as shown in Fig. 3.

The EDD and DD location errors (in meters) estimated by the a posteriori covariance matrix

Event | EDD | DD | ||||
---|---|---|---|---|---|---|

Δ | Δ | Δ | Δ | Δ | Δ | |

2 | 17 | 22 | 47 | 28 | 27 | 103 |

3 | 17 | 22 | 50 | 27 | 27 | 106 |

4 | 17 | 22 | 43 | 28 | 28 | 153 |

5 | 21 | 23 | 47 | 29 | 27 | 108 |

6 | 17 | 21 | 43 | 29 | 28 | 166 |

7 | 21 | 25 | 51 | 29 | 28 | 170 |

8 | 21 | 27 | 53 | 28 | 26 | 153 |

9 | 28 | 30 | 84 | 27 | 27 | 118 |

10 | 35 | 36 | 100 | 27 | 28 | 137 |

**I**

_{S}for the solution generated by the \(S_{\textrm{DD}}\) part of the misfit function alone (classical DD solution) has a much smaller value than for solutions obtained for the misfit function containing the remaining combinations of the \(S_{\textrm{DD}}\), \(S_{\textrm{SE}}\), and \(S_{\textrm{ED}}\) terms. This is especially true when comparing the

**I**

_{S}values for solutions generated by the \(S_{\textrm{DD}}\) and \(S_{\textrm{SE}}\) terms and by \(S_{\textrm{ED}}\) alone. In the case of the poorly resolved event (right panel in Fig. 4), this effect disappears. The value of the

**I**

_{S}factor is now very low and similar for all choices of the misfit function. It seems that in this case, the resolution of all methods is similar and limited by insufficient information in the input data. The conclusion which we have drawn from these observations is that algorithms built upon \(S_{\textrm{ED}}\) or \(S_{\textrm{SE}}\) combinations explore information about event depth more efficiently than the DD technique. To understand the mechanism of this “depreciation” of the DD method, let us examine the Shannon measure for all nine located events and different choices of the misfit function. The results are shown in Fig. 5.

**I**

_{S}factor for the DD solutions with respect to other choices of \(\ensuremath{S(\mathbf{m})}\) for all well-resolved events. Moreover, the

**I**

_{S}values are almost the same for all events. The second characteristic is an apparent progressive diminishing of the

**I**

_{S}factor for all but one, \(S_{\textrm{DD}}\) alone, of the combinations of the \(S_{\textrm{DD}}\), \(S_{\textrm{SE}}\), and \(S_{\textrm{ED}}\) terms with successive models. Inspection of Table 2 indicates that the varying number of seismic time onsets recorded and used for locating events is the main factor influencing

**I**

_{S}. In fact, the dependence of

**I**

_{S}on the number of used onsets plotted in Fig. 6 fully proves this expectation. A similar effect has been reported for the DD technique by Bai et al. (2006). An analysis of Fig. 6 suggests that for a large number of onsets,

**I**

_{S}saturates for any considered misfit functions. At the moment, we are not sure if this is caused by exhaustion of all the independent information contained in the data or if it results from the inherent resolution ability of the algorithms imposed by the structure of the assumed misfit functions.

## 5 Conclusions

The EDD location technique was originally proposed in order to include the specific demands of the mining seismic environment, such as event clustering (Gibowicz and Lasocki 2001; Orlecka-Sikora and Lasocki 2002) and continuous changing of the monitoring system (Mendecki and Sciocatti 1997; Rudzinski and Debski 2011), in the DD method. We expected the resulting algorithm to have a large enough spatial resolution to enable a detailed analysis of the structure of the spatial seismic clusters. Through performing synthetic tests, we have concluded (Rudzinski and Debski 2011) that the performance of the EDD approach with respect to the epicentral coordinates is essentially the same as for the DD technique. The current analysis of real data further supports this conclusion. However, the performance of the EDD approach with respect to the hypocenter depth has been found to be significantly better than that of DD.

The analysis performed in this paper clearly shows that the DD algorithm does not fully exploit the information contained in the input seismic data. This is not the case with the EDD approach. Using the same data set, the EDD algorithm leads to more precise depth solutions (larger value of the Shannon information measure). We believe that this is because the additional terms introduced into the misfit function by the EDD algorithm bring additional constraints (information) on the final solution with respect to the DD term. To illustrate this point, let us assume that we are locating *N* events using *M* stations and that signals from all the events are recorded by all stations. It is easy to show that in such a case, the \(S_{\textrm{DD}}\) term contains *N* ×(*N* − 1)×*M*/2 travel time difference factors, the \(S_{\textrm{SE}}\) term contains *M* ×(*M* − 1)×*N*/2 factors, and finally the \(S_{\textrm{ED}}\) term consists of *M* ×*N* ×(*N* − 1) ×(*M* − 1)/2 terms. Assuming that each such factor represents some constraint imposed on the final solution, it is obvious that in typical conditions when *M* > *N*, the \(S_{\textrm{DD}}\) term has the lowest “resolving power.” This analysis remains true provided that there are no dominating factors in the \(S_{\textrm{DD}}\), \(S_{\textrm{SE}}\), and \(S_{\textrm{ED}}\) terms. Otherwise, the dominating factors will determine the structure and the most important features of the misfit function. Consequently, adding new factors to the misfit function \(\ensuremath{S(\mathbf{m})}\) by changing *N* or *M* does not change it significantly. The consequence of this is stationarity of the a posteriori probability distribution, which means that the Shannon measure saturates when the number of phase readings increases.

It is not easy to say when the misfit function can be dominated by travel time difference factors. From the physical point of view, by analogy to phase transition processes, such a situation can happen if there exists some natural temporal or spatial scale for the analyzed (inverted) parameters. This is the case, for example, with epicentral coordinates for which the spatial extension of the network provides such a characteristic length scale. On the other hand, if such a scale length is missing, we can expect that all factors in \(\ensuremath{S(\mathbf{m})}\) have similar importance and thus the DD approach will not be able to exploit all available information efficiently. We believe that one example of this is the planar seismic network and seismicity located at the depth of the network.

Finally, to conclude our discussion about the performance of the EDD algorithm, let us observe that similar to the DD approach, the EDD algorithm uses the differential travel times as input data. However, contrary to the DD approach, it includes all possible combinations of source–receiver pairs. This means, however, that some of the input differential times can hardly be calculated reliably by means of cross-correlation techniques because of a lack of similarity between the considered seismograms. One consequence of this fact is that the algorithm needs careful and, in some part, manual preprocessing of the input data. Its automatic implementation may be problematic.

## Acknowledgements

The authors are very grateful to the Rudna copper mine for its cooperation and kind permission to use its seismic data. This work was partially supported by grant no. 2011/01/B/ST10/07305 from the National Science Center. The anonymous reviewers are acknowledged for their effort and help in improving the paper.

**Open Access**

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.