1 Introduction

1.1 Current research on earthquakes in mines

According to available information, the first rock burst accident in a coal mine in China occurred in Fushun City, Liaoning Province (Pan et al. 2003, 2007; Benjun 1987). Coal mine earthquakes can be caused by various factors, including the ground pressure, rock bursts, and artificial blasting (Ye et al. 2008; Yan et al. 2010; Li et al. 2011). Therefore, real-time monitoring and location of mine earthquakes are of critical significance for predicting the occurrence of rock bursts and protecting the lives of workers and property (Zhao et al. 2023; Pang et al. 2023; Fu-xing et al. 2015; Fu-hua et al. 2014; Liu and Gao 2012).

There are two major sources of error in the location of coal mine seismic sources. The first is related to the fact that the first-arrival picking accuracy is not high enough, and the second is that the wave velocity model does not apply to the actual geological structure (Xiao-hu et al. 2011; Lin et al. 2010). Therefore, the purpose of this study was to develop a high-accuracy mine seismic source location algorithm. The algorithm is applicable to small-scale localized geological structures such as mines. An innovative method was used by vectorizing the three-dimensional tomography of the small-scale geological structures, thereby constructing a vector geological model. Many first-arrival picking stations were distributed in the small-scale area (Linli et al. 2023; Shaoquan et al. 1993), and a large amount of first-arrival difference data were used to offset the first-arrival picking error. In this way, a system of analytical differential equations was established, and the analytical definite solution conditions were obtained, thereby establishing a new system for the location of a mine seismic source.

The location of a seismic source in a mine is an ill-posed problem. Therefore, when establishing the analytical differential equations, i.e., the definite solution conditions, the numerical stability of the algorithm need to be taken into account. The numerical ill-posed problem was converted into a well-conditioned problem, or the numerical ill-posedness was reduced as much as possible. The aim of this study was to provide a solution to the numerical ill-posedness of the seismic location problem in coal mines.

In a system, if a small perturbation of parameter A causes a large perturbation of parameter B, then B is sensitive to A. For a numerical problem, if a small error in the input data results in a large error (generally > 10 times) in the solution, the numerical problem is considered to be ill-posed. The larger the fluctuation is, the more serious the ill-posedness is. The location of mine seismic sources is not only an inverse problem with multiple solutions but also a highly ill-posed problem. This is because of the complexities of the vector geological model and the nonlinear differential equations, as well as the fact that the wave velocity and coordinates at any point are not decoupled. Therefore, a special numerical algorithm needs to be designed from the perspective of differential equations to solve the multi-solution and highly ill-posed problem.

We believe that there is a natural coordinate system in the geological structure of mines. The coordinate system is naturally bound to the wave velocity structure of the mine and they are not decoupled from each other. Another coordinate system is an artificial coordinate system, which we used to establish the velocity structure model and differential equations. Therefore, densely distributed stations were placed in small-scale mines to reduce the first-arrival error; however, this was not the focus of this study. This study focused on three aspects: (1) establishing a vector geological model that best fits the natural coordinate system; (2) establishing a nonlinear differential equation system that is compatible with the vector geological model; and (3) using key technologies to deal with the problems of multiple solutions and numerical ill-posedness.

In this study, we made a few improvements to existing location models for seismic source location in coal mines. Specifically, in view of the multiple solutions and ill-posed nature of the inverse problem, the problem was abstracted into an analytical model, analytical equations were established in the process of solving the problem, and an analytical solution was finally obtained. These improvements overcome the bottlenecks of traditional numerical and optimization methods for grid medium.

To obtain real-time data and information about geotechnical and underground engineering events, we must rely on digitization of the geological structures. To achieve digitization, an important and practical approach is to establish a vector geological model.

The vector geological model is used to replace traditional grid medium, and high-precision and high-efficiency analytical models are used to replace traditional numerical models. These are key prerequisites for realizing real-time dynamic monitoring and location of microseismic events.

Seismic rays are a high-frequency approximation of seismic waves. The actual path of the seismic wave is a Fresnel zone with both microstructural fractal characteristics and elastic–plastic fuzzy characteristics. The underground rock and soil mass is complex, and accurate seismic source location requires consideration of the microstructure fractal characteristics and elastic–plastic fuzzy characteristics. Academically, this could be a future research direction in this area. Notably, geological vectorization is an important carrier of these computation theories.

Mine seismic location technology is a technical method used to infer the spatiotemporal position of the seismic source based on the differences in the signal arrival time obtained by the observation system. This is a typical inverse problem.

Simons, and King et al. (Simons and Menzies 1975; King 1971; Haojiang et al. 1997; Xiating, et al. 2009; Huang 2011) believed that a wide range of underground rock and soil masses can be represented by multi-layer horizontal or inclined medium. Therefore, we started from the simplest model, i.e., a multi-layer horizontal or inclined velocity structure model, to develop the mine seismic location algorithm.

Wang and Zhang developed a seismic source location method applicable to small-scale geological structures based on the assumption of typical horizontal or inclined transversely isotropic medium (Shuai 2023, 2016; Shuai et al. 2018; Xiang-dong et al. 2015, 2014).

1.2 Traditional seismic source location in large-scale geological structures

The seismic source location method is described as follows (Hu 2006; Shen et al. 2000; Chen 2007; Ministry of Housing and Urban–Rural Development of the People's Republic of China. Code for Seismic Design of Buildings , 2010; Zhang and Xie 1989; Wang, et al. 2007; Liao 2002; Qian and Yin 1980; Zhuang 2006; Wang 2011; Li 2005; Tan et al. 2017; Li and Bai 2012; Chao 2009; Bo 2012; Jiang et al. 2005; Huaye 1999; Mengtao 1993; Xu et al. 2004). Based on the waveform parameters and first-arrival time of seismic waves picked up by stations, the spatial coordinates of the seismic source and the time of the seismic perturbation are determined, and sometimes, the velocity structure of the medium is determined. Finally, an evaluation of the solution is formulated.

Developing seismic location methods and improving the seismic location accuracy has been an important topic in seismic science. Seismologists are constantly improving or proposing location methods in the hope of obtaining a higher location accuracy. The accuracy of seismic location can be affected by a number of factors. Zhu et al. (Yue and Xiaofei 2002) conducted a comprehensive analysis of the possible causes of the errors in seismic location. They concluded that the main factors included the layout of the observation stations, seismic phase identification, arrival readings, structure of the crust, and location algorithms.

Seismic location can be divided into two categories based on actual applications, namely, fine location for research purposes and rapid location for seismic monitoring. Fine location methods can yield high-quality seismic location results, which is of great significance to research on seismology, Earth's interior physics, and engineering seismology. In comparison, rapid location methods based on observation stations can enhance the timeliness of earthquake warning and the accuracy of rapid reporting data, which are crucial for emergency response, post-earthquake disaster reduction and relief, and earthquake trend prediction. With the introduction of earthquake early warning, higher requirements have been put forward for seismic location, especially the velocity and accuracy of the observation station network or individual stations. Improving the precision and accuracy of seismic phase picking during real-time observations is the foundation for rapid automatic seismic source location. Therefore, high-quality seismic phase identification technology is needed. Future research directions involve seismic location based on three-dimensional crustal structure models and the development of rapid, high-precision, and high-quality seismic location methods (Zhu and Zhao 1997).

From the mathematical perspective, the essence of the seismic source location problem is to determine the minimum of the objective function. Various location methods have been proposed using different construction, processing, and minimization methods of the objective function. Numerical computations typically suffer from the following problems: the calculation of the travel time, the calculation of the partial derivatives, and the inversion of the equations. The fact that the observation stations are distributed on the surface leads to certain challenges in deep location. Various location methods have been proposed to address some of these problems. Each has its own advantages and disadvantages and different application fields (Yang et al. 2005).

Early seismic location was mainly based on the geometric mapping method. This method can be traced back to the advent of seismographs, and a variety of classic methods have been developed (Li 2014). In the past three decades, due to the rapid development and widespread application of computer technology, automatic location methods based on intelligent numerical calculations have developed rapidly and have become the primary method in current practice. Shanbang Li pioneered China’s seismic location work at the Jiufeng Seismic Observatory in Beijing in 1930. In 1953, large-scale observation data from multiple stations were used to determine the epicenter. Among the commonly used computer location methods, the Geiger method, proposed by the German physicist Geiger in 1912, is the most popular and classic method (Fu and Liu 1991), and it is the origin of computational location. Most existing location methods were developed based on the Geiger method, such as the joint location method in absolute location, the main event location method, and the double difference method in relative location. The Geiger method and its derivative algorithms are all time-domain location methods. A different location approach is based on the spatial domain, such as the station-couple time difference method.

1.2.1 Absolute location methods

  1. (1)

    Single event location method

Existing linear location methods were mostly derived from the Geiger method. In essence, this method establishes a nonlinear relationship between the differences in the observed and calculated travel times and the undetermined seismic source parameters. Then, the nonlinear problem is linearized, and the least squares principle or the weighted least squares principle is used to establish a system of linear equations. First, a tentative solution that is close to the true solution is substituted into the equations, and then, the problem is iteratively solved using numerical methods.

In the 1970s, with the rapid application of computers, Geiger's ideas were widely accepted for seismic source location. Lee et al. proposed the HYPO71, HYPO78–81 series of programs (Geiger 1912), which have been developed into the HYPOELLIPSE location program (Lee et al. 1975), which is widely used today. Zhonghe Zhao participated in the development of versions 80 and 81 of this program. Yan Jiang wrote the BLOC 86 program in 1986. After many modifications and additions, the NC 91 seismic location and magnitude calculation program was developed. This program is behind the earthquake data reports (EDR) of the U.S. National Earthquake Information Center (NEIC), the Interim Report of the China Seismological Observatory, and the China Earthquake Annual Report published by the Institute of Geophysics, China Earthquake Administration.

After Backus and Gilbert proposed a new inversion theory, Klein proposed the HYPOIN–VERSE algorithm (John et al. 1999). Lienert et al. further proposed the HYPOCENTER algorithm (Klein 1978). Nelson and Vidale improved the HYPOINVERSE algorithm and proposed the QUAKE3D method based on a three-dimensional velocity model (Lienert et al. 1986). In China, classic methods have also been widely used. Zhao used HYPO81 for the Beijing observation station network (Nelson and Vidale , 1990), and Wu et al. (Zhonghe 1983) and Zhao et al. (MINGXIup W, et al. 1985) used the classic method for the location of the Luquan earthquake and Lingwu earthquake sequences, respectively.

However, the Geiger method is largely dependent on the initial value. Sometimes, this can result in divergent location or even no location results. In actual seismic observations, if the stations cover the epicenter, this method can yield accurate location results. However, if the station distribution is not optimal, for instance, the earthquake is on one side of the stations, then the equation becomes ill-posed and the least squares method is unstable, resulting in an incorrect source location or solution divergence. In view of these problems, many scholars have proposed various improvement methods. (1) Many methods can be used to conduct inversion of the linear equations obtained using the least squares principle. For example, when the coefficient matrix of a linear equation system is singular or close to singular, it can cause instability and divergence in the iteration process. In this case, the singular value decomposition (SVD) method can be used to obtain the estimated solution, and the resolution and error of the solution can also be obtained. When the matrix is large, the conjugate gradient method can be used to solve the equations. (2) In order to improve the stability of the numerical calculations, many methods such as centering, scaling, and damped least squares are usually used (Klein 1978). (3) The premise of the least squares method (L2 criterion) is that the differences follow a Gaussian distribution. Yet, this requirement is often not met. In this case, the L1 criterion can be used, that is, the sum of the differences, rather than the sum of squares of the differences, is used to establish the functional relationship with the source parameters to be determined, thereby reducing the impact of large differences (Zhao et al. 1992).

  1. (2)

    Multi-event location methods

Multi-event location methods jointly determine multiple seismic sources and parameters such as the station correction or velocity model in order to reduce the errors caused by replacing complex crustal structures with simple velocity models while improving the location efficiency.

  1. (a)

    Joint inversion of source location and station correction (joint epicenter determination, JED; joint hypocenter determination, JHD)

Assuming that there are m events (sources) and n stations, for each station, a station correction is introduced to compensate for the error caused by the simplification of the velocity model. For all of the events and stations, corresponding algorithms are used to jointly determine the source positions of the m events and the corrections of the n stations.

Douglas first proposed the joint epicenter determination (JED) theory in 1967 (Prugger and Gendzwill 1988), and later, Dewey expanded it into the joint hypocenter determination (JHD) method, which includes source depth determination (Douglas 1976). In order to solve the problem of large matrices due to large m and n values, in 1983, Pavlis and Booker proposed the penalized maximum likelihood estimation (PMLE) method of parameter separation (Dewey 1972), which was simplified by Pujol (Pavlis and Booker 1983; Pujol 1988). Wang et al. (Pujol 2000) used the JHD and parameter separation methods to obtain P-wave travel time corrections for each station based on the first-arrival P-wave travel time data in the Kunming area, which greatly improved the location accuracy.

  1. (b)

    Joint inversion of source location and velocity structure (seismic site hypocenter, SSH)

In 1976, Crosson proposed the seismic site hypocenter (SSH) joint inversion theory (Wang et al. 1993). Since the SSH method does not require calibration of the wave velocity and can obtain abundant information about the velocity structure, it is currently widely used for seismic location. Compared with the JED, the SSH does not involve station correction, and the velocity structure is used as an unknown parameter to be inverted simultaneously with the seismic source, thereby avoiding the potential errors caused by the artificial velocity model.

On the theoretical basis of the joint inversion of the one-dimensional velocity structure and seismic source, Aki et al. gridded the horizontal non-uniform velocity structure of the Earth and proposed a method for the joint inversion of a three-dimensional velocity structure and seismic source in 1977 (Crosson 1976; Aki and Lee 1976). However, joint inversion using a single set of equations requires a large volume of computations. Pavlis and Booker (Aki et al. 1977) and Spencer and Gubbins (Pavlis and Booker 1980) used the parameter separation method to solve the coupled velocity parameters and source parameters, which greatly improved the calculation efficiency. Regarding Chinese scholars, Zhao established a new seismic wave velocity model (MDBJ81) in 1983 to deal with the sparse distribution of stations in Beijing, and this model was combined with the SSH method to improve the capabilities of the observation stations in Beijing (Spencer and Gubbins 1980). In addition, Liu introduced the orthogonal projection operator to achieve parameter separation and proposed a method of sequential orthogonal triangulation using the block structure of the matrix, which reduced the volume of the computations (Zhao 1983). Furthermore, Li and Liu improved the SSH by applying the latest research results on the three-dimensional velocity structure. Considering the balance of the equation system, the accuracy of the focal depth and earthquake onset time were improved (Futian 1991). Sun et al. used the damped least squares method to determine the source location and velocity parameters (Li and Liu 1991). In addition, Guo et al. (Sun et al. 1986), Zhao et al. (Gui-an and Rui 1992), and Zhu et al. (Zhao et al. 1993) used the SSH method to determine seismic sources in different scenarios.

However, the above multi-event location methods have certain problems. For example, the station correction method of the JHD is oversimplified and cannot reflect the complex structure of the Earth’s crust. Additionally, the three-dimensional velocity model in the SSH method results in a large volume of computations. By combining the ideas of these two methods, a few events can be selected for absolute location using the SSH method, which reduces the volume of the computations.

1.2.2 Relative location method

The relative location method is a method that can effectively reduce the errors associated with an insufficient understanding of the crustal structure. That is, the travel time difference between two events to the same station is only determined by the relative positions of the two events and the wave velocity between them, and it has nothing to do with the wave velocity from the event location to the station. It has been used to study shallow seismic activities related to reservoir faults (Fitch and Muirhead 1974) and the seismic depth in the Middle East (Jackson and Fitch, 1979). Ma (1982) was the first to use the relative location method in China. Zhou et al. (Zhu et al. 1995) improved this method by introducing reference stations. In order to prevent the errors in the main shock from being transmitted to undetermined earthquakes, Got (1994) proposed the multiple relative location method. Deichmann et al. (1992) performed waveform cross-correlation to correct the travel time difference readings for a group of earthquakes clustered in space and time. Lin (2000) referred to this method as the improved main event location method. Waldhauser et al. improved this method, proposed the double-difference earthquake location algorithm (DDA), and applied it to the Northern Hayward Fault in California (Zhou et al. 1999). Based on the DDA and standard tomography technology, Clifford et al. (2003) developed the DDA tomography location method. The results of this method in practical applications were superior to those of the DDA method. Yang et al. conducted research using the double difference location method, which yielded good relocation results (Waldhauser and Ellsworth , 2000). Bai et al. (2003) used the multiple relative location method to determine the location of the aftershocks of the Shunyi earthquake that occurred on December 16, 1996. Yang et al. (2002) used the main event location method to determine the location of the 1998 Zhangbei-Shangyi earthquake.

  1. (1)

    Main event method (arrival time difference, ATD)

The relative location method was developed based on the JED and has been a classic and popular method. Spence presents a detailed elaboration of this theory (Yang et al. 2004). The basic principle of the arrival time difference (ATD) method is to select a main event with a relatively accurate position, calculate the positions of a group of events occurring near the main event, and then calculate the source positions of the group of events.

The relative location method eliminates the error caused by the velocity model by introducing the arrival time difference. It has many advantages. For instance, since the main event and the events to be determined are very close to each other, iteration is not needed. Moreover, the arrival time differences are not required for the main event or the event to be determined. The errors in the relative position and relative arrival time obtained using this method are 30% smaller than that of the classic method. However, the determination of the absolute position and absolute arrival time depends on the main event.

Zhou et al. made great improvements to this method (Zhu et al. 1995). Specifically, they avoided direct solving of the earthquake onset time. After determining the seismic source, the arrival time is calculated based on the propagation velocity and distance, and the first wave arrival time is used to determine the depth.

The relative epicenter position obtained via relative location has a relatively high accuracy. For the main event, the improved classic method can be used for single event location. The combination of the two can provide good location results.

  1. (2)

    Double difference algorithm

In 2000, Waldhauser and Ellsworth proposed the double difference earthquake location algorithm (Zhou et al. 1999), referred to as the double difference algorithm (DDA). Specifically, two events are selected to form an event pair, and the difference between the calculated and observed arrival times of each event at a certain station is calculated. The differences between the two events in an event pair are then subtracted. Next, the above steps are repeated for all stations and event pairs, thereby determining the absolute position of the seismic source.

An advantage of the DDA is that it uses the cross-correlation analysis method in the spectral domain to obtain the arrival time difference of the events, which greatly improves the accuracy of the arrival time. Different from the relative location method, the distance between the event pairs does not matter, which greatly enhances the applicability of the method. If the arrival time differences of multiple phases are used, the location accuracy could be further improved. In addition, the algorithm has a strong anti-interference ability and robustness. Currently, this is a promising location method.

In the DDA, when the event pairs are close to each other, the equation can be simplified to obtain the relative distance between the event pairs, thereby simplifying the algorithm and reducing the calculation time.

  1. (3)

    Applicable conditions of the relative location methods

The ATD and DDA are both relative location methods, yet their establishment conditions and application scopes are different. The results of the ATD depend on the location of the main event. The DDA does not require a main event, and the results do not depend on the initial source location. The ATD is only suitable for precise seismic location in a small spatial range, whereas the DDA can be used for precise seismic location in a large spatial range.

1.2.3 Location methods in the spatial domain, i.e., station-couple time differences

The above location methods are all time-domain methods based on the Geiger method. Based on the processing of the arrival time differences, the four source parameters are not completely independent of each other, and the location results depend on the velocity structure and observation station distribution. In order to overcome the above shortcomings, a location method in the spatial domain was developed by replacing the arrival time differences with the distance differences. The equation only involves the epicenter location, depth, and earthquake onset time, which are solved separately, avoiding mutual compromise of these parameters and achieving a high location accuracy. Lomnitz (Spence 1980) and Carza et al. (Lomnitze 1977) used this method for teleseismic location, and the error in the epicenter location was 8–20 km.

In 1957, Romney proposed the station-couple time difference method for the location of local earthquakes (Carza et al. 1979). The arrival time differences and average apparent surface velocities of two stations (i.e., a station couple) with similar arrival times and short distances were utilized to establish a distance difference equation. The condition number of the equation is low, and the equation is easy to solve. Moreover, the location result has little dependence on the structure. However, it is not very effective for the determination of the focal depth and earthquake onset time. Zhao and Zeng improved this method (Romney 1957). Specifically, the focal depth was determined using the slope of the arrival time curve, and the earthquake onset time was determined based on the intercept of the arrival time curve with the time axis. Ding and Zeng used the station-couple time difference method to determine the epicenter in the Beijing-Tianjin-Tangshan area. The focal depth was determined using the arrival times of the different earthquake phases (Zhao and Zeng 1987).

1.2.4 Nonlinear methods

Both single-event and multi-event location methods are linear methods based on the Geiger method, and these linear methods can cause various problems in many cases (Zhifeng and Rongsheng 2011). For example, when dealing with nonlinear expressions of the differences and undetermined parameters, the linearization method may not be reasonable. If the initial value of the iteration is not appropriate, linear iteration can cause the solution to fall into a local minimum. Nonlinear methods are an effective approach for avoiding these problems.

  1. (1)

    Newton's method

When dealing with the nonlinear expression of the difference between the observed and calculated travel times and the undetermined parameters, Thurber proposed the use of the nonlinear Newton’s method containing second-order partial derivatives to avoid the problems encountered by the Geiger method (Lee and Stewart 1981). Both uniform and multi-layer velocity models have the following problems: for a shallow earthquake with a focal depth close to the surface, the second-order partial derivative tends to be the maximum, while the first-order partial derivative is close to zero. Hence, the second-order partial derivative is extremely important. In addition, when the seismic source is outside the station network, the introduction of the second-order partial derivative improves the stability of the algorithm. Since the uncertainty in the focal depth location comes from the correlation between the earthquake onset time and the focal depth in the linear method and the second-order partial derivative is more sensitive to changes in the focal depth than the first-order partial derivative, the introduction of the second-order partial derivative reduces the correlation and improves the stability of the algorithm. It should be noted that for multi-event location or three-dimensional velocity structures, the introduction of the second-order partial derivative greatly increases the volume of the computations. Thurber presented a nonlinear least squares solution based on Newton’s method.

  1. (2)

    Global search methods

Various global search algorithms have been developed for nonlinear optimization and have also been used in seismic location. Finding the minimum of the objective function can prevent the solution from falling into a local minimum, and there is no restriction on the form of the objective function. However, the efficiency of this method is generally low.

Prugger and Gendzwill (Zhao et al. 1992) and Zhao et al. (Thurber 1985) used the simplex method for seismic location in 1988 and 1994, respectively. The simplex method is a simple algorithm and does not require partial derivatives or inverse matrices, yet it cannot determine the resolution and error of the solution. In addition, other global search methods have been used in seismic source location, such as Monte Carlo (Zhao et al. 1994), simulated annealing, and genetic (Billing et al. 1994; Xie, et al. , 1996) algorithms.

  1. (3)

    Powell method

The Powel method (Wan et al. 1997) is a global search method. As an effective method for directly searching for the minimum of the objective function, it is a modified conjugate gradient method. The Powell method does not require partial derivatives or inverse matrices and has low requirements for the initial iteration value. Generally, the position of the station with the earliest arrival is used as the initial value. It has certain advantages in automatic and rapid seismic forecasting.

In 1979, Tang used the Powell method for seismic location (Powou 1964). The Powell method itself cannot give an error estimate. Wang et al. (Tang 1979) used a numerical experiment method to randomly perturb the theoretical travel time to obtain the relationship between the root mean square residual error and the source position error. In addition, Yan et al. (Wang et al. 1994) and Wang et al. (Yan and Xue 1987) applied the Powell method to determine the locations of earthquakes in the Three Gorges area and the Qinghai-Tibet Plateau, respectively.

  1. (4)

    Bayesian method

This method was developed based on the Bayesian evaluation theory, that is, from a statistical point of view, the optimal values of the model parameters maximize the probability of the observation data. In the 1980s, Tarantola and Valette (Wang et al. 2000) and Jackson and Matsu’ura (Tarantola and Valette 1982; Matsu'ura 1984) proposed strict equations and solutions for the Bayesian method.

1.2.5 Other location methods

  1. (1)

    Non-iterative method

The non-iterative method is simple and has low requirements for the user. It is an important algorithm and was the simplest type of seismic source location algorithm in the early stage. In the 1920s, Inglada (Jackson and Matsu'ura 1985) proposed a seismic source location algorithm, referred to as the Inglada method. This algorithm uses a minimum number of sensors (three for a two-dimensional plane, and four for a three-dimensional space) for seismic source location. This method not only leads to multiple solutions but also does not introduce any optimization method. In addition, the Inglada method only uses one wave velocity model for seismic source location, which greatly limits the application of the method.

In the early 1970s, researchers from the United States Bureau of Mines (USBM) proposed a novel non-iterative seismic source location algorithm called the USBM method (Inglada 1928; Leighton and Blake 1970). First, it separates the earthquake onset time to obtain a new set of linear equations. In addition, the USBM method uses an optimization analysis method in the location solution, which improves the location accuracy to a certain extent. Due to the various advantages of the USBM method, it has quickly become the most widely used method in North America. However, like the Inglada method, this method only uses one wave velocity model for seismic source location.

  1. (2)

    GMEL method

The grid-search multiple-event location (GMEL) method (Zhu and Zhao 1997) is a new multi-event location method based on a grid search. The GMEL is an extension of the earlier single event method, the GSEL. It uses a grid search to solve the problem and is suitable for multiple stations and multiple seismic events. By picking the phases of multiple seismic events, the arrival time and seismic source location are determined.

  1. (3)

    EHB method

In 1999, Engdahl et al. proposed the Engdahl, van der Hilst, and Buland (EHB) method for global teleseismic location (Leighton and Duvall 1972). A variety of seismic phases were used, including P, S, PKiKP, PKPdf, Pp, pwP, and sP. A global velocity model suitable for post-arrival seismic phases was obtained to confirm the teleseismic depth phase (pP, pwP, and sP). Compared with the location results of the International Seismological Centre (ISC) and National Earthquake Information Center (NEIC) methods, the accuracies of the epicenter and the focal depth determined by the EHB are significantly higher. The EHB method can be used for rapid location, as well as for tomography and studying the Earth's internal structure.

  1. (4)

    Intersection method

The intersection method (Engdahl et al. 1998; Zhao et al. 2008; Ai-Hua et al. 2015) is an intuitive, efficient, and stable method and has been widely used. It uses source trajectories for location and does not require solving equations. Even if there are only a small number of earthquake records, valuable information about the epicenter can still be obtained using the intersection method. However, the accuracy of the intersection method is low, especially regarding the focal depth. Therefore, this method is currently mainly used as an auxiliary location method. The main reasons for its low accuracy are that the earth model is assumed to be a uniform or horizontally uniform medium and the source trajectory is circular or hyperbolic. Actually, the Earth is characterized strong inhomogeneity both radially and transversely. Using an oversimplified velocity model will inevitably lead to large location errors. In addition, the intersection method uses the source trajectories on the surface rather than spatial trajectories for location, the epicenter is at the intersection of the surface source trajectories, and the focal depth is where the corresponding surface source trajectories intersect best. When the focal depth is not particularly small compared to the epicenter distance, even for a uniform medium model, the source trajectories on the surface will not intersect at the epicenter, but rather within a region. Determining the epicenter location within the intersection region is still an open problem.

1.3 Seismic location using three-dimensional crustal structure

In recent years, automatic numerical location methods based on modern digital seismic observation technology and scientific computing technology have developed rapidly and have become the primary method used in current seismic location. Studies have been carried out on the three-dimensional tomography of global and regional velocity structures, and the application of a three-dimensional velocity structure model for seismic location has attracted much attention. For example, Harvard University has always regarded improving the accuracy of seismic source location as a major research direction, which mainly involves the use of three-dimensional velocity structure models (Zhou and Zhao 2012).

In the past, seismic location problems have been studied based on one-dimensional crustal models, and the accuracy of seismic location has been improved using different parameter inversion algorithms and via the assignment of data weights.

Due to the past poor understanding of the Earth’s internal structure and limited available data, it was not until 1940 that Jeffreys and Bullen used the seismic data collected around the world to obtain an average travel time curve through statistical smoothing, which is widely known as the Jeffreys-Bullen seismological tables. They were widely used as soon as they were published (Chen et al. 2001). The U.S. National Earthquake Information Center, which is internationally known for providing rapid response to global earthquakes (such as the weekly Preliminary Determination of Epicenters (PDE) Catalog), and the International Seismological Center, which collects and processes huge amounts of seismic phase data (ISC Bulletins are published monthly, usually with a lag of 2 years), still use the Jeffreys-Bullen tables for data processing (Leighton and Duvall 1972). The theoretical travel time of the main teleseismic phases obtained according to the Jeffreys-Bullen tables can have an error range of several seconds. For example, if the typical teleseismic P-wave travel time is 500 s, the theoretical travel time prediction of the Jeffreys-Bullen tables has an error of within 1% (i.e., 5 s) (Jeffreys and Bullen 1940). Hence, the Jeffreys-Bullen tables are widely used around the world. The tauP Toolkits for travel time calculation launched by the University of South Carolina and the HYPOSAT seismic location program developed by the Norwegian seismic array NORSAR in 1997 include the Jeffreys-Bullen tables as one of the critical structure models.

The one-dimensional crustal model has been further developed to the iasp91 model and sp6 model based on the initial Jeffreys-Bullen tables. On this basis, Kennett et al. proposed the ak135 model. Engdahl et al. (Leighton and Duvall 1972) applied it to re-location global earthquakes from 1900 to 1999, and they found that an accurate one-dimensional model could significantly improve the seismic location results.

In the era of digital seismic observation, there have been fruitful developments in seismic tomography inversion in recent years, resulting in the rapid advancement of three-dimensional tomography of global velocity structures (Lay and Wallace 1995; Su et al. 1994; Bijwaard et al. 1998), especially joint tomography of different velocities (e.g., P-wave and S-wave) (Kárason and Hilst 1999; Su and Dziewonski 1997), as well as regional three-dimensional velocity structures.

Seismic tomography aims to reveal the changes in seismic wave velocities within the Earth, which is one of the key research fields in seismology. Travel time observations in the past 40 to 50 years have confirmed lateral changes in the Earth’s structure, and in the late 1970s, three-dimensional non-uniform velocity structure inversion of seismic travel times was realized for the first time. The stable inversion method proposed by Aki et al. (Crosson , 1976; Aki and Lee , 1976) uses the seismic phase travel time differences to invert the three-dimensional P-wave velocity of the regional block structure model, and it has been successfully used to invert global mantle structural changes in 10° grids. Dziewonski et al. (Kennett et al. 1998) proposed a different inversion method, which uses spherical harmonic functions instead of block divisions to represent the mantle structure. This parameterization method greatly reduces the unknown variables in the inversion, making it possible to use a large number (up to 700,000) of P-wave travel time differences to directly calculate the inversion tomography matrix. In the early 1980s, the introduction of effective matrix solution methods greatly accelerated the inversion velocity of the Earth’s structure (Asad et al. 1999; Dodge et al. 1996), enabling inversion tomography of a large amount of seismic data and numerous model parameters. For the first time, a global block inversion tomography with a resolution of 5° × 5° × 100 km was obtained. Almost simultaneously, spherical harmonic functions were expanded from the third order to high order. These developments led to the publication of many studies on the Earth’s internal structure.

In China, Liu (Zhao 1983) et al. (1989) attributed the seismic tomography of velocity reconstruction to solving a matrix equation system and introduced an orthogonal projection operator. This made the mathematical description of the joint inversion of the seismic source location and velocity structure more concise and obtained clearer physical meanings, and after parameter separation, the system of equations related to the seismic source was also compatible. Therefore, it is possible to obtain a unique solution for the seismic source location through an appropriate station layout. Regarding numerical calculations, Liu et al. proposed a sequential orthogonal triangulation method using the block matrix structure and then performed damped least squares inversion on the upper triangular matrix. This method greatly reduces the memory usage and the volume of the calculations.

The acquisition of numerous global and regional deep and shallow three-dimensional tomography data has led to the great application prospects and research potential of 3-D seismic location in seismic imaging, fine geological structure characterization, verification, fine seismic location, fault trend inference, earthquake nucleation, and source parameter inversion. (Dziewonski et al. 1977; Clayton and Corner 1983; Nolet 1985). Researchers have obtained detailed earthquake rupture processes during seismic source location.

The Ground Truth Events database established by the Experimental International Data Center (EIDC) provides a large number of nuclear explosion and earthquake catalogs with accuracies of 0, 1, 2, 5, and 10 km. These catalogs provide reliable data for verifying and improving three-dimensional seismic tomography results. The establishment of three-dimensional crustal models such as the 3SMAC model and the CRUST5.1 model (Dreger et al. 1998) provide references for three-dimensional seismic location.

Smith et al. (Mooney et al. 1998) compared the location results of the global three-dimensional P-wave velocity SP12/WM13 model, one-dimensional Jefferys-Bullen tables, PREM, and iasp91 models for 26 known nuclear explosion events and 83 seismic events with good location results. Their results showed that the three-dimensional location results were significantly better than the results of one-dimensional models, and the location error was reduced by about 40%. In addition, An-tolik et al. also studied the application of three-dimensional structure models for seismic source location.

Based on three-dimensional seismic tomography, researchers from China have conducted abundant research on seismic source location in three-dimensional complex velocity structure and regional velocity structure models.

Zhao et al. (Engdahl et al. 1998; Zhao et al. 2008; Ai-Hua et al. 2015) used the minimum travel time tree ray tracing method to determine the seismic source trajectory in complex laterally inhomogeneous medium, which can be used for seismic source location in complex velocity models.

Conventional seismic location methods usually need to pick up the first arrival of seismic waves. When the first arrival is not obvious or is overwhelmed by a higher level of noise, the accuracy is low. Cao et al. (Smith and Ekstrom 1996) used 3-D Gaussian beam migration imaging for seismic source location, which solved the above problem.

In view of the problems of traditional methods, such as the poor ability to deal with complex geological conditions, long processing time, and inability to automatically modify the velocity model according to additional seismic information so as to improve the accuracy of the next seismic location, Jin et al. (Cao et al. 2015) proposed a gridded seismic location method based on seismic station data, which achieved rapid seismic source location.

Tan et al. (Jin et al. 2007) and Zhao et al. (Tan et al. 2015) used perforation monitoring data and the arrival time difference to invert the velocity structure using the Levenberg–Marquardt algorithm and Occam algorithm, thus providing a foundation for micro-seismic source location.

Deng et al. (Zhong et al. 2015) combined the double-difference seismic location method proposed by Waldhauser and Ellsworth in 2000 with traditional travel time seismic tomography, and the relative travel time difference between adjacent earthquakes and the absolute travel time from the epicenter to the station were used to determine the location of the seismic source and the velocity structure. Then, double-difference tomography was conducted on the fine velocity structure of the Longmenshan fault zone.

Wang et al. (Deng et al. 2014; Wang et al. 2015) combined double-difference seismic tomography with absolute and relative P-wave arrival data to obtain the fine crustal P-wave velocity structure and source parameters of the Lushan seismic source area, and a velocity structure with a resolution of less than 5 km was obtained. In addition, seismic tomography was also conducted on the Zhaotong area to analyze the Yiliang earthquake area.

Xu et al. (Wang et al. 2014) proposed a nonlinear method called the time-reversal imaging technique to determine the seismic source location. This method is an absolute location method that avoids the problems encountered by the classic Geiger method and its absolute and relative derivative methods, including errors caused by linearization, the rationality of the least squares method, and the use of statistical methods to evaluate solutions.

Li et al. (Xu et al. 2013) used the differential evolution nonlinear global optimization algorithm (DE algorithm) to invert the crustal velocity structure model and determine the seismic source location, which provided an initial velocity model and repositioned source parameters for regional-scale seismic tomography.

Bai et al. (Li et al. 2006) analyzed the problem of matrix inversion to obtain the global solution under the global selection of the initial position of the seismic source in a three-dimensional complex velocity model. They also studied the use of the matrix inversion algorithm to obtain the global minimum in order to perform rapid and accurate seismic source location. This method is suitable for early warning of disasters such as earthquakes and tsunamis.

Li et al. (Chaoying et al. , 2009) introduced a source-scanning algorithm (SSA) to study seismic location and earthquake rupture surfaces. This method used a digital seismic waveform and achieved a good location accuracy without accurate arrival time picking and calculation of the theoretical seismogram. Their results revealed that SSA is a reasonable tool for micro-seismic location and can be applied in the monitoring and location of micro-earthquakes in coal mines.

Joint inversions of regional velocity structures and seismic source locations in China [135–147] have greatly enriched regional velocity structure models in various parts of China. The three-dimensional seismic tomography of China’s continental plate has been gradually improved, providing references for accurate seismic source location based on three-dimensional velocity models.

Although global-scale 3-D seismic source location is currently still in the exploration stage and the joint inversion of the 3-D velocity structure is limited to specific areas, it is expected that the use of 3-D seismic source location on a daily basis will be achieved soon.

With the establishment of China’s digital seismic network and the continuously enriched results of velocity structure models in various regions, it is reasonable to carry out seismic source location on three-dimensional structures as soon as possible.

1.4 Problems in existing studies

After decades of development in seismology, a consensus has been reached, that is, discretization is a good way of representing a medium. There are two reasons for this. First, the velocity on the grid is determined and the velocity between grid points is obtained via interpolation, so this method is suitable for ray tracing. Second, the establishment of underground models requires seismic tomography, and the current methods are mostly based on a gridded medium. Regarding discrete grid representation, numerical calculations and optimization are needed to obtain the forward and inverse solution of the seismic source location. Existing studies have the following issues that need to be discussed.

1.4.1 The accuracy, efficiency, and timeliness of the location algorithms do not meet the high requirements for real-time monitoring of micro-seismic events

Conventional micro-seismic source location does not have high requirements for timeliness. The current computing resources have no problem dealing with conventional mine micro-seismic monitoring and location; and hence, the problem of algorithm efficiency is not prominent.

However, substandard location accuracy and computational efficiency are two of the key problems that hinder the realization of real-time monitoring of microseismic events. Complex location algorithms cause the problems of a low computational efficiency and high consumption of computing resources. This paper focuses on real-time monitoring of micro-seismic events, which has a high requirement for the timeliness of seismic source location. Therefore, the requirements for the algorithm, as an important theoretical basis, are very stringent, especially the location accuracy and calculation efficiency requirements.

1.4.2 High accuracy leads to a large volume of calculations

Previous studies have explored the balance between computational complexity and efficiency in numerical calculations and optimization. For example, the network search method is a typical grid optimization method. Li et al. (Tan et al. 2017) stated that “one of the problems of grid-based algorithms is that the calculation accuracy is as a function of the grid spacing. Lowering the grid size to 50% in a three-dimensional medium means an 8 times increase in the calculation time. Therefore, when the requirement for the travel time accuracy is high, the calculation time may become unacceptable.”

1.4.3 Conflict between the Runge phenomenon of high-order interpolation and the smoothness of the velocity model

In classic coal mine seismic source location methods, the velocity on the grids is determined and the velocity between grid points is obtained via interpolation. The medium discretization has been well documented in existing studies. The local smoothness requirements of the velocity models in classic algorithms are determined based on the specific algorithm, and the global high-order smoothness does not necessarily need to be satisfied in advance.

In order to improve the calculation efficiency and accuracy while reducing the volume of the calculations, in this paper, we aim to convert the grid medium into a vectorized medium within a certain range and to optimize it into a vectorized geological structure model with first-order smoothness and continuous second-order derivatives. In order to ensure the smoothness of the velocity model, high-order interpolation is used; however, the grid will be dense, which will cause the Runge phenomenon due to high-order interpolation. An effective method to avoid the Runge phenomenon is to perform linear interpolation, yet the interpolation curve is not smooth, which not only does not meet the smoothness requirements of the velocity model but also cannot guarantee the continuity of the second derivative. Therefore, there is a conflict between the Runge phenomenon of high-order interpolation in the grid medium and the smoothness requirements of the velocity model. Appropriate measures must be used to convert the grid medium into a velocity structure model that meets both requirements.

1.5 Scientific significance

The theoretical seismic source location algorithm needs to deal with the relationship between the observed arrival time difference and the seismic source coordinates. However, in practice, the seismic source location algorithm is very complex. In addition to the unknown spatiotemporal parameters (t0, x0, y0, z0), the medium velocity structure distribution V is uncertain and inaccurate, and V is a function of the spatial coordinates, that is, V = v (x, y, z). Therefore, the wave velocity in the medium and the mine seismic source coordinates are not decoupled. For the implicit condition of Snell’s equation, the ray path and seismic source coordinates are highly coupled with the three major categories of known conditions, and the known conditions and the conditions to be solved for are not decoupled. In three-dimensional structure models, this phenomenon is even more prominent.

Therefore, studying the analytical expression of the seismic source location problem using three-dimensional structure models has gradually become a hot research topic in academia.

Such research not only has important theoretical significance for improving the accuracy and efficiency of micro-seismic source location but also plays an important role in realizing real-time monitoring of micro-seismic events and has important scientific value.

2 Wavefront forward modeling method for any observation system

The wavefront forward modeling method and the joint forward modeling and inversion location method are based on similar premises. Assuming that the seismic wave propagation path is reversible, then we can swap the location of the stations with that of the source. Then, applying our travel time constraint to this spherical wave forward modeling method, we determine the wavefront distribution in a stratified medium. Because the actual seismic wave propagation path must be one of the paths connecting the points on the wavefront to the station, it should be possible to locate the seismic source by tracing a path back from the station along the corresponding points on a wavefront curve as it is projected backward in time. In this four-dimensional model space (three spatial dimensions and one temporal dimension), the source is located using four variables (x0, y0, h0, T0). To constrain these four source variables, we require four travel time equations and thus seismic data from four non-collinear stations. By combining the specific wavefront distribution equations generated using the seismic data from these four stations, the spatiotemporal coordinates (x0, y0, h0, T0) of the source can be determined.

An advantage of this method is that it does not require a specific station configuration beyond requiring at least four stations on two non-collinear lines. This technique is viable using data from both uniform and non-uniform observation systems. Moreover, the location accuracy is not affected by the approximation of the refraction point. Regardless of the station distribution, the forward modeling method always produces accurate results.

According to the principle of from simple to complex velocity structures, from two-dimensional to three-dimensional velocity models, from a two-layer medium to a multi-layer medium, and from a horizontal layered medium to an inclined medium, the complexity of the velocity structure model is gradually increased, and thus, the derivation of the equations is clear and logical.

2.1 Two-dimensional forward modeling of a spherical wave in a two-layer medium

In the forward modeling of spherical waves in a two-layer horizontal medium, the following geometric equations apply:

$$\cos \theta_{{1}} = \frac{{h_{1} }}{{v_{{{\text{layer}}1}} T_{{1}} }},or\;R_{{1}} = v_{{{\text{layer}}1}} T_{{1}} = \frac{{h_{1} }}{{\cos \theta_{{1}} }}$$
(1)

where T1 is the travel time of the shock wave from Station 1 in Medium 1, R1 is the radius of the wavefront in Medium 1 after time T1 has elapsed, \(\theta\) is the angle between the vertical direction and the line connecting the intersection of the wavefront on the medium interface and the station, \(h_{i}\) is the burial depth of the medium, and the subscript i denotes the station number.

The law of wave refraction is expressed as

$$\frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }} = \frac{{\sin \theta_{{\text{r}}} }}{{\sin \theta_{{\text{i}}} }} = \frac{{\sin \theta_{{\text{r}}} }}{{\sin (\mu_{{1}} \theta_{{1}} )}},$$
(2)

where \(\mu_{{1}}\) is a parameter related to the position of the refraction point, \(\theta_{{{\text{in}}}} = \mu_{{1}} \theta_{{1}}\) is the incidence angle of the reversed propagation path from the station to the source, and \(\theta_{{{\text{refract}}}}\) is the refraction angle of the shock wave. By rearranging Eq. (2), we obtain

$$\sin \theta_{{\text{r}}} = \frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }}\sin (\mu_{{1}} \theta_{{1}} )$$
(3)

Using known substitutions for the sine function, we can rewrite Eq. (3) in terms of the cosine function:

$$\cos \theta_{{\text{r}}} = \sqrt {1 - \frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}}^{2} }}\sin^{2} (\mu_{{1}} \theta_{{1}} )}$$
(4)

With this formulation, we can use the forward model of a spherical wave propagating in a two-layer medium to determine the wavefront equation in the second layer:

$$\begin{gathered} x_{{0}} = \left( \begin{gathered} x_{1} + \frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }}\left( {\frac{{h_{1} }}{{\cos \theta_{{1}} }} - \frac{{h_{1} }}{{\cos (\mu_{{1}} \theta_{{1}} )}}} \right)\sin \theta_{{\text{r}}} \hfill \\ + h_{1} \tan (\mu_{{1}} \theta_{{1}} ) \hfill \\ \end{gathered} \right), \hfill \\ = x_{1} + \frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}} }}T_{{1}} \sin \mu_{{1}} \theta_{{1}} + \left( {1 - \frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}}^{2} }}} \right)h_{1} \tan (\mu_{{1}} \theta_{{1}} ) \hfill \\ \end{gathered}$$
(5)
$$\begin{gathered} h_{0} = \frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }}\left( {\frac{{h_{1} }}{{\cos \theta_{{1}} }} - \frac{{h_{1} }}{{\cos (\mu_{{1}} \theta_{{1}} )}}} \right)\cos \theta_{{\text{r}}} , \hfill \\ = \left( {v_{{{\text{layer}}2}} T_{{1}} - \frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }}\frac{{h_{1} }}{{\cos (\mu_{{1}} \theta_{{1}} )}}} \right)\sqrt {1 - \frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}}^{2} }}\sin^{2} (\mu_{{1}} \theta_{{1}} )} \hfill \\ \end{gathered}$$
(6)

where \((x_{0} ,h_{0} )\) are the source coordinates, \((x_{1} ,h_{1} )\) are the coordinates of Station 1, and T1 is the shock wave travel time from the source to Station 1. While the travel times T1 and T2 are unknown, the difference between these two travel times can be calculated. Because the origin time T0 and the travel times Ti are unknown, the values of \(\theta_{{1}}\) and \(\mu_{{1}}\) are also unknown. The parametric formulations expressed in Eqs. (5) and (6) describe the geometry of a spherical wave in a stratified medium, and the points on the wavefront represent possible paths tracing back to the seismic source from the seismic station. If the arrival time at another set of stations can be obtained, then these equations can be used to provide constraints on additional wavefront point distributions. Solving the parametric framework expressed in Eqs. (5) and (6) for three different stations (i.e. calculating parameters \(T_{1} ,\theta_{1} ,\mu_{1} ,T_{2} ,\theta_{2} ,\mu_{2} ,T_{3} ,\theta_{3} ,{\text{and }}\mu_{3}\)) will generate the source location, which is represented geometrically as the intersection of the three wavefront curves.

For each pair of stations, there are two linearly independent equations containing the coordinates of variables x and h. For three stations, there are \({2} \times {(3} - {1)} = {4}\) linearly independent equations. Using the travel time differences T1 − T2 and T2 − T3, we can generate two more linearly independent equations. The geometric relationships \(\cos \theta_{{1}} = \frac{{h_{1} }}{{v_{{{\text{layer}}1}} T_{{1}} }}\), \({\text{cos}}\theta_{{2}} = \frac{{h_{{2}} }}{{v_{{{\text{layer}}1}} T_{{2}} }}\), and \({\text{cos}}\theta_{{3}} = \frac{{h_{3} }}{{v_{{{\text{layer}}3}} T_{{3}} }}\) are also linearly independent. Using these nine linearly independent equations, we can solve for the nine unknown parameters in Eqs. (5) and (6). Once we have constraints for these parameters, we can determine the source location. The spherical wavefront and two-dimensional source location in a two-layer horizontal medium are illustrated in Fig. 1.

Fig. 1
figure 1

Spherical wavefront and two-dimensional source location in two-layer horizontal medium

2.2 Three-dimensional forward modeling of a spherical wave in a two-layer medium

Because the actual locations of the source and the seismic stations exist in three dimensions, the possible source locations reside on a surface. To resolve this extra parameter, it is necessary for a fourth station to be added to the existing station configuration.

The horizontal position of the source \(\varphi\) is the angle between the x-axis and the vertical plane containing the seismic wave propagation path. For Station 1, the parametric equations for the surface projections of the possible source locations are expressed as follows:

$$\begin{gathered} x_{{0}} = x_{1} + \left[ {\frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }}\left( {\frac{{h_{1} }}{{\cos \theta_{{1}} }} - \frac{{h_{1} }}{{\cos \theta_{{\text{i}}} }}} \right)\sin \theta_{{\text{r}}} + h_{1} \tan \theta_{{\text{i}}} } \right]\cos \phi_{{1}} , \hfill \\ = x_{1} + \left[ {\frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}} }}T_{{1}} \sin \mu_{{1}} \theta_{{1}} + \left( {1 - \frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}}^{2} }}} \right)h_{1} \tan (\mu_{{1}} \theta_{{1}} )} \right]\cos \phi_{{1}} \hfill \\ \end{gathered}$$
(7)
$$\begin{gathered} y_{{0}} = y_{1} + \left[ {\frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }}\left( {\frac{{h_{1} }}{{\cos \theta_{{1}} }} - \frac{{h_{1} }}{{\cos \theta_{{\text{i}}} }}} \right)\sin \theta_{{\text{r}}} + h_{1} \tan \theta_{{\text{i}}} } \right]\sin \phi_{{1}} , \hfill \\ = y_{1} + \left[ {\frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}} }}T_{{1}} \sin \mu_{{1}} \theta_{{1}} + \left( {1 - \frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}}^{2} }}} \right)h_{1} \tan (\mu_{{1}} \theta_{{1}} )} \right]\sin \phi_{{1}} \hfill \\ \end{gathered}$$
(8)
$$\begin{gathered} h_{0} = \frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }}\left( {\frac{{h_{1} }}{{\cos \theta_{{1}} }} - \frac{{h_{1} }}{{\cos \theta_{{{\text{refract}}}} }}} \right)\cos \theta_{{\text{r}}} , \hfill \\ = \left( {v_{{{\text{layer}}2}} T_{{1}} - \frac{{v_{{{\text{layer}}2}} }}{{v_{{{\text{layer}}1}} }}\frac{{h_{1} }}{{\cos (\mu_{{1}} \theta_{{1}} )}}} \right)\sqrt {1 - \frac{{v_{{{\text{layer}}2}}^{2} }}{{v_{{{\text{layer}}1}}^{2} }}\sin^{2} (\mu_{{1}} \theta_{{1}} )} \hfill \\ \end{gathered}$$
(9)

where \((x_{0} ,y_{0} ,h_{0} )\) is the location of the source, and \((x_{1} ,y_{1} ,h_{1} )\) are the coordinates of Station 1.

In the parametric equations for the four stations of interest, there are 16 unknown parameters: \(T_{i} ,\theta_{i} ,\mu_{i} ,{\text{and }}\phi_{i}\) (i = 1–4). By solving for these 16 parameters, we can determine the spatial location of the source. If we define Ti in terms of \(\theta_{{\text{i}}}\), then we arrive at 3 + 1 + 3 = 7 linear independent equations for x, y, h, T, and \(\theta\) for the two stations, while four stations lead to 3 × (4 − 1) + 3 + 4 = 16 linear independent equations. All 16 parameters can be solved, and the spatial position of the source can be obtained.

2.3 Forward modeling of a spherical wave in a multi-layer medium

With an arrival time at Station 1 of T1 and a shock wave travel time of \(T_{1} = t_{1} - T_{0}\), we define the propagation radius of the wavefront in the first medium as \(R = v_{1} T_{1}\), where the angle between the intersection line of the wavefront with the medium and the source location is \(\theta\). To expand our analysis to multiple layers, we define the burial depth \(h_{i}^{j}\) in terms of the station number i and the layer number j. Because we make no assumptions about the study area, we allow the value of \(h_{i}^{{1}}\) to vary. Assuming that the layers are horizontal, if \(j \ne 1\), then \(h_{i}^{j} = h_{i + 1}^{j}\). We also define the following geometric relationships:

$$\cos \theta_{{1}} = \frac{{h_{{1}}^{{1}} }}{{v_{{{\text{layer}}1}} T_{{1}} }},\;or\;R_{{1}} = v_{{{\text{layer}}1}} T_{{1}} = \frac{{h_{{1}}^{{1}} }}{{\cos \theta_{{1}} }}.$$
(10)

The source Q is a point on the wavefront surface, and the geometry of the wavefront surface is dictated by the parameter \(\mu { (}0 < \mu < 1{)}\). If the incidence angle of the shock wave in medium 1 is \(\mu \theta\), then \(\mu\) represents the refraction point on the wave path. If the incidence angle of the shock wave in medium i is \(\theta_{i}\), then \({\text{sin}}\theta_{1} = \sin \mu \theta\), \({\text{cos}}\theta_{1} = \cos \mu \theta\), and \({\text{tan}}\theta_{1} = \tan \mu \theta\). According to Snell’s Law,

$${\text{sin}}\theta_{i} = \frac{{v_{i} }}{{v_{1} }}\sin \theta_{1} ,{\text{cos}}\theta_{i} = \sqrt {1 - \frac{{v_{i}^{2} }}{{v_{1}^{2} }}\sin^{2} \theta_{1} } ,\;and\;{\text{tan}}\theta_{i} = \frac{{\sin \theta_{1} }}{{\sqrt {\frac{{v_{i}^{2} }}{{v_{1}^{2} }} - \sin^{2} \theta_{1} } }}\left( {i = { 2},{ 3},{ 4}} \right).$$
(11)

By running the forward propagation model for a spherical wave on the path from Station 1 toward the source in the kth layer, we obtain the wavefront equation in the kth layer:

$$x_{0} = x_{1} + \left[ {\left( {\frac{{h_{{1}}^{{1}} }}{{v_{{{\text{layer}}1}} \cos \theta_{{1}} }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{{{\text{layer}}j}} \cos \theta_{j} }}} } \right) \cdot v_{k} \sin \theta_{k} + \sum\limits_{j = 1}^{k - 1} {h_{j} \tan \theta_{j} } } \right] \cdot \cos \phi_{1} ,$$
(12)
$$y_{0} = y_{1} + \left[ {\left( {\frac{{h_{1}^{1} }}{{v_{{{\text{layer}}1}} \cos \theta_{{1}} }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}} } \right) \cdot v_{k} \sin \theta_{k} + \sum\limits_{j = 1}^{k - 1} {h_{j} \tan \theta_{j} } } \right] \cdot \sin \phi_{1} ,$$
(13)
$$h_{0} = \left( {\frac{{h_{{1}}^{{1}} }}{{v_{{{\text{layer}}1}} \cos \theta_{{1}} }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{j} }}{{v_{j} \cos \theta_{j} }}} } \right) \cdot v_{k} \cos \theta_{k} ,$$
(14)

where \(\varphi\) is the angle between the x-axis and the vertical plane that contains the wave path.

Up to this point, all of the parameters, except for \(\mu\) and \(\phi\), are known. By analytically describing the geometry of a spherical wave in the layered medium in Eqs. (1214), we can explore the ranges of the possible \(\mu\) and \(\phi\) values.

If we know the travel times at the three stations, we can insert these times into the forward model to generate the wavefront point distributions. The intersection of these three wavefront curves (i.e., the common solution of the parametric equations for these three stations) represents the location of the source. In the parameter equations for the three stations, there are six unknown parameters: \(\mu_{1} ,\phi_{1} ,\mu_{2} ,\phi_{2} ,\mu_{3} ,\phi_{3}\). Based on the coordinates of the two stations, three linear independent equations can be constructed, and six linear independent equations can be constructed for the three stations. By solving all six parameters, the spatial position of the source can be obtained. The parametric equation is derived from the seismic travel time of one station, and the wavefront distribution is the possible location of the seismic source. If the seismic travel times of the other two stations are obtained, and the same method is used for the forward modeling, and the other two wavefront distributions representing the possible locations of the hypocenter can be obtained. The intersection of these three wavefront curves must be the location of the epicenter, which is the common solution of the parametric equations corresponding to the three stations in the algebraic sense.

2.4 Explanation of the spherical wave forward equation in any layer of the medium

The physical meanings of Eqs. (12), (13) and (14) are described below.

Inside the brackets in Eqs. (12) and (13): \(h_{1}^{1}\) is the thickness of the first layer of the medium where the station is located, \(\theta\) is the angle between the intersection of the wavefront on the medium interface and the seismic source after the total travel time for the station, \(\frac{{h_{1}^{1} }}{\cos \theta }\) is the maximum distance that the seismic wave propagates in the first layer of the medium where the station is located after the total travel time to the station, and thus, \(\frac{{h_{1}^{1} }}{{v_{{{\text{layer}}1}} \cos \theta }}\) is the total travel time for station 1.

\(h_{1}^{j}\) is the thickness of the jth layer of the medium. \(\theta_{j}\) is the refraction angle of the actual path of the seismic wave into the jth layer of the medium or the incidence angle as the wave exits the jth layer of the medium. \(\frac{{h_{1}^{j} }}{{\cos \theta_{j} }}\) is the ray tracing length of the actual path of the seismic wave in the jth layer of the medium, so \(\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}\) is the propagation time of the seismic wave in the jth layer of the medium, \(\sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}}\) is the sum of the time it takes the seismic waves to propagate from the 1st layer to the k layer, \(\left( {\frac{{h_{1}^{1} }}{{v_{{{\text{layer}}1}} \cos \theta }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}} } \right)\) is the total travel time for Station 1 minus the sum of the time it takes the seismic wave to propagate from the 1st layer to the k − 1th layer, that is, the propagation time of the seismic wave in the kth layer of the medium, and \(\left( {\frac{{h_{1}^{1} }}{{v_{{{\text{layer}}1}} \cos \theta }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}} } \right) \cdot v_{k}\) is the distance that the seismic wave propagates in the kth layer.

\(\theta_{k}\) is the refraction angle of the actual seismic wave path entering the kth layer of the medium, \(\left( {\frac{{h_{1}^{1} }}{{v_{{{\text{layer}}1}} \cos \theta }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}} } \right) \cdot v_{k} \sin \theta_{k}\) is the horizontal projection of the propagation distance of the seismic wave in the kth layer, \(\left( {\frac{{h_{{1}}^{{1}} }}{{v_{{{\text{layer}}1}} \cos \theta }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}} } \right) \cdot v_{k} \cos \theta_{k}\) is the vertical projection of the propagation distance of the seismic wave in the kth layer, or the thickness h0 of the kth layer in which the seismic source is located. These are the physical meanings of Eq. (14).

For Eqs. (12) and (13), because \(h_{1}^{j}\) is the thickness of the jth layer, \(\theta_{j}\) is the refraction angle of the actual path of the seismic wave into the jth layer of the medium or the incidence angle as it exits the jth layer, and \(h_{1}^{j} \tan \theta_{j}\) is the horizontal length of the actual path of the seismic wave in the jth layer. \(\sum\limits_{j = 1}^{k - 1} {h_{1}^{j} \tan \theta_{j} }\) is the sum of the horizontal lengths of the seismic wave in the medium from the 1st to kth layers, and \(\left( {\frac{{h_{1}^{1} }}{{v_{{{\text{layer}}1}} \cos \theta }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}} } \right) \cdot v_{k} \sin \theta_{k} + \sum\limits_{j = 1}^{k - 1} {h_{1}^{j} \tan \theta_{j} }\) is the sum of the horizontal lengths of the seismic wave in the 1st to kth layers, i.e., the horizontal distance between the seismic source and the station.

Since \(\varphi\) is the angle between the vertical plane of the seismic wave propagation path and the x-axis, \(\left[ \begin{gathered} \left( {\frac{{h_{{1}}^{{1}} }}{{v_{{{\text{layer}}1}} \cos \theta }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{{{\text{layer}}j}} \cos \theta_{j} }}} } \right) \hfill \\ \cdot v_{k} \sin \theta_{k} + \sum\limits_{j = 1}^{k - 1} {h_{{1}}^{j} \tan \theta_{j} } \hfill \\ \end{gathered} \right] \cdot \cos \varphi_{1}\) is the projection of the horizontal distance between the seismic source and the station onto the x-axis, and \(\left[ \begin{gathered} \left( {\frac{{h_{1}^{1} }}{{v_{{{\text{layer}}1}} \cos \theta }} - \sum\limits_{j = 1}^{k - 1} {\frac{{h_{1}^{j} }}{{v_{j} \cos \theta_{j} }}} } \right) \hfill \\ \cdot v_{k} \sin \theta_{k} + \sum\limits_{j = 1}^{k - 1} {h_{1}^{j} \tan \theta_{j} } \hfill \\ \end{gathered} \right] \cdot \sin \varphi_{1}\) is the projection of the horizontal distance between the seismic source and the station onto the y-axis. These are the physical meanings of Eqs. (12) and (13).

For \(\theta_{j}\)(\(j = 1 \cdots k\)), it can be seen from the description of the variables in Sect. 2.4 that \(\theta_{j}\) is the refraction angle of the actual path of the seismic wave into the jth layer or the incidence angle as it exits jth layer, so \(\sin \theta_{1} = \sin \mu \theta\), \(\cos \theta_{1} = \cos \mu \theta\), and \(\tan \theta_{1} = \tan \mu \theta\). In addition, \(\sin \theta_{1} = \frac{{v_{i} }}{{v_{1} }}\sin \mu \theta\), \(\cos \theta_{1} = \sqrt {1 - } \frac{{v_{i}^{2} }}{{v_{1}^{2} }}\sin^{2} \theta_{1}\), and \(\tan \theta_{i} = \frac{{\sin \theta_{1} }}{{\sqrt {\frac{{v_{i}^{2} }}{{v_{1}^{2} }}} - \sin^{2} \theta_{1} }}(i = 1,2,3,4)\) (i = 2, 3, 4). Thus, for \(\theta_{j}\) (\(j = 1 \cdots k\)), regardless of how large k is, the expression for \(\theta_{j}\) contains only one unknown parameter, \(\mu\).

The thickness of a layer of the medium and the wave velocity of the layer of the medium are known; the vertical coordinates of each station and the detected arrival time ti are known; the travel time of each station Ti and \(\theta_{{\text{i}}}\) are unknown; and the seismic source spatial parameter \(\mu_{i}\) and \(\varphi_{i}\) are unknown. Equations (12), (13), and (14) are parametric equations with Ti, \(\theta_{{\text{i}}}\), \(\mu_{i}\), and \(\varphi_{i}\) as the parameters, which represent the geometric shape of the wavefront of the spherical wave in the layered medium.

Therefore, the known and unknown variables contained in the spherical wave forward modelling equation in any layer of the medium are consistent with the known and unknown variables contained in the spherical wave forward modelling equation for two-layer medium. The unknown variables are Ti, \(\theta_{{\text{i}}}\), \(\mu_{i}\), and \(\varphi_{i}\) (where i = 1–4). Thus, the modeling process of the seismic source location in any horizontal layered medium is the same as that in a two-layer horizontal medium, but the forms of the location equations in general layered medium are more complex.

2.5 Identifying the layer that contains the seismic source

During forward modeling of the seismic wave from the station to the source, we search through each layer sequentially to find the source and calculate the h0 coordinate of the source.

For example, in a multi-layer horizontally layered medium, it is easy to determine if the source is located in the shallowest layer. However, if the source is not located in the shallowest layer, then we can use forward modeling of the origin time to calculate the shape and the position of the seismic wavefront in the second layer of the medium. In this fashion, we develop a framework of nonlinear equations that describe our system. Each solution to these equations represents a possible focal depth h0. If h0 is less than the thickness of H2 (i.e., the layer directly beneath the surface layer), then we infer that the source is located in the second layer. Conversely, if h0 is greater than H2, then we know that the source is located in a deeper layer and use the forward model to project the wavefront into the third layer (i.e., the next deepest). This process is repeated until we find the layer that contains the source. We illustrate an example of this layer-by-layer search in Fig. 1.

2.6 Micro-seismic monitoring in the muchengjian (Jia and Li 2010; Jia et al. 2014; Pan et al. 2012) mining area

We also used our joint forward modeling and inversion location method to monitor and locate the source of the seismic activity in the Muchengjian Coal Mine. The Muchengjian Coal Mine is located in the Mentougou District of Beijing. The lithology of this coal mine consists of two layers: an upper layer that does not contain coal seams (burial depth of 600 m, P-wave velocity of 4500 m/s) and a lower layer that contains the main coal seams (burial depth of 400 m, P-wave velocity of 2250 m/s). We captured micro-seismicity data for this coal mine using the monitoring and location system developed by the Rock Burst Research Institute of Liaoning University of Engineering and Technology (accuracy of 1 ms). The No. 1 station group is located near the main shaft at an altitude of 820 m and the No. 2 station group is also located at an elevation of 820 m, but the stations are located farther from the main coal seam. With the stations covering an area of 3 km × 10 km, we were able to record micro-seismic signals with magnitudes ranging from − 1 to 3 on the Richter scale. For one particular micro-seismic event, the source coordinates (x, y, h) were (− 13,475 m, 4,420,814 m, − 180 m) and the origin time was 14:20:11.000 s on June 6, 2010. With a T0 value of 0.067, the corrected origin time was 14:20:11.067.

In order to determine the source coordinates using our joint forward modeling and inversion location method, we required data from three non-collinear stations. For Stations 1, 3, and 6 (Table 1), we solved Eqs. (79) to determine the values of θ1 (1.3605), θ2 (1.456447), and θ3 (1.49481). After substituting these values into Eqs. (1214), we determined the source location by solving the six nonlinear equations. We set our initial parameter values to \((\mu_{1} ,\phi_{1} ,\mu_{2} ,\phi_{2} ,\mu_{3} ,\phi_{3} ) = (0.{9},3,0.{9},3,0.{9},3)\). After using Matlab to iteratively solve these equations, we arrived at our final parameter values of \(\begin{gathered} (\mu_{1} ,\phi_{1} ,\mu_{2} ,\phi_{2} ,\mu_{3} ,\phi_{3} ) = \hfill \\ \left( \begin{gathered} 0.{9}4328,3.14159,0.98524, \hfill \\ 2.71497,0.{9}9379,2.71323 \hfill \\ \end{gathered} \right) \hfill \\ \end{gathered}\).

Table 1 Station locations and arrival times in the test case of the Muchengjian Coal Mine

By substituting these values into Eqs. (1214), three estimates of the source location with respect to the station locations were obtained: \(\left\{ \begin{gathered} x_{01} = - 13664 \hfill \\ y_{01} = 4420814 \hfill \\ h_{01} = 332.599 \hfill \\ \end{gathered} \right.\), \(\left\{ \begin{gathered} x_{02} = - 13692 \hfill \\ y_{02} = 4420913 \hfill \\ h_{02} = 359.609 \hfill \\ \end{gathered} \right.\), and \(\left\{ \begin{gathered} x_{03} = - 13675 \hfill \\ y_{03} = 4420905 \hfill \\ h_{03} = 372.02 \hfill \\ \end{gathered} \right.\). These source-station distances correspond to an absolute position of the coordinates. The micro-seismic test results obtained using different algorithms are presentec in Table 2.

Table 2 Micro-seismic test results obtained using different algorithms

3 Simplifications of complex nonlinear systems

By the application of the method to specific examples, it was found that the wavefront method has as many as 16 nonlinear equations, i.e., 16 unknown parameters. When using the iterative method to solve the equations, it is necessary to estimate the initial iteration values of all of the unknowns, which is very difficult. Therefore, the complex nonlinear systems are simplified.

To solve a nonlinear equation system, the initial iteration values need to be determined. Hence, the smaller the number of parameters that need initial iteration values is, the easier it is to determine the overall initial value range. Therefore, the variable substitution method was used to eliminate the location parameters, thereby reducing the difficulty of determining the initial iteration values.

The wavefront method includes 16 equations, i.e., 16 unknowns. Determining the initial iteration values is very difficult. However, after the nonlinear system is simplified, the number of parameters that require an initial iteration value is reduced to 11 in the most complex case, which greatly simplifies the system.

The smaller the number of parameters that need initial iteration values is, the easier it is to determine the initial iteration value. This finding has been obtained from engineering research on complex systems. In general, as engineering systems become increasingly complex, it becomes difficult to make accurate statements about the processes within and the data generated by that system (Zadeh, 1973; Zhao, 2006). This becomes a concern because in our modeling method, the number of nonlinear equations needed to adequately characterize the system increases as the number of dimensions and the number of layers increase.

During preliminary testing, we found that it is difficult to determine the initial parameter values while simultaneously solving 16 nonlinear equations. If we use very general values to initialize the unknown parameters, then our model often generates meaningless complex solutions. In order to obtain a meaningful and unique real solution, it is necessary to select the location parameters correctly and to define them with a precision of two or more decimal places. However, because the source parameters are unknown, we cannot define the 16 location parameters with an accuracy of two decimal places. Despite designing our technique with this task in mind, it is still difficult for the model to accurately locate the seismic source without making some simplifying assumptions. To work around these issues, we reduce the number of unknown parameters or nonlinear equations.

3.1 Simplification of the two-dimensional problem

Two of the parameters in the two-dimensional observation system are the distance between station i and medium interface hi and the horizontal distance between station i and source xi. In the two-dimensional problem, we require at least three stations to form a well-posed equation system.

$$\cos \theta_{{1}} = \frac{{h_{1} }}{{v_{{{\text{layer}}1}} T_{{1}} }},$$
(15)
$$\cos \theta_{{2}} = \frac{{h_{{2}} }}{{v_{{{\text{layer}}1}} T_{{2}} }},$$
(16)
$$\cos \theta_{{3}} = \frac{{h_{{3}} }}{{v_{{{\text{layer}}1}} T_{{3}} }},$$
(17)
$$T_{{2}} - T_{{1}} = a_{{1}} ,$$
(18)
$$T_{{3}} - T_{{2}} = a_{{2}} .$$
(19)

In Eqs. (18) and (19), a1 and a2 are the arrival time differences between the three stations.

In Eqs. (1519), there are six unknowns: T1, T2, T3, \(\theta_{1} ,\theta_{2}\), and \(\theta_{3}\). We can reduce the number of unknown parameters by redefining these parameters.

If we define \(\frac{{T_{1} }}{{T_{2} }} = m\), then according to Eqs. (18) and (19), T1, T2, and T3 can be expressed as \(T_{1} = \frac{{ma_{1} }}{1 - m}\), \(T_{2} = \frac{{a_{1} }}{1 - m}\), and \(T_{3} = \frac{{a_{1} }}{1 - m} + a_{2}\), respectively. By substituting T1, T2, and T3 into Eqs. (15), (16), and (17), respectively, we can redefine \(\theta_{1} ,\theta_{2}\), and \(\theta_{3}\) in terms of the unknown variable m. This new system of equations has only four unknown parameters: \(\mu_{{1}}\), \(\mu_{{2}}\),\(\mu_{{3}}\), and m. Now that we have simplified the original nonlinear system, the problem is well-posed.

3.2 Simplification of the three-dimensional problem

We can apply a similar simplification to the three-dimensional problem:

$$\cos \theta_{{\text{i}}} = \frac{{h_{i} }}{{v_{{{\text{layer}}1}} T_{{\text{i}}} }}\left( {{\text{i }} = { 1},{ 2},{ 3},{ 4}} \right),$$
(20)
$$T_{{{\text{i }} + { 1}}} - T_{{\text{i}}} = a_{{\text{I}}} \left( {{\text{i }} = { 1},{ 2},{ 3}} \right).$$
(21)

With these two equations, we can reduce the number of unknown values from eight (T1, T2, T3, T4, \(\theta_{{1}} ,\theta_{{2}} ,\theta_{{3}} ,\theta_{{4}}\)) to one.

If we define \(\frac{{T_{1} }}{{T_{2} }} = m\), and Ti (i = 1, 2, 3, 4) can be expressed as

$$T_{1} = \frac{{ma_{1} }}{1 - m}\;{\text{and}}\;T_{2} = \frac{{a_{1} }}{1 - m},T_{3} = \frac{{a_{1} }}{1 - m} + a_{2} \;{\text{and}}\;T_{4} = \frac{{a_{1} }}{1 - m} + a_{2} + a_{3}$$
(22)

Substituting these four parameters into Eq. (21) allows us to express the remaining unknown values \(\theta_{{1}} ,\theta_{{2}} ,\theta_{{3}} ,\theta_{{4}}\) in a single variable m.

The new equation system contains nine parameters: \(\mu_{{1}}\),\(\phi_{{1}}\), \(\mu_{{2}}\), \(\phi_{2}\), \(\mu_{{3}}\), \(\phi_{3}\), \(\mu_{{4}}\), \(\phi_{4}\), and m. Thus, the remaining nine nonlinear equations constitute a well-posed nonlinear equation system. Hence, the original nonlinear system is simplified.

In this way, the simplified equation system contains a total of 11 variables (\(T_{1}\), \(\theta_{{1}}\), \(\mu_{{1}}\), \(\varphi_{{1}}^{{}}\), \(T_{2}\), \(\mu_{{2}}\), \(\phi_{2}\), \(T_{{3}}\),\(\mu_{{3}}\), \(T_{{4}}\), and \(\mu_{{4}}\)), and the number of nonlinear equations is 11. The stability of the forward and inversion method is lower than that of the geometric mean method, but the difference is small, and it is has a better stability than the classical linear method.

4 Seismic source location in a multi-layer inclined medium

The geometric method and the joint forward modeling and inversion location method require that the stations are distributed on the horizontal surface plane. For a multi-layer horizontal medium, the distances from each station to the interface of the first layer are equal, that is, the vertical coordinates of the stations are \(h_{i}^{1} = h_{j}^{1}\), where i and j are the station numbers, and \(i \ne j\).

The wavefront forward modeling method is applicable to any observation system. The algorithm does not require the stations to be on the same horizontal plane; and hence, the vertical coordinates of the stations can be different, i.e., \(h_{i}^{1} \ne h_{j}^{1}\), without affecting the location process.

The wavefront forward modeling method does not require the stations to have the same vertical coordinates (\(h_{i}^{1} = h_{j}^{1}\)); yet, when the wavefront of the shock wave emitted by the station is forwarded in a multi-layer medium, the forward propagation direction of the wavefront is required to be orthogonal to the medium interface, and only through this can the source coordinates \((x_{0} ,y_{0} ,h_{0} )\) be reparametrized to new parameters \((\mu ,\varphi )\). The forward propagation direction of the wavefront in the method is generally parallel to the z-axis of the three-dimensional coordinate system. In the current geodetic coordinate system used for the station layout, the z-axis is generally orthogonal to the horizontal surface plane rather than to the medium interface. Therefore, if the geodetic coordinate system is used, the forward propagation direction of the wavefront forward modeling is orthogonal to the horizontal surface plane rather than to the medium interface, which will make the wavefront forward modeling method inapplicable. Therefore, in order to apply the wavefront forward modeling method to a multi-layer inclined medium, it is necessary to rotate the geodetic coordinate system until the z-axis is orthogonal to the medium interface.

Although the multi-layer inclined medium is at an angle relative to the horizontal surface plane, the interfaces remain parallel to each other. Therefore, by rotating the geodetic coordinate system, the wavefront propagation direction can be orthogonal to the medium interface in each layer, allowing seismic source location using the wavefront forward modeling method.

Therefore, in this study, we made the following assumption for the inclined medium: regardless of whether the number of layers of the inclined medium is two or more, we only need to rotate the station coordinates to convert the horizontal coordinates \((x_{i} ,y_{i} )\) to a plane parallel to the medium interface. The change in the vertical coordinates of the stations is the vertical distance of the first layer of the medium. In this way, the seismic source location problem in a two-layer inclined medium or a multi-layer inclined medium is transformed into a location problem in a multi-layer horizontal medium in an irregular observation system, and the wavefront forward modeling method can then be applied.

The steps of the wavefront forward modeling method for a multi-layer inclined medium are described below.

Conditions: Extend the medium interface, and assume that the extended surface of the medium interface intersects with the horizontal surface plane at line l. The angle between line l and the y-axis of the geodetic coordinate system where the station is located is \(\alpha\). The angle between the normal direction of the medium interface and the horizontal surface plane is \(\beta\).

  1. (1)

    Rotate the geodetic coordinate system where the station is located around the z-axis by \(\alpha\) degrees (clockwise rotation is positive and counterclockwise rotation is negative) so that the y-axis of the new coordinate system after rotation is parallel to line l and the horizontal coordinates of the station \((x_{i} ,y_{i} )\) change to Eq. (23). The rotation process is illustrated in Fig. 1.

    $$\left\{ {\begin{array}{*{20}c} {x_{i} ^{\prime} = x_{i} \cos \alpha + y_{i} \sin \alpha } \\ {y_{i} ^{\prime} = - x_{i} \sin \alpha + y_{i} \cos \alpha } \\ \end{array} } \right.$$
    (23)
  2. (2)

    Rotate the new coordinate system obtained in step (1) around the y-axis \((x = 0,z = 0)\) by \(\beta\) degrees (clockwise rotation is positive and counterclockwise rotation is negative) so that the z-axis of the new coordinate system after rotation is parallel to the normal direction of the medium interface and the three-dimensional coordinates of the station after the second rotation change from \((x_{i} ^{\prime},y_{i} ^{\prime},z{}_{i})\) to Eq. (24). The rotation process is illustrated in Fig. 2.

    $$\left\{ {\begin{array}{*{20}c} \begin{gathered} x_{i} ^{\prime\prime} = x_{i} ^{\prime}\cos \beta + z_{i} \sin \beta \hfill \\ = (x_{i} \cos \alpha + y_{i} \sin \alpha )\cos \beta + z_{i} \sin \beta \hfill \\ \end{gathered} \\ {\begin{array}{*{20}c} {y_{i} ^{\prime} = - x_{i} \sin a + y_{i} \cos \alpha } \\ \begin{gathered} z_{i} ^{\prime} = - x_{i} ^{\prime}\sin \beta + z_{i} \cos \beta \hfill \\ = - (x_{i} \cos \alpha + y_{i} \sin \alpha )\sin \beta + z_{i} \cos \beta \hfill \\ \end{gathered} \\ \end{array} } \\ \end{array} } \right.$$
    (24)
    Fig. 2
    figure 2

    The first rotation of the geodetic coordinate system

  3. (3)

    The coordinate system obtained in step (2) is parallel to the medium interface, and the wavefront forward modeling method can then be used to locate the seismic source. In this new coordinate system, the horizontal coordinates of the station are calculated using Eq. (24).

  4. (4)

    Calculate the distance from the station to the medium interface. This distance cannot be calculated using Eq. (24). Since the vertical distance between each station and the medium interface \(H_{i}\) is known, the projection of \(H_{i}\) on the normal direction of the interface can be used: \(h_{i} = H_{i} \cdot \cos \beta\), where \(h_{i}\) is the vertical distance between the station and the medium interface.

  5. (5)

    Back-calculate the source coordinates in the new coordinate system obtained in step (2) through de-rotation. Specifically, the coordinate system is rotated around the y-axis by − \(\beta\) degrees (the negative sign indicates counterclockwise rotation), converting the source coordinates to the coordinates (x', y', z') in the coordinate system obtained in step (2).

    $$\left\{ {\begin{array}{*{20}c} {x^{\prime} = x \cdot \cos ( - \beta ) + z \cdot \sin ( - \beta )} \\ {z^{\prime} = - x \cdot \sin ( - \beta ) + z \cdot \cos ( - \beta )} \\ \end{array} } \right.$$
    (25)
  6. (6)

    Back-calculate the source coordinates in the geodetic coordinate system through de-rotation. Specifically, the coordinate system is rotated around the y-axis by \(\beta\) degrees (the negative sign indicates counterclockwise rotation), converting the source coordinates to the coordinates (x’’, y’’, z’) in the geodetic system.

    $$\left\{ {\begin{array}{*{20}c} {x^{\prime\prime} = x^{\prime} \cdot \cos ( - \alpha ) + y^{\prime} \cdot \sin ( - \alpha )} \\ {y^{\prime\prime} = - x^{\prime} \cdot \sin ( - \alpha ) + y^{\prime} \cdot \cos ( - \alpha )} \\ \end{array} } \right.$$
    (26)

The coordinates (x'', y'', z') are the final coordinates of the seismic source.

5 Condition number and ill-posedness measurement of wavefront forward method

For a numerical problem, if a small perturbation (i.e., error) in the input data causes a large relative error in the output data (i.e., the solution), the problem is an ill-posed problem. For example, when calculating the value of function \(f(x)\), if x is changed by \(\Delta x = x - x^{ * }\), the relative error is \(\frac{\Delta x}{x}\), the relative error of the function value \(f(x^{ * } )\) is \(\frac{{f(x) - f(x^{ * } )}}{f(x)}\), and the ratio of the relative errors is

$$\left| {\frac{{f(x) - f(x^{ * } )}}{f(x)}} \right|/\left| {\frac{\Delta x}{x}} \right| \approx \left| {\frac{{xf^{\prime}(x)}}{f(x)}} \right| = C_{p}$$
(27)

\(C_{p}\) is the condition number of the problem. The relative error of the independent variable is generally small. If the condition number \(C_{p}\) is very large, it will cause a large relative error in the value of the function.

Generally, if \(C_{p} \ge {10}\), the problem is considered to be ill-posed, and the larger the condition number is, the more serious the ill-posedness is. The ill-posedness is not caused by the calculation method but is inherent in the numerical problem. Special measures can be taken to reduce the errors.

The wavefront forward modeling method uses a deterministic model in the numerical calculations. In particular, the layer velocity is a deterministic value. Although the layer velocity is measurable, it is rare that the layers are perfectly horizontal or inclined with parallel orientations, and it is also difficult to accurately obtain the wave velocity. There are errors in the medium information, especially the medium velocity structure. Therefore, theoretical analysis and measurement of the ill-posedness of the wavefront forward modeling method is particularly important for ensuring the location accuracy and improving the applicability of the method.

In this study, we measured the ill-posedness of the wavefront forward modeling method by calculating the condition number of the function value. In order to facilitate calculation and save space, the wavefront forward modeling method in a two-layer medium was taken as an example. The wavefront forward modeling in a multi-layer horizontal or inclined medium is essentially the same as that in a two-layer medium, and the only difference is the number of velocity structure parameters. Therefore, the wavefront forward modeling of a two-layer medium is representative and facilitates the calculation of the condition number.

In order to measure the impacts of the errors of the wave velocities in the two layers \(v_{{{\text{layer}}1}}\)and \(v_{{{\text{layer}}2}}\) and the burial depth \(h_{i}^{1}\) on the location parameters (\(\mu_{{1}}\), \(\varphi_{{1}}\), T1, \(\theta_{{1}}\), \(\mu_{{2}}\), \(\varphi_{{2}}\), T2, \(\theta_{{2}}\), \(\mu_{{3}}\), \(\varphi_{{3}}\), T3, \(\theta_{{3}}\), \(\mu_{{4}}\), \(\varphi_{{4}}\), T4, and \(\theta_{{4}}\)), it is necessary to calculate the partial derivative of the 16 parameters for \(v_{{{\text{layer }}1}}\), \(v_{{{\text{layer}}2}}\), and \(h_{i}^{1}\).

According to the implicit function theorem, the wavefront forward modeling has a total of 16 nonlinear equations, corresponding to 16 unknown parameters. Therefore, as long as the function determinant (or the Jacobi determinant) composed of the partial derivatives is not 0 at the solution of the function, the nonlinear equation system has 16 unary implicit functions with derivatives. The calculation for the derivatives of the 16 unary implicit functions at the solution of the equation is described as follows: the 16 nonlinear equations are used to calculate the partial derivatives of each unknown parameter. Each row of the Jacobi determinant is the partial derivative of a certain nonlinear equation for the 16 unknown parameters. Each column is the partial derivative of the 16 nonlinear equations with respect to an unknown parameter. The 16-row and 16-column function determinant is the denominator of the partial derivative of the 16 unknown parameters with respect to the velocity structure parameters (\(v_{{{\text{layer}}1}}\), \(v_{{{\text{layer}}2}}\) or \(h_{i}^{1}\)).

Regarding the numerator of the partial derivatives of the unknown parameters to the velocity structure parameters, the partial derivatives of each nonlinear equation to a velocity structure parameter are calculated to form a 16-dimension column vector. The column vector is used to replace a certain column of the Jacobi determinant. The replaced column corresponds to the unknown parameter whose partial derivative to the velocity structure parameter is the fractional value composed of the numerator and denominator.

Although the analytical expression of the derivative of any unknown parameter to a velocity structure parameter in the wavefront forward modeling method can be obtained theoretically, the wavefront forward modeling has 16 nonlinear equations, corresponding to the 16-dimension function determinant, and it is difficult and unnecessary to write concrete expressions of the partial derivatives (Fig. 3).

Fig. 3
figure 3

The second rotation of the coordinate system

Using the det function in the MATLAB software, the expressions of the numerator and denominator determinant are written. Loops are used for repetitive calculations, thereby automatically and efficiently obtaining the value of the determinant and the partial derivatives. Then, Eq. (27) is used to calculate the condition number \(C_{p}\).

In this section, we present several calculation examples of varying wave velocity ratios, station distances, burial depth ratios, and focal depths. The 16-dimension nonlinear equations of these examples are solved using MATLAB. Then, the previous method is used to calculate the partial derivatives and the condition numbers.

In Figs. 4, 5, 6 and 7, the symbols m1, f1, T1, c1, m2, f2, T2, c2, m3, f3, T3, c3, m4, f4, T4, and c4 represent \(\mu_{{1}}\), \(\varphi_{{1}}\), T1, \(\theta_{{1}}\), \(\mu_{{2}}\), \(\varphi_{{2}}\), T2, \(\theta_{{2}}\), \(\mu_{{3}}\), \(\varphi_{{3}}\), T3, \(\theta_{{3}}\), \(\mu_{{4}}\), \(\varphi_{{4}}\), T4, and \(\theta_{{4}}\), respectively. It should be noted that there is not a condition number for the parameter \(\varphi_{{4}}\) in the graphs. This is because \(\varphi_{{4}}\) = 0 due to the station parameters and source positions, \(\varphi_{{4}}\) is the dependent variable \(f(x)\) in the denominator of Eq. (27), and therefore, the condition number of \(\varphi_{{4}}\) could not be calculated and is omitted in the following analysis.

Fig. 4
figure 4

Condition numbers of 16 unknown parameters for three velocity structure parameters under different wave velocity ratios

Fig. 5
figure 5

Condition numbers of 16 unknown parameters for three velocity structure parameters under different station distances

Fig. 6
figure 6

Condition numbers of 16 unknown parameters for three velocity structure parameters at different focal depths

Fig. 7
figure 7

Condition numbers of 16 unknown parameters for three velocity structure parameters under different burial depth ratios

It can be seen from Fig. 4–1 that as the wave velocity ratio decreases, that is, the wave velocity difference in the two layers of the medium increases, the condition number of each unknown parameter for the wave velocity \(v_{{{\text{layer}}1}}\) increases significantly, indicating a high sensitivity to \(v_{{{\text{layer}}1}}\), that is, large relative errors in the location parameters caused by the error in \(v_{{{\text{layer}}1}}\). Except for \(\mu_{{4}}\), the largest condition number in Fig. 4–1 is less than 10. In Fig. 4–2 and 3, although the condition number for \(v_{{{\text{layer}}2}}\) and \(h_{i}^{1}\) change differently, they are less than 10. Regarding the condition number of \(\mu_{{4}}\) for \(v_{{{\text{layer}}1}}\), it increases sharply to greater than 10 within a very small wave velocity ratio. The small wave velocity ratio only has theoretical meanings because it is impossible to observe such a small wave velocity ratio in actual geotechnical engineering. For the common wave velocity ratios in geotechnical engineering, the condition number of \(\mu_{{4}}\) for \(v_{{{\text{layer}}1}}\) is much less than 10. Therefore, from the perspective of the wave velocity ratio, the wavefront forward modeling method generally does not exhibit ill-posedness.

As can be seen from Fig. 5, except for \(\mu_{{1}}\), \(\mu_{{2}}\), \(\mu_{{3}}\), and \(\mu_{{4}}\), which exhibit different variations, all of the remaining parameters exhibit good consistency, and the maximum condition number in Figs. 5–1 to 3 is less than 1.5, i.e., much less than 10. Therefore, from the perspective of the station distance, the wavefront forward modeling method does not exhibit ill-posedness.

It can be seen from Fig. 6–1 that similar to Fig. 4, when the focal depth increases to a large value, the condition number of \(\mu_{{4}}\) for \(v_{{{\text{layer}}1}}\) is significantly larger than the condition numbers of the other unknown parameters, and the condition number is less than 5.5. The condition number for \(v_{{{\text{layer}}2}}\) in Fig. 6–2 is less than 2, and that for \(h_{i}^{1}\) in Fig. 6–3 is less than 0.4. Therefore, although the condition numbers of the unknown parameters for the velocity structure parameters exhibit an increasing trend, it is still much less than the critical value, suggesting an ill-posed problem.

It can be seen from Fig. 7–1 that the condition numbers \(\mu_{{2}}\), \(\mu_{{4}}\), \(T_{{1}}\), \(T_{{2}}\), \(T_{{3}}\), and \(T_{{4}}\) for \(v_{{{\text{layer}}1}}\) initially increase and then decrease. There is a certain burial depth ratio that results in the largest condition number. The ratio is greater than 0 and less than 1. The condition numbers for \(v_{{{\text{layer}}1}}\), \(v_{{{\text{layer}}2}}\), and \(h_{i}^{1}\) in Figs. 7–1 to 3 are less than 2. Therefore, the problem does not exhibit numerical ill-posedness as the burial depth ratio changes.

The wavefront forward modeling method is a new formulation of the seismic source location problem via parameterization of the seismic source coordinates. Whether the condition number has reached an ill-posed level is directly related to the sensitivity of the unknown parameters to the location conditions, as well as the feasibility of the algorithm. An analysis of the condition numbers in Figs. 4, 5, 6 and 7 reveals that the condition number of each variable in the wavefront forward modeling method is low. For the condition parameters in micro-seismic location in geotechnical engineering practice, the condition numbers of all of the variables for the velocity structure parameters are much less than 10. Therefore, wavefront forward modeling is not an ill-posed problem, and conventional numerical calculations can be used to solve the problem.

6 Conclusions

The algorithm developed in this study is a medium-oriented algorithm. The starting point was to design the non-uniform velocity structure of the medium and to establish the corresponding seismic location algorithm in a targeted manner by gradually increasing the complexity of the non-uniform velocity structure. For a real three-dimensional velocity structure, the number of layers is arbitrary, which is a natural thought process when extending the two-layer medium to more complex velocity structures. In addition, according to the three-dimensional velocity structure of the medium layers in the coal mine area, the velocity is not only heterogeneous along the vertical direction but also exhibits lateral heterogeneity. A layered inclined medium is the simplest example of such lateral heterogeneity. In order to improve the accuracy of micro-seismic source location based on the three-dimensional velocity structure of the rock and soil layers in the mining area, it is critical to achieve seismic source location in a multi-layer inclined laterally heterogeneous medium.

  1. (1)

    The wavefront forward modeling method only requires the P-wave arrival time differences at the observation stations, and the earthquake onset time is introduced into the nonlinear equation system as an unknown variable. There is only a rounding error in the wavefront forward modeling and there are no modeling errors for both multi-layer horizontal and inclined medium.

  2. (2)

    The wavefront forward modeling method is a location method applicable to any observation system. Under different conditions with varying burial depth ratios and medium heterogeneity, the location accuracy of the wavefront forward modeling method was demonstrated to be superior to those of the joint forward modeling and inverse location method and the classical linear method.

  3. (3)

    By rotating the geodetic coordinate system, a new coordinate system that was parallel and orthogonal to the occurrence of the inclined medium was obtained, thereby converting the seismic source location in a multi-layer inclined medium to source location in a multi-layer horizontal medium under an irregular observation system.

  4. (4)

    The orthogonal rotational transformation of the geodetic coordinate system in a multi-layer inclined medium was developed. Once the coordinates of the seismic source were calculated, they were de-rotated to the original geodetic coordinate system and the corresponding equations were obtained.

  5. (5)

    The condition numbers of the 16 unknown parameters of the wavefront forward modeling method for three velocity structure parameters were calculated. The results showed that under the conditions encountered in geotechnical engineering practice, the condition numbers were much less than 10, indicating that the wavefront forward modeling method was not ill-posed. The errors in the velocity structure parameters did not lead to large errors in the unknown parameters, and hence, conventional numerical calculations could be used to solve the problem.