Introduction

Neural networks have received much attention in seismological research and geosciences in recent years. Artificial Neural Networks (ANNs) have been shown to powerfully realize solutions to problems in pattern recognition, database retrieval and to offer important new approaches to information processing because of their adaptability and ability to learn. The term neural network is used to describe various topologies of highly interconnected simple processing elements that offer an alternative to traditional methods of computing.

The idea of a neural network was originally conceived as an attempt to model the biophysiology of the brain, i.e., to understand and explain how the brain operates and functions. Neural net models are specified by a net topology, node characteristics, and training or learning rules. These rules specify an initial set weight and indicate how weights should be adapted to improve performance during use. Four choices must be made in the design of a supervised network training application: the input and output structures, hidden layer structure, the unit activation function, and the network training algorithm. These models are composed of many nonlinear computational elements operating in parallel and arranged in patterns reminiscent of biological neural nets. Computational elements or nodes are connected via weights that are typically adapted to improve performance during use. The potential benefits of neural nets extend beyond the high computation rates provided by massive parallelism. Adaptation or learning is a major focus of neural net research.

The knowledge in an ANN is encoded in the hundreds of interconnecting weights. For example, consider a trained network, an ANN which has learned to solve a problem correctly for all the elements in the training set. The first layer, the input layer, accepts the individual components of the input vector and distributes them to all the units of the next layer. Each of the units of this hidden layer then computes a weighted sum of the received inputs and performs a nonlinear squashing operation and then distributes its output to each of the processing elements of the output layer. The exact architecture of an ANN is highly problem dependent, and the number of hidden units is usually based on some ad hoc rules of thumb, some of which are discussed.

In recent years, several articles have been published on using neural networks to distinguish natural earthquakes from man-made events. ANNs is one of the outstanding methods for handling classification problems by learning from examples. ANN can be used for feature extraction as well as for the classification task itself and can extract non-trivial features with large discriminative power and discover highly nonlinear structures in the signal space. In addition, ANNs are capable of forming disconnected decision regions and thereby operating as discriminators and clusters simultaneously.

Bitto et al. (1989), AllamehZadeh and Nassery (1999), and Hush and Horne (1993), used autocorrelation estimates of the records as the training set for the Multi-Layered Perceptron (MLP), for discrimination between earthquakes and chemical explosions. Neural classifiers can form complex, highly nonlinear, and disconnected regions in the feature space of classes with significant overlaps. On the other hand, recent developments of the neural network classifiers indicate that they are useful for solving many difficult seismological problems in discriminant analysis and pattern classification with powerful theoretical support (Dysart and Pulli 1990). Neural classifier can form complex, highly nonlinear, and disconnected regions in the feature space of classes with significant overlap (Kohonen 1990).

Dysart and Pulli (1990) employed the MLP to discriminate earthquakes from underground nuclear explosions. They used spectral amplitudes of the picked windows as a training database for the neural network. They applied MLP to the classification of explosions and earthquakes, based on the spectral ratios Pn/Sn and Pn/Lg. Alexander et al. (1992) applied a frequency-slowness seismic image as the input to a neural network, to discriminate between earthquakes and nuclear explosions. AllamehZadeh and Nassery in 1999 used discriminant parameter ARMA filter coefficients and band-limited spectral ratios with Quadratic Neural Networks (QNNs). A qualitative comparison has been made between the MLP and the QNN networks in this article.

In this article, we demonstrate an Autoregressive Moving Average (ARMA) classification system based on an ANN technique for seismic discrimination problems and introduce new technique for building the system using QNN. We show that a QNN is not only powerful as a classifier, but it is also useful for feature extraction such as ARMA coefficient filters.

This system approach has a number of operational advantages over conventional method. Its main advantages over the conventional methods are the higher speed of learning, the smaller classification error rate, and better performance on small sets of seismic data.

Event Selection

The first stage of the identification is to classify a seismic event into one of the two classes: natural earthquakes and man-made events, mainly chemical explosions. Considering the great importance of seismological studies in Iran, the Iranian National Broadband Seismic Network (INSN) was established in 1995 to consider seismotectonics of Iran, accurate determination of the location of earthquakes, rapid announcement to rescue teams, and mitigation of seismic hazards.

At present INSN is equipped with 23 remote stations, Guralp system that was installed by the International Institute of Earthquake Engineering and Seismology (IIEES). Each station consists of a broadband, three-component seismometer that is equipped with a 24-bit digitizer, GPS antenna, and an acquisition computer across the country in which the continuous real time data are transferred to a network center at the main building of IIEES through a very small aperture satellite terminal (VSAT). Reliable and rapid earthquake location and reporting to responsible persons via short message service (SMS) and the IIEES web site is one of the main successful achievements of IIEES. For more detailed description of the seismic network and data-acquisition system, we refer the readers to our Web site (http://www.iiees.ac.ir).

The chemical explosions database used in this study was extracted from 25 explosions records supplied by the U.S. Geological Survey-National Earthquake Information Center and 61 earthquakes records obtained from INSN stations and USGS–NEIC data. Tables 1 and 2 show respectively the list of the events and the seismic stations. Short-period displacement records of the vertical components of P-waves were used. Most of the records had been digitized with a sampling rate of 20 Hz. However, the sampling rate of several records was 50 Hz. For each explosion event, we have selected a single station, preferably at a distance of about 10° (Figs. 1, 2). Table 3 shows the names and locations of the selected stations.

Table 1 Location Coordinates from the INSN Stations and Codes of Some Frequent Regions
Table 2 Earthquakes’ Location Coordinates, Time, Date, Depth, and Magnitude from USGS–NEIC Catalogs Used in This Study
Figure 1
figure 1

The map shows the INSN stations used in this study

Figure 2
figure 2

Location of events used in this study. Stars are explosions and circles are earthquakes. Te path between source and station are shown

Table 3 Explosions’ Location Coordinates, Time, Date, Depth, and Magnitude from USGS–NEIC Catalogs Used in This Study

Preprocessing of the Raw Data

The preprocessing stage is a critical process that performs a transformation from data space into a feature space to remove redundancy from recorded signals. The following preprocessing steps have been performed: In the first step, (1) instrumentation correction was performed on all of the input records, based on the given instrument frequency response or pole zero diagram. In order to extract suitable features from the P-wave, each record was filtered through 0.5–4 Hz band using a fourth-order band-pass Butterworth filter, and finally the appropriate lengths of windows for the P-waves were identified; in the next step, feature extraction (parameterization) and classification by artificial neural nets was applied (Kim et al. 1993). Discovering reliable seismic discriminants for a given purpose is usually difficult because it requires long-term gathering of expert knowledge, and the automation of feature extraction can become a rather complicated problem (see, e.g., Der et al. 1982). This is especially true for events recorded at long distances that are related to the energy release of the event through a complex functional relationship. The elastic and anelastic response characteristics of the propagation paths, and the response characteristics of seismometers, are undesired effects that usually fade the inherent source properties of seismometers (Der and Less 1985). In this study, ARMA analysis of seismic P-waves recorded in regional distances was performed. The ARMA method, especially in the calculation of the spectra of stationary signals, was used for frequency analysis of signals, which give frequency response as sharp peaks and valleys. In this study, as the result of this ARMA analysis of the frequency–time domain of seismic signals frequency spectrum curves (histogram curves) were obtained. Another feature is that P-coda/P spectra were calculated using the P signal (6–7 s). Our final features are the corner frequency, slope, constant level on ARMA-gram curves and the values of the P-coda/P and S/N spectra sampled at frequencies (13 features in 2 separate groups). Results obtained show that these methods are more accurate and efficient than other methods on account of using the nonlinear ANN technique.

ARMA Coefficients’ Determination Using Artificial Neural Network Technique

In the seismologic literature, most of the studies on seismic signal classification are based on parametric statistical models. Extraction or estimation of the model parameters usually requires an action demanding significant effort by analyst.

Neural networks are a type of adaptive computing system that can learn from data and generalize from stored knowledge to produce appropriate outputs in response to new data. An interesting illustration of the application of ANN to seismic discrimination is given by Dysart and Pulli (1990) for the development mathematical models like non-parametric pattern classifiers, non linear filters, and cluster algorithms. In their study, backpropagation was used for training with a learning rate and consists of simple quadratic processing elements (Fig. 3), called neurons, interconnected by links associated with numeric coefficients or weights indicating the strength of each connection (Kohonen 1990). QNN can form complex and non-planar decision surfaces between populations with significant overlaps.

Figure 3
figure 3

The block diagram show the architecture of artificial neural networks used in seismic discrimination

Neural networks are suitable for seismic signal problems with the following characteristics:

  • Nonlinearities, that is, finding a nonlinear function that will separate the earthquakes from explosions,

  • Dimensionality reduction, for example, when there are a large number of input parameters,

  • Noisy data due to weak local events,

  • Problem dependence on the various seismic source mechanisms, and

  • No known mathematical algorithm to determine the changing media or environment conditions.

AllamehZadeh and Lucas in 1995 used ANN for determining ARMA coefficients and showed that the performance results of their method proved to be better than some of the then existing ARMA coefficient methods because of the non linear nature of ANN. A general three-layer neural network of the coefficients of ARMA has been reported in their article.

Detection and classification problems in seismology relate to discrimination between natural events which include mainly tectonic and volcanic earthquakes and artificial events, such as underground nuclear explosions, mine blasts, underwater explosions, and military explosions. Perhaps the most important aspect of the seismic source identification problem is that of finding proper discriminant parameters. We have extracted the ARMA filter coefficients of the windowed P-wave phase through some matrix manipulations using conventional proxy ARMA modeling determination using ANN. The derived coefficients are then applied to QNN for training and classification. The power spectrum estimates of a few tens of seconds of seismogram can be described in terms of 5–10 ARMA coefficients. The changes in both the order and the value of the ARMA coefficients are diagnostic of the differences between noise and signal, and can be exploited to characterize seismic data.

Consider a system described by a linear constant coefficient difference equation given by (1), the output y(n) is obtained using only previous outputs, i.e., y(n − 1), y(n − 2), y(n − 3), …, y(n − p) and the current input, i.e., x(n), which means that b(k) = 0 for k > 0 and only a(k), and b(0) must be determined. The general ARMA equation is given by

$$ y\left( n \right) = - \sum\limits_{k = 1}^{p} {a_{k} y(n - k)} + \sum\limits_{k = 0}^{q} {b_{k} x(n - k)} $$
(1)

the addition of external input to the system results in

$$ y\left( n \right) = - \sum\limits_{k = 1}^{p} {a_{k} y(n - k)} + \sum\limits_{k = 0}^{q} {b_{k} x(n - k)} + \sum\limits_{k = 1}^{r} {c_{k} u(n - k)} $$
(2)

Various methods have been reported in the literature for determining ARMA model coefficients, among which are direct methods including the least squares method, Pade approximation, Proxy method, Shank method, autocorrelation method, and covariance method.

In this article, we introduce an application of ARMA coefficients to seismic signal classification using ANN as shown in Figure 3.

ARMA–ANN Coefficients’ determination algorithm is given below:

  1. (1)

    Initialize the weight vectors and initialize polynomial order at R = 2,

  2. (2)

    Select the training input and output pairs for the network, input pattern and ARMA section using six coefficients for each signal; therefore, we obtain results in the extraction of six coefficients from each of the file signal,

  3. (3)

    Run selected pattern,

  4. (4)

    Evaluate the error and then back-propagate the error and weight update (input to hidden layer).

An automatic strategy for classification of a vector is built into a two-layer feed-forward neural network. In the most general sense, a quadratic neuron is described as a quadratic form defined in the input space, whose numeric value for an input can be positive or zero, depending on the position of that input point (AllamehZadeh and Nassery 1999). Compatibility with standard statistical techniques and higher speed of classification is the main advantage of employing such networks. The available data have been divided into a training set and testing with a Leave-One-Out strategy.

Classification Results and Accuracy Measures

In this study, an ARMA coefficient using QNN has been used for the analysis of seismic signals in a completely contrasting manner from the previous networks. The system employed here generates a dynamic piecewise model for the signal according to the latest information derived from it. The network used in this article detects the type of the source whenever a natural or artificial source changes the nature of the background noise of the seismograms. In this study, the structure of ARMA obtained using QNN is depicted in Figure 3. The input, a hidden and output layer of this network contains 6, 4, and 1 neurons, respectively.

The two-layer QNN is assumed for the first and third cases, with eight neurons in the input layer corresponding to the eight ARMA features.

The best number of hidden units depends in a complex way on the numbers of input and output units, the number of training cases, the amount of noise in the targets, the complexity of the function or classification to be learned, the architecture, the type of hidden unit activation function, and the training algorithm.

We have calculated the number of hidden nodes by a general rule of [Ni(Number of inputs) + No(Number of outputs) * (2/3)] using the MATLB commercial neural network software from the company.

In most situations, there is no way to determine the best number of hidden units without training several networks and estimating the generalization error of each. If we have too few hidden units, we will get high training error and high generalization error due to underfitting and high statistical bias. If we have too many hidden units, we may get low training error but still have high generalization error due to overfitting and high variance.

Table 4 shows the results of a Leave-One-Out testing scheme. It should be noted that the sample size was small, and each method was evaluated by the Leave-One-Sample Out method. For a sample size of n, the training is done with n − 1 samples, and the remaining sample is then classified and checked for accuracy. The method is repeated until all samples have been used as in the test. In general, this method will produce higher accuracy than dividing the data into a training set and a test set. For our new network, an initial value of 10 has been chosen for the number of output neurons initially at first, and then the learning algorithm reached an optimum value of six automatically.

Table 4 International Station Coordinates Used in This Study and Codes of Some Frequent Regional Sites

The second column of Table 4 shows the number of clusters found (for the first two methods) or reached (for the last method) by each of the algorithms. The error rates are derived for each case and the average numbers of the iterations are also given in the third and fourth columns of Table 4.

Table 5 shows the complete experimental results of using the characteristics curves as a discriminator of earthquakes from chemical explosions. In this table, the ARMA, by using QNN memory length and also the ratio of accuracy, are varied to reach the best rate of correct classification. Each cell of the table shows the ratios of the number of correct classified events to the total number of the events as percents (Table 6).

Table 5 Experimental Results for Classification Between Earthquakes and Explosions
Table 6 Discriminant Analysis Classification

When the ratio of accuracy is increased while a length of windows memory is relatively small, the error increases because of the non-precise ARMA by using QNN prediction and consequently improper data clustering by it. On the other hand, for higher lengths of QNN memory and low accuracy ratios, the QNN may get trapped in the local minimums that increase the probability of error. The table shows that the length of 12 for QNN memory with accuracy equal to 0.95 yields the best successful classification rate of 0.81%.

This provides a good measure of confidence in the classification and QNN develops an approximation to the decision regions of each class. Its adaptive capabilities allow it to be used in problems in which there is little prior knowledge of pattern class distributions.

This property leads to excellent performance in classification tasks, as shown in “Pre-Processing of the Raw Data” section. One of the key features of QNN nets is that their amenability to generalization, a property that can be exploited to reduce the number of hidden nodes in classification tasks. As a result of normalization, the output activity results turned out are the activity-weighted averages of the input weights in which the weights from the most active inputs contribute most to the value of the output activity. In other words, the roles of output weights and hidden nodes’ activities are interchanged. In standard QNN nets, the weights determine how much each hidden node contributes to the output. In QNN nets, the activities of the hidden nodes determines which weights contribute most to the output.

Conclusion

The discrimination of small earthquakes from explosions based on the seismic signal recorded at regional distances is an important and difficult task. In this study,a flexible, automated technique for processing seismograms to discriminate between small earthquakes and explosions was developed. A thorough investigation of the discrimination capabilities of the technique was undertaken using regional event seismograms from 61 earthquakes and 25 chemical explosions recorded by the USGS–NEIC. The proposed technique for seismic discrimination is founded on the statistical approach and uses measurements of the ARMA coefficients with QNNs. Our experiments showed that the conventional ARMA coefficients filters discriminants were included in the optimal feature set by automated feature-selection procedure along with other, regionally dependent, features.

Our results for discrimination are based on events selected from the Semipalatinsk test site recorded at the various international seismic stations at (CHTO, ANTO) and IIEES stations. We have presented detailed results of classification by using ANNs and have shown how accuracy can be obtained based on ARMA coefficients. We have also demonstrated that the performance of the QNNs with ARMA coefficients’ measurements was able to classify seismic data and presented an experimental comparison of conventional MLP on a pattern classification problem.

Development of an advanced feature-selection procedure might be accomplished in the following ways.

By using theoretical error probability formulated for nonlinear classification rules; such ANN has been derived from the quadratic discriminator, and it is natural to employ them at the stage of feature selection if the final decision making is to be carried out by the quadratic discriminator.

The study outlined in this study compares conventional MLPs and QNNs on a pattern classification problem involving a set of seismic data. This ARMA dataset is particularly challenging for any pattern classifier due to extensive overlapping between the samples belonging to different classes. The comparison of decision surfaces for each class with the structure of the input space verified that QNNs are capable of representing the well-separated and highly overlapping classes on the input feature space.

An important contribution of this study was to show that a backpropagation neural network can be used determine ARMA parameters. The estimation of a continuous parameter is a task that is in contrast to a discrimination or classification problem, such as identifying an event. For a QNN network, a good measure of confidence in the classification is achieved with a classification error between 10% and 25%.

This dataset has been extensively used to compare different pattern classification approaches since it contains well-separated and highly overlapping classes on the input feature space.

It was experimentally verified that a leave-one-out strategy is not the best approach to assigning class labels.

The QNN has been used for the analysis of seismic signals in a completely different manner from those of the previous networks. This study is based on the fact that just some moments before the onsets of most of natural earthquakes, the Earths interior activities can affect the seismographic characteristics in ARMA coefficients filter. These pre-event changes may sometimes be used as the precursors for the future seismological events.

The results obtained so far indicate that there is considerable justification for using the method of ARMA coefficients with QNN to discriminate between natural earthquakes and artificial events, and thus this can be adopted as an attractive alternative for practical pattern-recognition applications.