A proposed data acquisition system and algorithm for signal processing of moving-coil geophone’s output

The study of different types of vibrational and seismic movements is important for exploration in strata monitoring, machine health monitoring, earthquake detection, etc. In order to study these vibrational movements, it needs to be acquired first for analysis. Data acquisition using the seismic sensors is a challenging task. This paper presents a data acquisition system developed to acquire seismic signals from a moving-coil geophone. The paper also discusses a signal interpretation algorithm that is devised to perform automatic detection of a seismic event occurrence by separating through the waveform and non-waveform components in the sensor’s output using Gaussian naive Bayes classifier and Kernel density estimation technique. The proposed method is effective in the identification of a useful signal and identification of its nature of origin. Accuracy of the algorithm was 99% for the waveform classification. Sensitivity of the data acquisition system for the seismic sensors was 1.589 µm s–1. Further, the developed data acquisition system and the algorithm can be used in mines for seismological studies aimed at separating the vibration signal generated due to explosion and the one caused due to Earth’s tectonic and seismic activities.


Introduction
Mining is renowned for being one of the most hazardous sectors in the world due to its complex work environment. Underground coal mining is considered as one of the challenging jobs due to the confined working space, harsh and unpredicted working conditions, and continuously varying environmental parameters such as flammable and toxic gases, temperature and humidity as well as complex geo-mining conditions. When these parameters exceed a certain limit, it becomes hazardous for miners working in underground mines. The potentially hazardous atmosphere, strata and hydrological conditions inside underground mines have mostly proved fatal for miners. Underground coal mines are particularly vulnerable to disasters and life of miners are at risk due to collapse of mine pillars, roof fall, inundation, air blast, presence of gases and coal dust, and occurrence of explosion and mine fire. It causes many fatalities, damage to human lives, loss of property as well as production interruption. Miners in underground coal mines are prone to several risk conditions during their work which may cause loss of life or serious injury which has a direct and indirect cost for employees and employers. Accidents in underground coal mines can often have serious catastrophic consequences. As per the statistical report of Directorate General of Mines Safety (DGMS), India, there were around 2460 fatal accident cases recorded during 1992-2015 in Indian coal mines in which 2990 miners died and 446 miners were injured, whereas during the same period around 15,991 serious accidents occurred in which 16,413 miners were injured [7]. Further, the major causes of disasters in Indian underground coal mines during 1901-2007 were roof fall (57%), side fall (10%), explosion (16%), inundation (14%) and fire/gas (3%) excluding other causes of mine hazards. Therefore, there is an urgent need for developing an intelligent system for real-time monitoring and prediction of underground mine parameters causing above hazards and giving audio-visual warning to the working miners before occurrence of the impending hazards so that valuable life of miners and mines' property can be saved.
Many mines over the world face problem of seismic activities. The vibration caused by even small sources must be considered as potentially hazardous for both mining infrastructure and human resource deployed in mines [20]. Strata failure is also caused due to seismic events in mines. Seismic activity caused due to mining activities affects both safety and productivity of mines. Sudden release of strain energy accumulated in coal-rock mass is associated with different activities during mining. Mining-induced seismic events, which are induced by sudden release of strain energy accumulated in coal-rock mass, are associated with not only superficial structure movement triggered by stress manifestation, but also related to large geological discontinuity as well as extent and means of mining [8,18]. Rock burst and strong tremors induced by mining activity result in damage to underground workings or surface buildings, and in some cases, injury and loss of life [32]. The mining-induced seismic events create much danger to underground miners [10,23].
The rotating machinery used in industries requires continuous monitoring of roller and ball bearings. Failures that generally occur in machines are due to the unexpected overloading condition, inadequate lubrication or ineffective sealing which may cause machine breakdowns and fatal accidents [6]. Because of these failures, vibrational signals of different recurring nature occur in the machine which can be monitored for machine health. Health monitoring of equipment with rotating parts can also be done through vibration monitoring [11]. Hence, there is a need for continuous health monitoring and vibration monitoring as it is an efficient tool for machine health condition monitoring [1,30].
Further, there is a requirement for an early detection system to provide early warning and minimize the disaster caused by an earthquake [27]. Thus, there is a need for continuous monitoring of information relevant to vibrational movements.
The acquisition of vibrational or seismic data is required in various applications such as strata and micro-seismic movement monitoring in underground mines, machine health monitoring, seismic explorations for earthquake predictions, etc. Several systems for vibrational data acquisition have been developed [26,31]. However, these are very expensive and do not include data analysis features. Also, the seismic monitoring system's efficiency mostly depends on the accuracy of the seismic sensors used. Generally, the seismic measurements applications use moving-coil transducers such as geophones. A geophone is a velocity detecting transducer that transforms mechanical vibration into a voltage signal [34].
It is an active transducer that requires an auxiliary data acquisition system for measuring electrical signals, which further converts the signal into a digital form to make the data valuable for further processing. However, the data acquisition systems available in the market are usually expensive and complex [22]. The data acquisition system's signals are analyzed either through the manual methods or through dedicated software. Seismic data analysis also involves classification of the signal into seismic events and vibration caused because of explosion or blasting [9].
The manual methods are generally time-consuming and result in more human errors, whereas the software analysis methods are expensive. However, various researchers have recently used artificial intelligence to signal data using different seismic analysis code formats [5,13]. Therefore, to overcome the above issues related to vibrational data acquisition and its analysis, a data acquisition system using analog electronic circuits has been developed to provide a better solution to the problems related to data acquisition and signal processing. The system performs an interpretation of moving-coil geophone's output with the help of an algorithm developed using Gaussian Naive Bayes classifier and Kernel density estimation method. The seismic sensor used in this system is moving-coil geophone. The system provides a 16-bit analog-to-digital conversion rate. The conversion rate is solely dependent on the analog-to-digital converter module employed in the circuit. The output signal's quality generally depends on the proper selection and application of an analog-to-digital converter [29].
Geophone's output is fed to the analog-to-digital conversion circuit for conversion of analog signal into its digital form. Subsequently, the signal processing algorithm using Gaussian naive Bayes classifier and kernel density estimation method uses the digitized signal. The algorithm facilitates interpretation of the digital signal into a piece of meaningful seismic information.
This paper enumerated the developed data acquisition system and algorithm for signal interpretation to perform automatic detection of seismic events and performance analysis of the system.

System architecture
The developed system can provide a synoptic data acquisition system along with a signal processing algorithm based on Gaussian naive Bayes classifier and kernel density estimation for the acquisition of data from the geophone and processing of the data into a meaningful output as depicted in Fig. 1. Figure 1 represents a schematic diagram of the system consisting of data acquisition and signal interpretation modules. It consists of a moving-coil geophone, a data acquisition module and a signal interpretation module. This system is capable of providing data acquisition and signal processing of any vibration or seismic signal.
This system has used geophone as the primary sensor. This kind of active transducer is good at sensing the medium-and high-frequency disturbances. However, it performs poorly with the low-frequency range perturbations [3]. The algorithm proposed for signal interpretation used Python programming language for its implementation. Classification and clustering methods were studied and found to be suitable for signal interpretation, as the analysis of seismic signals primarily comprises identification of occurrence of seismic movements. Whenever a seismic activity or any vibration occurs, the sensor's output signal has a noticeable change in magnitude. Hence, classification and clustering help identify such changes in amplitude by classifying and clustering the output data into waveform and non-waveform components.

Geophone
A geophone is an active transducer that converts mechanical vibrating motion into a voltage signal. Seismic activity monitoring applications utilize these types of sensors. Based on their construction and principle of transduction, geophones are of two types, namely (i) moving-coil geophone and (ii) piezoelectric geophone [16,17]. Piezoelectric sensors have high sensitivity. It is generally utilized in the field of seismic exploration, which requires capturing and monitoring of weak signals. Moving-coil geophones are more rugged and less sensitive [18]. It works on the principle of Faraday's law of electromagnetic induction [9]. The frequency ranges in which it detects the signal depends upon the model used [18]. They provide seismic data with high fidelity [14]. SM-24 geophone (Make: Sensor Nederland B.V) was used for conducting experiments with the developed system and algorithm. These geophones have a frequency range varying from 10 to 240 Hz. Its sensitivity was 28.8 V m -1 s -1 , and the natural frequency was 10 Hz (Table 1).
Moving-coil geophone can be modeled into a spring-mass-damper model representing the transducer structure by replacing spring with coil and mass with a combination of mass and a permanent magnet [35], which can be expressed by the following transfer function: where h = system damping coefficient, ωo = corner frequency, and Ho(s) = transfer function.
According to Table 1, a geophone with the calibration shunt resistance of 1 kΩ has damping equal to 0.69. Figure 2 presents the Bode plot for the geophone. The plot represents the magnitude and phase response of the geophone, which denotes the variation of the sensor's phase and magnitude according to the frequency.

Data acquisition module
Data acquisition is a very significant aspect of an instrumentation system. Conversion of an analog signal into a digital form is the essential function of any data acquisition system. Most of the analysis and monitoring applications require digital data for processing. The principal aim of a data acquisition system is to acquire data with greater accuracy and sampling rate [21]. Figure 3 presents a functional block diagram of the developed data acquisition system. The major components of the system comprise of (i) sensor input unit, (ii) analog-to-digital converter module (ADS1115), (iii) microcontroller (Arduino Uno/AT mega 328), and (iv) computing unit/digital data processing unit. The ADS1115 module has been used for analog-to-digital conversion (ADC). It comprises of an input multiplexer that allows measurement from two differential input nodes, making it compatible with the geophones since they also have two output terminals. One advantage of the ADS1115 ADC chip is that it requires low power [12].
Whenever there is a vibrational movement, the geophone senses and converts it into an electrical signal for sending it to the data acquisition system. The ADC module (ADS1115) converts it into digital form after receiving the electrical signal and sends it to the computing module. The microcontroller plays the role of a control unit in the system. Arduino Uno development board has been used as it contains an ATmega 328 microcontroller. It controls sensing and data conversion through the embedded program. Figure 4 presents the waveform obtained by the output in case of occurrence of an event. The peak obtained in the waveform plot represents occurrence of a vibrational movement. Once the computing unit receives the data, it pre-processes the data before sending it to the developed algorithm by checking null values. After pre-processing of digital data, it is sent to the algorithm for further processing.

Circuit diagram
A geophone is connected to ADS1115 through a set of resistors. According to the SM-24 datasheet, a 1 kΩ resistor (R2) needs to be connected between the geophone terminals to get a flatten spike-free response curve [25]. The two resistors (R1 and R3) with values equal to or above 1 kΩ are connected to the input terminals of ADS1115 to limit the current entering into it.
The ADS1115 module communicates with the microcontroller through an I2C interface where two wires are used for communication, which is serial clock (SCL) and serial data (SDA). The I2C is a serial communication protocol, where bit by bit data transfer takes place through the SDA. It is synchronous. The ADS1115 acts as a slave device when connected to the microcontroller. Since the I2C communication bus involves only two lines (SDA and  SCL) for enabling communication, only two pins of the microcontroller are required to be connected to ADC [24].
Arduino Uno has A4 and A5 analog pins, and SDA and SCL pins of the ADS1115 module are connected with these pins, respectively [15]. Apart from being analog pins, A4 and A5 pins on Arduino Uno are also dedicated for I2C communication on Arduino Uno. Figure 5 shows the circuit diagram of the geophone data acquisition system. 1

Sensitivity calculation for the system
Sensitivity of the 16-bit ADC (ADS1115) for 3 V full scale range [14] is: However, sensitivity of the geophone is 28.8 V m -1 s -1 (Table 1). Hence, the overall system's sensitivity is: Therefore, any movement in the ground with a velocity equal to 1.589 µm s -1 can be converted into a digital signal by this system if there is no noise in the system.

Signal interpretation algorithm module
The stochastic behavior of vibrational data makes it a little difficult to analyze in terms of precise prediction [28]. The moving-coil geophone and the designed data acquisition circuit were installed in the laboratory under simulated conditions similar to an underground mine, and vibration was generated artificially. The installed geophone recorded data. The data were exported into an excel sheet for further analysis. Vibration data were analyzed for event detection, waveform classification and arrival picking. This paper's main focus was on waveform classification using a combination of Gaussian naive Bayes classifier and kernel density estimation technique. Waveform classification is a technique that divides a seismic wavelet into some clusters based on the defined waveform shape. Detection of seismic activity involves use of these techniques. The main aim of developing an algorithm is to reduce the manual efforts by clustering the data acquisition module data into two cluster labels: first cluster with waveform components and second cluster with non-waveform components. Waveform components are the events that occur due to any seismic or any expected vibrational activity, whereas the non-waveform components are the unwanted perturbations. Hence, the developed algorithm identifies the waveform and non-waveform components using clustering and classification methods. Time-domain and frequency-domain techniques are generally used for these kinds of analysis. Sometimes domain methods are a short-time average over long time average (STA/LTA) triggering, amplitude threshold triggering (ATT), or Z detector [9], while the fast Fourier transform (FFT) and power spectral density are among the most used methods in the frequency domain. Before performing the developed algorithm, an algorithm for FFT was run with the recorded data. The FFT is a modern computational tool that facilitates signal analysis in the frequency domain [2]. It tells about the intensity of vibrations that have occurred frequently. Python was used as a skeleton programming language to implement these algorithms. Figure 6 presents the FFT plot obtained by executing the code. The plot obtained is represented in frequency (Hz) versus magnitude (in dB) values.
The conclusion from the plot presented in Fig. 6 is that at a frequency range of around 30 Hz, many signals appeared, which shows that when there were no vibrations, the signal frequency remained around 30 Hz. However, only frequency domain analysis does not draw the essential characteristics of a vibration motion. Hence, the algorithm based on Gaussian naive Bayes classifier and kernel density estimation has been developed.
The developed algorithm comprises of four steps covering (i) kernel density computation, (ii) estimation of the local maxima and minima values, (iii) clustering based on the local maxima and minima values, and (iv) implementation of Gaussian naive Bayes classifier model which was trained and tested on clustered data obtained by kernel density estimation. Gaussian naive Bayes classifier is a machine learning tool, while kernel density estimation is a statistical tool that can be used to cluster one-dimensional data [19]. It was initially used to the cluster data, and then, the same data were used to train a model based on Gaussian naive Bayes classifier so that it can be used later for classification of seismic data [16]. Figure 7 shows the flowchart of the developed algorithm.
The developed algorithm's primary purpose is to perform waveform classification of the signals coming from the sensor. It deals with the classification of the waveform and non-waveform components in the data recorded from the sensor, where the waveform component indicates the part of the data that represents useful information, while non-waveform component represents part of the data that represent noise. This classification has multiple applications in various fields, for example, in the classification of sound waveforms for speech recognition, classification of seismic waveforms for detection of an earthquake, and classification of signals generated by the brain to understand the mechanism of functioning of the brain for various research purposes.

Implementation of the algorithm
Initially, the sensor's output data were pre-processed in Python by checking the existence of null values and verifying the recorded entries with incorrect formats. After preprocessing of the output data, the following steps were performed: a) Kernel density computation It was performed on the pre-processed sensor data. Kernel density estimation was theoretically created before the development of Python programming. It was implemented in Python by using the 'KernelDensity' estimator module in the 'sklearn.neighbors.kde' library. This algorithm calculates each point's probability density function, keeping the parameters under 'kernel' and 'kernel bandwidth. ' Figure 8 represents the plot obtained by executing it on the datasets created from the data acquisition system's output. b) Estimating the local maxima and minima The local maxima and minima of data distribution density were determined using the 'argrelextrema' module from 'scipy.signal' library. Cluster 1 has very few points, while cluster 2 has a more significant number of points (Fig. 9). Cluster 1 represents non-waveform component of the dataset, while cluster 2 denotes the dataset's waveform component. Artificial vibration was produced a maximum number of times as the data were recorded in the laboratory, and cluster 2 depicted it. c) Implementation of the Gaussian Naive Bayes classifier model, and training and testing using Kernel density estimation technique clustering data Gaussian naive Bayes is one of the simplest classifiers based on the naive Bayes theorem assuming the predictors are independent of each other. Since it is a supervised machine learning algorithm, it needs training sets with several records showing whether it is a waveform component or a non-waveform component [33]. Hence, the clustered data obtained from the kernel density Fig. 6 FFT plot of the geophone's output estimation output was used for training and testing the classifier. Once the clustering was done by kernel density estimation method, the clustered data were split into 80/20 ratio, where 80% of the clustered data were used to train the Gaussian naive Bayes classifier model, while the remaining 20% of the clustered data were utilized to test the classifier. It was executed by using a code scripted in Python programming language using the 'GaussianNB' module from 'sklearn. naive_bayes' library.
The developed algorithm used Python programming language in Jupyter notebook environment for its implementation. The algorithm used the dataset which was recorded by the developed data acquisition system for testing.

Evaluation criterion
The sensitivity of the developed system was evaluated to assess the performance. Further, the developed signal interpretation algorithm model's accuracy was analyzed in terms of classification accuracy, F1 score, and confusion matrix.

Sensitivity
The system's sensitivity was determined by measuring the output in response to the variation in the input. The fixed value of a system quantifies the smallest change that a sensor system can detect. It states the response of a sensing system. Mathematically, it can be explained as the partial derivatives of the output concerning its input parameters [4]. It is represented as:

Classification accuracy
Classification accuracy is similar to the term called accuracy. It is the ratio of the number of samples that are predicted correctly to the total number of input samples. It is represented as:

F1 score
F1 score of a classifier algorithm is dependent on the precision and recall of the classifier. Precision is the ratio of correct positive results to the total number of favorable results predicted by the classifier. The recall is the ratio of correct positive results to the total number of positive results that should have been predicted by the classifier. Mathematically, it is defined as the harmonic mean between precision and recall. It is represented as:

Confusion matrix
It is the matrix of a model that represents its overall performance. It reveals the number and types of errors being made by the classifier. The significant features of the confusion matrix layout (Fig. 10)

Performance of data acquisition system
The calculated sensitivity of the system was 1.589 µm s -1 . Therefore, it can be observed that movement in ground with a velocity around or more generous than 1.589 µm s -1 can be converted into a digital signal by the system in the absence of any kind of noise.

Performance of the developed algorithm
Gaussian naive Bayes classifier's performance has been analyzed by comparing the classification prepared based on kernel density estimation. The calculated F1 score and related data are depicted in the classification report ( Table 2). The prepared confusion matrix for Gaussian naive Bayes classier is presented in Table 3. The algorithm divided the dataset into 0 (waveform component) and 1 (non-waveform component). In the classification report, precision value for both '1′ and '0′ was equal to 1, which means prediction was performed correctly for the number of predicted data points. At the same time, recall value was 1 for '0′ and 0.66 for '1′, which indicates that prediction was performed correctly for all the data points of category '0′ in the dataset. While for category '1,′ not all data points were predicted by the classifier. The classifier missed some of the data points with category '1. ' The features of the obtained confusion matrix are: For the above confusion matrix, it is clear that some of the data points in the '0′ category were predicted as '1′ category.

Calculation for classification accuracy for the developed system
Accuracy of Gaussian naive Bayes classifier was calculated from the confusion matrix, as presented in Table 3 To assess the effectiveness of the developed data acquisition system and algorithm, a comparison was made with the previous research on the subject. Lindenbaum et al. [17] had used the K-nearest neighbors algorithm for classification purposes and Gaussian naive Bayes-based classification for developing the algorithm. The detection rate was found to be 93.4% with K-nearest neighborsbased algorithm [16]. However, accuracy was found to be 99.6% using the Gaussian naive Bayes-based classification. Sokolowski et al. [23] had applied STA/LTA, and some improved STA/LTA algorithms for P wave detection, which is different from the developed algorithm. However, these STA/LTA algorithms involve high computational complexity, as incoming signals require multiple fittings for these methods [27].
The above test results and comparisons lead to the conclusion that the developed classifier algorithm performs better results than other manual methods because of its simplicity and higher accuracy.  Table 2 Classification report for Gaussian naive Bayes classifier Table 3 Confusion matrix for Gaussian naive Bayes classifier

Conclusions
The developed data acquisition system with moving-coil geophone and signal interpretation algorithm is useful for monitoring and predicting seismic and roof fall events in underground mines. Further, the application of the supervised machine learning technique based on the Gaussian naive Bayes classifier and kernel density estimation method makes the system intelligent and efficient for measuring seismic events with higher accuracy. The experimental results clearly showed that the developed signal interpretation algorithm achieved an accuracy of 99.6%. The execution time of the data acquisition system and data interpretation model is 70 and 30 ms, respectively. Hence, total time required for execution of overall system is 100 ms. Further, the system is automatic in nature as it performs the detection of seismic events automatically through the machine learning model. Thus, the developed system shall be implemented in underground mines to predict roof fall by measuring seismic events and provide warning of impending roof fall for taking precautionary measures by the mine management to avoid fatal accidents.
In the future, the use of 32-bit ADC might increase the data acquisition system's accuracy and sensitivity. It was also observed that the naive Bayes classifier's accuracy is good, but it needs to be tested with a variety of vibration data to check its robustness.