Research on Underground Location Algorithm Based on Random Forest and Environmental Factor Compensation

: In view of the complex and changeable underground environment of coal mine and the long strip shape of underground roadway, a new underground location algorithm based on Stochastic Forest and environmental factor compensation is proposed. Firstly, the underground AP network model and roadway environment are analyzed, and the fingerprint localization algorithm is constructed. At the same time, the Kalman filter algorithm is used to filter the RSS signal in the offline sampling and real-time positioning stage. Then the algorithm based on random forest and environmental factor compensation is proposed. Under the assumption that the attenuation factors between the two anchor nodes are the same, the signal strength ratio compensation algorithm is proposed, which optimizes the shortcomings of the similarity of the locality error of the similar region. The target speed constraint condition is introduced to reduces errors caused by the transmission of RSS signal transmission and environmental factors. Experiments were carried out by using a bomb shelter to simulate the real mine roadway environment. The results show that the proposed algorithm can meet the high-precision positioning of the well under the conditions of sparse anchor nodes and complex environment underground.


Introduction
Safety has always been a hot issue.Coal accounts for 70% in the energy structure.It is estimated that this situation will last for the next 20 years.In 2018, 224 accidents and 333 deaths occurred in the coal mines of China, including 2 major accidents with 34 deaths, and the mortality rate of coal mines is 0.093 per million tons.While, the overall security situation will be better for the development of the science and technology.In the management of coal mine safety production, it is very important to grasp real-time information, such as the number of underground personnel, activity trajectory, precise location distribution, disaster location monitoring and more [1].At the same time, in the event of accidents, timely and accurate rescue depends on high-precision positioning system [2].Therefore, the current research of underground location algorithm is very important [3].
There are mainly working face and roadway in the active area of coal mine workers and locomotives.Compared with the transmission of radio frequency signal on the ground, the underground wireless transmission environment in coal mine is more complex [4].At present, the positioning technology mainly includes: Bluetooth, RFID, WIFI, ZigBee, UWB, ultrasound, and more.[5] And the positioning algorithm is mainly based on ranging algorithm and non-ranging algorithm, such as AOA, TOA, TDOA [7], and RSSI [8], which belong to ranging algorithm, such as DV-Hop [9], APIT, MDSMAP and more, which belong to non-ranging algorithm [10].The underground positioning in coal mine is different from ground positioning.GPS cannot play a role in underground, and the underground environment is complex and variable.The application of location algorithm in underground mine is more difficult [11].Compared with other networks, Wi-Fi [12] network has the advantages of strong signal, wide bandwidth and fast transmission rate.In the underground coal mine, the WLAN basically covers roadways and working faces, without additional network laying and installation equipment.It can speed up positioning response by adjusting the speed of data transmission in real time, which meets the needs of personnel greatly.It is an inevitable trend for the development of wireless networks in the future to transmit real-time voice, image and other information while locating [13].Wang Dongdong proposed an AP planning method suitable for underground mines by studying the electromagnetic wave propagation loss law of coal mine roadway, which satisfies the coverage of mobile terminal in WLAN  .Literature [19] proposes that the core function method and particle filter algorithm are used to locate the underground target, which realizes the tracking and positioning of static and dynamic targets.
In this paper, an underground location algorithm based on random Forest and environmental factor compensation is proposed.The network model and roadway environment of AP are analyzed.The RSS noise is processed by Kalman filter, and the Random forest classification algorithm is applied to fingerprint location.At the same time, the optimization of signal intensity ratio compensation and target speed constraint is proposed, which solves the problems of inaccuracy of current underground positioning algorithm and inefficiency of fingerprint algorithm positioning, and provides a reference for the follow-up application of underground high-precision positioning.

Roadway environment analysis
The underground roadway and working face are narrow and long tunnel-like enclosed limited space with fixed height and width of the roadway, variable length and irregular shape.The environmental characteristics can be summarized as follows: 1) The roadway is a long strip with limited radio transmission, usually up to several kilometers long and only a few meters wide.
2) Coal and rock structures are on both sides of the roadway, which belong to light-tight medium.There are concave and convex roofs and floors at varying degrees.And the electromagnetic wave refraction and reflection are serious.
3) There are many factors of air medium in coal mine roadway, such as high humidity and gas concentration, which have great influence on radio signal attenuation.
The signals received by WiFi terminals are synthesized waves after multiple reflections, scatterings and diffraction, which usually come from multiple paths and directions.The multipath transmission model of multi-AP network in roadway is as shown in Figure 2.

2 Kalman filter algorithm
Through the analysis in Section 2.1, there is noise in the signal received by STA.In this paper, the noise is filtered by Kalman filtering algorithm [20].The received signal is treated as a discrete system without control variables.It can be described by the following two formulas: Where, the received signal strength value ( ) x k is k filtered at all times.
When the control function ( 1) u k − or process excitation noise ( ) N k is zero, the order n n × gain matrix A in the above formula linearly maps the state of the previous moment 1 k − to the state of the current moment k ; the value of the system parameter A is 1. ( ) z k is the received signal strength value k measured at the time; and the system parameter H is taken as 1.It is assumed that the process excitation noise ( ) N k and the observation noise ( ) V k are independent of each other and obey the white Gaussian noise (White Gaussian Noise).In practical systems, the process-excited noise covariance Q and the observed noise covariance R may vary with each iteration, assuming that they are all constant.
Firstly, the process model of the system is used to predict the next state of the system.Suppose that the current system state is k , according to the system model, it can be predicted the state based on the previous state of the system: Where, ( 1| 1) is the best result of the previous state.Update the covariance ( | 1) x k k − corresponding to P : Where, the predicted value with the measured value, the optimal estimation of the current state ( | ) Where, ( ) Kg k is the Kalman Gain of the current moment: Where, I is the Matrix of 1, and the value is taken as 1.
Sampling is performed 15 m away from the access point, with the rate of sampling once per second, and acquiring 50 s.The filtering result is as shown in Figure 3. (fingerprints) and the corresponding signal intensity is established.Sampling points are set at certain intervals in the area to be detected.The signal intensity and its corresponding position information measured at each sampling point are stored in the database to form a fingerprint database.
2) real-time localization stage: When a worker or device moves to a certain location, the portable WiFi terminal can calculate the location of the terminal by comparing the matching algorithm with the information in the fingerprint database according to the signal intensity measured in real time.
In this paper, the random forest algorithm is used in the prediction classification of real-time localization stage [21] [22], which is a combination classification algorithm of integrated learning.Based on the construction of Bagging integration, the random attribute selection is further introduced in the training process of decision tree, and bootstrap is used to put back the original data set.Several samples are extracted and trained with weak classifier-decision tree.Then these decision trees are grouped together and the final classification or prediction results are obtained by voting, as shown in Figure 4. Step2: Decision tree generation If there are D features in the feature space, in each process of generating the decision tree, d features (d<D) are randomly selected from the D features to form a new feature set, by using the new feature set.To generate a decision tree, a total of n decision trees are generated in n rounds.
Step3: Models combination Since the n decision trees are random in the selection of the training set and the selection of features, the n decision trees are independent of each other, and the importance of each decision tree is equal, so when they are combined, they can be considered to have the same weight.For fingerprint matching, all decision tree votes are used to determine the final result.
Step4: Model verification The verification of the model requires a verification set.When the training set is selected from the original sample, some samples are not selected for one time.When the feature selection is performed, some features may not be used, so the unused samples can be selected from the original sample set as the validation set.
The idea of fingerprint location algorithm based on random forest: Firstly, the sampled signal is processed by Kalman filter to form a fingerprint database; then the feature data of the current location target is obtained in real time, processed by Kalman filter, and predicted by random forest; finally, the location information of unknown nodes is obtained.The algorithm model is as shown in Figure 5.In order to reduce the influence of narrow space of roadway on radio frequency signal propagation, it is assumed that the roadway environment of adjacent AP is the same, it means that its attenuation factor is the same.In this paper, a signal intensity ratio compensation algorithm is proposed to further optimize the positioning results.
Assume that multiple APs are deployed underground; the two nearest APs are AP1 and AP2 according to the RSS; the coordinates are x y ; the distance between AP1 and AP2 is known as d; d1 and d2 are the distances from the terminal to AP1 and AP2 respectively.Then there are: Let R be the ratio of d1 to d2, then: It can be obtained that: Similarly, the coordinate Y can be obtained.The arithmetic average coordinates of the terminal can be obtained from the results of signal strength compensation ratio algorithm and WIFI-RFFL algorithm ( , ) x y :

Actual mine
The target of downward positioning is usually mobile miners and locomotives, so the velocity parameters of moving targets can be taken as constraints.The normal walking speed of underground people is generally less than p v , and the speed of locomotives is less than c v .At the same time, AP position coordinate correction is introduced; it means that error correction will be made every time an AP passes through, thus reducing the accumulated error.In the literature [8] (2) If step (1) is not satisfied, the terminal coordinates at the current time ( , ) t t x y are calculated according to the WIFI-RFFL-SIR algorithm.
(3) Calculate the distance _ 1 t t d − between the current time t and the last time v v t × , the current estimated position coordinates of the WiFi terminal are considered to be authentic.On the contrary, according to the speed constraints, it can be determined that the current estimated position coordinates of the WiFi terminal are not believable, then the current unknown node coordinates ( , ) x y can be expressed: If it is not credible, proceed to step (3).After comparisons for n times, the position coordinates of the WiFi terminal can be expressed as follows:

Experimental analysis
In order to verify the positioning performance of the algorithm, positioning experiments were carried out in an air-raid shelter.The air-raid shelter is about 160 m long, 2.8 m wide and 3 m high.Other environmental parameters are similar to those of underground roadway.The length of the shelter is x axis, the width is y axis, and the center of the shelter at the entrance is coordinate origin.that the staff walk along the middle line of AP arrangement, it means that the terminal y coordinate is always 0; the maximum speed of personnel moving is 3 m/s; there is no locomotives in air-raid shelter; and the time interval of terminal reporting position is 1 second.

Effect of offline sampling interval on positioning accuracy
Two AP are laid in the shelter, whose coordinates are AP1 (0, 1.4), AP2 (25, 1.4) respectively, and the heights are the same 1.2 meters.In the fingerprint training stage, staffs hold WiFi terminals based on wireless SOC chip GS1011 to sample and set up fingerprint database every 1 meter, 2 meters, 3 meters and 4 meters respectively.At each sampling point, 60 received signal intensity values are continuously collected and averaged as signal intensity fingerprints.WIFI-RFFL is applied to the localization algorithm.As shown in the figure 6 and 7, when the interval between sampling points is 1 m and 2 m, the average positioning errors are 0.62, 0.66 respectively, and the root mean square positioning errors are 0.772 and 0.798, respectively.They are very close, for the smaller the interval between sampling points, the denser the sampling points, and the smaller the difference of signal intensity fingerprints between sampling points, which brings difficulties to random forest fingerprints matching.When the interval is more than 2 meters, the positioning error increases gradually.Therefore, the interval between sampling points is set to 2 meters in subsequent experiments, which not only avoid the positioning error, but also reduces the workload of off-line sampling.

Experiment of fingerprint location in random forest(WIFI-RFFL)
There are three AP in the air-raid shelter, whose coordinates are AP1 (0, 1.4), AP2 (80, 1.4), AP2 (160, 1.4) respectively, and the heights are the same 1.2 meters.If the sampling interval of fingerprint is 2 meters, the maximum positioning error can be predicted to be 1 meter.Workers walk around the shelter with WiFi based on wireless SOC chip GS1011.Based on WIFI-RFFL algorithm, the signal intensity of AP measured by WiFi terminal in real time is matched with fingerprint in database.As shown in Figure 8, due to the environmental factors in the air-raid shelter, there are diffraction, reflection, refraction and other factors.The positioning error of observation points between 30 and 50 meters is very large.At the same time, the relationship between RSS and distance is verified.When the distance is more than 30 meters, the RSS value does not change significantly with the increase of distance, resulting in a significant increase in the error.The average positioning error of WIFI-RFFL algorithm is 5 meters.

Experiments on signal intensity ratio compensation location algorithms( WIFI-RFFL-SIR)
The AP deployment is consistent with Section 4.2, where staff move around the shelter with WiFi based on wireless SOC chip GS1011.Based on WIFI-RFFL-SIR algorithm, the signal intensity of AP measured by WiFi terminal in real time is matched with fingerprint in database.As shown in figure 9, after signal intensity compensation, the positioning errors of most nodes are effectively reduced, with an average positioning error of 4.25 meters.However, due to the continuous movement of staff in the air-raid shelter, there are still singular positioning points with large errors, for example, the positioning error at coordinates (42.3,0) is 13.2 meters.

Experiments on velocity constrained compensation location algorithms( WIFI-RFFL-SIR-VC)
The AP deployment is consistent with Section 4.2, where staff move around the shelter with WiFi based on wireless SOC chip GS1011.Based on WIFI-RFFL-SIR-VC algorithm, the signal intensity of AP measured by WiFi terminal in real time is matched with fingerprint in database.As shown in Figure 10, after speed constraints, the influence of individual noise points on the average positioning error is weakened.At the same time, the terminal nodes are corrected through AP.The overall positioning accuracy is high, and the average positioning accuracy is 3 meters.

Conclusions
In this paper, a downhole localization algorithm based on random forest and environmental factors compensation is proposed for the complex environment of coal mine.The underground AP network model and tunnel environment are analyzed, and a multi-AP bridge networking model is constructed.The introduction of Kalman filter weakens the influence of noise, and reduces the complexity of fingerprint matching algorithm.The proposed signal strength ratio compensation algorithm and speed constrained optimization algorithm further improves the positioning accuracy, and eliminates the influence of noise.Through all the experiments, the average positioning error is 3m, which satisfies the applications of underground rescue, activity track playback, disaster monitoring and positioning and so on.However, due to the sparseness of the underground AP, once the fault greatly affects the positioning accuracy, the blind spot location will be further studied.
communication system of digital mine [14].By studying the long strip characteristics of roadways, Yang Cheng et al. proposed a neural network interpolation algorithm based on WLAN-based region division and a localization algorithm based on signal strength weight index[15].Compared with the traditional algorithm, the computational complexity and accuracy are improved.By establishing dual WiFi channel and signal transmission and reception timing mode, Sun Jiping et al. proposed TOA coal mine underground target location method based on time error suppression [16].Wu Jingran et al. proposed an improved fingerprint location algorithm, which combined with pedestrian track estimation (PDR) algorithm to achieve the location of underground personnel [17].By analyzing the transmission loss model of roadways and using the method of dynamic acquisition of path fading index.Han Dongsheng et al. proposed a weighted centroid location algorithm based on RSSI [18]

2 Fingerprint localization algorithm based on Kalman filter and random forest classification 2 . 1 Figure 1
Figure 1 Bridge networking model for multi-AP

Figure 2
Figure 2 Multipath transmission model of multi-AP network in roadway

Figure 3
Figure 3 Comparison results before and after Kalman filtering

Figure 4
Figure 4 Random Forest Algorithms The algorithm steps are as follows: Step1: Sample sets selection Assuming that there are N samples in the original sample set, and N samples are extracted from the original sample set by Bootstraping (with playback sampling) in each round, meanwhile a training set of N size is obtained.The training sets extracted in each round are T1, T2,... Tn.Step2: Decision tree generation If there are D features in the feature space, in each process of generating the decision tree, d features (d<D) are randomly selected from the D features to form a new feature set, by using the new feature set.To generate a decision tree, a total of n

Fig. 5 Fingerprint location algorithm model based on random forest 3 Fingerprint localization algorithm based on signal intensity ratio and speed constraint optimization 3 . 1
Fig. 5 Fingerprint location algorithm model based on random forest

Fig. 6 Figure 7
Fig. 6 Location error of terminal at different sampling intervals

Figure 8
Figure 8 Location error of WIFI-RFFL algorithm

Figure 9
Figure 9 Location error of WIFI-RFFL-SIR algorithm

Figure 10
Figure 10 Location error of WIFI-RFFL-SIR-VC algorithm , Ding Enjie et al presented the relationship between RSS and communication distance.It can be seen that within a certain distance, RSS and communication distance are almost linear.According to this characteristic, assuming the received signal strength value at the distance of AP 3m is , the speed constraint algorithm steps are as follows:(1) Obtain the RSS value of the current terminal.If the signal intensity value of one AP of all AP received by the WiFi terminal at a certain time can satisfy

Table 1
Test results of positioning accuracy at different sampling intervals