Abstract
A novel device-free target tracking and following method is proposed based on both the received signal strength (RSS) and channel state information (CSI) of WiFi signals. Different from the typical device-free target tracking method, we consider a scenario where the device-free target under tracking is followed by a device that transmits the reference signals for location tracking of the target and the device itself. To meet the goal, a deep spatial-temporal neural network model is designed to learn and exploit the multi-resolution spatial and temporal features of RSSI and CSI for location tracking. By experimental results on our testbed, we show that the average positioning accuracy of the proposed method for the device-free target can reach 0.773 meters, which has a \(64\%\) improvement over the accuracy of 2.164 meters of a typical device-free tracking method under the same experimental condition.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Indoor positioning systems (IPS) have seen increasing demand and applications in location-based services (LBS), the Internet of Things (IoT), and intelligent healthcare [1]. In view of the diverse applications of IPS, significant research efforts have been made for positioning with different wireless technologies, such as Bluetooth, WiFi, and mobile network [2]. Among them, WiFi-based positioning has attracted the most attention for at least two reasons: First, existing WiFi networks can serve as a general infrastructure for positioning purposes. With the dense deployment of WiFi devices and their built-in circuits for measurements of radio signal strength (RSS) or even channel state information (CSI), a WiFi network can be easily transformed into a positioning system. Second, the orthogonal frequency division multiplexing (OFDM) technology that underpins the WiFi and the fourth generation (4G) and beyond systems can provide the signature CSI of targets in space, time, and frequency, allowing for more precise positioning than using RSS readings only. In addition, the rich research results in WiFi-based positioning also lay a solid foundation for technology developments in this field [3, 4].
The idea of RSS-based positioning is first introduced in [5], and is realized with the readings of RSS indicators (RSSI) of received WiFi packets. Despite the significant research efforts in this area over the past decade, IPS with RSSI measurements remains a challenging task in practice, particularly, in the consistency of positioning accuracy, which is sensitive to measurement campaigns and channel modeling in complex indoor environments [3, 4].
Compared to RSS-based positioning that offers a meter-level accuracy, positioning with instantaneous CSI has a potential to provide a centimeter-level accuracy [6, 7] at the cost of a measurement density 2 to 3 orders higher in both space and time than that for the RSS-based approach, and is, thus, more suitable for positioning in small-scale areas of few meters squared (\(m^2\)). Nevertheless, the combination of these two positioning technologies creates a possibility for high-precision end-to-end autonomous indoor navigation services that can use RSS-based positioning methods for coarse trajectory control and CSI-based methods for fine localizations at endpoints.
From an estimation theoretical point of view, either RSS- or CSI-based positioning can be further classified into two categories: the deterministic versus the statistical ones. In the deterministic approach, a geometric model is needed to serve the positioning purpose. For RSS-based methods in this category, a radio power path-loss model is typically used to characterize the relationship of RSS measurements to radio propagation distances [5, 8, 9] between transmitters and receivers. For CSI-based methods in this category, a radio wave propagation model is employed to estimate the angle of arrival (AOA) or the time difference of arrival (TDOA) based on the phase differences in the CSI of designated receivers [10, 11]. These geometric measurements in terms of propagation distances, AOA, or TDOA can then be transformed into the location information of the target, given the predetermined locations of the receivers.
In contrast to the deterministic approach, location information of a statistical approach is directly extracted from measurement data. No specific physical model is needed for location estimation. Either the well-known pattern matching methods, e.g. [12, 13], or the more sophisticated deep learning methods, e.g. [14,15,16], fall in this category and have been extensively studied for positioning with RSSI or CSI measurements.
The aforementioned positioning methods, either with RSS or CSI measurements, or either following a deterministic or a statistical approach, are all based on reference signals transmitted or received from the target device. The physical principle underpinning the positioning algorithms is the same and results from electromagnetic (EM) wave propagations and their interactions with the environments, including reflection, diffraction, or refraction effects on obstacles. From this point of view, positioning can also be done for a target that is not actively involved in the transmissions or receptions of reference signals. This kind of passive positioning for a device-free target is somewhat like Radar remote sensing, but perhaps more challenging due to the complicated EM wave propagation channels in indoor or outdoor urban environments. Consequently, IPS with such a device-free target localization method typically relies on a deep learning model to extract the location information of the target intrinsic in the CSI that is reflected or diffracted from the target and is received by the IPS receivers, such as [17,18,19], to name a few.
Despite the rich research results in IPS, location tracking with CSI remains to be a challenging task due to the complexity to represent the spatial and temporal structure of CSI with regard to moving targets without a specific physical model for it, although several recent research efforts have been made to investigate this issue with, for instance, recurrent neural networks (RNN) [20,21,22] or particle filters [22]. This problem becomes even more complicated for device-free target tracking if the transmitter of the reference signal also moves with the target. This could happen in applications where a robotic pet or a care robot is approaching or following its master. In view of the technical challenges in device-free target tracking and its great potential in applications such as care robots, automatic guided vehicles (AGVs), and intelligent patrol inspection, etc., we study herein a device-free target following method based on neural networks that use both the CSI and RSSI of reference signals.
To exploit the spatial structures and characteristics of both CSI and RSSI, we reexamine their individual spatial correlations in indoor space and design neural networks to extract both the coarse and fine location information of the target out of RSSI and CSI. With these spatial features of the target, we further construct an RNN to learn and exploit the temporal structures and characteristics in the spatial features for continuous location tracking and target following. The performance of the proposed device-free target following system is verified by experiments on our testbed, and has a positioning accuracy of 0.773 meters, which has a \(64\%\) improvement over traditional device-free approaches. The contributions of this research are summarized as follows:
-
We reinvestigated the spatial correlations of CSI and RSSI, and developed a deep neural network (NN) structure to exploit the spatial and temporal correlations of CSI and RSSI for location tracking of moving targets.
-
Based on the proposed NN structure, we designed a common framework for IPS which not only can be applied for typical device-based or device-free location tracking, but also for device-free target tracking where reference signals are sent from a moving device that is also tracked by the IPS.
-
By experimental verifications, we showed that the proposed device-free target tracking method enjoys a \(64\%\) improvement in positioning accuracy when reference signals are sent from a device that follows the target.
This paper is organized as follows. Section 2 introduces the system model for IPS and Section 3 provides some background knowledge on data preprocessing for CSI measurements. A spatial-temporal NN structure is proposed for IPS in Section 4. Experimental results are presented in Section 5, followed by some discussions and concluding remarks drawn in Section 6.
2 System Architecture
In current WiFi and 4G/5G systems, Orthogonal Frequency Division Multiplexing (OFDM) and Multiple Input Multiple Output (MIMO) technologies [23] are used to combat the multipath effects in broadband wireless transmissions and to increase the system capacity. In MIMO transmissions, data is usually split into multiple streams, with each stream transmitted from a different transmit antenna in the same frequency band. For MIMO-OFDM systems, each data stream is further modulated with OFDM. Specifically, data are further split, modulated, and transmitted in multiple orthogonal subcarriers in the occupied frequency band to avoid inter-symbol interferences.
Due to this orthogonality property, CSI amplitude and phase of each subcarrier can be easily estimated in OFDM systems. For a legacy 802.11n WiFi that supports \(2\times 2\) MIMO, and has a total of K subcarriers, e.g. \(K=56\) across a total bandwidth of 20 MHz, the CSI matrix \(\textbf{H}\) of the \(L=4\) antenna pairs is given by
where \(h_{k,l}=\vert h_{k,l}\vert e^{-j \angle {h_{k,l}}}\) is the CSI value on the k-th subcarrier and the l-th antenna pair, and consists of the amplitude (\(\vert h_{k,l}\vert\)) and phase (\(\angle {h_{k,l}}\)) parts. The CSI matrix \(\textbf{H}\) represents the overall channel frequency response (CFR) of the wireless channels, including the effects of small-scale channel fading and large-scale channel reflection, scattering and shadowing effects. In other words, we can sense a slight variation in an indoor space via its CSI matrix.
On the other hand, the RSS measurements, expressed as RSSI (RSS Indicator), give the average power of received OFDM symbols. In free space, their values mainly depend on the propagation loss from a transmitter (TX) to a receiver (RX), and are characterized by a log-distance path-loss model [24] on the decibel (dB) scale as
with \(PL(d_{0})\) being the path loss at the reference distance (\(d_0\)), \(\gamma _d\) being the decay exponent and \(\eta _{s}\) being a Gaussian random variable (RV) to model the shadowing effect. This model basically assumes that the radio power decays exponentially with distance in free space. While, an RSSI is also affected by the shadowing and multipath effects such that the RSS-based positioning approach is not able to capture a small-scale power variation, and has an indoor localization accuracy at the meter level [6].
To improve positioning accuracy, instantaneous CSI can be used to extract the information of individual propagation path in order to achieve a decimeter-level localization accuracy. In our proposed system, we use WiFi APs to collect the CSI and RSS measurements of a target device. The WiFi APs use a Linux-based operating system, OpenWRT [25], and can synchronously collect their RSSI readings of WiFi packets from an associated device on an assigned WiFi channel. Their CSI, however, is collected individually by their on-board Atheros network interface cards (NIC) (QCA9558) for IEEE 802.11n standard. The Atheros-CSI-tool [26] built on top of the Atheros WiFi driver (ath9k) can extract the CSI, RSSI, and the timestamps of received WiFi packets.
Moreover, in order to monitor the CSI data from multiple devices simultaneously, the Atheros-CSI-tool can be implemented with 3 different modes: the AP-client, broadcast, and injector monitor modes. The AP-client mode is the basic configuration of this tool, which provides one-to-one communication and captures the CSI data from the associated device. The broadcast mode involves multiple clients and one broadcast AP. Clients share the same MAC address to simultaneously receive packets from the broadcast AP. In the injector monitor mode, all devices share the same MAC address and can acquire CSI from an injector. In our WiFi-based IPS, the system is implemented with the broadcast mode and uses multiple receivers to obtain CSI measurements.
The IPS is developed, deployed, and tested on the 8-th floor of the EED building of National Yang Ming Chiao Tung University (NYCU), Taiwan, and controls a total of 3 WiFi APs, which serve as CSI clients to collect CSI and RSSI measurements. The floor plan and the locations of the three RX APs (marked as triangles) are illustrated in Fig. 1. The entire testing area consists of two hallways, and the dimensions of the testing area are around \(13.8\times 8.2\ m^2\). The RX APs are installed on the ceiling and configured to use the WiFi channel 2 in the industry, scientific, and medical (ISM) band.
3 Calibration Procedure of CSI Measurements
In 802.11n WiFi networks, CSI data are obtained from the high-throughput (HT) preamble [27]. Signal processing on the reference signals of preamble can be divided into four functional steps: start-of-packet detection, automatic gain control (AGC), frequency offset estimation, and channel estimation. In these functional steps, the AGC will introduce an unknown scaling parameter in the CSI amplitude. On the other hand, the phase offset in CSI is caused by hardware mismatches in the carrier and the sampling frequencies between the transceiver and the receiver. Besides, a packet detection error also causes a CSI phase rotation error. These errors on both the CSI phase and amplitude must be calibrated if we aim for a precise positioning result.
To address the issue of CSI phase correction, many studies have been proposed [28,29,30]. In their results, CSI phase distortions are found attributed to four sources: sampling frequency offset (SFO), sampling time offset (STO), carrier frequency offset (CFO), and Phase-Lock-Loop (PLL) phase offset. According to the linear phase offset assumption [29], the observed CSI phase of each CFR sample can be defined as
where \(\varphi _{k,l}\) denotes the true value of the phase on the k-th subcarrier of the l-th antenna pair, with \(k=-28,...,-1,1,...,28\) in our system setting, \(\Delta t\) is the time delay caused by STO and SFO, \(\beta\) is the phase offset caused by CFO and PLL phase offsets, N is the OFDM dimension, and \(Z_k\) is measurement error.
To compensate the CSI phase due to the time delay and phase offset, we need to remove the effect of \(\Delta t\) and \(\beta\) accordingly. Given that k is the subcarrier index, a phase calibration method can estimate the mean \(\beta\) and the slope \(\Delta n \triangleq \frac{\Delta t}{N}\) of phase offsets from the CSI phase measurements with a least squares (LS) estimator. As a result, we can calculate the estimate as follows:
with the number of total subcarriers \(K=56\) and the set of available subcarriers \(S_k=\{-28,...,-1,1,...,28\}\) in our system. And
Based on the estimated slope (\(\Delta n\)) and mean (\(\hat{\beta }\)), the calibrated phase value is given by
On the other hand, the amplitudes of CSI measurements are affected by the AGC in the WiFi NIC. When received WiFi packets are under different AGC settings, the scales of received signal strengths are different. Therefore, we only consider the morphology of CSI amplitudes and normalize the collected CSI amplitudes into a range of [0,1] as follows
On the basis of CSI calibration, stable CSI are likely to be obtained on each subcarrier. To verify the CSI calibration performance, we test it for a fix reference point (RP) by experiments. In the experiment, the TX is placed at RP 7 in Fig. 1 and the CSI data are measured by receive AP 3 (RX3). Figure 2(a) presents the original phases of 100 collected CSI samples. The phases after removing the offsets due to CFO and PLL are presented in subfigure (b), and the phases further eliminating the distortions caused by STO and SFO are provided in subfigure (c). The normalized CSI amplitudes are shown in subfigure (d). As shown in the results, the calibrated CSI data, especially the phase terms, are more stable than the raw data in static environments, allowing us to use the CSI to represent the complex channel between RP7 and RX3.
Furthermore, we also investigated the spatial correlation among the collected CSI samples. Spatial correlation measures the continuous changes in CSI measurements at nearby locations. In the experiment, we collected CSI measurements along a straight line (the vertical hallway in Fig. 1) and calculated the correlation between two sampling points. In order to avoid AGC distortion and ensure consistency in the NIC configuration, the TX was continuously transmitting WiFi packets and moving at a constant speed of 2 m/s. Therefore, the collected CSI measurements are partitioned into sampling intervals of 100 ms. The average CSI in different intervals are used to calculate the spatial correlation as follows
where \(\varvec{h}_{j}=[ h_{1,j},...,h_{K,j}]^T\) represent the average CFR values at the j-th sampling location, and \(<\cdot>\) is the inner product of two complex vectors.
Figure 3 shows the relationship of CSI correlations to distances, and compares it to the spatial correlations of RSSIs that are collected in the same experiment. We can observe that the correlations of CSI are rapidly decreasing with separation distances, compared to that of RSSI. The effective correlation distances in terms of CSI and RSSI correlations are 0.12 m and 2 m, respectively. Similar results were also presented in [31]. In their works, the correlations of the time-reversal resonance strength (TRRS) samples, i.e., the time-domain CSI measurements, was modeled with a Bessel function, and the effective correlation distance was also about 0.1 m [31]. The results clearly show that CSI and RSSI measurements have different spatial correlation resolutions and can be combined to provide location information of mixed resolutions for indoor navigation purposes.
4 Deep Spatial-Temporal Network Architecture for Target Tracking and Following
In this research, we aim to use CSI and RSSI of reference signals received by RX APs in our IPS to assist an agent device to continuously track and follow a target which does not carry a WiFi device with it. In contrast to typical device-free location estimation that uses a fixed WiFi signal source, the agent device carries a WiFi with it to continuously send reference signals for IPS to track its location and the target’s, as well. As shown in the previous section, the CSI and RSSI have different correlation distances in space. RSSIs can reflect meter-level location variations, which are useful to build a large-scale radio power map for coarse locationing in indoor space. CSI readings, on the other hand, can provide a decimeter-level location information for fine positioning with respect to (w.r.t.) designated points. Combing these two positioning technologies allows us to provide an automatic indoor navigation service with acceptable measurement efforts for RPs of different scales of spatial resolutions in different indoor space.
To fulfill the goal, we need to design a deep learning model for IPS to merge and transform the information continuously acquired from CSI and RSSI into the locations of the agent device and the target, as well, when both are moving. This implies that the neural network, on one hand, needs to be able to represent the spatial structure of RSSI and CSI with regard to the device locations, and, one the other hand, able to interpret the temporal variations in CSI and RSSI into continuous movements of the devices. This inspires us to construct a neural network structure as illustrated in Fig. 4.
The entire network mainly consists of two parts: a spatial feature extraction network that decodes the RSSI and CSI observations into the location features of the agent device and the target for individual signal sampling and processing intervals, and an encoding network that keeps track of and transforms the features of consecutive time intervals into the current locations of the agent device and the target. Details of them are specified in the following subsections.
4.1 Spatial Feature Extractions
For the spatial feature extraction network as illustrated in Fig. 5, it mainly contains three parts: a deep neural network (DNN) used to extract the coarse spatial structure of RSSI with regard to the location of the agent device that transmits a reference signal for positioning purposes, and two convolutional neural network (CNN) that are employed to extract the fine spatial features of CSI for the target’s location relative to the agent device. The inputs to the two CNNs are \(\angle {H_{k,l}}\) and \(\vert H_{k,l}\vert\), respectively, for \(k=-28,\dots ,-1,1,\dots ,28\) and \(l=1,\dots ,4\), collected from the three APs of our IPS as illustrated in Fig. 1. Consequently, \(\angle {H_{k,l}}\) and \(\vert H_{k,l}\vert\) are formatted as two individual maps of dimension \(56\times 12\), and fed into individual CNNs to extract the spatial features among the CSI of different subcarriers, transmit and antenna pairs, and APs.
Each CNN contains 5 layers of 2-dimensional (2D) spatial convolutions as illustrated in Fig. 6. The number of channels in the five convolutional layers are 16, 32, 32, 32, and 32, respectively, and each channel is implemented with a kernel size of \(3\times 3\) and stride of 1 for filtering. The activation functions in the CNN network are leaky ReLU. The first four convolutional layers are followed by maximum pooling layers, which use a kernel of \(2\times 2\) and stride of 2 for pooling. The outputs of the final convolutional layer are flattened into a feature vector of dimension 128. The CNN output feature vectors are denoted by \(\varvec{f}^t_{amp}\) for CSI amplitudes \(\vert H_{k,l}\vert\) and \(\varvec{f}^t_{phase}\) for CSI phase \(\angle {H_{k,l}}\). The two vectors are concatenated to form a CSI feature vector of \(\varvec{f}^t_{CSI}=\varvec{f}^t_{amp} \oplus \varvec{f}^t_{phase}\), where \(\oplus\) stands for the concatenation operator.
Given that the importance of the different feature contents may also vary with the location of the reference agent device, the CSI feature vector is weighted by the outputs of an attention network [32] in an attempt to extract the essential location information of the target relative to the agent device. The attention network is formed with a DNN network whose output is an importance vector \(\varvec{\alpha }^t\) in the range of [0, 1]. Multiplying \(\varvec{f}^t_{CSI}\) by \(\varvec{\alpha }^t\) gives the weighted CSI feature vector of
where \(\otimes\) stands for an element-wise product.
Unlike CSI inputs, the RSSI inputs \(\hat{R}^t_{p}\) to their feature extraction network only have \(P=3\) variables given by the three received WiFi APs in our IPS. We, thus, use a DNN model to extract the RSSI feature vector \(\varvec{f}^t_{RSS}\) of dimension \(16\times 1\). Concatenating this feature vector to \(\hat{\varvec{f}}^t_{CSI}\) from CSI, and sending them to another DNN generates a final feature vector \(\varvec{f}_t\) of dimension 32.
4.2 Temporal Feature Extraction
The spatial feature \(\varvec{f}_t\) serves an observation for the locations of the agent device and the target during a certain observation time interval. The observation can be good to serve the locationing purpose if the target and the agent device are static and the observation interval is long enough. For the target tracking problem considered in this work, however, the observation interval cannot be set too long to track the dynamic location changes, nor too short to combat the observation noise. To resolve this dilemma, we use a recurrent neural network (RNN) to learn and exploit the relationship among the time series of feature vectors \(\varvec{f}_t\) with regard to the time-varying locations of the target and the agent device, as well, that follows the target. The proposed architecture is presented in Fig. 7, formed based on the well-known long-short-term memory (LSTM) model.
The network consists of a layer of bidirectional LSTM model (BiLSTM) [33] and a layer of unidirectional LSTM. The BiLSTM processes the times series of feature vectors \(\varvec{f}_t\) on a time-window basis, and derives a forward feature vector \(\overrightarrow{\varvec{f}}^t\) of dimension 32 and a backward feature vector \(\overleftarrow{\varvec{f}}^t\) of the same dimension for every time instant t within a time window, making use of the temporal relationship among \(\varvec{f}_t\). These two feature vectors \([\overrightarrow{\varvec{f}}^t, \overleftarrow{\varvec{f}}^t]\) are then merged by another layer of LSTM to generate a feature vector \(\varvec{y}^t\). The feature vector \(\varvec{y}^t\) is directly transformed by a DNN into the location estimate \(\hat{\varvec{p}}^{t}\).
4.3 Overall System Architecture for IPS
With the signal preprocessing procedure specified in Section 3 and the neural network architectures depicted in Figs. 4\(\sim\)7, we construct an IPS as illustrated in Fig. 8. The operations of the IPS system are divided into two phases: the offline training phase and the online inference phase. In the offline training phase, CSI and RSSI readings are collected for the different types of devices under test (DUT) along preplanned paths. The loss function for the training of the entire neural network is defined as the MSE of positioning, given by
where \(\hat{\varvec{p}}^{i}\) and \(\varvec{p}^{i}\) are the estimated and the ground-truth locations of the DUT, \(\varvec{\theta }\) are learnable parameters in the system, and \(\xi\) is the number of samples for evaluation. For the generalization of neural network, training samples are randomly shuffled into multiple mini-batches of size \(\xi\). The parameters \(\varvec{\theta }\) are updated for every mini-batch. The convergence curves of the training loss functions for device-based tracking are presented in Fig. 9, where the mini-batch size is 190 and the number of mini-batches in an epoch is 24.
The IPS can be applied in 3 positioning scenarios: (1) device-based location tracking for an agent who is carrying a WiFi device and continuously sending reference signals for IPS to estimate its location; (2) device-free location tracking for a target in which the target’s location is continuously estimated by reference signals sent from a fixed TX in our IPS; and (3) device-free target tracking and following in which the IPS continuously tracks the target’s location based only on the reference WiFi signals sent from a moving agent device that is following the target. The dimensions of the output vector are different according to the operating scenarios. In the next section, we will present our experimental results for the three positioning scenarios and investigate their performance limitations, respectively.
5 Experimental Results
We perform experiments for the three aforementioned positioning scenarios: (1) device-based location tracking, (2) device-free target tracking, and (3) device-free target tracking and following, to analyze and compare their positioning accuracies. All experiments are carried out in a corridor of dimensions \(16\times 8\) as illustrated in Fig. 1. RSSI and CSI are sampled and processed at an interval of 100 ms, and collected individually for the three scenarios. In addition, all network models are trained with a commercial server, which has a CPU of Intel i9-9900X 3.50 GHz (10 cores), 64 GB memory, and a GPU of NVIDIA GeForce GTX2080. Details of the experimental settings and results are provided in the following subsections.
5.1 Scenario 1: Device-Based Location Tracking
In this scenario, we assume that the DUT is equipped with a WiFi TX, which is an OpenWRT WiFi AP and programmed to transmit WiFi packets continuously. During the data collection period, CSI data are collected by 3 WiFi RXs installed on the ceiling of the corridor in Fig. 1, and the DUT is moved along the 6 predefined paths at a constant speed of 0.46 m/s, as shown in the figure. For each path, we repeat the experiments for 5 times and record the CSI and RSSI measurements in 5 individual data sets. Three of the data sets are used for network training and two of them are used for network validation. The DUT locations are labeled at a meter-level scale in the format of \((p_x, p_y)\), and are interpolated between end locations based on a constant traveling speed of 0.46 m/s between them. Therefore, the location resolution is 4.6 cm at the system data sampling rate of 10 Hz. And the total number of sampling locations is 1540 along the 6 predefined paths of the entire route of 70 m.
The MSEs of location tracking in each path are presented in Table 1. The MSEs of different paths are all about 1 meter, which implies that the proposed model is robust and can offer an acceptable accuracy in a device-based tracking scenario. To provide a baseline for performance comparison, we also implemented a straightforward model, which learns spatial features with a 3-layer DNN network (512-256-128) and uses the same RNN model to learn the temporal correlation. The results show that their positioning accuracies are at the same level. However, the DNN-based model requires 854400 neurons to learn the spatial characteristics, which is 3.37 times larger than the proposed CNN feature extraction model. As a result, the proposed model can characterize the spatial relationship with a lower complexity.
On the other hand, the window size of the RNN model also affects the performance of the tracking algorithm. In this work, the maximum window size is set to 21 sampling intervals to exploit the spatial correlations of CSI measurements in a maximum separation distance of 46 cm. Table 1 also shows the results of different window sizes. Although the accuracies improve with larger window sizes, the improvement gets saturated when the window size is beyond 15, or the correlation distance reaches \(7\times 4.6 \simeq 32\) cm, which is also a distance where the CSI correlation drops below 0.7 as shown in Fig. 3.
5.2 Scenario 2: Device-Free Target Tracking
In this positioning scenario, the DUT is not equipped with a WiFi TX. The reference signal for positioning is transmitted by a WiFi TX installed at a reference point (RP) in the test field. Therefore, the location of the RP is a key factor for target tracking performance. In our experimental setting, we have 7 RPs as shown in Fig. 1. The distance between TX RPs is about 1.4 m.
During the data collection period, the WiFi TX is placed at a given RP while the DUT moves along three predefined paths (Path 1, Path 3, and Path 5) at a constant traveling speed of 0.46 m/s. There are a total of 740 sampling locations along the three paths of the entire route of 34 m. For each path, we repeated the experiments for 5 times and recorded the CSI and RSSI measurements in 5 separate data sets. Following the same setting in scenario 1, three of the data sets are used for network training and the other two for network validation. However, the data collection procedure was repeated for 7 times to verify the performance w.r.t. the 7 WiFi TX RPs in the device-free scenario.
The MSEs of device-free location tracking for the different TX RPs are provided in Table 2. Among them, when WiFi TX is placed at RP 6, the IPS achieves the best performance with an average MSE of 1.3 m. In contrast, the IPS has the worst MSE of 2.1 m when WiFi TX is placed at RP 1. The varying tracking performance shows that device-free positioning can be severely affected by the source location of WiFi TX. To find the reason, we closely examined the results for WiFi TX at RP 5 and RP 7, respectively, and plotted the positioning MSE at the true locations of DUT as shown in Fig. 10. We can see from the figures that the positioning MSEs for DUT closer to the WiFi TX are better than those for DUT away from the WiFi TX. This is because device-free positioning relies on the channel disturbances due to the moving DUT. If the DUT is far from the WiFi TX, the wireless channels and the measured CSI, would become very weak and noisy, hence, not able to provide useful location information for the DUT. Table 2 also compares the performance with the proposed feature extraction model and that of the DNN model.
5.3 Scenario 3: Device-Free Target Tracking and Following
In this scenario, both the locations of a target and an agent device are tracked by the IPS. The agent device, e.g. an AGV, is equipped with a WiFi TX and follows behind the device-free target at a fixed distance, and continuously transmits WiFi packets for IPS to track its location and the target’s as well. This makes the IPS a hybrid positioning system that provides both a device-based and a device-free location tracking services. To conduct experiments for this challenging scenario, we consider a case where the agent device is within 1.4 m of the target and is at one of the RPs defined in Fig. 1. This allows us to use the CSI and the RSSI data collected in the device-free target tracking scenario for the target tracking and following problem considered herein. Taking path 3 as an example, the agent device, namely the WiFi TX, moves along a sequence of RPs, \(\{RP7, RP6, RP1, RP4, RP5\}\), and the CSI and RSSI measurements collected for the target that is within 1.4 m in front of the agent device are used for positioning the target by our IPS.
There are a total of 3 paths in our experiments as defined in Fig. 11(a). For each path, we have 5 sets of measurement data. Three of the data sets are used for network training, and two of them are used for network validation. The total length of these three paths is around 18 m, yielding 401 samples in each data set. The output labels of the network consist of two parts: the locations of the device-free target and the agent device. Both locations are labeled at a meter-level scale in the format of \((p_x, p_y)\). The proposed method has a very competitive performance with an average MSE of 0.773 m for the device-free target. Compared to typical device-free positioning, the IPS enjoys a \(64\%\) improvement in MSE of positioning when an agent device follows the device-free target.
As shown in Fig. 11(b), the positioning errors on the testing routes are more uniform and precise than the results in Fig. 10. Although target tracking is implemented in a device-free manner, the agent device and its corresponding TX signal are able to produce highly spatially correlated CSI and RSSI measurements, due to the closer distances between the WiFi TX of the agent and the moving target. As a result, this target tracking method not only leads to a more accurate positioning precision for device-based tracking of the agent, but also provides a better precision for device-free tracking of the target.
6 Conclusions
The problem of device-free target tracking with RSSI and CSI measurements of wireless signals were reinvestigated in this work. Compared to existing works in this area, we are particularly interested in applications where the device-free target under tracking is followed by a device that transmits the reference signals for location tracking of the target and the device itself. To fulfill the goal, we reinvestigated the spatial correlations of RSSI and CSI measurements, and designed a deep neural network model to learn and exploit the spatial and temporal characteristics of CSI and RSSI for location tracking. The neural network architecture consists of a DNN to extract the coarse spatial features of RSSI w.r.t. the indoor space, and two CNNs to extract the fine spatial features in the phases and the amplitudes of CSI w.r.t. the device that transmits the reference signals. The multi-resolution spatial features of RSSI and CSI are further merged and transformed by an RNN model to extract the temporal features in them for target tracking purposes. The proposed learning-based IPS can be applied for three locating scenarios: (1) device-based location tracking; (2) device-free target tracking; and (3) device-free target tracking and following. Our experimental results show that the proposed IPS can provide a positioning accuracy of 0.898 meters in device-based location tracking. For device-free target tracking, however, the average MSE of positioning is 2.164 meters, and is found mainly affected by the propagation distances from the reference signals to the target. To improve location precision, a novel device-free target tracking method was proposed in scenario (3) where the target’s location is tracked based on reference signals sent from a device that follows the target. Both the target and the device need to be positioned in this scenario. The results show that the MSE of target tracking with the proposed method can reduce to 0.773 meters, which has a significant \(64\%\) improvement over the device-free tracking method in scenario (2). This suggests that the proposed method has a good potential to be extended for device-free multi-target tracking in the future.
Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Balo, F., & Torǧul, B. (2016). Internet of things: A survey. International Journal of Applied Mathematics, Electronics and Computers, 104–110. https://doi.org/10.18100/ijamec.267197
Yassin, A., Nasser, Y., Awad, M., Al-Dubai, A., Liu, R., Yuen, C., Raulefs, R., & Aboutanios, E. (2017). Recent advances in indoor localization: A survey on theoretical approaches and applications 19(2), 1327–1346. https://doi.org/10.1109/COMST.2016.2632427
Khalajmehrabadi, A., Gatsis, N., & Akopian, D. (2017). Modern WLAN fingerprinting indoor positioning methods and deployment challenges. IEEE Communications Surveys & Tutorials, 19(3), 1974–2002. https://doi.org/10.1109/COMST.2017.2671454
He, S., & Chan, S.-H.G. (2016). Wi-Fi fingerprint-based indoor positioning: Recent advances and comparisons. IEEE Communications Surveys & Tutorials, 18(1), 466–490. https://doi.org/10.1109/COMST.2015.2464084
Bahl, P., & Padmanabhan, V. N. (2000). Radar: an in-building RF-based user location and tracking system. In: Proceedings IEEE INFOCOM 2000 - Conference on Computer Communications, vol. 2, pp. 775–784. https://doi.org/10.1109/INFCOM.2000.832252
Yang, Z., Zhou, Z., & Liu, Y. (2013). From RSSI to CSI: Indoor localization via channel response. ACM Computing Surveys, 46(2), 25–12532. https://doi.org/10.1145/2543581.2543592
Wu, Z.-H., Han, Y., Chen, Y., & Liu, K. J. R. (2015). A time-reversal paradigm for indoor positioning system. IEEE Transactions on Vehicular Technology, 64(4), 1331–1339. https://doi.org/10.1109/TVT.2015.2397437
Mazuelas, S., Bahillo, A., Lorenzo, R. M., Fernandez, P., Lago, F. A., Garcia, E., Blas, J., & Abril, E. J. (2009). Robust indoor positioning provided by real-time RSSI values in unmodified WLAN networks. IEEE Journal of Selected Topics in Signal Processing, 3(5), 821–831. https://doi.org/10.1109/JSTSP.2009.2029191
Ko, C.-H., & Wu, S.-H. (2022). A framework for proactive indoor positioning in densely deployed WiFi networks. IEEE Transactions on Mobile Computing, 21(1), 1–15. https://doi.org/10.1109/TMC.2020.3001127
Zhang, L., & Wang, H. (2019). 3D-WiFi: 3D localization with commodity WiFi. IEEE Sensors Journal, 19(13), 5141–5152. https://doi.org/10.1109/JSEN.2019.2900511
Wu, K., Xiao, J., Yi, Y., Chen, D., Luo, X., & Ni, L. M. (2013). CSI-based indoor localization. IEEE Transactions on Parallel and Distributed Systems, 24(7), 1300–1309. https://doi.org/10.1109/TPDS.2012.214
Youssef, M., & Agrawala, A. (2008). The Horus location determination system. Wireless Networks, 14(3), 357–374. https://doi.org/10.1007/s11276-006-0725-7
Feng, C., Au, W. S. A., Valaee, S., & Tan, Z. (2012). Received-signal-strength-based indoor positioning using compressive sensing. IEEE Transactions on Mobile Computing, 11(12), 1983–1993. https://doi.org/10.1109/TMC.2011.216
Wang, X., Wang, X., & Mao, S. (2020). Deep convolutional neural networks for indoor localization with CSI images. IEEE Transactions on Network Science and Engineering, 7(1), 316–327. https://doi.org/10.1109/TNSE.2018.2871165
Wang, X., Gao, L., Mao, S., & Pandey, S. (2015). DeepFi: Deep learning for indoor fingerprinting using channel state information. In: IEEE Wireless Communications and Networking Conference (WCNC), pp. 1666–1671. https://doi.org/10.1109/WCNC.2015.7127718
Wang, X., Wang, X., & Mao, S. (2017). ResLoc: Deep residual sharing learning for indoor localization with csi tensors. In: IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), pp. 1–6. https://doi.org/10.1109/PIMRC.2017.8292236
Liu, Y., Xiong, W., Zhu, Z., & Li, S. (2018). CSI-based high accuracy device free passive localization system. In: IEEE 88th Vehicular Technology Conference (VTC-Fall), pp. 1–5. https://doi.org/10.1109/VTCFall.2018.8690670
Dang, X., Si, X., Hao, Z., & Huang, Y. (2019). A novel passive indoor localization method by fusion CSI amplitude and phase information. Sensors, 19(4). https://doi.org/10.3390/s19040875
Chen, K. M., & Chang, R. Y. (2020). Semi-supervised learning with GANs for device-free fingerprinting indoor localization. In: IEEE Global Communications Conference (GLOBECOM), pp. 1–6. https://doi.org/10.1109/GLOBECOM42002.2020.9322456
Zhang, Y., Qu, C., & Wang, Y. (2020). An indoor positioning method based on CSI by using features optimization mechanism with LSTM. IEEE Sensors Journal, 20(9), 4868–4878. https://doi.org/10.1109/JSEN.2020.2965590
Hoang, M. T., Yuen, B., Dong, X., Lu, T., Westendorp, R., & Reddy, K. (2019). Recurrent neural networks for accurate RSSI indoor localization. IEEE Internet of Things Journal, 6(6), 10639–10651. https://doi.org/10.1109/JIOT.2019.2940368
Zhang, L., Gao, Q., Ma, X., Wang, J., Yang, T., & Wang, H. (2018). DeFi: Robust training-free device-free wireless localization with WiFi. IEEE Transactions on Vehicular Technology, 67(9), 8822–8831. https://doi.org/10.1109/TVT.2018.2850842
Goldsmith, A. (2005). Wireless Communications. Cambridge University Press, Cambridge, U.K. https://doi.org/10.1017/CBO9780511841224
Rappaport, T. S. (2002). Wireless Communications: Principles and Practice, 2nd edn. Prentice Hall, Hoboken, New Jersey, U.S. https://dl.acm.org/doi/10.5555/559977
OpenWrt Project: Welcome to the OpenWrt Project. (2004 (accessed March 1, 2020)). https://openwrt.org/
Xie, Y., Li, Z., & Li, M. (2015). Precise power delay profiling with commodity WiFi. In: Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. MobiCom ’15, pp. 53–64. ACM, New York, NY, USA. https://doi.org/10.1145/2789168.2790124
IEEE standard for information technology– local and metropolitan area networks– specific requirements– part 11: Wireless lan medium access control (MAC) and physical layer (PHY) specifications amendment 5: Enhancements for higher throughput. Technical report, IEEE (2009). https://doi.org/10.1109/IEEESTD.2009.5307322
Chen, C., Chen, Y., Han, Y., Lai, H., & Liu, K. J. R. (2017). Achieving centimeter-accuracy indoor localization on WiFi platforms: A frequency hopping approach. IEEE Internet of Things Journal, 4(1), 111–121. https://doi.org/10.1109/JIOT.2016.2628701
Wang, X., Gao, L., & Mao, S. (2016). CSI phase fingerprinting for indoor localization with a deep learning approach. IEEE Internet of Things Journal, 3(6), 1113–1123. https://doi.org/10.1109/JIOT.2016.2558659
Tadayon, N., Rahman, M. T., Han, S., Valaee, S., & Yu, W. (2019). Decimeter ranging with channel state information. IEEE Transactions on Wireless Communications, 18(7), 3453–3468. https://doi.org/10.1109/TWC.2019.2914194
Zhang, F., Chen, C., Wang, B., Lai, H.-Q., Han, Y., & Liu, K. J. R. (2018). WiBall: A time-reversal focusing ball method for decimeter-accuracy indoor tracking. IEEE Internet of Things Journal, 5(5), 4031–4041. https://doi.org/10.1109/JIOT.2018.2854825
Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations (ICLR 2015). https://doi.org/10.48550/ARXIV.1409.0473
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673–2681. https://doi.org/10.1109/78.650093
Acknowledgements
This research has been funded in part by the Ministry of Science and Technology (MOST), Taiwan, under the grants of 109-2221-E-009-103-MY3, 109-2218-E-009-002, 109-2221-E-009-025, 110-2221-E-009-031, 110-2224-E-A49-001, 111-2218-E-A49-024, and 111-3114-E-A49-001, in part by the Center for Open Intelligent Connectivity under the Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan, and is grateful for the support of The National Defense Science and Technology Academic Collaborative Research Project in 2023.
Funding
This research has been funded in part by the National Science and Technology Council (NSTC), Taiwan, under the grants of 109-2221-E-009 -103 -MY3, 111-2218-E-A49-024 and 111-3114-E-A49-001.
Author information
Authors and Affiliations
Contributions
Ching-Lan Chen conducted the algorithm developments, data measurements, program writing, and performance verifications for this work. Chun-Hsien Ko contributed to the data measurements and performance verifications, Sau-Hsuan Wu provided the original idea, algorithm development, and performance verification, Heng-Shih Tseng performed part of the data measurement and program writing, and Ronald Y. Chang did part of the algorithm development and performance verification of this work.
Corresponding author
Ethics declarations
Ethics Approval
N.A.
Conflict of Interest/Competing Interests
N.A.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chen, CL., Ko, CH., Wu, SH. et al. Device-Free Target Following with Deep Spatial and Temporal Structures of CSI. J Sign Process Syst 95, 1327–1340 (2023). https://doi.org/10.1007/s11265-023-01862-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-023-01862-y