1 Introduction

The rapid development of smartphones has made it a carrier of location-based service (LBS), such as indoor localization, navigation, and tracking. In recent years, received signal strength (RSS) fingerprint-based Wi-Fi positioning methods have attracted the attention of many researchers because RSS can be easily obtained by a Wi-Fi-integrated mobile device without any additional hardware [1,2,3,4,5]. The performance of the fingerprint-based methods depends on the number of the reference points (RPs) in unit space. However, as RSS measurement is time-consuming and laborious, an increase in the number of RPs will increase positioning costs.

Figure 1 shows the RSS distribution of three different access points (APs) in the same space. It can be seen that the RSS distribution of different APs in the same space is different because of the differences of their place, so there is a corresponding relationship between the location and the RSS values, just like the fingerprints of persons. It is called “location fingerprint.”

Fig. 1
figure 1

RSS distributions for three different APs in the same space. a The RSS value received from the AP1 at the RPi.b The RSS value received from the AP2 at the RPi.c The RSS value received from the AP3 at the RPi

Fingerprint-based positioning requires first establishing a fingerprint database and then performing online matching in two stages of rough matching and precise matching in this paper. Wireless signals depend on the propagation environment, and the multi-path characteristics of different channels are different. During the transmission of wireless signals, unique signals related to the transmission environment are produced through reflection, refraction, and scattering. The multi-path feature is referred to as the “location fingerprint” [6]. However, since the radio frequency (RF) signals can vary over time and space due to obstacles and multi-path fading, this may degrade performance of using RSS fingerprints for positioning. It is therefore difficult to achieve good accuracy in most applications.

To achieve better indoor localization results, there are many fusion approaches to enhance the positioning accuracy [2, 7, 8]. However, if different types of sampling terminals are employed in offline stages and online stages, the positioning accuracy will be reduced directly [9]. Such device heterogeneity will affect the performance of indoor localization [10]. Therefore, the key of indoor localization technology is to design an accurate real-time indoor localization system without any hardware installation and modification, which can be readily deployed to the mobile devices.

A method by the existing Wi-Fi infrastructure, without the need of knowing the specific location of each AP, is proposed in this paper. The compressive sensing technology is used for reconstructing the position fingerprints, and the received signal strength difference (RSSD) is used for replacing the traditional RSS. The purpose is to improve the positioning accuracy without increasing the hardware investment, no matter which devices the users use.

The main contributions of this paper are as follows:

  • To present a method of indoor positioning through the fusion of RSSD and rompressive sensing (CS) without any hardware installation and modification. Replace the traditional RSS with RSSD to realize localization, which can undermine the influence of device heterogeneity on localization accuracy to some extent. At the same time, the compressive sensing algorithm is presented for location matching. The existing reference point data are used for restoring the complete fingerprint database so that every point in space has a corresponding fingerprint and the localization accuracy can be improved to a certain extent.

  • To present an improved fuzzy clustering method (IFCM) algorithm, unlike the traditional fuzzy clustering method (FCM) algorithm, it is not necessary to determine the clustering center in advance. It takes all sample points as the candidate clustering centers and gives each sample point a corresponding real number, which is called the biased parameter. The larger the deviation parameter is, the more likely to be determined as the clustering center.

The rest of this paper is organized as follows: Section 2 describes a summary of related work. Next, Section 3 describes the overall system architecture and RSSD-CD methods. The experimental results and the related discussion are presented in Section 4. Finally, Section 5 provides the conclusions of this paper.

2 Related work

Fingerprint-based location technology does not require prior knowledge of infrastructure location, nor does it need a propagation model. In general, fingerprint-based methods map RSS values to physical locations through pattern recognition technology [11]. This kind of RSS-localization mapping is obtained from the site survey, and it can be located by using different statistical modeling or machine learning algorithm to compare data in online measurement and pre-existing database [12, 13]. K-nearest neighbor (KNN) [14], multi-layer perceptron [15], neural network [16, 17], maximum likelihood [18, 19], support vector machine [20], nuclear methods [21,22,23], etc. can be used in indoor localization. Nuclear-based machine learning technology can also be applied to WLAN localization [21]. Indoor signal propagation environment, however, is quite complex. The RSS of a given location usually changes over time due to the reason of interference, reflection, refraction, multi-path fading, direction of equipment, hardware changes, temperature, humidity, and even attack [24,25,26,27,28,29]. Additional radio bandwidth will also change RSS in a bandwidth-constrained system [30, 31]. Such noise causes the mismatch between the online measuring data and the offline recorded data in the location fingerprint system. The previously stored fingerprinting map no longer reflects the statistical features of the current RSS, so the performance of the system decreases. In an actual indoor environment, this problem is inevitable. At the same time, the Wi-Fi chipsets on different smartphones may be sensitive to different Wi-Fi APs and channels. Therefore, the signal values and length of signal vector [32] are different. The operating systems of smartphones may support different APs. The detection rate and number of APs sensed can be altered [33]. Therefore, even the same device with different operating systems may still have heterogeneity.

In literature [34, 35], the authors proposed that RSS for different devices follow a linear model for the same signals. The signals of two devices are given, and the linear calibration model is obtained by regression. Offline training is required before the exact mapping relationships. However, manual data acquisition and calibration are inconvenient for deployment. To reduce such labor-intensive work, some crowd-sourced algorithms [36] can be implemented to provide efficient signal data for calibration. To reduce the workload of offline calibration, online calibration is adopted in literature [32, 37]. Hossain et al. have proposed a simple signal strength difference (SSD) algorithm based on the signal and antenna gain model. The constant factor of the antenna gain is derived. A similar idea is employed in the received signal strength indication (RSSI) ratio [37]. An AP measurement value is selected [37], and each AP signal is divided by this constant value. These methods are easy to realize on the online calibration of the signal vector.

Numerous researchers improve the accuracy of the indoor localization system from different aspects. Wang et al. [8] have used curve fitting (CF) technology to determine RSS-distance relation in an indoor environment, instead of using a logarithmic distance model and a series of linearly independent functions to build a generic RSS-distance model. The advantage of this approach is that the data can be extracted from a small number of RSS measurements. By dividing space and curve fitting in each partition, a more accurate RSS-distance relationship can be obtained for each AP. First, a subarea which a mobile device belongs to is determined, and then the sum of distance errors can be minimized to realize localization. However, the drawback of this scheme is that it only considers the regular layout of the room and not considers the diversity of terminals.

Fang et al. [38] have matched the RSS value of the mobile device and the fingerprint database by using the compressive sensing principle. Compressive sensing technology can be applied to compensate for the scarcity compressed signal or noise. This method uses a small amount of fingerprint sampling to reconstruct a complete fingerprinting map, it can reduce the workload of building a fingerprint database and also can improve the accuracy of positioning, but this method still does not take the diversity of terminals into account and has a higher computational complexity.

Wen et al. [39] have proposed a new algorithm to adjust the RSS fingerprint database by using the feedback information about the surrounding environment. The system uses different, updated measurements and offline fingerprint points according to the time and spatial intensity of the location. Because the adaptive radius is applied as the main factor of affecting positioning accuracy, this method can solve the problem of signal occlusion which were caused by moving obstacles in simulation and actual scenes.

Zheng et al. [40] have used a low-tubal-rank tensor to model Wi-Fi fingerprints of all reference points (RPs) and proposed an adaptive sampling scheme via approximate volume sampling to improve reconstruction accuracy of radio map with reduced expenditure. Meanwhile, they have provided a mathematical theory to analyze and derive the performance bounds of the proposed method in terms of sample complexity and reconstruction error.

References [41,42,43] are based on outdoor positioning. Wu et al. [41] proposed a mobile positioning system and a mobile positioning method based on recurrent neural networks to analyze the RSSIs from heterogeneous networks, which include cellular networks and Wi-Fi networks. The network signals from heterogeneous networks can be analyzed to improve the accuracies of the estimation of locations. Cheng et al. [42] proposed an intelligent positional approach for high-speed trains based on ant colony optimization and machine learning algorithms. The proposed methods can enhance the real-time performance in the online updating process on the premise of reducing the positioning error. Chen et al. [43] proposed a two-stage estimation algorithm based on variable projection method for GPS positioning. The proposed method can effectively mitigate multi-path interference.

Many researchers have used different methods to improve indoor localization performance. However, they do not consider the influence of both equipment heterogeneity and non-reference point factors on positioning accuracy. Compared to these works, our received signal strength difference and compressive sensing (RSSD-CS) method can restore complete signal space with limited RSS measurement. It can reduce noise interference and remove outliers to some extent. No matter what type of terminals are being used, a satisfactory positioning effect can be obtained.

3 Methods

Although locational fingerprinting technologies based on Wi-Fi have relatively high localization accuracy, collected RSS samples could be affected by indoor environmental factors, such as multi-path, shielding, and person’s movement. RSS values are likely to be different even at the same device, same location, and different time, which could affect localization accuracy significantly. Due to the differences in the type of chipsets in the devices, the RSS values measured by different devices at the same time, in the same location, are also different. It is impossible to establish a fingerprint database for each device in the database. Therefore, the error caused by the heterogeneity of devices cannot be ignored.

As can be seen from Fig. 2, the sampling of RSS values at a different time in the same location is constantly changing. The estimation of location only by absolute RSS value will produce a large error. It is necessary to select the mean value within continuous time sampling as the measured RSS values for calculation both online and offline states. Doing so can reduce the adverse impact of environmental factors, but it is impossible to eliminate it fundamentally.

Fig. 2
figure 2

RSS measured by the same device at the same location during one minute

The indoor propagation model of the wireless signal is [44]:

$$ P(d){\left|{}_{\mathrm{dB}\mathrm{m}}=P\left({d}_0\right)\right|}_{\mathrm{dB}\mathrm{m}}-10\beta l\mathrm{g}\left(\frac{d}{d_0}\right)+{X}_{\mathrm{dB}}=10\lg \left(\frac{P_{\mathrm{AP}}{G}_{\mathrm{AP}}{G}_{\mathrm{MT}}{\lambda}^2}{16{\pi}^2{d_0}^2L}\right)-10\beta \lg \left(\frac{d}{d_0}\right)+{X}_{\mathrm{dB}} $$
(1)

where PAP is the transmission power of AP, GAP is the antenna gain of AP, GMT is the antenna gain of mobile terminal (MT), L is system loss factor, λ is the wavelength of the wireless signal. Due to different Wi-Fi network interface controllers (NICs) embedded in smartphones, the antenna gain (GAP and GMT) may be different. This leads to RSS samples at the same time, and the same location by different devices are also different.

From formula (1), it can be seen that the RSS value at a distance d depends on the hardware parameter of MT. So,

$$ {\left.P\left({d}_1\right)\right|}_{\mathrm{dB}\mathrm{m}}=10\lg \left(\frac{P_{\mathrm{AP}1}{G}_{\mathrm{AP}1}{G}_{\mathrm{MT}}{\lambda_{\mathrm{AP}1}}^2}{16{\pi}^2{d_0}^2{L}_1}\right)-10{\beta}_1\lg \left(\frac{d_1}{d_0}\right)+{\left[{X}_1\right]}_{\mathrm{dB}} $$
(2)
$$ {\left.P\left({d}_2\right)\right|}_{\mathrm{dB}\mathrm{m}}=10\lg \left(\frac{P_{\mathrm{AP}2}{G}_{\mathrm{AP}2}{G}_{\mathrm{MT}}{\lambda_{\mathrm{AP}2}}^2}{16{\pi}^2{d_0}^2{L}_2}\right)-10{\beta}_2\lg \left(\frac{d_2}{d_0}\right)+{\left[{X}_2\right]}_{\mathrm{dB}} $$
(3)

Formula (2) minus formula (3) is:

$$ P\left({d}_1\right){\left|{}_{\mathrm{dBm}}-P\left({d}_2\right)\right|}_{\mathrm{dBm}}=10\lg \left(\frac{P_{\mathrm{AP}1}{G}_{\mathrm{AP}1}{\lambda_{\mathrm{AP}1}}^2{L}_2}{P_{\mathrm{AP}2}{G}_{\mathrm{AP}2}{\lambda_{\mathrm{AP}2}}^2{L}_1}\right)-10{\beta}_1\lg \left(\frac{d_1}{d_0}\right)+10{\beta}_2\lg \left(\frac{d_2}{d_0}\right)+{\left[{X}_1-{X}_2\right]}_{dB} $$
(4)

As can be seen from formula (4), the difference of two RSS values eliminates the item of GMT, that is, RSSD eliminates the impact of device heterogeneity. The difference among APs can be used as fingerprint features, for example, RSS differences, i.e., a RSSD value between two APs can be used as a fingerprint. RSSD can reflect the interval relationship between fingerprints at a specific time and location, and it is a more stable wireless signal feature than the absolute RSS value. But, for n APs, the values of independent RSSD can only be n − 1. This reduction in dimension may make RSSD slightly less accurate than RSS when using the same device in the offline and online stages. However, Kjargaard [45] has pointed out that the increase of RSS fingerprint dimension does not bring obvious improvement of localization accuracy when n is greater than 5. So, when using different devices, RSSD as the location fingerprint can bring significant improvement to accuracy.

WLAN-based fingerprint localization consists of two stages: the offline stage and the online stage. During the offline stage, RSS reading of each access point (AP) was collected at the uniformly distributed reference point (RP), and the RSS values were clustered and eliminated outliers. The most stable m APs and their corresponding RSS values of each cluster were extracted as the eigenvalues for the subsequent rough localization. During the online stage, first, the AP and corresponding RSS reading measured online was compared with the clustering eigenvalues in the offline stage to find the most similar clusters and roughly locate several subsets. Then, the orthogonalization is done on the set to meet the necessary conditions for compressive sensing principle. The processed set is calculated as the RSSD between APs. Finally, the compressive sensing theory is used for accurate localization. The flow chart of the proposed scheme is shown in Fig. 3

Fig. 3
figure 3

Block diagram of the proposed indoor localization system

3.1 Offline stage

In general, the localization application scenario is that the user holds a mobile device, measures RSS reading of some APs which can be sensed, and the user's location is shown on the map or illustrated as a message. The user's location estimation is done by comparing the similarity between the RSS values measured at the mobile and the fingerprint map. In this paper, the compressive sensing theory is used for reconstructing the fingerprint map. In the process of fingerprint matching, the RSSD value is used as the new fingerprint to compare the measured value to the database, to find the most matching fingerprint, and to realize the localization.

3.1.1 Fingerprint collection

In the offline phase, the original set of the RSS time samples collected from AP i in RP j are denoted as {ψi, j(τ), τ = 1, 2, …, q, q > 1}, where q represents the total number of time samples. The experiment result shows that the mean value of received signal strength tends to be stable when q ≥ 30 in the experimental environment. Then, the average of the RSS time samples is computed and stored in a database, known as a radio map. Such a radio map is used for describing the RSS spatial nature of the location area and can be represented by Ψ:

$$ \Psi =\left[\begin{array}{cccc}{\psi}_{1,1}& {\psi}_{1,2}& \cdots & {\psi}_{1,N}\\ {}{\psi}_{2,1}& {\psi}_{2,2}& \cdots & {\psi}_{2,N}\\ {}\vdots & \vdots & \ddots & \vdots \\ {}{\uppsi}_{L,1}& {\psi}_{L,2}& \cdots & {\psi}_{L,N}\end{array}\right] $$
(5)

where ψi, j is the average of RSS readings from AP i at RP j (unit: dBm), that is, \( {\psi}_{i,j}=1/q\cdotp {\sum}_{\tau =1}^q{\psi}_{i,j}\left(\tau \right),i=1,2,\cdots, L,j=1,2,\cdots, N \), L is the total number of APs that can be detected, and N is the number of RPs. Column vector \( {\overrightarrow{\psi}}_j \)in Ψ represents the average RSS vector that at the RP j received from L APs, is denoted as:

$$ {\overrightarrow{\psi}}_j={\left[{\psi}_{1,j},{\psi}_{2,j},\cdots, {\psi}_{L,j}\right]}^T,j\in \left\{1,2,\cdots, N\right\} $$
(6)

where the superscript T denotes the transposition. The variable values of the RSS time sampling sequence from each AP at all RPs are stored in the database to be used in fine localization by the AP’s selection mechanism subsequently. The variance vector corresponding to the RPj, j = 1, 2, ⋯, N is defined as:

$$ {\overrightarrow{\varDelta}}_j={\left[{\varDelta}_{1,j},{\varDelta}_{2,j},\cdots, {\varDelta}_{L,j}\right]}^T $$
(7)

where Δi, j is the unbiased estimated variance of RSS readings from APi at RPj.

\( {\varDelta}_{i,j}=1/\left(q-1\right)\cdotp {\sum}_{\tau =1}^q{\left({\psi}_{i,j}\left(\tau \right)-{\psi}_{i,j}\right)}^2,i=1,2,\cdots, L \). The table composed of \( \left\{\left({x}_j,{y}_j;{\overrightarrow{\psi}}_j,{\overrightarrow{\varDelta}}_j\right),j=1,2,\cdots, N\right\} \) forms a complete fingerprint database and is stored in the server.

3.1.2 Improved fuzzy clustering method (IFCM)

Due to the indoor complex environment and complex time-varying characteristics of radio wave propagation, the online RSS is different from that in the radio map. As can be seen from Fig. 2, there is a deviation of 2–20 dBm in the RSS reading of the same AP at the same sampling point over time. To reduce the impact of time variability and remove potential outliers, the reference points in the database can be clustered. The members of the same class have the characteristics of geographic proximity and RSS fingerprint similarity. These characteristics can be used for the rough localization mechanism of the subsequent online stage, and narrow the localization area to one or more classes. Clustering can weaken the influence of time-varying and spatial interference of partial RSS, improve the efficiency of accurate localization, and reduce the time complexity of the algorithm.

There are many clustering methods, for example, the K-means clustering algorithm, FCM algorithm, ant colony algorithm, and neural network algorithm. Wu et al. [46] proposed an efficient pixel clustering-based method for mining spatial sequential patterns from serial remote sensing images. To cluster pixels into a pixel-group rapidly, the images were converted to the run-length coding schema, from which images can be overlaid with each other efficiently, to produce pixels with a sequence list. Chen et al. [47] proposed a travel time prediction system based on data clustering for waste collection vehicles. The adaptive-based clustering (ABC) methods analyze the similarities or distances between the data and cluster centers and group these data into several clusters in accordance with a threshold. He et al. [48] proposed an evolutionary K-means (EKM) algorithm, which combines K-means and genetic algorithm, solves K-means initiation problem by selecting parameters automatically through the evolution of partitions. Two aggregated consensus matrices are defined to store the clustering tendency for each pair of instances. They store the tendency that a pair of data instances should group together or group apart, respectively. The matrices were used to evaluate partitions. Liao et al. [49] proposed a tensor factorization-based user cluster (TFUC) model. The latent influence users are identified by a neural network clustering model. This model can filter the marketing users with low influence before constructing tensor, which is proven to significantly enhance the recommendation effect. Wang et al. [50] proposed two efficient co-clustering algorithms via nonnegative matrix tri-factorization. First, an effective penalty nonnegative matrix three-factorization method is introduced for high-order orthogonality constraints. Secondly, the three-factorization of symmetric penalty nonnegative matrix is used to deal with the co-aggregation problem in the extraction of the sample similarity matrix.

FCM algorithm is the commonly used fuzzy clustering method, but this algorithm is sensitive to the initial value and easy to fall into local extreme value, it is difficult to get the global optimal solution. Unlike the traditional FCM algorithm, the IFCM clustering algorithm does not need to determine the clustering center in advance. It takes all n sample points as the candidate clustering center and gives each sample point a real value, which is called the biased parameter. The larger parameters are more likely to be used as clustering centers. At the same time, the selection of biased parameters will also affect the number of clustering. The algorithm establishes similarity information between each sample point and other sample points, maximizes the fitness function in the sample competition through the message iteration of the loop between samples, and finally forms the class center and class members. Because the algorithm does not need to randomly select the class center and the convergence speed is fast, it is applied to the data preprocessing in the offline phase of the localization system.

Sim(i,j) represents the similarity between the reference point RPj and the reference point RPi. The received signal strength vector at the reference point RPj can be approximately expressed as \( {\overrightarrow{\psi}}_j+{\varepsilon}_j \), where εj is the measurement noise, approximate Gaussian distribution under certain conditions. Therefore, Euclidean distance can be used as a judgment basis to measure the similarity between RSS vectors of reference points. The similarity between reference points is defined as:

$$ \mathrm{Sim}\left(i,j\right)=-\left\Vert {\overrightarrow{\psi}}_i\right.-{\left.{\overrightarrow{\psi}}_j\right\Vert}^2,\forall i,j\ne i\in \left\{1,2,\dots, N\right\} $$
(8)

where Sim(k, k), k = 1, 2, …, N is defined as the self-similarity function, and the value is assigned as the bias parameter to indicate the possibility of the reference point RPk becoming the class center. Since there is no prior knowledge, all reference points are likely to become class centers. The bias parameters of each reference point are defined as a function of the median similarity value, which can be expressed as:

$$ b=\alpha \cdotp \mathrm{median}\Big\{\mathrm{Sim}\left(i,j\right),\forall i,j\in \left\{1,2,\dots, N\right\},j\ne i $$
(9)

where α is the empirical value obtained during the experiment process, which is used for selecting a moderate number of classes.

The core of the improved FCM clustering algorithm is the transmission of messages between two reference points:

Each RP is assigned the same offset parameter to form the affiliation matrix a[N][N], whose initial value is a function of the median value of Sim(i,j).

Attractiveness messages are the adaptability of one reference point to be the class center of other reference points, and affiliation messages are the affiliation degree of a reference point, which becomes a class member centered on another reference point. IFCM clustering is to continuously search for evidence from data and carry out message transmission in a cyclic and iterative way, to produce a high-quality class center and assign a classification center to every member.

Attractiveness message a(i, j)

Attractiveness message a(i, j) is sent from the reference point RPi to RPj, reflecting the cumulative evidence which RPj as a class attracts to RPi under the influence of considering another candidate reference point besides RPj, is denoted as:

a(i, j) = Sim(i, j) − max {t(i, j) + Sim(i, j)}, j ≠ j, (10)

where t(i, j) are affiliation messages.

Affiliation message t(i, j)

Affiliation message t(i, j) is sent from candidate class center RPj to RPi, reflecting the accumulated evidence that under the influence of other class members taking RPj as the center, RPi believes that RPj is belonging to its class center, they are denoted as:

$$ t\left(i,j\right)=\min \left\{0,a\left(j,j\right)+{\sum}_{i^{\prime}\ne i,j}\max \left\{0,a\left({i}^{\prime },j\right)\right\}\right\},{i}^{\prime}\ne i $$
(11)

The self-affiliation message t(j, j) reflects the accumulated evidence of RPj as the class center after the positive attraction values are sent from other reference points besides RPj to RPj. It is denoted as:

$$ t\left(j,j\right)={\sum}_{i^{\prime}\ne j}\max \left\{0,a\left({i}^{\prime },j\right)\right\} $$
(12)

Class centers compete through passing two messages between the pairs of reference points. For every RPi, calculating j = argmaxj ∈ {1, 2, …, N}{t(i, j) + a(i, j)}, if j = i, then RPi is selected as the class center. On the contrary, RPj is going to be the class center of RPi. The messages are passed recursively between pairs of RPs within each radio map, and the above updating rules are followed until the appropriate number of class convergences, the corresponding class center is formed.

In general, the reference points located at the center of the cluster usually have a bigger sum of attraction to other reference points. Therefore, they are more likely to become the class center. On the contrary, for those located at the border of the cluster, the possibility of becoming the class center is low due to smaller sums of attraction.

3.2 Online stage

Define the RSS value measured by the user's handheld terminal device at any unknown location as:

$$ {\overrightarrow{\varphi}}_r={\left[{\varphi}_{1,r},\dots, {\varphi}_{L,r}\right]}^T $$
(13)

where φK, r is the online average RSS. The online localization stage is divided into a coarse localization stage and fine localization stage. The coarse localization first compares the similarity between the RSS vectors \( {\overrightarrow{\varphi}}_r \)and every class center in the offline database, and an appropriate threshold is set to narrow the localization area with a sub-set of the reference point, which not only reduces the computational complexity of the localization algorithm but also can remove the errors caused by outliers at a distance. Then, select the most powerful m APs from the RSS vector \( {\overrightarrow{\varphi}}_r \)measured online, conducting the AP selection of the data in the offline database and select the same m APs to participate in the calculation. To eliminate device heterogeneity, the subtraction between the first AP and the other m − 1 APs was performed to generate a new RSSD matrix for localization.

3.2.1 Coarse localization

According to the definition of similarity function, the set of matching class center is denoted as:

$$ \mathrm{Si}{\mathrm{m}}_{\mathrm{m}\mathrm{t}}=\left\{\mathrm{Sim}\left(r,j\right)>\beta, \kern0.5em j\in H\right\} $$
(14)

where H is the number of clusters, β is the threshold for controlling the number of matching classes, which is defined as the function of maximum similarity and minimum similarity, and is denoted as:

$$ \beta =\max \left\{\mathrm{Sim}\left(r,j\right)\right\}+{\beta}_1\min \left\{\mathrm{sim}\left(r,j\right)\right\} $$
(15)

where β1 is usually a real number between 0.3 and 0.5.

After the matching class is determined, the RSS vector set received by the members of the matching class is denoted as a \( L\times \overset{\sim }{N} \) matrix \( \overset{\sim }{\Psi} \):

$$ \overset{\sim }{\Psi}=\left[{\overset{\sim }{\psi}}_{i,j},i=1,2,\dots, L,\forall j\in C\right] $$
(16)

which means \( \overset{\sim }{\Psi}=\left[\begin{array}{cccc}{\overset{\sim }{\psi}}_{1,1}& {\overset{\sim }{\psi}}_{1,2}& \cdots & {\overset{\sim }{\psi}}_{1,\overset{\sim }{N}}\\ {}{\overset{\sim }{\psi}}_{2,1}& {\overset{\sim }{\psi}}_{2,2}& \cdots & {\overset{\sim }{\psi}}_{2,\overset{\sim }{N}}\\ {}\vdots & \vdots & \ddots & \vdots \\ {}{\overset{\sim }{\psi}}_{L,1}& {\overset{\sim }{\psi}}_{L,2}& \cdots & {\overset{\sim }{\psi}}_{L,\overset{\sim }{N}}\end{array}\right] \), where \( \overset{\sim }{N} \) is the number of matching class members, and means \( \overset{\sim }{N}=\left|C\right| \)

AP selection mechanism for the strongest RSS

According to the measured value of online RSS, m APs with the strongest RSS are selected for calculation. Sort the online RSS received from L APs in descending order, select the first M APs, and complete the matrix c line by line. Since the user receives different signals from different APs in different positions, the matrix \( \overset{\sim }{\overset{\sim }{\Psi}} \) needs to be filled in dynamically.

$$ \overset{\sim }{\overset{\sim }{\Psi}}=\left[\begin{array}{cccc}{\overset{\sim }{\overset{\sim }{\psi}}}_{1,1}& {\overset{\sim }{\overset{\sim }{\psi}}}_{1,2}& \cdots & {\overset{\sim }{\overset{\sim }{\psi}}}_{1,N}\\ {}{\overset{\sim }{\overset{\sim }{\psi}}}_{2,1}& {\overset{\sim }{\overset{\sim }{\psi}}}_{2,2}& \cdots & {\overset{\sim }{\overset{\sim }{\psi}}}_{2,N}\\ {}\vdots & \vdots & \ddots & \vdots \\ {}{\overset{\sim }{\overset{\sim }{\psi}}}_{M,1}& {\overset{\sim }{\overset{\sim }{\psi}}}_{M,2}& \cdots & {\overset{\sim }{\overset{\sim }{\psi}}}_{M,N}\end{array}\right] $$
(17)

Received signal strength difference (RSSD) conversion

After determining the matrix \( \overset{\sim }{\overset{\sim }{\Psi}} \), the subtraction between the other m − 1 APs and the first AP is calculated to form the matrix \( \overset{\sim }{\overset{\sim }{\Psi}}^{\prime } \).

$$ \overset{\sim }{\overset{\sim }{\Psi}}^{\prime }=\left[\begin{array}{cccc}\overset{\sim }{\overset{\sim }{\psi }}{\prime}_{1,1}& \overset{\sim }{\overset{\sim }{\psi }}{\prime}_{1,2}& \cdots & \overset{\sim }{\overset{\sim }{\psi }}{\prime}_{1,N}\\ {}\overset{\sim }{\overset{\sim }{\psi }}{\prime}_{2,1}& \overset{\sim }{\overset{\sim }{\psi }}{\prime}_{2,2}& \cdots & \overset{\sim }{\overset{\sim }{\psi }}{\prime}_{2,N}\\ {}\vdots & \vdots & \ddots & \vdots \\ {}\overset{\sim }{\overset{\sim }{\psi }}{\prime}_{M-1,1}& \overset{\sim }{\overset{\sim }{\psi }}{\prime}_{M-1,2}& \cdots & \overset{\sim }{\overset{\sim }{\psi }}{\prime}_{M-1,N}\end{array}\right] $$
(18)

where \( {\tilde{{\tilde{\psi}}^{\prime}}}_{i,j} \) is the difference between the signal strength received from APiand AP1 at the reference point RPj, and is denoted as:

$$ {\tilde{{\tilde{\psi}}^{\prime}}}_{i,j}={\tilde{\tilde{\psi}}}_{i,j}-{\tilde{\tilde{\psi}}}_{1,j} $$
(19)

Let M represent the total number of the strongest AP selected in the localization area and \( \overset{\sim }{N} \)represent the total number of reference points in the matching class.

The RSSD vector received online is defined as \( {{\overrightarrow{\varphi}}_r}^{\prime } \),

$$ {{\overrightarrow{\varphi}}_r}^{\prime }={\left[{\varphi}_{1,r}^{\prime },\dots, {\varphi}_{M-1,r}^{\prime}\right]}^T $$
(20)

where \( {\varphi}_{i,r}^{\prime } \) represents the RSSD between APi and AP1 at the request location, that is:

$$ {\varphi}_{i,r}^{\prime }={\varphi}_{i,r}-{\varphi}_{1,r} $$
(21)

3.2.2 Fine localization

Sparse characteristics and problem modeling

To apply compressive sensing, sparse characteristics and irrelevancy are required. Firstly, for the localization system, the spatial position of the user at a specific time is unique and sparse. Ideally, suppose that the user just stands on a RP, so θ can be defined as a vector of \( \overset{\sim }{N}\times 1 \), and each element represents a reference point in the spatial domain selected in rough localization. Thus, the user’s position can be accurately represented by a 1-dimensional sparse vector, that is, the corresponding reference point position that the user is standing on is 1, and the other positions are 0:

$$ \theta ={\left[0,\dots, 0,1,0,\dots, 0\right]}^T $$
(22)

Set y as the measurement vector of online RSS, yas the vector composed of the RSSD between APi and AP1, and y can be further described as:

$$ {y}^{\prime }={\overset{\sim }{\overset{\sim }{\Psi}}}^{\prime}\theta +\delta $$
(23)

where \( \overset{\sim }{\overset{\sim }{\Psi}}=\varPhi \overset{\sim }{\Psi} \); the definition of \( {\overset{\sim }{\overset{\sim }{\Psi}}}^{\prime } \) is shown in formulas (18) and (19). Φ is M × L matrix and is a selection on the measurement vector y. Each row of Φ is a 1 × L vector, only φ(r) = 1, where r is the index of selected AP. \( \overset{\sim }{\Psi} \) is a \( L\times \overset{\sim }{N} \)matrix, is a subset of the offline fingerprint library defined in (16), each column represents the RSS vector of a selected class member in the coarse localization region. δ is unknown environmental noise. In the ideal environment without considering noise, \( {{\overrightarrow{\varphi}}_r}^{\prime } \) is a column about \( {\overset{\sim }{\overset{\sim }{\Psi}}}^{\prime } \) corresponding to an index position of θ. In this case, the relationship between the measurement vector y and the online received signal strength vector φr is:

$$ y=\Phi {\varphi}_r $$
(24)

The purpose of localization is to reconstruct the 1-dimensional sparse vector θ from formula (23) according to the measurement vector difference y at the moment t, y is calculated by formulas (21) and (20) after the AP selection mechanism.

Orthogonalization

In addition to the sparse characteristics, the uncorrelated or weak correlation matrix \( {\overset{\sim }{\overset{\sim }{\Psi}}}^{\prime } \) is an important feature to ensure that the linear projection of signals can maintain the original structure of signals, which become another important guarantee for sparse signals reconstruction using the compressive sensing theory. The restricted isometry property (RIP) condition of the matrix \( {\overset{\sim }{\overset{\sim }{\Psi}}}^{\prime } \) is given and proved by Candes and Tao [51], and it is a necessary condition using the compressive sensing theory to realize sparse signals reconstruction through 1-minimization [52, 53] approximating 0-minimization. Therefore, orthogonalizing the matrix \( {\overset{\sim }{\overset{\sim }{\Psi}}}^{\prime } \)is necessary.

Orthogonalize y defined in formula (24), that is, z = Ty. Define T = QR+.\( R={\overset{\sim }{\overset{\sim }{\Psi}}}^{\prime } \)Q = orth(RT)T, represents the orthogonalization of the matrix R,R+ is the pseudo-inverse matrix of R. Therefore, the localization problem can be described as the minimization problem of the following 1-norm:

$$ \hat{\theta}=\arg \underset{\theta \in {R}^N}{\min }{\left\Vert \theta \right\Vert}_1,s.t.z= Q\theta +{\delta}^{\prime } $$
(25)

z = Ty = QR+ · ( + δ) = QR+ ·  + δ =  + δ\( {\overset{\sim }{\overset{\sim }{\Psi}}}^{\prime } \)is known, Q meets the RIP characteristic [54]. Since θ has a sparse nature, according to the theory of compressive sensing theory [53, 55,56,57,58], if the number of APs M meets \( M=O\left(\mathit{\log}\overset{\sim }{N}\right) \), the location indicator θ can be well recovered from z with very high probability by solving the following 1-minimization problem:

When the user happens to be located at a reference point, it is ideally reconstructed into a 1-dimensional sparse vector, the value of the index position is 1, while the other positions are 0. However, in practice, the user does not necessarily happen to be at a certain reference point, so the reconstruction of \( \hat{\theta} \)is not necessarily a 1-dimensional sparse vector. Some larger values are taken at the index positions of a few candidate reference points, and the remaining index positions are approximately 0. To some extent, these non-zero values between (0, 1) reflect the possibility of corresponding candidate reference points as position estimation. Therefore, the threshold σ is set, and the reference point corresponding to the coefficient greater than the threshold σ is taken as the second-level candidate reference point and is denoted as ϒ,

$$ \Upsilon =\left\{n|\hat{\theta}(n)>\sigma \right\} $$
(26)

Therefore, the linear combination of these second-order reference points with weights will be used as the final position estimation for accurate localization.

$$ \left(\hat{x},\hat{y}\right)={\sum}_{n\in \Upsilon}{\mu}_n\cdotp \left({\hat{x}}_n,{\hat{y}}_n\right) $$
(27)

where \( {\mu}_n=\hat{\theta}(n)/{\sum}_{i\in \varUpsilon}\hat{\theta}(i) \). The localization is completed.

4 Experimental results and discussion

4.1 RSSD robustness analysis

Thirty locations were randomly selected and sampled with two different mobile terminals for signal strength. RSS and RSSD values of each location were compared, as shown in Fig. 4. Figure 4a is the RSS sequence of the same AP at 30 sampling points of two different mobile terminals. Figure 4b is the RSSD sequence of the difference in signal strength for a pair of APs in two different terminals at 30 sampling points.

Fig. 4
figure 4

Robustness comparison between RSS and RSSD. a RSS sampling from different mobile terminals. b RSSD sampling from different mobile terminals.

It can be seen that at each sampling point, the RSS values obtained by different mobile terminals are different, with a maximum difference of 14.31 dBm, while the RSSD values obtained by different mobile terminals maintain a relative consistency. Although there is a difference between the two RSSD values at the same sampling point, the maximum difference is only 1.74 dBm. That RSSD had better robustness compared with RSS is consistent with the previous analysis.

4.2 Comparison of fingerprint location accuracy of RSS and RSSD

In practice, terminals used in localization are mostly different from that used in fingerprint database construction. At the same time, the number of AP used for localization is also a major factor affecting the localization accuracy. Therefore, in the process of the experiment, two cases of terminal homogeneity and heterogeneity were tested, respectively. Using the traditional KNN algorithm, when the same type of terminals is used for the offline fingerprint database construction and the online localization, the localization accuracy of RSSD is slightly lower than that of the RSS when 6 APs are selected. When 10 APs are selected, their performance is similar, and both of them can achieve the localization accuracy within 3 m in case of over 87%. However, in the case of different types of terminals, the performance of the RSS decreases significantly, and the localization accuracy within 3 m is only 32%, while the localization accuracy within 3 m of RSSD reaches 58%, 85% of the probability can reach the localization accuracy within 5 m, and the localization accuracy can be improved by more than 20% as shown in Fig. 5.

Fig. 5
figure 5

Comparison of RSS and RSSD localization performance. a Terminal homogeneity. b Terminal heterogeneous

It can be seen that the localization accuracy in the case of terminal heterogeneity is significantly lower than that in the case of homogeneity. RSS-based localization decreased by more than 50%, RSSD-based decreased by nearly 30%. But in the practical application, the situation of terminal heterogeneity is universal, so it is urgent to improve the performance of the localization system.

4.3 Using compressive sensing localization performance

In the case of terminal homogeneous, the RSS-based using compress sensing localization accuracy can reach 100% within 3 m, 80% within 1 m and 67% within 0.4 m, with the performance improved by nearly 20%. In the case of terminal heterogeneous, the localization accuracy within 3 m can be achieved at 40% and within 4 m at 60%, with the performance improved by nearly 3.8%. It can be seen that compressive sensing has no obvious effect on device heterogeneity, which is shown in Fig. 6.

Fig. 6
figure 6

Comparison of RSS-based using compressive sensing localization performance. a Terminal homogeneity. b Terminal heterogeneous

In the case of terminal homogeneous, the localization accuracy of the RSSD-based using compress sensing can reach 100% within 3 m, 79% within 1 m and 67% within 0.4 m, with the performance improved by nearly 20%. In the case of terminal heterogeneous, the localization accuracy within 3 m can be achieved at 73% and within 4 m at 92%. It can be seen that RSSD-CS fusion algorithm greatly improves the positioning accuracy in the case of homogeneous and heterogeneous terminals, which is shown in Fig. 7.

Fig. 7
figure 7

Comparison of RSSD-CS localization performance. a Terminal homogeneity.

b Terminal heterogeneous

The performance of RSSD-CS algorithm is verified through experiments, and the specific summary is shown in Tables 1 and 2.

Table 1 Different percentile values concerning the positioning error under terminal homogeneous
Table 2 Different percentile values concerning the positioning error under terminal heterogeneous

5 Conclusion

This paper proposes an RSSD-CS indoor localization algorithm to address the terminal heterogeneity, which could improve the indoor localization accuracy. Firstly, an IFCM algorithm is used for clustering map data to eliminate outliers in the offline phases, narrow the search range for online matching, and reduce time complexity. Then, during the online phases, RSSD values of APs sensed and selected at location points and matching matrices on the map are calculated orthogonally to meet the constraints of CS. Finally, CS theory is used for estimating the user’s location. The experimental results demonstrate that the proposed localization method leads to substantial improvements in localization accuracy over the widely used traditional fingerprinting methods. In particular, the positioning accuracy of this method is improved by 20.5% and 15.6% compared to SSD and received signal strength and compressive sensing (RSS-CS) algorithm.

Future research of this work may have three directions. First, the proposed IFCM algorithm has a higher time complexity, which requires a longer time to calculate when the number of samples is large. While this work only improves the shortcomings of the clustering algorithm in terms of its sensitivity to initial values, an in-depth research for other clustering algorithms could be conducted in the future. Second, the calculation of RSSD values that is conducted in online stages increases the computational complexity. If RSSD values of map data can be obtained offline in the future, the timeliness of location could be guaranteed. Third, the localization accuracy is also related to the network element layout. If the layout of network elements can be optimized in the future, which could ensure that 5–7 first-path signals can be received at any point in the positioning space, the effect of multi-path can be effectively avoided and the positioning accuracy can be further improved.