1 Introduction

Recently, various positioning techniques have been developed for accurate indoor geolocation. These techniques can be roughly divided into two classes. The first and most popular one is traditional technique, which measures received signal strength (RSS), time of arrival (TOA) or other signal property matrixes and calculate distance between the transmitter and receiver. Then triangulation is applied to the measured distances and the location can be determined. Another technique is based on pattern recognition, which is more intelligent and can improve the positioning performance when the measurements are not reliable [1, 2].

In 3D indoor environment, the localization problem becomes extremely complicated. The first consideration is the difference between the techniques utilized in indoor and outdoor geolocation systems. The widely used GPS is no longer an option for the indoor environment since it requires LOS which is blocked in most situations. Many new techniques have been explored recently. The most used one is RF-based technique. Since the properties of RF signals like Received Signal Strength (RSS) and Time-of-Arrival (ToA) can be utilized to determine to the distance between the transmitter and receiver. Then the triangulation can be applied to find out the location of the mobile points (MPs) [2,3,4,5]. Recently, more sensor are introduced into the area of indoor geolocation, such as accelerometer, gyroscope, and barometer, which is commonly described as sensor-fusion techniques [19, 6,7,8]. Moreover, since most of the building will have cameras all over the building for security purposes, the image processing techniques can also be applied in the indoor geolocation area. Another consideration is the difference in the maps used in the outdoor and indoor environment. The Google map can be acquired easily when the users are in the outdoor environment with high-accurate outdoor geolocation. But when one comes indoor, the outdoor map is no longer accurate enough since the outdoor maps cannot show the detailed structure of the indoor environment on different floors and the transition between floors. So indoor maps should be used instead. The last consideration is the 3D geolocation vs 2D geolocation schemes. There have been large amount of researches conducted for the 2D geolocation. But when it comes to 3D scenarios, the problem becomes more complex. For example, the WiFi signal will suffer extra loss when going through the ceilings between floors and the pass loss model should be changed accordingly. In the 2D scenarios, every estimated location is on a certain floor. But in 3D scenarios, more information should be applied such as how to whether the user is inside or outside the building, on which floor is the user located and whether the user is in the elevator or on the stair.

From the considerations above, it is crucial to look deeply into the 3D indoor geolocation problems from different aspects. Cramer-Rao Lower Bound (CRLB) is a typical method which evaluates the positioning performance of different techniques. It can give the lower bound for the positioning schemes so that all the techniques can be compared accordingly. CRLB has been well explored in many previous researches, but none of them consider the factor of certainty of coverage. As we all know, when signal goes from the transmitter to the receiver, it will suffer all kinds of fading and losses. There is a possibility that the transmitted signal becomes lower than the receiver’s sensitivity so that it cannot be detected. In this case, the location is out of coverage and cannot used for position detection. Considering the matter of coverage, it is essential to determine the probability that one location is within the coverage before we start to calculate CRLB [9, 10].

In this study, we tried to explore how the coverage probability can affect the procedure of CRLB calculation and consequently the performance evaluation. The calculation of coverage probability is specifically derived as the basic foundation of the follow-up CRLB calculation. Scenarios are designed for conducting experiments which is based on the infrastructure of Atwart Kent Laboratory (AKL). Both 2D and 3D scenarios are designed and comparisons are made accordingly.

In the following section, we will describe the methodology in detail. Section 3 elaborates the results and analysis while conclusion is made in Sect. 4.

2 Related Work

Since smart phone is powerful with various embedded sensors (Barometer, Gyroscope, etc.) and other applications (WiFi, GPS), approach for intruder detection can be implemented in multiple methods. Some related work has been done related to this topic. Smart devices have become essential parts of our daily life. Smart phone owners can not only use their phones to make phone calls, but also access a wide range of services and information. They can read breaking news, conduct transaction through online banking, and even get information about their health condition from the small but smart devices. Another important application of smart phones is localization and navigation. With various embedded sensors, location can be estimated by different techniques which makes smart phone a good candidate for both outdoor and indoor geolocation. GPS is reliable and accurate in the outdoor localization. By acquiring the Line-of-Sight from the satellites, location can be calculated by using triangulation. Other sensors can also be utilized for indoor localization, such as Wi-Fi, barometer, accelerometer, gyroscope and etc. Wi-Fi signal is the most popular technique that is used in the indoor localization. From the received signal strength (RSS) or time of arrival (ToA), distance from the access points (APs) can be calculated and triangulation can be applied to acquire the estimated location as well. Barometer is good in determine the height of the user inside the building, since altitude is closely related to the air pressure measured by the barometer. Accelerometer is used for measuring the acceleration of the movement. By calculating the second integral of the acceleration, distance of movement can be calculated. Gyroscope is helpful in detecting motion and by looking into the data gathered from this sensor, every motion can be detected so that the location can be estimated according to that [11, 12].

Although indoor geolocation has been explored for a couple of years, there are still challenging problems in this area, among which the map selection and performance evaluation problem are the most crucial and critical. The commonly used Google maps have good performance in the outdoor localization and navigation. By using Google maps, one can be guided to a place with high speed and accuracy GPS application. However, the Google maps do not pay much attention to the indoor environment, in which the detailed layout should be displayed. So it is crucial for us to find out whether the smart phone is operating in the outdoor or indoor environment and the correct map can be selected accordingly. Also, in multiple-floor buildings, layout differs from different floors. Therefore, we should also find out the which floor the smart phone user is currently located in so that the corresponding map can be displayed. The first part of the map selection problem can be described as outdoor-indoor transition detection and the second part as multi-floor transition detection. To solve these to detection problems, proper sensor selection is the very first step, after which scenarios and algorithms can be designed and consequently experiments can be conducted [13,14,15].

For any indoor geolocation problem, it equally important to evaluate the performance. It is essential since a criterion is needed for designing algorithm and deploying the APs so that we can compare different techniques that are used and choose the one with the best performance. But there are still some challenging issues related to this problem. A general and efficient way should be provided and we can utilize it to analyze any indoor geolocation system. The approach can also be modified since for different localization schemes, other sensors will be fused and the modified scheme is able to respond to any change.

The work described by [16] presents an approach which detects intruder for WLAN access. Least Mean Square (LMS) and Prioritized Maximum Power (PMP) are used as two RSS-based matching algorithms. Their performance of accuracy are compared in indoor and outdoor-indoor areas and PMP algorithm provides a better performance than LMS in positioning application.

An approach using fusion of sensors, WLAN signals and building information for indoor/campus localization is developed by [8]. This method shows the possibilities of combing the measurements from different sensors and building information to obtain accurate indoor localization as well as the possibilities that sensors can aid in intruder detection [17, 2].

Some indoor personal navigation applications are introduced in [2]. Map Matching Algorithms are implemented, which make the Pedestrian Navigation Module (PNM) have the capbility to provide localization results even with bad reception of GPS signals.

Another approach is described in [18, 19] which fuse dead reckoning (DR) algorithm, GPS, and RFID for pedestrian positioning. This method is implemented as software module with web-based APIs on computing systems which shows that GPS and the active RFID tag system can seamlessly and effectively adjust estimation errors in DR as well as possibilities for sensor fusion localization.

3 Probability of Coverage

To analyze the coverage certainty, we should start from the commonly used path loss model in decibels, which is given by:

$$\begin{aligned} L_p=L_0+10 \alpha \log _{10}r \end{aligned}$$
(1)

where \(L_p\) is the total path loss from the transmitter to receiver. \(L_0\) is the normalized path loss, which is the power loss at 1 m. \(\alpha\) is the gradient indicating the relation between distance and power. In the environment of office buildings, the materials of the buildings are brick, wood, metal, and other composites. These materials have different gradients from 2 to 6. In large office area, \(\alpha\) is changeable according to different r, which indicates the distance from the transmitter to the receiver.

The transmitted signal is also expected to have different path losses in different directions, causing power variation when it reaches to receiver. This variation is commonly called shadow fading or large-scale fading since its cause is obstruction by objects around the receiver. It is not feasible to model shadow fading in a deterministic way, and therefore we usually use statistical models instead. We define l as the shadow fading in the radio propagation, which is a zero mean normally distributed random variable with a standard deviation of \(\sigma\). The probability distribution function (PDF) for shadow fading can be written as:

$$\begin{aligned} f(l)=\frac{1}{\sqrt{2\pi }\sigma }e^{-l^2/2\sigma ^2} \end{aligned}$$
(2)

Every receiver has its own sensitivity, which is the minimum RSS that it can recognize. Given the PDF of the shadow fading, we can calculate the probability that the RSS in one location will be lower that the sensitivity (Outage) as well as the probability that it is higher than the sensitivity (Coverage). It is obvious that the sum of the two will be one and we only need calculate one of them. We denote s as the difference between transmitted power and the sensitivity, which indicates the maximum power loss for effective transmission. Then the probability of coverage can be derived as follow:

$$\begin{aligned} Prob(Coverage)&=Prob(L_p+l<s)=Prob(l<s-L_p)\nonumber \\&=1-\int _{s-L_p}^{\infty }f(l) dl\nonumber \\&=1-\int _{s-L_p}^{\infty }\frac{1}{\sqrt{2\pi }\sigma } e^{-l^2/2\sigma ^2} dl\nonumber \\&=1-\frac{1}{2}erfc\left( \frac{s-L_p}{\sqrt{2}\sigma }\right) \end{aligned}$$
(3)

where erfc() is the complementary error function, and \(erfc(x)=\frac{2}{\sqrt{\pi }}\int _{x}^{\infty }e^{-t^2} dt\). Then we can replace \(L_p\) with Eq. (1), and the coverage probability is written as:

$$\begin{aligned} Prob(Coverage)=1-\frac{1}{2}erfc\left( \frac{s-L_0-10 \alpha \log _{10}r}{\sqrt{2}\sigma }\right) \end{aligned}$$
(4)

From Eq. (4), we can see that all the factors are constant except d, which means that the probability is a function of the distance between the transmitter and receiver.

4 Channal Modelling with Probability of Coverage

Path loss model plays an important role in designing localization algorithms and performance evaluation. There has been numerous researches that focus on channel modelling for wide and local area networks. The empirical path loss model is no longer fit for Indoor 3D environment. In multistory building, the power-distance gradient \(\alpha\) will change according to different distances, so the commonly used path loss model is given:

$$\begin{aligned} L_p=L_0+ \left\{ \begin{array}{cr} 20log_{10} r, &{} (10 \ge r \ge 1m)\\ 20+30log_{10} \frac{r}{10}, &{} (20 \ge r> 10m)\\ 29+60log_{10} \frac{r}{20}, &{} (40 \ge r> 20m)\\ 47+120log_{10} \frac{r}{40}, &{} ( r > 40m)\\ \end{array}\right. \end{aligned}$$
(5)

From Eq. (5), it is clear that path loss becomes greater when the distance between the transmitter and receiver becomes larger. But the method that we use to calculate coverage certainty stay the same. Equation (35) can still be used in 3D scenarios and the only difference is that we should replace the empirical path loss model with the 3D distance-partitioned model, which creates a different \(L_p\).

However, in reality, in some occasions, the RSS cannot be detected in indoor environment because of severe shadow fading. In this case, the typical method of channel modelling is not suitable since the data is incomplete so that we can not find out a optimized fit for the RSS data (Fig. 1).

With the concern of probability of coverage, we can find receive the supposed RSS even if the interference from the surrounding is severe and then we can apply the commonly used linear fit method to model the indoor channel.

Fig. 1
figure 1

Channel modeling

Fig. 2
figure 2

Probability of coverage in different transmitted power

5 CRLB with Bayes’ Theorem

Let’s assume that a mobile device is moving in an environment where N access points are deployed. These fixed access points can be any kind of IoT devices with different transmitted power, frequencies and radio propagation characteristics. Then deviation of CRLB start from the empirical pathloss model:

$$\begin{aligned} Lp_i=L_0+10\alpha _i log_{10}(r_i)+X(\sigma ) \end{aligned}$$
(6)

where distance from mobile device to \(i\hbox {th}\) access point can be calculated as \(r_i=\sqrt{(x-x_i)^2+(y-y_i)^2}\). In this case, the estimator is the coordinate of the mobile device’s location, which can be denoted as \(\theta =[x \quad y]^T\), and \((x_i, y_i)\) is the coordinate of the \(i\hbox {th}\) access point.

Then the probability distribution function of observation (pathloss) given certain estimator can be determined as:

$$\begin{aligned} p_i(Lp_i/\theta )=\frac{1}{\sqrt{2\pi } \sigma }exp\left( -\frac{1}{2\sigma ^2}[Lp_i-L_0-10\alpha _i log_{10}r_i]^2\right) \end{aligned}$$
(7)

We also have the probability of coverage as \(p_i(\theta )\). From Bayesian Theory, we also know that:

$$\begin{aligned} p_i(Lp_i;\theta )=p_i(Lp_i/\theta )\cdot p_i(\theta ) \end{aligned}$$
(8)

All the observations can be considered as addictive and independent from one another, thus the joint PDF for the observations can be derived as:

$$\begin{aligned} p(\varvec{Lp};\theta )= & {} \prod \limits _{i=1}^{N} p_i(Lp_i;\theta )\nonumber \\= & {} \prod \limits _{i=1}^{N} p_i(Lp_i/\theta )\cdot p_i(\theta ) \end{aligned}$$
(9)

From the theory of CRLB we also know that it is the inverse of Fisher Information Matrix, which is denoted as \(\varvec{I}(\theta )\) and can be calculated as follows:

$$\begin{aligned} \varvec{I}(\theta )=\left[ \begin{array}{cc} -E\left[ \frac{\partial ^2 ln (p(\varvec{Lp};\theta ))}{\partial x^2}\right] &{} -E\left[ \frac{\partial ^2 ln (p(\varvec{Lp};\theta ))}{\partial x \partial y}\right] \\ -E\left[ \frac{\partial ^2 ln (p(\varvec{Lp};\theta ))}{\partial x \partial y}\right] &{} -E\left[ \frac{\partial ^2 ln (p(\varvec{Lp};\theta ))}{\partial y^2}\right] \\ \end{array}\right] \end{aligned}$$

Since

$$\begin{aligned} ln(p(\varvec{Lp};\theta )) & {} = ln\left( \prod \limits _{i=1}^{N} p_i(Lp_i/\theta )\cdot p_i(\theta )\right) \\= & {} \sum \limits _{i=1}^{N} [ln(p_i(Lp_i/\theta ))+ln(p_i(\theta ))] \end{aligned}$$

We can rewrite the Fisher matrix as:

$$\begin{aligned} \varvec{I}(\theta )=\varvec{I}_1 (\theta )+\varvec{I}_2 (\theta ) \end{aligned}$$
(10)

where

$$\begin{aligned} \varvec{I}_1(\theta )=\sum \limits _{i=1}^{N}\left[ \begin{array}{cc} -E\left[ \frac{\partial ^2 ln (p_i(Lp_i/\theta ))}{\partial x^2}\right] &{} -E\left[ \frac{\partial ^2 ln (p_i(Lp_i/\theta ))}{\partial x \partial y}\right] \\ -E\left[ \frac{\partial ^2 ln (p_i(Lp_i/\theta ))}{\partial x \partial y}\right] &{} -E\left[ \frac{\partial ^2 ln (p_i(Lp_i/\theta ))}{\partial y^2}\right] \\ \end{array}\right] \end{aligned}$$

and

$$\begin{aligned} \varvec{I}_2(\theta )=\sum \limits _{i=1}^{N} \left[ \begin{array}{cc} -E\left[ \frac{\partial ^2 ln (p_i(\theta ))}{\partial x^2}\right] &{} -E\left[ \frac{\partial ^2 ln (p_i(\theta ))}{\partial x \partial y}\right] \\ -E\left[ \frac{\partial ^2 ln (p_i(\theta ))}{\partial x \partial y}\right] &{} -E\left[ \frac{\partial ^2 ln (p_i(\theta ))}{\partial y^2}\right] \\ \end{array}\right] \end{aligned}$$

So that

$$\begin{aligned} cov(dr)=\varvec{I}(\theta )^{-1}=\left[ \begin{array}{cc} \sigma _x^2 &{} \sigma _{xy}^2\\ \sigma _{xy}^2 &{} \sigma _y^2 \end{array}\right] \end{aligned}$$
(11)

CRLB is the trace of the covariance matrix, which means that \(\sigma _r^2=\sigma _x^2+\sigma _y^2\). In this way, the total CRLB can be calculated no matter how many APs are covered.

Actually, we can find out the close form of \(\varvec{I_1}(\theta )\) and \(\varvec{I_2}(\theta )\):

$$\begin{aligned} ln(p_i(Lp_i/\theta ))=ln\left( \frac{1}{\sqrt{2\pi }\sigma }\right) -\frac{(Lp_i-L_0-10\alpha _ilog_{10}(r_i))^2}{2\sigma ^2} \end{aligned}$$
(12)

Then:

$$\begin{aligned} \frac{\partial ln(p_i(Lp_i/\theta ))}{\partial x}=0-\frac{1}{2\sigma ^2} \cdot \frac{\partial [Lp_i-L_0-10\alpha _ilog_{10}(r_i)]^2}{\partial x} \end{aligned}$$
(13)

Let’s take \(\phi =Lp_i-L_0-10\alpha _ilog_{10}(r_i)\), so that:

$$\begin{aligned} \frac{\partial ln(p_i(Lp_i/\theta ))}{\partial x} =-\frac{1}{2\sigma ^2} \cdot \frac{\partial \phi ^2}{\partial x} \end{aligned}$$
(14)

6 CRLB with Probability of Coverage

In this section, we provide the general description of pathloss models, according to which, CRLB and probability of coverage can be derived in multiple-dimension scenarios. Since the commonly-used CRLB does no include the effect of transmitting power (proved in later section), we hybrid both of CRLB and probability of coverage in pursuing more reliable performance evaluation.

6.1 Cramer-Rao Lower Bound

Cramer-Rao Lower Bound (CRLB) indicates the smallest estimation error under given observations and is frequently used in evaluating the performance of localization systems.

The usage of matrix only fits the condition that there are more than 2 APs. If only one AP is available, another method should be applied instead.

In this case, the partial differential equation should be:

$$\begin{aligned} dL_{pi}(r)=\frac{10\alpha _i}{ln10}\frac{dr}{r}, i=1,2,...,N \end{aligned}$$
(15)

Then the location error can be estimated as follow:

$$\begin{aligned} dr=\frac{ln10\cdot r}{10\alpha _i}dL_pi(r), i=1,2,...,N \end{aligned}$$
(16)

And covariance of dr can be derived:

$$\begin{aligned} cov(dr)= & {} \left( \frac{ln10\cdot r}{10\alpha _i}\right) ^2 cov(dL_pi(r))\nonumber \\= & {} \sigma ^2 \left( \frac{ln10\cdot r}{10\alpha _i}\right) ^2, i=1,2,...,N \end{aligned}$$
(17)

which is also the variance of dr, so the CRLB in this case is

$$\begin{aligned} \sigma _r=\frac{ln10\cdot r}{10\alpha _i} \sigma , i=1,2,...,N \end{aligned}$$
(18)

In order to investigate the relation between the location error and signal strength error, we apply a differential operation to both sides of Eq. (33) with respect to two coordinates x and y, then we have:

$$\begin{aligned} dL_{pi}(x,y)=\frac{10\alpha _i}{ln10}\left( \frac{x-x_i}{r_i^2}dx +\frac{y-y_i}{r_i^2}dy\right) , i=1,2,...,N \end{aligned}$$
(19)

where \(L_{pi}\) is the total path loss from \(AP_i\) to the location of (xy); \((x_i,y_i)\) is the coordinate of \(AP_i\); \(\alpha _i\) is the power-distance gradient for signal coming from \(AP_i\); \(r_i\) is the distance between the receiver and \(AP_i\), and \(r_i=\sqrt{(x-x_i)^2+(y-y_i)^2}\); N is the number of APs.

The set of Eq. (19) can be written in matrix form as:

$$\begin{aligned} d\varvec{L_p}=\varvec{H}\cdot d\varvec{r} \end{aligned}$$
(20)

where

$$\begin{aligned} d\varvec{L_p}=\left[ \begin{array}{c} dL_{p1}\\ dL_{p2}\\ \vdots \\ dL_{pi} \end{array}\right] , d\varvec{r}=\left[ \begin{array}{c} dx\\ dy \end{array}\right] , \varvec{H}=\left[ \begin{array}{cc} \frac{10\alpha }{ln10}\frac{x-x_1}{r_1^2} &{} \frac{10\alpha }{ln10}\frac{y-y_1}{r_1^2} \\ \frac{10\alpha }{ln10}\frac{x-x_2}{r_2^2} &{} \frac{10\alpha }{ln10}\frac{y-y_2}{r_2^2} \\ \vdots &{} \vdots \\ \frac{10\alpha }{ln10}\frac{x-x_N}{r_N^2} &{} \frac{10\alpha }{ln10}\frac{y-y_N}{r_N^2} \\ \end{array}\right] \end{aligned}$$

From Eq. (20), we can estimate the location error.

$$\begin{aligned} d\varvec{r}=(\varvec{H}'\varvec{H})^{-1}\varvec{H}'d\varvec{L_p} \end{aligned}$$
(21)

Since the path loss estimation error is identical to the error caused by shadow fading, which has zero mean and variance of \(\sigma ^2\), and these errors for different APs are independent with each other, then we can have the two equations as follow:

$$\begin{aligned} \varvec{E}[d\varvec{L_{pi}}]=0,cov(d\varvec{L_{pi}},d\varvec{L_{pj}})=\left\{ \begin{array}{c} \sigma ^2, i=j\\ 0, i \not = j\\ \end{array}i,j=1,2,...,N\right. \end{aligned}$$
(22)

Then the covariance matrix of the location error \(d\varvec{r}\) is given by

$$\begin{aligned} cov(d\varvec{r})=\sigma ^2(\varvec{H}'\varvec{H})^{-1}=\left[ \begin{array}{cc} \sigma _x^2 &{} \sigma _{xy}^2\\ \sigma _{xy}^2 &{} \sigma _y^2 \end{array}\right] \end{aligned}$$
(23)

The standard deviation of location error is finally derived as

$$\begin{aligned} \sigma _r=\sqrt{\sigma _x^2+\sigma _y^2} \end{aligned}$$
(24)

From Eq. (24), we can see that if the transmission environment is given, the location error only relies on the coordination of the receiver (xy), and we can calculate the CRLB at any location according to that.

In the previous sections, all we have discussed are focused on the analysis in 2D condition. But in reality, 3D geolocation schemes are more important in indoor environment. Therefore, we should have a deeper look at how to expand our methods to 3D environment.

The calculation for CRLB needs more expansion since the coordinate of every location becomes three dimensional. In 3D environment, we use similar method to start the derivation of CRLB.

To analyze the relation between RSS and the least location error (CRLB), we can apply partial differential to Eq. (25) [10, 19]. Then we have

$$\begin{aligned} dP_i(x,y,z)=-\frac{10\alpha }{ln10}\left( \frac{x-x_i}{r_i^2}dx +\frac{y-y_i}{r_i^2}dy+\frac{z-z_i}{r_i^2}dz\right) \end{aligned}$$
(25)

In this case, the matrix form should also be expanded to three dimension, where

$$\begin{aligned} d\varvec{r}=\left[ \begin{array}{c} dx\\ dy\\ dz \end{array}\right] , \varvec{H}=\left[ \begin{array}{ccc} \frac{10\alpha _1}{ln10}\frac{x-x_1}{r_1^2} &{} \frac{10\alpha _1}{ln10}\frac{y-y_1}{r_1^2} &{} \frac{10\alpha _1}{ln10}\frac{z-z_1}{r_1^2}\\ \frac{10\alpha _2}{ln10}\frac{x-x_2}{r_2^2} &{} \frac{10\alpha _2}{ln10}\frac{y-y_2}{r_2^2} &{} \frac{10\alpha _2}{ln10}\frac{z-z_2}{r_2^2}\\ \vdots &{} \vdots &{} \vdots \\ \frac{10\alpha _N}{ln10}\frac{x-x_N}{r_N^2} &{} \frac{10\alpha _N}{ln10}\frac{y-y_N}{r_N^2} &{} \frac{10\alpha _N}{ln10}\frac{z-z_N}{r_N^2}\\ \end{array}\right] \end{aligned}$$

By using the same least-square estimation method we mentioned before, estimation of the location error can be evaluated:

$$\begin{aligned} d\varvec{r}=(\varvec{H}'\varvec{H})^{-1}\varvec{H}'d\varvec{P} \end{aligned}$$
(26)

and the covariance matrix of the location error is

$$\begin{aligned} cov(d\varvec{r})=\sigma ^2(\varvec{H}'\varvec{H})=\left[ \begin{array}{ccc} \sigma _x^2 &{} \sigma _{xy}^2 &{} \sigma _{xz}^2\\ \sigma _{xy}^2 &{} \sigma _y^2 &{} \sigma _{yz}^2\\ \sigma _{xz}^2 &{} \sigma _{yz}^2 &{} \sigma _z^2 \end{array}\right] \end{aligned}$$
(27)

Then the CRLB can be calculated as follow:

$$\begin{aligned} \sigma _r=\sqrt{\sigma _x^2+\sigma _y^2+\sigma _z^2} \end{aligned}$$
(28)

Since every coverage probability and CRLB is redefined here, the 3D CRLB can be calculated in the same way which is described in Eq. (32).

6.2 CRLB Concerning Probability of Coverage

From the previous section, we can calculate the probability that a location can be covered by a AP as well as the CRLB which shows the minimum location error under this condition. It is reasonable for us that calculate the CRLB concerning the effect of coverage certainty, so that the total CRLB will be more reliable and accurate.

We denote \(p_i\) as the probability that a certain location can be covered by \(AP_i\), which can be calculated by Eq. (4). Suppose there are NAPs in total, the probability that kAPs are covered can be calculated according to the probabilities we calculated before. The number of combinations C of selecting k elements out of N can be calculated as

$$\begin{aligned} C=\left( \begin{array}{c} N\\ k \end{array}\right) =\frac{N!}{k!(N-k)!}=\frac{N(N-1)\cdots (N-k+1)}{k(k-1)\cdots 1} \end{aligned}$$

To calculated the CRLB concerning coverage certainty, all the probabilities for the combinations should be explored and the total CRLB should be the summation of every individual CRLB times its corresponding probability. For example, if only 1 AP is covered, then there are N combinations (\(C=N\)) in this case. Suppose \(AP_1\) is the one that is covered, then the probability for this condition is given from the concept of probability theory

$$\begin{aligned} Prob_k=\sum \limits _{i,j}^{C}\left[ \prod \limits _{i}^{k}p_i* \prod \limits _{j,j\not =i}^{N-k}(1-p_j)\right] \end{aligned}$$
(29)

where \(Pron_k\) is the probability that only \(AP_1\) is covered while others are not. Note that we should skip the situation when all the APs are not covered. In this condition, no location estimation can be made, since no information can be used to determine the location of the receiver. Therefore, it is useless to discuss this situation.

However, we should mention that when \(i=0,1,2\), which means that no AP or only one/two APs are covered, matrix \(H'H\) will be singular and CRLB cannot be determined. It is reasonable since we are able to find the location estimation only when more than 3 APs are accessible. In this case, \(Prob_0-2\) can no longer be used in computing the PCRLB and we should normalize the probabilities by:

$$\begin{aligned} {Prob_k}'=\frac{Prob_k}{\sum \limits _{k=2}^{N}Prob_k} \end{aligned}$$
(30)

And,

$$\begin{aligned} \sum \limits _{k=0}^{N}{Prob_k}'=1 \end{aligned}$$
(31)

Similarly, we can calculate the probabilities for all the other conditions (\(Prob_2, Prob_3, \cdots , Prob_N\)). Then the total CRLB can be calculated as follow

$$\begin{aligned} CRLB_{total}=\sum \limits _{i=0}^N CRLB_i\cdot {Prob_i}' \end{aligned}$$
(32)

In this way, we can calculate the total CRLB no matter how many APs are covered.

6.3 Scenarios Design

To compare different geolocation systems, the very first step is to design test scenarios so that their performance can be evaluated in a same way.

We have designed 5 scenarios which can be divided into two types, 2D scenarios and 3D scenarios. The first 3 scenarios are designed on the same floor and we can compare these 3 scenarios for the effects of AP number. In Scenario 4 and 5, APs are deployed in multiple floors, and we can compare the effect of 3D scenarios. The detailed scenario description is given as follow:

  • Scenario 1: 3 APs are placed on the ceiling of the same floor (at 3 of the 4 corners).

  • Scenario 2: 4 APs are placed on the ceiling of the same floor (at the 4 corners).

  • Scenario 3: 5 APs are placed on the ceiling of the same floor (at the 4 corners and the middle).

  • Scenario 4: 4 APs are placed on the ceiling of the 1st floor (at the 4 corners), and we calculate the total CRLBs of the three floors.

  • Scenario 5: 4 APs are placed on the ceiling of every floor (at the 4 corners, 12 APs in total), and we calculate the total CRLBs of the three floors.

We assume that every floor has a space of 30 m × 30 m and a height of 5 meter. Every floor is sampled in every 0.1 meter (the 4 edges are not includes), so we have 299 × 299 = 89401 samples in total for every scenario.

6.4 Limits and Challenges of Combined CRLB

There are two aspects that we should pay extra attention to.

First of all, the increase of access number will dramatically increase the amount of possibilities as well as the amount of computation. For instance, when we have only have 4 APs, there are \(\frac{4!}{3!1!}=4\) possibilities that 3 of them can cover the mobile device. When the number increase to 5, we have \(\frac{5!}{3!2!}=10\) possibilities. When the number becomes to 6, we will have \(\frac{6!}{3!3!}=20\) possibilities. So it is obvious that the amount of computation will increase greatly even if we only add one access point. The most reliable way is to limit the amount of access points used in localization so that the computation can also be limited within a reasonable range. This means we will pick the most useful points in estimating the receivers location, i.e.

The other aspect is the consideration of H matrix. For every deployment we can find a H matrix respectively. But not all of the matrix are full-ranked. If one matrix are singular, we cannot find its inverse matrix so that we cannot calculate the CRLB according to that. This will happen when all the access points are located at the same plane so that two of the columns in the H matrix will have the same or opposite values. To avoid this situation, we have to remove the combination that will cause singular H matrix from our possibility set. Also, we do not consider the situation when 0, 1, or 2 access points are covered since the amount of APs are too small and we cannot estimate the mobile device’s location according to the information. But note that we can still have the probability that get these situations, thus when calculating the combined CRLB, we use normalized probability instead of the one we calculated before.

7 Probability-of-Coverage

The concept of Probability-of-Coverage (PoC) starts from the commonly used RF path loss model [13], which represents the distance-power relationship in different environment and scenarios of operation:

$$\begin{aligned} L_p=L_0+10 \alpha \log _{10}r \end{aligned}$$
(33)

where \(L_p\) is the total path loss from the transmitter to receiver and \(L_0\) is the power loss at first meter. r is denoted as the distance from the transmitter to the receiver. \(\alpha\) is the gradient indicating the relation between distance and power. In the environment of office buildings, the materials of the buildings are brick, wood, metal, and other composites. These materials have different gradients from 2 to 6. Thus, \(\alpha\) is changeable according to different environment so that various path loss models have been designed in literature.

The transmitted signal is also affected by shadow fading or large-scale fading, which can be denoted as \(X(\sigma )\) and it is a zero mean normally-distributed addictive random variable with a standard deviation of \(\sigma\). The probability distribution function (PDF) for shadow fading can be written as:

$$\begin{aligned} f(X(\sigma ))=\frac{1}{\sqrt{2\pi }\sigma }e^{-X(\sigma )^2/2\sigma ^2} \end{aligned}$$
(34)
Fig. 3
figure 3

PoC in different transmitted power

We all know that every device has its own sensitivity, which is the minimum RSS that it can recognize. Therefore, with the variation of shadow fading, RSS at a certain location may goes below the sensitivity of the mobile device. Given the PDF of the shadow fading, we can calculate the probability that the RSS in one location will be lower that the sensitivity (Outage) as well as the probability that it is higher than the sensitivity (Coverage). We denote \(Lp_{max}\) as the difference between transmitted power and the sensitivity, which indicates the maximum power loss for effective transmission. Then the Probability-of-Coverage (PoC) can be derived as follows:

$$\begin{aligned} Prob(Coverage) & {} = Prob(L_p+X(\sigma )<Lp_{max})\nonumber \\ & {} = Prob(X(\sigma )<Lp_{max}-L_p)\nonumber \\ & {} = 1-\int _{Lp_{max}-L_p}^{\infty }f(X(\sigma )) dl\nonumber \\ & {} = 1-\int _{Lp_{max}-L_p}^{\infty }\frac{1}{\sqrt{2\pi }\sigma } e^{-X(\sigma )^2/2\sigma ^2} dl\nonumber \\ & {} = 1-\frac{1}{2}erfc\left( \frac{Lp_{max}-L_p}{\sqrt{2}\sigma }\right) \end{aligned}$$
(35)

where \(erfc(\cdot )\) is the complementary error function. From Eqs. (33) and (35), we can see that PoC is closely related to the specific path loss model, the standard deviation of shadow fading \(\sigma\) as well as the sensitivity and transmitted power s. Note that althogh PoC is derived from the empirical path loss model here, we can replace \(L_p\) in Eq. (35) with any other path loss model which fits a specific scenario. Figure 3 shows the effect of transmitted power from \(-10\ \hbox {dBm}\) to 20 dBm at different distances, where free-space path loss model is used (\(\alpha =2\)) and standard deviation of shadow fading is set as 5 dB.

8 CRLB with PoC

Cramer-Rao Lower Bound (CRLB) indicates the smallest estimation error under given observations and is frequently used in evaluating the performance of localization systems [19, 20]. The derivation of CRLB has been widely used in many researches but as described before, all the calculation of CRLB is under the assumption that the mobile device is covered by all the access points, which is impossible in the environment full of low-energy IoT devices, whose coverage is not as wide as the high-power wireless devices. In this section, attempts are made to combine the CRLB with PoC to fit it into the IoT environment so that more reasonable conclusions are guaranteed.

Let’s assume that a mobile device is moving in an environment where N access points are deployed. These fixed access points can be any kind of IoT devices with different transmitted power, frequencies and radio propagation characteristics. Then deviation of CRLB start from the empirical pathloss model:

$$\begin{aligned} Lp_i=L_0+10\alpha _i log_{10}(r_i)+X(\sigma ) \end{aligned}$$
(36)

where distance from mobile device to \(i\hbox {th}\) access point can be calculated as \(r_i=\sqrt{(x-x_i)^2+(y-y_i)^2}\). In this case, the estimator is the coordinate of the mobile device’s location, which can be denoted as \(\theta =[x \quad y]^T\), and \((x_i, y_i)\) is the coordinate of the ith access point.

Then the probability distribution function of observation (pathloss) given certain estimator can be determined as:

$$\begin{aligned} p_i(Lp_i/\theta )=\frac{1}{2\pi \sigma ^2} exp\left( -\frac{1}{2\sigma ^2}[Lp_i-L_0-10\alpha _i log_{10}r_i]^2\right) \end{aligned}$$
(37)

We also have the probability of coverage as \(p_i(\theta )\). Then

$$\begin{aligned} p_i(Lp_i;\theta )=p_i(Lp_i/\theta )\cdot p_i(\theta ) \end{aligned}$$
(38)

All the observations can be considered as addictive and independent from one another, thus the joint PDF for the observations can be derived as:

$$\begin{aligned} p(\varvec{Lp};\theta )= & {} \prod \limits _{i=1}^{N} p_i(Lp_i;\theta )\nonumber \\= & {} \prod \limits _{i=1}^{N} p_i(Lp_i/\theta )\cdot p_i(\theta ) \end{aligned}$$
(39)

From the theory of CRLB we also know that it is the inverse of Fisher Information Matrix, which is denoted as \(\varvec{I}(\theta )\) and can be calculated as follows:

$$\begin{aligned} \varvec{I}(\theta )=\left[ \begin{array}{cc} -E\left[ \frac{\partial ^2 ln (p(\varvec{Lp};\theta ))}{\partial x^2}\right] &{} -E\left[ \frac{\partial ^2 ln (p(\varvec{Lp};\theta ))}{\partial x \partial y}\right] \\ -E\left[ \frac{\partial ^2 ln (p(\varvec{Lp};\theta ))}{\partial x \partial y}\right] &{} -E\left[ \frac{\partial ^2 ln (p(\varvec{Lp};\theta ))}{\partial y^2}\right] \\ \end{array}\right] \end{aligned}$$

Since

$$\begin{aligned} ln(p(\varvec{Lp};\theta )) & {} = ln\left( \prod \limits _{i=1}^{N} p_i(Lp_i/\theta )\cdot p_i(\theta )\right) \\= & {} \sum \limits _{i=1}^{N} [ln(p_i(Lp_i/\theta ))+ln(p_i(\theta ))] \end{aligned}$$

We can rewrite the Fisher matrix as:

$$\begin{aligned} \varvec{I}(\theta )=\varvec{I}_1 (\theta )+\varvec{I}_2 (\theta ) \end{aligned}$$
(40)

where

$$\begin{aligned} \varvec{I}_1(\theta )=\sum \limits _{i=1}^{N}\left[ \begin{array}{cc} -E\left[ \frac{\partial ^2 ln (p_i(Lp_i/\theta ))}{\partial x^2}\right] &{} -E\left[ \frac{\partial ^2 ln (p_i(Lp_i/\theta ))}{\partial x \partial y}\right] \\ -E\left[ \frac{\partial ^2 ln (p_i(Lp_i/\theta ))}{\partial x \partial y}\right] &{} -E\left[ \frac{\partial ^2 ln (p_i(Lp_i/\theta ))}{\partial y^2}\right] \\ \end{array}\right] \end{aligned}$$

and

$$\begin{aligned} \varvec{I}_2(\theta )=\sum \limits _{i=1}^{N} \left[ \begin{array}{cc} -E\left[ \frac{\partial ^2 ln (p_i(\theta ))}{\partial x^2}\right] &{} -E\left[ \frac{\partial ^2 ln (p_i(\theta ))}{\partial x \partial y}\right] \\ -E\left[ \frac{\partial ^2 ln (p_i(\theta ))}{\partial x \partial y}\right] &{} -E\left[ \frac{\partial ^2 ln (p_i(\theta ))}{\partial y^2}\right] \\ \end{array}\right] \end{aligned}$$

So that

$$\begin{aligned} cov(dr)=\varvec{I}(\theta )^{-1}=\left[ \begin{array}{cc} \sigma _x^2 &{} \sigma _{xy}^2\\ \sigma _{xy}^2 &{} \sigma _y^2 \end{array}\right] \end{aligned}$$
(41)

CRLB is the trace of the covariance matrix, which means that \(\sigma _r^2=\sigma _x^2+\sigma _y^2\). In this way, the total CRLB can be calculated no matter how many APs are covered.

However, in the traditional derivation of CRLB, \(p_i(\theta )\) is always considered as 1, which means that \(\varvec{I}_2(\theta )\) is zero and the key parameters that affect the performance of system are the deployment of access points and the pathloss model. In fact, when devices with various transmitted power are deployed, they will not perform the same way even if they share the same progagation characteristic.

There are two aspects that we should pay extra attention to. First of all, from Eq. (40), the increase of access number will add more information to the Fisher matrix so that when we take the inverse of it, the value in the covariance matrix will become smaller which indicates that the estimators are less deviated. This makes sense that when more information are provided, the better localization performance would be. However, in fact, when the number of access points reaches a certain value, the densely-deployed sensors can no longer improve the localization precision. They will interfere with one another and affect the entire system’s performance. So it is crucial that we explore how the number of access points will affect the CRLB and what is the best strategy for the deployment of IoT devices.

Another issue falls into the other part of Fisher matrix, which is how the PoC may affect the performance of the entire system. From the same equation we can see that, when \(\varvec{I}_2(\theta )\) is included in the calculation of CRLB, it also leads to a decrease ofthe value in covariance matrix. It is because, when PoC is considered, the calculation of Fisher matrix will be biased so that the information from APs with higher PoC will be more weighted than the ones with lower PoC, and thus, generate more accurate location estimation. It will be further proved by a simulation in the following section.

9 Results and Analysis

In this section, we will present the results and give our analysis from which conclusions can be made.

9.1 Contours of CRLB in Three Scenarios

We illustrate the contours of CRLB for the three Scenarios in Figs. 2, 3 and 4. In the figures, characteristic of error performance is clearly presented.

Note that although the space is 30 m × 30 m, we do not include the observations on the edges. Consequently, the contour shows a 29.9 m × 29.9 m space instead of a 30 m × 30 m one.

9.2 CDFs of Different Scenarios

When we explore more about the statistical characteristic of the performance, we illustrate the CDFs of different scenarios under both RSS-only and Barometer-assisted CRLB, which is shown in Figs. 5, 6, 7.

Fig. 4
figure 4

Contour of CRLB in scenario 1

Fig. 5
figure 5

Contour of CRLB in scenario 2

Fig. 6
figure 6

Contour of CRLB in scenario 3

Fig. 7
figure 7

CDFs for three scenarios

From the figure, we can see that the location error is decreased when more information from other floors is applied. Moreover, if the barometer assist the calculation of the CRLB, the performance is greatly improved (Figs. 8, 9, 10, 11, 12 and 13).

9.3 CRLB and PCRLB Comparison

Maximum, minimum, and mean CRLB value of the three scenarios using these two methods are listed in Table 1. From the table, we can find that by adopting the barometer-assisted method, a 41.67%, 29.29%, and 19.20% improvement can be achieved under the Scenarios 1, 2, and 3 respectively.

Table 1 Barometer-assisted method versus RSS-only method in error performance
Fig. 8
figure 8

Contour for scenario 1

Fig. 9
figure 9

Contour for scenario 2

Fig. 10
figure 10

Contour for scenario 3

Fig. 11
figure 11

CDFs for 3 scenarios in 2D

Fig. 12
figure 12

average CRLB in different transmitted power linear

Fig. 13
figure 13

Average CRLB in different transmitted power

9.4 Scenarios Design

To compare different geolocation systems, we designed 3 scenarios which can be used in checking the effect of multiple parameters. The detailed scenario description is given as follow:

  • Scenario 1: 3 High-Power (HP) APs are placed on the ceiling and 27 Low-Power (LP) APs are placed in grid on the 3rd floor of Atwater Kent Laboratory, which is shown in Fig. 14. The red dots represent the HP Wi-Fi routers while the black ones represent the LP RF devices, such as iBeacons or smart bumps. \(N (3-8)\) APs with the smallest path-loss will be selected for localization instead of all the APs, since we find that the rest APs can hardly contribute to the improvement of localization precision;

  • Scenario 2: The deployment of the APs is the same as Scenario 1. The transmitted power is set to different levels so that how it can influence the localization precision can be investigated;

  • Scenario 3: The deployment of HP APs stays the same while the LP APs is deployed randomly in on the floor so that the effect of different deployment can be explored;

Fig. 14
figure 14

Deployment of APs in AK building

The HP devices are set with a transmitted power of 20 dBm and the LP devices from − 10 to 20 dBm in different scenarios. The channel model that is used is the distance-partitioned model with operating frequency as 2.4 GHz. The 3rd floor has a space of 23 m × 16 m and a height of 5 meter. Samples are taken in every 0.1 m (the 4 edges are not includes), so we have 229 × 159 = 36411 samples in total for every scenario.

9.5 CRLB with Different LP Device Number

The CDFs of CRLB with different selected APs number are shown in Fig. 15. We can see that, when the number increases, the range becomes narrower, which indicates higher localization accuracy. But one thing should be paid attention to. As the selected APs number increases, the computational complexity will also increase exponentially. As described before, the complexity will be proportional to the factorial of the number N. Therefore, it is crucial to choose a suitable number so that the system can achieve highest efficiency with shortest processing time and reasonable localization precision. Table 2 illustrate the mean CRLB with different APs number vs. computational complexity. We cam easily find that, it is a wise choice to pick 5 or 6 APs since the complexity becomes too large afterwards.

Table 2 CRLB versus computational complexity

It is also worthwhile to mention that why we do not increase the number of HP devices. In the IoT environment, WiFi routers will be placed every several meters. So in our scenarios, three HP devices are quite enough to cover the entire area, which is exactly how our department deploy the routers. On the other hand, since the LP devices only cover comparatively smaller areas, and they are widely used and deployed, we can investigate more on these devices, and see how it may affect the localization precision when the number of LP devices changes.

9.6 CRLB in Different Transmitted Power

We illustrate the CDFs of CRLB for Scenario 2 in Fig. 16. In this figure, characteristic of error performance under different sets of transmitted power is clearly presented. Note that the CRLB will increase as the power level increases. It is because that when transmitted power becomes greater, more error from longer distance will be counted so that the localization precision will be worse than the ones with lower power level. Moreover, we can see that when standard deviation \(\sigma\) increases, the CRLB will also increase, indicating worse localization accuracy.

9.7 CRLB with Different LP Device Deployment

We also investigate how different deployment schemes can affect the CRLB. As is demonstrated in Fig. 17, it is straightforward that the random deployment will degrade the system localization performance. However, it is still crucial to explore how much worse random deployment can be than grid, since most of time the LP devices are not fixed, their location will change and they can be anywhere generating information in the IoT environment.

Fig. 15
figure 15

CRLB with different number of selected APs

Fig. 16
figure 16

Average CRLB in different transmitted power

Fig. 17
figure 17

CRLB with grid versus random deployment

10 Conclusion

In this paper, we present an approach to improve the performance of a 3D RSS-based geolocation system by using barometer in smart devices. A modified 3D path loss model is presented which brings penalty of ceilings into consideration. Based on the pressure-height physical law, we characterize the vertical estimation and fit it into a Gaussian Distribution. Calculation of 3D CRLB is provided as an expansion of the original 2D CRLB for performance evaluation. Moreover, We design 3 scenarios of different floors with various AP deployment strategies and conduct series of experiments for comparison. The improvement is specified with contour and CDFs of the scenarios and quantified from a comparison table.

Future work includes: To expand our system to buildings with more complicated architecture, which will make the research more related to the real world. Fully combining the barometer and RSS signal should also be explored, so that smaller error can be reached. It is also feasible if we integrate our technique with other sensors in smart phones to find if more improvement can be reached.