1 Introduction

The primary objective of an Indoor Localization System (ILS) is to estimate the location of a person or object within a multi-storied building. Indoor localization has a widespread application area including rescue operation, warehouse monitoring, asset tracking, indoor positioning, autonomous robot navigation, games, and many more. Hence, this domain has drawn the attention of many researchers over the last two decades due to its potential applications.

The widely accepted and popular outdoor localization system, GPS does not work in indoor environment properly due to the attenuation of signals by obstacles like walls, the presence of furniture and human beings and so forth (Yassin et al. 2016). Thus, a number of commercial systems have started to emerge with the escalating demand for indoor localization. In late 2011, using Google Map 6.0, released by Google, indoor localization and navigation are made available at some shopping malls and airports in countries including US and Japan. However, many building owners do not want to share information about their indoor floor plans in public due to privacy reasons. Some indoor navigation applications like the Tokyo station underground area navigation app,Footnote 1 have emerged for helping passengers to navigate through the large indoor spaces of major railway stations. These applications are not fine-grained and have primarily relied upon a manually created building map showing all important places of interest (for example, fare collection counters, ticket vending machine, etc.) which hinders their applicability to other stations.

Fig. 1
figure 1

Overall categorization of existing techniques of Indoor Localization System

Over the years, several indoor positioning systems based on different technologies have been explored. Based on the need for dedicated infrastructure these systems are categorized into two groups as depicted in Fig. 1. Each group is further classified into two categories based on the usage of a single technology or multiple technologies. Early systems provided considerable accuracy with pre-configured dedicated infrastructure including infrared transmitter/receivers, ultrasound receivers, specialized hardware to emit RF/ultrasound beacons, etc. Besides, range-based localization techniques such as Angle of Arrival (AoA), Time of Arrival (ToA), Time Difference of Arrival (TDoA), Time of Flight (ToF), Return Time of Flight (RToF), Phase of Arrival (PoA) have been used by many early systems. These techniques evaluate the distances from at least three transmitters and incorporate geometrical models for estimating locations. However, sophisticated antennas are required by such systems to provide considerable accuracy. In order to achieve wide-scale success, ILS should be made ubiquitous that would utilize the common devices (such as smartphones) carried by the people and common infrastructure support provided by the public offices, airports, shopping malls, and so on.

Nowadays, most of the buildings (like universities, hospitals, and other public infrastructures) are covered by Wireless LAN-based network infrastructure. Thus, indoor localization based on WiFi fingerprinting can be made ubiquitous as no additional hardware is required. Considerable research efforts are found to incorporate the same (Roy and Chowdhury 2018b, 2021a; Zhang et al. 2017b). This approach uses the Received Signal Strength (RSS) of WiFi Access Points (APs) to predict an unknown location. All smartphones and tablets available in the market have in-built WiFi chips to capture RSS values. An application collects RSS fingerprints of all candidate locations and sends these data to a remote server for storage and analysis. Besides, a test application captures the RSS of a location (unknown to the user) and sends it to the server for predicting the location by analyzing stored fingerprint data. The accuracy of such methods mainly depends on the extent of fingerprint effort as RSS samples are influenced by indoor ambient conditions like opening/closing of door/window, presence/absence of crowd/other interfering devices, and so on (Roy and Chowdhury 2021b).

To improve the accuracy of such systems, Bay et al. (2015), and Ruan et al. (2014) have proposed solutions based on Ultra Wide Band (UWB) radios for computing RSS variation on narrow channels. Other systems using Radio-Frequency Identification (RFID) (Yang et al. 2016), ZigBee (Gao et al. 2013), Bluetooth Low Energy (BLE) (Cooper et al. 2016), Visible Light Communication (VLC) (Liu et al. 2008), wearable sensing (Ranjan and Whitehouse 2015) also require pre-installed additional hardware for localization which makes this kind of system hard to adapt to the large-scale indoor environment. Video or imaging cameras (Lu et al. 2016) can also be used for localization purposes. However, these techniques require adequate lighting, a direct line of sight, energy resources and so on which increases the installation complexity and cost. More importantly, user privacy may be compromised. The ubiquity of such a system is strangled if different areas of the buildings are covered by different technologies such as RFID, BLE, UWB instead of using the same technology in every location. Indoor navigation can be used conveniently by all citizens including the visually impaired if these types of applications depend on a ubiquitous technology.

In this regard, the researchers have to focus on the reasons behind the vulnerability of RSS coming from an AP and propose remedies without introducing any dedicated infrastructure such as a tracker device or a BLE device to send beacons. In fact, several emerging research challenges are raised during implementation. First of all, to work with RSS fingerprints, signal behavior should be considered for three different domains such as temporal, ambiance, and device heterogeneity. Few works can be found on device heterogeneity but the localization problem subject to all three domains, particularly for crowded public infrastructure remains largely unexplored. Two important challenges arising out of it are - detecting stable APs for effective localization and ensuring sustainable localization and navigation accuracy even when training and testing conditions differ.

To the best of our knowledge, ubiquitous ILS with considerable accuracy is still beyond our reach. Even, state-of-the-art survey articles concentrate mostly on the need for integrating different technologies and the cost-effectiveness of a solution. However, the research challenges arising for a ubiquitous ILS remain mostly unexplored.

Therefore, the main contribution of this paper is to thoroughly discuss the research challenges and probable solutions associated with implementing a ubiquitous ILS based on WiFi signals for smartphone users. A survey of existing works on the different phases of a WiFi-based indoor localization framework is also presented. We have shown the effect of the discussed challenges using an RSS fingerprint dataset.

This paper is further organized as follows. Existing surveys that highlight some challenges are reviewed in Sect. 2. Section 3 defines the problem to be addressed followed by a brief description and state-of-the-art works on the different phases of ILS in Sect. 4. Associated research challenges for developing a ubiquitous ILS and future directions are thoroughly discussed in Sect. 5. Section 6 highlights some open research issues in ubiquitous ILS. Finally, Sect. 7 concludes the paper.

2 Existing surveys on indoor localization systems

A timeline of existing survey articles of this domain is presented in Table 1. These survey articles (Yassin et al. 2016; Stojanović and Stojanović 2014; He and Chan 2016; Xiao et al. 2016; Roy and Chowdhury 2021b) of this domain mainly focused on the indoor localization problem, different technologies, approaches, existing systems, applications, some common challenges, etc. Comparisons among the various methods and technologies according to metrics, such as accuracy, privacy and scalability have also been discussed. It has been highlighted that the integration among several localization systems as well as technologies has improved the quality of indoor location-based services. Moreover, choosing a suitable positioning approach with significant accuracy and granularity and installing costly equipment for combining various non-radio technologies (IMU, visual sensors, and so on.) have also enhanced the localization accuracy (Stojanović and Stojanović 2014; He and Chan 2016; Xiao et al. 2016). Therefore, exploring a cost-effective indoor positioning system that is developed using easily available technology is yet to be explored as mentioned by Yassin et al. (2016).

Table 1 A brief comparison among the existing surveys of indoor localization

Liu et al. (2007) have presented an in-depth overview of the existing indoor positioning systems, state-of-the-art localization schemes like triangulation, scene analysis, and proximity. Fischer and Gellersen (2010) have discussed the indoor localization techniques that are useful for assisting emergency responders in challenging ambience such as darkness, smokey, fire-outbreak, power outages, and so on. Besides, Yang et al. (2015) have presented their views on the importance of the mobility information that can be benefited for smartphone-based ILS along with wireless signals. Built-in sensors of the smartphone are used to identify mobility information such as step length, angular velocity, absolute direction, etc. Additionally, a survey on calibration-free indoor positioning systems has been introduced by Hossain and Soh (2015). They have also discussed the associated challenges of traditional fingerprinting like time and manpower, unforeseen environmental changes, device heterogeneity and emphasized calibration-free performance metrics such as map requirement, need for additional sensors, addressing device heterogeneity. Davidson and Piché (2017) have highlighted the lack of any standard procedure for evaluating localization accuracy of various existing algorithms. They have also suggested to design a public benchmark dataset for evaluating state-of-the-art indoor positioning algorithms as a possible solution.

In this literature, certain research works have been found on mathematical models to track a user but fingerprinting-based techniques can better cope with changing ambient conditions. Though there are fingerprinting-based localization approaches that incorporate different technologies to improve localization accuracy, system ubiquity is not taken into account by most of these works. More specifically, in a few survey articles, the need for integrating different technologies and cost-effectiveness are mentioned but still, the research challenges arising for a ubiquitous ILS remains mostly unexplored. This motivated us to identify and elaborate on the emerging issues of such a ubiquitous ILS from the perspective of practical implementation. Accordingly, the problem statement of indoor localization using WiFi signals is formulated first in the next section.

3 Problem statement

The two vital key parameters of a WiFi fingerprinting-based ubiquitous ILS are Received Signal Strength Indicator (RSSI) and Channel State Information (CSI).

RSSI is basically a measurement of signal power received at the receiver end. This signal power at the receiver end is reduced with the distance due to the propagation of the electromagnetic wave through space, which is known as path loss. The path loss is calculated by a function of the distance between the transmitter and receiver and this relationship between RSSI and distance can be represented using Log Distance Path Loss Model as mentioned by Li et al. (2019), Zafari et al. (2019), and Wu et al. (2018). Generally, this model is expressed by the following equation.

$$\begin{aligned} {PL(d)=PL(d_0) + 10\lambda \log _{10}\frac{d}{d_0} + \xi _\sigma } \end{aligned}$$
(1)

where PL represents signal strength in decibel, \(\lambda\) is known as path loss exponent, d is the distance between the transmitter and receiver, \(\xi _\sigma\) denotes a Gaussian random variable having standard deviation \(\sigma\), \(PL_0\) is the received power assumed at reference distance \(d_0\).

The RSSI is defined as a ratio of the received power to a reference power as follows.

$$\begin{aligned} {RSSI= -(10\lambda \log _{10}d + A)} \end{aligned}$$
(2)

where \(\lambda\) is known as propagation exponent, d is the distance between the transmitter and receiver, and A denotes the signal strength received at 1 m of distance.

In addition, CSI is another aspect of wireless signal propagation. CSI denotes the characteristics of a communication channel that shows how the transmitted signal propagates through the communication channel between transmitters and receivers. It represents the combined effect of wireless communication, such as fading, scattering, shadowing, multipath, and power decay with distance. CSI is calculated at the receiver end as follows.

$$\begin{aligned} {\vec{Y}} = {\vec{H}}{\vec {X}} + {\vec{N}} \end{aligned}$$
(3)

where \({\vec{X}}\) and \({\vec{Y}}\) represent the transmitted signal vector and received signal vector, respectively, \({\vec{N}}\) denotes additive white Gaussian noise and \({\vec{H}}\) is the channel frequency response, which is referred as CSI. So, CSI is obtained from \({\vec{X}}\) and \({\vec{Y}}\) at the receiver end.

Generally, in WiFi fingerprint-based localization systems, the train data are collected from every accessible Location Points (LPs). Specifically in the train dataset (TR), RSS fingerprints are received from total n number APs. The RSS values of ith fingerprint collected from a location say \(l_z\) is notated as \(tr_i = \{rss_{i1}\), \(rss_{i2}\), \(\ldots\), \(rss_{in}:l_z\}\). The symbol \(rss_{ij}\) denotes the RSS value of jth AP presents in the ith fingerprint, where \(1\le i \le m\), considering m number of training fingerprints and \(1\le j \le n\). All location points and the corresponding train data set are represented as \(L = \{l_1, l_2, \ldots , l_{g}\}^T\) and \(TR = \{tr_1, tr_2, \ldots , tr_{m}\}^T\) respectively, where \(l_z=\{x_z,y_z\}\) is the two-dimensional coordinate of an location in the experimental region. Similarly, the test dataset is represented as \(TE = \{te_1, te_2, \ldots , te_{m^{\prime }}\}^T\), where \(m^{\prime }\) denotes the number of test fingerprints and \(|TR|>|TE|\). Each test fingerprints, \(te_i\), is represented as \(te_i = \{rss^{\prime }_{i1}, rss^{\prime }_{i2},\ldots , rss^{\prime }_{in}\}\).

In this regard, given a labeled train set, TR, and a test set, TE, the indoor localization problem is to predict the unknown location corresponding to each test fingerprints, \(te_i\in TE\).

The following section describes the different steps of ILS.

4 Different phases of indoor localization system

This section discusses the general phases of an ILS as depicted in Fig. 2 and highlights the state-of-the-art works in each of these phases.

Fig. 2
figure 2

Different phases of indoor localization system

4.1 Data collection

The two different modes of RSS fingerprint collection are described below.

4.1.1 Dedicated user-based data collection

Using smart devices, the users who are willing to participate in the data collection task acquire RSS of available APs from each location point or reference point. The location points are chosen according to ground truth decided by the work. For instance, data can only be collected from meaningful location points of the experimental region. Torres-Sospedra et al. (2014) have collected RSS data from two locations per room i.e., one position inside the room and one position just outside the room for room-level localization. In order to provide fine-grained localization, Ghosh et al. (2016) have divided their experimental region into \(2\times 2\) sq. meter grids and RSS fingerprints have been collected by the user from each grid. Generally, in this data collection mode, proper labels i.e., locations are tagged with the collected fingerprints.

4.1.2 Crowdsourcing-based data collection

In this data collection process, the users, carrying smart devices, walk around the building as usual for their daily activities. Their smart devices record RSS fingerprints and other relevant information from various positions of their movement path along with the traveled walking distances (Wu et al. 2015; Lohan et al. 2017). More importantly, it is difficult to properly label the crowdsourced data because in many cases, the users are even unaware of their involvement in the data collection task. Hence, these fingerprints are grossly labeled based on the feedback of the crowd. Wu et al. (2015) have collected a large volume of crowdsourced data and predicted the user’s current location based on the number of footsteps from a previously known location. These footsteps are obtained by the accelerometer sensor of smartphones. However, these number of footsteps, as well as sensor readings, may vary from user to user. Lohan et al. (2017) have proposed another crowdsourcing approach where the current position of a user has been taken as manual input from the user. Thus, the labeling of the crowdsourced dataset may be incorrect as it depends on the crowd behavior.

4.2 Data preprocessing

The raw data needs preprocessing before analysis. The WiFi data are preprocessed in the following manner.

  • Interpolating missing entries of unheard APs: In every location point, the signals of all APs are not heard due to the limited coverage range of WiFi signals and other indoor environmental factors. These missing entries (\(rss_{i}, rss^{\prime }_{i}\)) of both TR and TE need to be interpolated before analysis. There are different ways of handling the missing values such as deleting the observation, discarding the feature, imputation with mean/median/mode, etc. Generally, RSS values lie between 0 to \(-120\) dBm. Wu et al. (2015) have assigned the signal strengths of unheard APs with 0 whereas Cooper et al. (2016) have assigned the same with the minimum observed value of RSS.

  • Removal of inconsistent AP: Generally, APs of nearby buildings or even hotspots are heard during data collection. Those APs may not be available all the time and also have weak signal strength. A value close to 0 dBm indicates the strong signal and less than \(-80\) dBm indicates a weak signal (poor distance sensitivity), which may not be useful. Hence, those \(rss_{i}\) need to be discarded at the time of location prediction. Moreover, WiFi hotspots are movable and alive for a short duration. Keeping these kinds of signal strengths for analysis incur noise in location prediction.

4.3 Data analysis

The collected datasets are analyzed to find a meaningful distribution of RSS over different locations. Statistical approaches, as well as various machine learning algorithms, are used to discover the pattern of RSS to estimate an unknown location. In this context, a brief review of the existing literature based on statistical and machine learning approaches is presented below.

4.3.1 Statistical analysis techniques

The representative statistical approaches of indoor localization are depicted in Fig. 3. In this domain, a large number of research efforts have been adopted the fingerprint approach as a basic scheme of location estimation. So, over the years fingerprinting has been applied in different technologies including WiFi, RFID, acoustic, visible light, magnetic field, and so on. However, WiFi-based fingerprinting is the most preferable one due to its ubiquity in indoor regions. In early 2000, RADAR (Bahl and Padmanabhan 2000) has been proposed as an indoor position tracking system and used the existing WLAN technology. It is one of the first significant works in this field. Along with the triangulation-based localization technique, this system incorporates the Rayleigh fading model and Rician distribution model. Another well known system is Horus (Youssef and Agrawala 2005), which requires less computational resources. This system has been implemented by many researchers in the last decades. To achieve better accuracy, different modules including Clustering, Discrete Space Estimator, Correlation Modelling and Handling, Continuous Space Estimator, and Small-Scale Compensator have been proposed in Horus to address different causes of wireless channel variations.

Fig. 3
figure 3

Different statistical approaches for indoor localization system

Apart from these, some recent research efforts have been found to utilize inertial sensors for improving localization accuracy (Kang and Han 2015; Koroglu and Yilmaz 2017). Generally, inertial sensor-based localization systems use Pedestrian Dead Reckoning (PDR) approach. In PDR, the distance traveled from a known or initial starting position has been computed. Embedded inertial sensors of smartphones have been used to track pedestrians. The displacement of a user has been determined by the complex human mobility information like step counting, stride length estimation, heading direction estimation, trajectory, walking, running, stair, elevator, and so on. SmartPDR (Kang and Han 2015) is one of the well-known PDR approaches for smartphone users. SmartPDR uses various modules including step event detection, heading direction estimation, step length estimation for location estimation. Generally, biases, bias stability, and thermo-mechanical noise are the most common errors that affect those inertial measurements.

In addition, CSI has been considered as a stable signature for achieving higher localization accuracy (Xiao et al. 2012; Sen et al. 2012; Zhang et al. 2020). Using CSI reliable and fine-grained information about the wireless channel has been acquired. Zhang et al. (2020) have introduced Cramer–Rao Lower Bound (CRLB) concept to analyze the localization errors of their proposed CSI-based indoor localization model. Their proposed technique has considered the relationship between the localization accuracy and the path loss, shadowing effect, multipath effect, and asynchronous effect in order to obtain the localization error due to the pedestrian motion. However, these techniques require huge calibration effort for building a fingerprint database via wardriving.

Interestingly, the major drawbacks of the fingerprint approach like the cost of time and manpower have been reduced by the crowdsourcing technique in which a fingerprint database has been constructed at the time of normal traversing with a smartphone. Rai et al. (2012) has developed a system called Zee to enable a zero-effort crowdsourcing approach while collecting WiFi signal strength and inertial sensor readings. In addition, Zee has incorporated the augmented particle filtering method to represent the uncertainty in location prediction. Similarly, in LiFS (Yang et al. 2012), authors have leveraged user trajectories to construct a fingerprint database that maps between fingerprints and the floor plan. They have introduced the concept of a stress-free floor plan so that the geometrical distances between any two points in the high dimension space are reflected by the real walking distances of the users. So, they transform a floor plan into a stress-free floor plan, generate a fingerprint space, and map the fingerprints to the real locations. In the localization phase, LiFS uses the nearest neighbor algorithm to find the target location.

Besides, of those above-mentioned approaches, some geometrical models have been built up to find out an unknown location. Unlike searching from the previously stored fingerprint data, an unknown location has been calculated using a model such as Log-Distance Path Loss (LDPL) model (Li et al. 2018; Prasad and Bhargava 2021), Weighted Path Loss model (Poulose and Han 2019) etc. Ji et al. (2006) and Lim et al. (2010) have deployed WiFi sniffers at known locations for collecting the RSS of various APs. Then an RSS map has been constructed using the LDPL model. Moreover, a more sophisticated model called the ray-tracing model has been used by Ji et al. (2006). However, these types of models have required knowledge about the locations of APs. To address this issue, Chintalapudi et al. (2010) have used a genetic algorithm along with the LDPL model in their proposed system, EZ, for solving the RSS-distance equations. However, EZ has been dependent on the available GPS information at some specific locations like the entrance of a room or near a window. In addition, a complex computation process has been involved in EZ and the physical localization method has generated a lot of miss-detections in rooms. Recently, Prasad and Bhargava (2021) have designed a localization model from the RSS considering unknown transmit power and LDPL exponent.

Apart from the RSS-based model, other geometric models based on the relationship between the transmitted and received signals have been also utilized in this domain. These type of systems include CUPID (Sen et al. 2013) based on Angle of Arrival (AoA), Guoguo (Liu et al. 2013) based on Time of Arrival (ToA) and Cricket (Priyantha et al. 2000) based on Time Difference of Arrival (TDoA). Wang et al. (2015) have proposed a novel WiFi-based scheme using Curve Fitting (CF) and location search techniques. The CF technique has been used to construct a fitted RSS-distance function for each AP in each subarea. The two-step online positioning phase has been designed to determine the subarea of a device and identify the appropriate location within the selected subarea using two location search algorithms. Yang et al. (2020) have proposed a novel RSS-based Trilateration algorithm for indoor localization. First, they have preprocessed the raw data using a Gaussian filter to reduce the influence of measurement noise. Using a novel Least-Squares CF (LSCF) method, they have estimated the transmit power and the path loss exponent. However, these model-based approaches usually require the deployment of additional infrastructure, remodeling of some existing products, and knowledge about the hardware configurations.

In the recent past, the Fresnel zones model has been used in this literature to determine the elliptic region of the target (Fei et al. 2020; Wu et al. 2021). There are multiple propagation paths in the indoor environment from transmitter to receiver due to NLoS propagation. Each Fresnel phase is the phase difference between the NLoS path and LoS path. Conclusively, the resided Fresnel Zone of a target is determined according to various mathematical calculations that use the difference of the Fresnel phases. Fei et al. (2020) have applied the Fresnel zones model to obtain the elliptic region of the target according to the phase of CSI. They have implemented their proposed model in two different multi-path indoor areas to evaluate its feasibility. Wu et al. (2021) have considered two traditional sensing models such as the Fresnel zone model and CSI-ratio model to extract some insightful properties for localization and a variety of device-free sensing applications.

4.3.2 Machine learning techniques

In machine learning, the collected dataset is divided into train set, validation set, and test set, while the train set is used to build the model and the validation set is used to validate it. The trained models are then used to detect an unknown location using the recorded data in the test phase as input. Labeling of data is an important issue here. Labeled train sets are suited for supervised learning algorithms. Specifically, proper data labels should be maintained in the labeled train set or else we will not be able to get significant accuracy for the test data. The Semi-supervised learning algorithms are chosen when a dataset contains few labeled data and a large amount of unlabeled data. Besides, the Unsupervised learning algorithms are applicable for the unlabeled datasets.

Consequently, the representative machine learning approaches of WLAN-based indoor localization techniques, that have been proposed and implemented in the last two decades are summarized in Fig. 4. Researchers have utilized the well-known supervised learning approaches like k-Nearest Neighbor (kNN) (Kriz et al. 2016; Roy and Chowdhury 2018a), K* (Mascharka and Manley 2015), Support Vector Machine (SVM) (Yu et al. 2014; Rossi et al. 2013), Bayesian Network (BN) (Xu et al. 2017), Naive Bayes (Zhang et al. 2014), Decision Tree (Zia et al. 2018), Random Forest (Ramadan et al. 2018), Neural Network (Meng et al. 2019; Roy and Chowdhury 2021a) in this domain. Kriz et al. (2016) have applied weighted kNN in their proposed technique where BLE beacons along with WiFi RSS have been used for estimating unknown locations. Along with several classifiers, Mascharka and Manley (2015) have used K* to analyze their collected dataset. Yu et al. (2014) have proposed an SVM-based algorithm to effectively minimize the fingerprint calibration effort while improving localization accuracy and stability. In order to develop a robust and accurate floor localization method, Xu et al. (2017) have designed a technique based on the Bayesian Network to identify the floor level of a pedestrian in a multi-storied building. Zhang et al. (2014) have investigated the performance of Bayes learning algorithms and have identified the common problem of zero probability caused by data incompleteness that affects the localization accuracy. So, they have proposed an improved Naive Base algorithm to overcome this problem. Zia et al. (2018) have investigated the performance of several machine learning techniques including Decision Tree in order to localize an object in indoor spaces. In this literature, the Decision Tree algorithm has been used to form another technique like Gradient Boosted Decision Tree, Random Forest, etc. To resolve the problem of Non-line-of-sight (NLoS) identification, Ramadan et al. (2018) have employed a Random Forest-based technique. Furthermore, willing to design a stable fingerprint database by reducing the fluctuation of WiFi RSS, Meng et al. (2019) have considered Radial Basis Function (RBF) Neural Network. Besides, Zhang et al. (2018) have addressed certain drawbacks of the existing PDR approaches such as the accumulation of errors due to noisy sensors by introducing a novel PDR-based ILS using Online Sequential Extreme Learning Machine (OS-ELM).

Fig. 4
figure 4

Different machine learning approaches for indoor localization system

Apart from these, the semi-supervised (Zhou et al. 2017; Wang et al. 2018) and unsupervised learning (Wu et al. 2013; Wang et al. 2019) algorithms have been also found in this literature that aims to reduce the effort of fingerprint collection. Zhou et al. (2017) have employed a semi-supervised manifold alignment approach where unlabeled samples along with timestamps have been used to construct the radio map. Besides, Wu et al. (2013) have also eliminated the effort of site surveys by designing an unsupervised approach using k-means clustering and a logical floor plan mapping method. Wang et al. (2019) have used DBSCAN (Density Based Spatial Clustering of Applications with Noise) to cluster the RSS fingerprint database and divide the entire region into several regions based on the clustering results. They have claimed that besides improving the localization accuracy, the computational complexity and location prediction time have been also reduced by clustering the fingerprint database.

In addition, some existing works have used ensemble learning techniques like Bagging (Trawiński et al. 2013), Boosting (Cooper et al. 2016), classifier fusion (Belmonte-Fernández et al. 2018). Those techniques have been found to outperform individual machine learning algorithms. Among them, Cooper et al. (2016) have utilized the AdaBoost technique to reduce computation complexity and improve runtime performance. They have used Bluetooth Low Energy (BLE) beacons along with WiFi signals for precise indoor positioning and their system has been achieved \(96.6\%\) room-level accuracy. Belmonte-Fernández et al. (2018) have designed an ensemble classifier based on the sum of probability estimates of six classifiers namely, Bayesian Network, kNN, Multi-Layer Perceptron (MLP), Random Forest, SVM, and Sequential Minimal Optimization (SMO) to estimate user’s position.

Recently, Transfer learning and Deep learning-based indoor localization schemes have become very popular among researchers. Based on the knowledge transferred from a related source environment to a related target environment, Liu et al. (2017) have proposed a Transfer learning-based framework to enhance the scalability of fingerprint-based localization by reducing the effort of fingerprint collection for the target indoor regions. Zhang et al. (2017a) have presented an indoor localization scheme using Deep Neural Network and Deep Belief Network in which WiFi RSS and magnetic field data have been fused to enhance the localization accuracy. However, the performance of their technique is closely related to the numbers of APs, location points, and labeled fingerprints in training sets. Koike-Akino et al. (2020) have used ResNet-inspired Deep Neural Network (DNN) to identify the location and orientation of a client.

The adoption of machine learning techniques in indoor positioning and their effectiveness in extracting knowledge, discovering, learning, and improving localization accuracy can be observed in the literature. These approaches are very effective than the traditional mathematical models for solving the problem of indoor localization. Due to the dynamic nature of indoors (like fluctuation of RSS, changing ambience, device heterogeneity), the localization problem becomes too complicated for handwritten rules and/or equations. Machine learning techniques can provide a scalable solution to the problem for large indoor spaces as the classifiers, can be easily tuned with the updation of datasets (Kim et al. 2018; Abbas et al. 2019; Zou et al. 2016). Moreover, unlike the traditional statistical approaches, machine learning techniques can be easily extended to provide a stable performance under various ambient conditions (Abbas et al. 2019; Jiang et al. 2014; Zou et al. 2015b). The online learning ability of classifiers allows to incrementally adapt to changing ambient scenarios which are very difficult with traditional approaches (Zou et al. 2015b; Jiang et al. 2014). Even, the knowledge gained through training for one ambient condition or one experimental region can be transferred to learn a new but related ambient condition or region, respectively through transfer learning mechanisms (Liu et al. 2017; Zou et al. 2017). More importantly, in the inertial sensing-based ILS, the adoption of unsupervised learning techniques more specifically, the clustering algorithms becomes a very effective approach to group the similar type of user-contributed trajectories together (Shang et al. 2015; Lashkari et al. 2018; Luo et al. 2018). The machine learning techniques have been also integrated for step detection and inertial navigation (Pasricha et al. 2015). Furthermore, in geomagnetic sensing-based ILS, Deep Neural Network has been incorporated to effectively classify the sequences of magnetic patterns that are very much sensitive with the indoor ambience (Lee et al. 2018).

Thus, from the above discussion and to the best of our knowledge, it is hard to find out any research work focusing on the essential features of a ubiquitous solution to the problem of indoor localization. However, ubiquity is an essential feature for the wide-scale commercial success of indoor localization-based applications. Consequently, the emerging research challenges for implementing a ubiquitous ILS and probable solutions are discussed in the following section.

5 Research challenges of a ubiquitous ILS and future directions

Common research challenges of indoor localization are: reducing the effort of fingerprint collection, fusion of proper technologies, selection of proper learning algorithm(s), elimination of unpredictable noise, improving reliability, and so on. In this regard, a comparison among the existing literature with respect to these research challenges is presented in Table 2.

Table 2 Comparison among the existing literature in terms of the various research challenges

However, in order to achieve a low-cost ubiquitous solution of indoor localization, apart from these general challenges we need to overcome the following emerging challenges.

  • Designing datasets and learning techniques to handle different contexts.

  • Identifying stable infrastructure in order to provide a ubiquitous solution.

5.1 Designing datasets and learning techniques to handle different contexts

This challenge calls the need for (i) designing a fine-grained dataset to analyze the RSS fingerprints collected from public infrastructure at different times in different ambient conditions using a number of devices having varying hardware configurations, and (ii) designing and testing localization techniques that work effectively even when the training and testing conditions vary.

5.1.1 Data acquisition subject to temporal, ambience and device heterogeneity for public infrastructure

A. Challenges faced:

The RSS of WiFi APs significantly varies from one device to another at any specific location point. The signal strength also varies at different times in a day, and due to weather conditions as well. Moreover, signal strengths also fluctuate for different indoor ambience such as the presence or absence of human beings and other interfering devices, furniture change, as well as the presence of obstacles. Thus, indoor positioning techniques should be validated using datasets that contain RSS data of different contexts such as temporal, ambience, and device heterogeneity. Additionally, proper location point labeling of the RSS data in the train set should be maintained for appropriate estimation of an unknown position. In this field, many researchers have proposed their techniques by explaining their experiments and providing results using their own datasets which are not disclosed. In other research fields such as image processing, natural language processing, bioinformatics, it is a common practice to validate any newly proposed technique using a huge number of available datasets in a public repository like the University of California Irvine (UCI) Machine Learning Repository.Footnote 2 This domain has fewer publicly available datasets as compared to other domains. Most of the available datasets have not considered the above-mentioned heterogeneous contexts. In addition, the available datasets have been taken either from university buildings or from office buildings. The existing techniques retain significant performance in such environments. In a public infrastructure (like shopping malls, railway stations, hospitals, etc.), the WiFi signal strengths become noisy due to the movement of the crowd, nearby interference of mobile devices, and many more. Thus, in order to develop a ubiquitous ILS, the existing or newly proposed techniques should be validated using the datasets of public infrastructure. However, the benchmark datasets for public infrastructure are still hard to find in public repositories. Moreover, comparing the performances of different techniques become very hard due to the following factors.

  • The size of a grid or cell, considered as location point, is not uniform. It varies from a small squared or rectangular area to the size of a room.

  • During train dataset preparation, the number of data samples collected per location point i.e. sampling rate varies a lot.

  • The units of RSS data are found to vary in state-of-the-art works. Either these are in linear scale or nonlinear such as dBm.

Table 3 Main characteristics of the publicly available datasets for indoor localization

The main characteristics of some publicly available datasets are described in Table 3. The CRAWDAD (King et al. 2008), KIOS (Laoudias et al. 2013) and IPIN 2016 Tutorial (Montoliu et al. 2017) datasets contain RSS fingerprints of very small regions. In CRAWDAD (King et al. 2008) dataset, the distance between any two location points is 1.5 m. However, the positioning granularity (i.e. size of each location point or cell) is not mentioned and even the distance between any two location points is not uniform in KIOS (Laoudias et al. 2013) and IPIN 2016 Tutorial (Montoliu et al. 2017) datasets. The UJIIndoorLoc (Torres-Sospedra et al. 2014) dataset contains RSS fingerprints of a vast indoor area of a university campus. Another dataset, UJIIndoorLoc-Mag (Torres-Sospedra et al. 2015) highlights the variations of the magnetic field. This dataset consists of 40,159 discrete captures containing inertial sensor data. However, these large-scale datasets (Torres-Sospedra et al. 2014, 2015) are neither fine-grained nor contain data for various ambient conditions. In JUIndoorLoc (Roy et al. 2019) dataset, the RSS fingerprints have been taken at a granularity level of 1 sq. meter from a university building. Moreover, these dataset contains RSS fingerprints of different times, ambience (open/closed room, presence/absence of human), and devices. In the recent past, few crowdsourced indoor localization datasets (Mendoza-Silva et al. 2018; Lohan et al. 2017) are published. Among them, the dataset described by Lohan et al. (2017) contains user traces of a significantly large area. However, one of the major drawbacks of the crowdsourcing approach is that many users are involved in constructing the radio map in the offline training phase. So, the labeling (i.e. tagging locations with samples) can also be varied from one user to the other. Thus, localization solely based on the crowdsourced RSS values can cause significant localization errors.

Fig. 5
figure 5

Localization accuracies of state-of-the-art classifiers for various datasets

The localization accuracies of state-of-the-art classifiers for various publicly available datasets are shown in Fig. 5. According to this figure, for UJIIndoorLoc (Torres-Sospedra et al. 2014) dataset, the accuracy of every classifier is better than the other datasets since the room level accuracy is always better than the fine-grain accuracy. Moreover, the localization accuracy varies with ambient conditions and positioning granularity. The area covered by each cell is significant in measuring the localization error in distance metrics. Furthermore, significant localization accuracy can be achieved with a granularity level of 1 sq. meter or 2 sq. meter using sophisticated machine learning techniques.

B. Probable solution:

The RSS data need to be collected for various contexts to analyze the robustness of localization algorithms with heterogeneous fingerprints. The different contexts can be as follows.

  1. 1.

    Temporal: Data can be recorded at different time slots in a day (say morning, afternoon, evening, night, etc.), to deal with the varying nature of RSS.

  2. 2.

    Ambience: Various public places have different types of ambience. So, the data can be collected in various ambient conditions such as the presence of a dense crowd in a railway station, presence of heavy electrical appliances in a factory, semi-open spaces in the museum, railway stations, hospitals, and so on.

  3. 3.

    Device: Smartphones with different configurations can be used for data collection to understand the variation of signal strength with respect to hardware change.

Fig. 6
figure 6

Variation of RSS with different times at a specific location point

Few existing works (Torres-Sospedra et al. 2014, 2015; Lohan et al. 2017) consider device heterogeneity but do not consider temporal or ambience heterogeneity. In order to show the effectiveness of the emerging challenges discussed here, we have formed a small fingerprint dataset. The RSS data of the available WiFi APs have been collected from a faculty room that has approximately 36 sq. meter area. The distance between any two neighboring location points is 1 m. Moreover, for understanding the dynamic nature of RSS, the data have been recorded for 20 days at different times. However, this dataset does not contain RSS for varying contexts throughout the entire floor or multi-floor due to operational problems. The variation of the signal strengths of 5 APs with respect to time, using the same device and at the same location point, is shown in Fig. 6. The RSS scan is repeated every 15 min from 11:00 AM to 07:00 PM. At a particular time instant, for every APs an average of statistical RSS values received in the scan duration is considered. It can be observed that the signal strengths in the morning (11:00 AM) and evening (07:00 PM) are almost the same with a drop at around 02:00 PM to 03:00 PM for all the APs. In the morning and evening, the location is less crowded. Thus, the number of nearby interfering devices is also less while at around 02:00 PM to 03:00 PM the place is more crowded.

C. Future direction: In JUIndoorLoc (Roy et al. 2019) dataset, the WiFi fingerprints have been collected from different floors of a university building for varying devices and minor variations of ambient conditions. However, the university campus is not that much crowded as public infrastructures, such as shopping malls or railway stations. Secondly, the floors do have a long corridor but do not contain vast open space as can be found in most airports, shopping malls, hospitals, and railway stations. A dataset containing RSS from such indoor/semi-indoor crowded open spaces is important as the RSS behavior of such places would not match with that collected from a room or a moderate-sized closed seminar hall. The semi-indoor spaces contain unique characteristics, unlike closed indoor spaces. The available datasets mentioned in Table 3 are useful for testing commercial applications such as navigation in office/university buildings. Thus, new benchmark RSS datasets of such overcrowded places subject to the temporal, ambience, and device heterogeneity are needed for conducting experimentation on critical services, such as emergency evacuation, especially for crowded public places. So, preparing datasets from such indoor/semi-indoor environments that are very common for public places is very crucial for effective localization and navigation services.

5.1.2 Designing techniques for precise indoor localization and navigation, when training and testing contexts are different

A. Challenges faced: Considering the dynamic nature of the indoor environment, it may not be possible to take the train set and test set in the same context (collection time, indoor ambience, and scanning device). Interestingly, the training context is known but not the testing context. The test set can be collected for an ambience or device for which no training instances are available. Moreover, the context may vary while a user moves around the experimental region for collecting the test instances. Providing navigation service becomes even more challenging for fine-grained localization because of labeling ambiguity. As a user moves in an indoor region, if the cell sizes are less (such as \(1\times 1\) sq. meter), the user moves from one cell to the other during the collection of test instances. Thus, sufficient instances from one cell may not be present in the test set and precise labeling of data also becomes very difficult. Hence, a supervised machine learning classifier may not provide satisfactory localization and navigation accuracy under different conditions. In this domain, few existing works utilized BLE beacons (Cooper et al. 2016) and RFID signals (Calderoni et al. 2015) for precise indoor positioning. The main advantage of using those signals is to get good distance sensitivity in close range and consume low power. However, the ubiquity of the ILS is not maintained as such additional devices need to be deployed as part of the infrastructure.

B. Modified problem formulation: Given a labeled train set, \(TR^{(t,am,d)}=\) \(\{tr_1^{(t,am,d)}\), \(tr_2^{(t,am,d)}\), \(\ldots\), \(tr_m^{(t,am,d)}\}^T\) (where m is the number of training fingerprints), of known contextual heterogeneity subject to temporal (t), ambience (am), and device (d) and a test set, \({TE}^{uc}=\) \(\{{te}_1^{uc}\), \({te}_2^{uc}\), \(\ldots\), \({te}_{m'}^{uc}\}^T\) (where \(m^{\prime }\) is the number of test fingerprints), of an unknown context (uc), the objective of a machine learning classifier is to predict an unknown location, \(l_z \in LP\) of each \({te}_i^{uc} \in {TE}^{uc}\) with considerable localization accuracy.

C. Probable solution: Some experiments have been conducted to show the effect of different training and testing conditions with respect to the dataset mentioned in Sect. 5.1.1. Table 4 highlights the performance of the individual classifiers for predicting an unknown location when the configuration of the device in which the classifiers are trained and the device used for collecting test data is different. Each subset of train dataset, \(TR(d_g,d_h)\), contains RSS of all the available APs from the corresponding locations taken by the smartphones say \(d_g\) and \(d_h\) (where \(g,h=1\) to 4 and \(g \ne h\)) at different times in a day. As can be observed from Table 4, the decision of individual classifiers is not significant enough to estimate a location. Thus, it is difficult for an individual classifier to retain generality while maintaining precision. In such a case, an ensemble of different condition-specific classifiers would be a better choice where a classifier is tuned separately for the conditions as shown in Table 5. The test dataset considered in Table 5 contains instances of all four devices as of the train dataset. However, the same instances are not present in the train and test datasets. Integrating the prediction results of these condition-specific individual base classifiers using the majority voting method the localization accuracy can be improved to \(92\%\) as shown in Table 5. Hence, the unified decision of all the individual classifiers is able to cover all the conditions. The condition-specific classifiers can be based on temporal, ambience, and device-specific data or the combination of those data.

This technique has been proposed by Ghosh et al. (2016) and implemented on a relatively small experimental zone. The localization accuracy of each individual classifier ranges between 58% to 85%. However, applying the majority voting method the accuracy increased to almost 96% in their dataset. Roy et al. (2021) have presented a weighted voting algorithm based on Dempster-Shafer belief theory to address this issue subject to device heterogeneity on the JUIndoorLoc dataset (Roy et al. 2019).

Table 4 Localization accuracies in % obtained by training and testing of different condition-specific datasets using four classifiers
Table 5 Localization accuracies in % obtained by each condition-specific base classifier and the Ensemble method using kNN with the test dataset featuring instances of all devices, \(d_1\) to \(d_4\)

More importantly, a condition-based ensemble classifier performs effectively when the condition in which the test dataset is collected is similar to one or more of the training conditions. However, a huge number of smart devices with different hardware configurations are available in the market. In reality, many contexts may appear as well. Hence, it is infeasible to record data for all conditions using every device. If many contexts are taken into consideration, the number of base classifiers increases exponentially for all probable combinations of contexts. Moreover, the signal strength variation in neighboring location points of 1 sq. meter granularity is very negligible, and this variation is significantly high with different contexts which may result in lower classification accuracy. Thus, a condition-based ensemble classifier may not be able to provide accurate results always. Hence, it is a vital challenge for researchers to explore a suitable technique to mitigate this problem.

D. Future direction: Interestingly, Transfer Learning is a new paradigm for improving the learning performance when the train and test data are collected for different conditions. In such a case, transferring the knowledge gained from the source environment may effectively improve the learning performance of the new environment or target domain (Liu et al. 2017). Hence, the overhead of site-survey for the target domain gets reduced and the scalability of the system is enhanced. Generally, metric learning and metric transfer are the main phases of any Transfer Learning-based framework. In metric learning, the distance metric from the source domain is learned by maximizing the statistical dependency between the WiFi signal features and corresponding location labels. The metric transfer phase then determines the most appropriate metric for the target domain by minimizing the inconsistency between the two domains. In this context, the source environment should have sufficient labeled fingerprints to achieve satisfactory localization performance. However, getting a sufficient amount of labeled fingerprints for all representative WiFi signal features is difficult. The dependency relation between the features may also vary with time, ambience, and device.

Moreover, it is difficult to locate a user precisely while on the move due to inherent labeling ambiguity induced by the movement. Here, data can only be grossly labeled. However, these individual instances with a gross label cause ambiguity. Multiple Instance Learning (MIL) is a semi-supervised learning technique that can be used to solve this issue. In MIL, not an individual instance, but a bag of instances are assigned a label with the requirement that at least one instance of the bag belongs to that label. Thus, a location-specific bag contains at least one RSS instance that is actually collected from that location and some other grossly labeled instances collected on the move. In this way, MIL techniques can be explored for indoor localization with few accurately labeled and other grossly labeled RSS instances collected using different devices.

5.2 Selecting stable infrastructure in order to provide a stable solution

A. Challenges faced: The localization accuracy mainly depends on the RSS fingerprints of all available APs. Signal strengths of certain APs at a location (say \(l_1\)) collected in some ambience (say \(am_1\)) may match with the signal strengths of that AP at another nearby location (say, \(l_2\)) for a different ambience (say \(am_2\)) due to signal fluctuations. As a result, incorrect location can be predicted at the time of classification if the ambience of fingerprint data is not considered. For smaller grid sizes, that is, fine-grained location points, this effect is even more apt. Interestingly, those APs are found to be stable across ambience which exhibits strong signal strength at a location. However, a common set of APs cannot exhibit strong signal strength across the entire coverage area. Thus, building properties need to be considered along with AP signal variations for steady coverage.

Jiang et al. (2015) have selected the important APs using the signal feature-based MaxMean approach. Similarly, in order to identify important APs, Lin et al. (2014) and Xue et al. (2019) have also used signal feature-based approaches such as Group Discriminant, Access Point Discrimination Criterion, respectively. They found improvement in accuracy than other existing methods. However, the physical distribution of the APs across the entire region is not considered in those methods. Thus, the localization error may increase in some locations as the selected important APs are not evenly distributed throughout the experimental region. Another drawback of the signal feature-based approach is to determine the threshold value which is used to select only the relevant APs and exclude the irrelevant ones. Moreover, information theory-based approaches like Information gain (Zhou et al. 2013; Zou et al. 2014) and Mutual information (Zou et al. 2015a) have been used in this domain for AP selection. However, the information theory-based approaches follow a univariate way to select the important APs, so, they cannot handle the redundant APs. Besides, Kim et al. (2017) have divided a target area into several rectangular clusters and each cluster has been divided into eight subzones to uniformly distribute important APs. However, this type of region division technique may not provide sustainable results for all experimental regions.

B. Modified problem formulation: Given a reduced train set, \(RTR^{(t,am,d)}=\) \(\{rtr_1^{(t,am,d)}\), \(rtr_2^{(t,am,d)}\), \(\ldots\), \(rtr_m^{(t,am,d)}\}^T\), of \(n^{\prime }\) APs (where \(n^{\prime }\)=\(|AP_{min}|\) and \(|AP_{min}|\) is a minimal set of stable APs) and a reduced test set, \(RTE^{(uc)}=\) \(\{{rte}_1^{(uc))}\),\({rte}_2^{(uc)}\), \(\ldots , {rte}_{m^{\prime }}^{(uc)}\}^T\), the objective is to select \(|AP_{min}|\) in such a way that the unknown locations of the test set, \(RTE^{(uc)}\), should be predicted by a machine learning classifier with considerable localization accuracy.

C. Probable solution: Machine learning-based feature selection techniques such as Correlation Attribute Evaluation, Information Gain evaluate relevant features in a relation. Moreover, in order to take care of the stability of the selected APs, the mean and standard deviation of the RSS of each AP at every location point should be considered. Before applying feature selection, the stable APs can be short-listed based on mean and standard deviation. Following this above-mentioned mechanism, experiments are conducted using state-of-the-art classifiers through cross-validation on our collected dataset. The obtained localization accuracies are depicted in Fig. 7. Instead of considering all the APs, if a minimum number of stable APs obtained from the feature selection technique are taken into consideration, the performance of the classifiers is improved significantly as shown in Fig. 7. However, the feature selection method is not the only solution for stable AP selection. The ranking of APs may change with different times, ambience, device configuration, and more importantly with the zone where localization is considered. An AP may show stable signal strength for one region while in another region it may not show that much stability. Thus, another related question regarding the stable AP selection is how to divide the region into different zones and identify the stable APs per zone.

Fig. 7
figure 7

Localization accuracies in % of different classifiers for the different number of highest ranked APs

D. Future direction: In public indoor regions like airports, railway stations, shopping malls, museums the APs are not deployed with the aim of providing localization. Moreover, in those places, many WiFi hotspots are alive than the pre-deployed WiFi APs. The signal strength of the hotspots can degrade the performance of localization as they are movable in nature and alive for a short duration. Thus, besides the selection of stable APs, the zones must be identified in order to deploy the APs in such a way that the APs cover the entire indoor region. In addition, their signal strength must be strong enough in the target region in order to show stability for heterogeneous conditions. Identifying stable APs not only reduces the cost of maintenance of APs and the dimension of location classification problem but also ensures sustainable localization performance.

However, according to the building properties, the indoor ambient properties vary, and consequently, the localization capability of the APs differs in various regions of an experimental area. Therefore, the whole region can be divided into clusters having similar signal properties of different APs. At the time of cluster formation, it can be observed that some APs may provide distinguishing characteristics for the two adjoining clusters while some APs give no predictive information for localization. Consequently, a proper technique should be explored for the identification of the optimal number and size of clusters based on the similar RSS behavior and the selection of important and stable AP set that distinguishes among the clusters.

Moreover, AP selection is a combinatorial optimization problem where an optimal set of relevant APs are selected from a large set of APs. Evaluating the performance of all possible subsets of APs from a large search space is generally infeasible in practice as huge computational effort is required. Meta-heuristics techniques find a near-optimal solution to an optimization problem like AP selection through exploring and exploiting a larger search space. Hence, Meta-heuristics techniques should be explored to select stable and important APs that generate a robust model for indoor localization.

The important APs that are less susceptible to the change of different contexts can be identified by Deep Neural Networks with multiple hidden layers. It can also be applied for extracting useful features from a dataset having instances from different devices subject to varying ambient conditions at different times in a day. Generally, it hierarchically learns multiple levels of representation and corresponding different levels of abstraction.

6 Discussion on open research issues in ubiquitous ILS

Despite the above-mentioned research challenges and existing research efforts, some open issues still exist. These are discussed below.

  • Adaptation with the change in feature space: The performance of a localization system often gets affected due to some significant changes in the existing WiFi network infrastructure like addition, replacement, drop-off, and shifting of WiFi APs. At the time of emergency conditions, like fire outbreaks, some WiFi APs of one region may inactive. In such scenarios, the training model needs to be updated to provide localization services. Thus, a feature space mapping technique needs to be explored in order to adapt the old training model to the new feature space.

  • Adaptation with the unlabeled data: The collection of huge training fingerprints is a very laborious and time-consuming task. Therefore, an investigation is needed to determine the minimum volume of fingerprint collection that covers the entire region. Accordingly, the system needs to be made effective and accurate by collecting unlabeled fingerprints from anonymous users. Moreover, an extensive investigation is also required to enhance the system’s performance.

  • Avoidance of crowd in public places: Crowd identification and mitigation are important concerns for public health and safety especially during the pandemic period in order to contain the spread of infectious diseases (such as Covid-19). Indoor localization techniques can be used to identify the mobility pattern of the crowd. Accordingly, crowd mitigation strategies can be investigated and possible crowd formation spots could be identified. For example, a dense crowd can be formed in front of a LED display screen at public places like some regions of a railway station, airports, shopping malls, etc. Analyzing the movement of the crowd or the formation of the crowd, if another LED display screen were placed in the nearby area, then there is a possibility to avoid the 0formation of the crowd.

  • Maintaining a trade-off among various performance metrics: The localization or tracking accuracy is the most vital requirement of any localization system. A system can be better if it has a high localization accuracy. In order to increase the accuracy, often the other characteristics including, scalability, robustness, energy efficiency, cost get overlooked. So, a proper trade-off between accuracy and other characteristics needs to be maintained to develop an efficient system. If a large localization area becomes very crowded, the wireless signal channels get more congested. Hence, more calculations or analyses may be required for localization. Thus, a localization system should be scalable so that it can ensure its usual localization performance when the localization scope gets increased. Moreover, a localization algorithm should be less complex and executed in the server end due to the lack of strong processing power and long battery life of the client end mobile device. Therefore, a system needs to be energy efficient so that it can consume less power. ILS is mostly used for live-location tracking of the users and real-time navigation. Thus, an efficient system is required with low network latency. To achieve this small volume of data (pre-processed data) should be transferred among the server and client.

7 Conclusion

The aim of this research is to provide the motivation for a ubiquitous WiFi-based ILS and the emerging research challenges associated with it. A brief discussion about the different phases of an ILS, problem definition, and a review of previous works are presented. Our key contribution is to give a detailed categorization of research challenges in ILS when system ubiquity is the prior concern. This is needed for almost all applications of indoor localization including indoor navigation, asset tracking, emergency evacuation, etc.

The associated research challenges and possible future scope are studied thoroughly. Designing a fine-grained comprehensive dataset for public infrastructure, designing techniques for precise indoor positioning with the train and test dataset of various conditions, and identifying minimal but more importantly, stable infrastructure are the prime challenges as detailed in Sect. 5. These issues need to be fixed for developing a wide-scale ILS that provides significant localization accuracy in crowded indoor spaces.