Is Localization of Wireless Sensor Networks in Irregular Fields a Challenge?

Wireless sensor networks have been considered as an emerging technology for numerous applications of cyber-physical systems. These applications often require the deployment of sensor nodes in various anisotropic fields. Localization in anisotropic fields is a challenge because of the factors such as non-line of sight communications, irregularities of terrains, and network holes. Traditional localization techniques, when applied to anisotropic or irregular fields, result in colossal location estimation errors. To improve location estimations, this paper presents a comparative analysis of available localization techniques based on taxonomy framework. A detailed discussion on the importance of localization of sensor nodes in irregular fields from the reported real-life applications is presented along with challenges faced by existing localization techniques. Further, taxonomy based on techniques adopted by localization methods to address the effects of irregular fields on location estimations is reported. Finally, using the designed taxonomy framework, a comparative analysis of different localization techniques addressing irregularities and the directions towards the development of an optimal localization technique is addressed.


Introduction
The emergence of Internet of Things (IoTs) and intelligent applications like smart cities, smart homes, smart healthcare services, and intelligent vehicular monitoring have increased the need to connect the physical environment with the digital world [1,2]. Wireless sensor networks (WSNs) are playing a vital role in such applications because of the ease with which they can connect physical and digital worlds [1]. Advances in Micro-Electro-Mechanical System (MEMS) technology, wireless communication, and digital 1 3 electronics have led to the proliferation of WSNs [3]. With the advancement of sensor technologies, WSNs are envisioning a wide variety of promising services in many fields such as search-and-rescue operations, environmental monitoring, precision agriculture, smart transportation, military surveillance, event detection (fires, floods, and hailstorms), to name a few [4][5][6][7][8].
WSNs are a network of sensor nodes which are tiny and cost effective devices. Sensor nodes consist of one or many sensors to monitor phenomena around them, Analog to Digital Converter (ADC) unit to convert monitored data to digital format, a small memory unit, a microcontroller for data processing, a transceiver unit to exchange data and a power supply which is usually a battery [9]. Nodes transmit the monitored data to a gateway node or base station through single hop routing or multi-hop routing [10,11]. Gateway node, which is usually a powerful node, will process the data and take necessary actions [12].
WSNs are formed by deploying sensor nodes in the field of interest. An efficient deployment maximizes the quality of network in terms of coverage, connectivity, and lifetime of the network [13,14]. Deployment of sensor nodes can be carried out in deterministic or random fashion, as represented in Fig. 1. In a deterministic deployment, the location of each sensor node is predetermined. By knowing the field geometry and radio propagation pattern of nodes, a minimum number of required nodes and their positions can be determined to achieve the required degree of coverage and connectivity [15,16]. But, these methods are usually not suitable for large scale networks which involve hundreds to thousands of nodes as they demand the placement of a huge number of nodes in predetermined places. Uniform random deployment is more suitable for networks consisting of a huge number of sensor nodes [17,18]. Disadvantage of this method is that, as this involves randomness, it does not guarantee complete coverage and connectivity. It is possible that there is missing coverage and connectivity in a few parts of the field.
After deployment, sensor nodes start monitoring phenomena around them and start collecting information on events. The information collected by each sensor node can be represented in the form (D, L, T), where D is the measurement data, L is the location of the measuring node, and T is the time at which measurement was done [12]. It is important to know the location of nodes along with measurement data and time of measurement. Location information of nodes in the network is essential to provide location stamps for the observed events, act on sensed data, locate and track target objects, determine the quality of coverage, facilitate geographic routing algorithms based on nodes location, position aware data processing, etc. Studies have revealed that existing location finding techniques like Global Positioning System (GPS) are not suitable to be used in WSNs because of higher power consumption and reduced accuracies in indoor and urban areas [5,6,[19][20][21][22].
Localization in WSN is the process of determining the location of sensor nodes. The existing traditional localization techniques in WSN utilize connectivity information, distance, and angle measurements between sensor nodes to estimate the location of nodes [23,24]. Centroid [25,26], weighted centroid [10,25], Approximate Point in Triangle (APIT) [19,27] and Distance Vector hop (DV-hop) [6,28] are few of the existing localization techniques which make use of connectivity information between nodes. Received Signal Strength (RSS) [18,[29][30][31][32], Time of Arrival (TOA) [7,33,34] and Time Difference of Arrival (TDOA) [34] based localization techniques utilize distance measurements between sensor nodes. RSS based techniques make use of change in signal strength from transmitting node to the receiving node to estimate the distance between nodes. TDOA and TOA based techniques make use of time a signal takes to travel between two nodes. Angle of Arrival (AOA) based localization methods make use of angular estimations between nodes [4,35].
In the existing literature, a number of localization algorithms have been reviewed in the area of estimation of location for sensor nodes. Authors in [36,37] have conducted a review of generic WSN localization techniques. Recent advances on localization techniques in WSNs were presented in [36] by considering a wide variety of factors and categorizing them in terms of sparse and dense node density, anchor based and anchor free algorithms, indoor and outdoor operating environment, static and mobile nodes, etc. In this review, it is observed that to handle a variety of applications in different scenarios, WSNs would need to be equipped with a combination of techniques and context to estimate locations accurately. A study of research problems associated with node localization in WSN was provided by [37]. In this paper, sensor network localization problems were described in terms of detection and estimation framework with emphasis on anchor based localization measurements. A review of rigid graph based localization of WSNs was conducted in [38]. Rigid graphs can sustain a different kind of deformations due to translation, rotation, and reflection. Concepts of rigid graphs were found to be more useful for determining correct location coordinates from error prone distance measurements.
Authors in [39,40] have conducted a review of mobility assisted localization techniques. In [39], authors have reported a review on various mobile anchor node assisted localization techniques. Discussions of various classification algorithms based on the mobility model and path planning schemes were conducted. In [40], authors have presented a survey on mobility-assisted localization techniques by focusing on the algorithmic approaches of these techniques. In this survey, along with algorithmic approaches, error refinement mechanisms adopted were highlighted, and also mobile anchor trajectories presented in existing works were reviewed.
Location finding is more challenging in practical scenarios where nodes are affected by non-line of sight (NLOS) communications, irregularities of terrains, and hardware malfunctioning. Hence, a detailed review of the available localization techniques that try to overcome the effects of irregularities is presented. This review is restricted to a network of static sensor nodes. In this review, an overview of the research done in the field of WSN localization techniques, evaluation, and comparison to existing localization techniques for irregular fields is presented.
The paper is organized as follows. Section 2 introduces different representations of irregular fields and evaluation criteria for localization techniques. Section 3 discusses 1 3 the motivations and challenges of WSN localization in irregular fields. Classification of localization techniques using different classification criteria and a taxonomy framework is provided in Sect. 4. Finally, conclusions and future research directions are summarized in Sect. 5.

Basics of Localization
This section discusses basics of localization by defining localization problem, irregularities of field, various representations of irregularities and evaluation criteria of localization techniques.

Localization Problem
Localization is determining locations of unaware sensor nodes by making use of proximity measures to reference nodes. Proximity is a quantitative measure which is proportional to the geographic distance. An example for a proximity measure is RSS. Here, by measuring the decay in signal strength from location unaware nodes to location aware reference nodes, geographic distance between the nodes can be estimated. TOA is another example for proximity measure. Here, by making use of time a signal takes to travel from location unaware nodes to reference nodes, geographic distance between the nodes can be estimated. In [41], authors have defined the localization problem as follows.
Consider a sensor network with N nodes. In this network, let M nodes be reference nodes denoted as X 1 , X 2 , … , X M and N-M be location unaware nodes with unknown positions denoted as X M+1 , X M+2 , … , X N . The Euclidean distance which is a measure of geographic distance between two nodes X i and X j is defined as where X ik and X jk are the kth coordinates of X i and X j respectively in d-dimensional space.
Localization problem can be stated as, where X i , X j are reference nodes, p ij is the proximity between any two reference nodes which can be in terms of RSS, hop count, TOA, AOA etc. and p si is the proximity between location unaware sensor node and any reference node. In other words, localization is the problem of estimating locations of N-M nodes with the help of M reference nodes.

Definition of Irregularity
Let f p be the function that maps co-ordinates of two sensor nodes X i and X j to p ij , where p ij is the proximity between X i and X j . Proximity p ij can be in terms of minimum hop counts or difference in transmitted and received signal powers between nodes X i and X j or any other measures. As defined in [41], if the measured proximity p ij for a pair of sensor nodes X i and X j written as p ij = f p (X i , X j ) is a function of the Euclidean distance between X i and X j , written as g p (d ij ), the sensor network is said to be regular or isotropic.
Irregular or anisotropic fields are the ones where proximities measured by a sensor node to others vary with different directions and hence cannot be represented as a function of Euclidean distance. In other words, there doesn't exist a function g p which can map proximity p ij to Euclidean distance d ij .

Different Representations of Irregularity
Irregularities in the fields can be represented in terms of irregular radio propagation pattern of nodes, noise in the environment, and by considering network holes and irregularly shaped fields.

Irregular Radio Propagation Pattern of Nodes
In [42], authors have used Degree of Irregularity (DOI) parameter to denote the irregularity of a radio pattern. In a field with randomly scattered obstacles of different sizes, radio signal gets attenuated with different magnitudes at different directions. The DOI is defined as the change in maximum path loss percentage per unit degree change in the radio propagation direction. As shown in Figs. 2 and 3, when there are no obstacles, DOI is 0 and Radio Propagation Pattern (RPP) is a perfect sphere. As the irregularity of the field and number of obstacles increase, radio propagation pattern becomes more and more irregular [43].

Noise
Irregularity in fields can be because of the presence of random obstacles which results in reflections, diffractions, absorptions and scattering of the radio signals from nodes. This causes NLOS communication among nodes, which can be represented by approximating NLOS error to follow different distributions, such as Gaussian distribution, uniform distribution, and exponential distribution in different conditions [44]. According to [45], distance between two nodes i and j can be represented as where d(i,j) is the true Euclidean distance between nodes i and j, n is noise factor.
where v n is the measurement noise and b nlos is the NLOS error which can follow different distributions.

Network Holes
Network holes are created by irregularities in the field where in some parts of the field signal propagation is completely blocked. This may be due to huge obstacles like rocks, rivers or buildings. Larger holes force signal propagation to take a longer path resulting in huge deviations from the actual distance between nodes [46]. As shown in Fig. 4a, signal propagation from node s to node t is close to a straight line, hence proportional to actual distance  In a, signal propagation from node s to node t is close to a straight line. In b, signal propagation from node s to node t is taking a longer route because of the obstacle and in Fig. 4b the path is curved in the presence of holes to bypass the hole. This results in higher bias of shortest path from actual distance between nodes s and t.

Evaluation Criteria
Quality of a localization technique for WSNs depends on whether it can satisfy the accuracy requirements while maintaining the resource consumptions of sensor nodes to be minimal. Localization techniques are evaluated for accuracy and success percentage at different node densities and reference node counts. These techniques are also evaluated for memory requirements, communication costs, time taken for communication and complexity of algorithms [47]. Localization accuracies and other characteristics are evaluated in different irregular shapes of fields with network holes and obstacles. They are also evaluated for multi-dimensional cases and heterogeneous or homogeneous set of nodes [48,49]. Accuracies are usually evaluated as average localization error which is calculated using below equation.
where n is the number of location unaware nodes, (x i,est , y i,est , z i,est ) is the estimated co-ordinate of ith node, (x i,act , y i,act , z i,act ) is the actual co-ordinate of ith node.
Success percentage is the number of nodes successfully localized, which can be calculated using below Eq. [47].
Success percentage is a way of measuring the quality of a localization algorithm. Higher value of success percentage indicates larger number of nodes localized by the algorithm and smaller value indicates only a few nodes in the network are localized.

Motivation
Motivations for localization in irregular fields with different sized obstacles, complex shaped fields and terrains are discussed here.
Considering an example of battle field surveillance or intelligent vehicular monitoring. They comprise of fields with obstacles of different sizes where sensors can rarely be uniformly deployed over the field. These obstacles result in degradation of radio communication among nodes and create holes in networks. Secondly, even in case of isotropic or regular fields, uneven power consumption among sensor nodes may create network holes. Lastly, external interferences like rain, sand storm, etc. may cause communication failures which result in holes in the network [46].
Intelligent environmental applications Localization is important in environmental applications like forest fire monitoring, precision agriculture applications. Forest fires are one of Number of target nodes localized Total number of target nodes × 100 the commonly occurring problems across the world. WSNs can provide solutions by gathering sensory data values like humidity and temperature from sensor nodes deployed in the field. By efficient localization of sensor nodes deployed in forest regions, place of fire can be known accurately through localization algorithms. Based on this information, necessary actions can be taken immediately. But, as forest areas are usually not plain uniform fields, considering irregularities of field increases localization accuracies [50,51]. WSNs are also used in precision agriculture applications, such as pest management and pH sensing in large farms [30]. Accurate localizations help in execution of required actions at right places.
Industrial monitoring WSNs are used in few industrial monitoring applications like monitoring of high temperature processes, vicious gas monitoring, etc. One such application is in the construction industry which needs monitoring of carbon monoxide (CO) concentrations. At over-standard concentration it may result in burns or explotions. WSN with CO sensors is deployed for this purpose. This requires placement of nodes amidst obstacles of different types like machines, humans and furniture. Localization considering these obstacles will correctly locate the place of over-standard concentration of CO [52].
Smart cities WSNs are an essential part of smart cities, smart roads for intelligent monitoring and control of events, etc. One such application is optimizing highway lighting for energy saving in smart cities. This is done through smart lighting techniques. WSNs are used to detect the presence of vehicles along the road and to control lamps accordingly. Efficient localization considering obstacles like vehicles, trees enhances the performance of such systems [53].

Challenges
In practice, sensor network is deployed in geographic regions of varying shapes and terrains consisting of obstacles of different size and shapes causing network holes and NLOS communications. These irregularities in network results in variation of distance to proximity mapping in the field of interest [41]. Few examples of challenges faced by some of the popular localization approaches due to irregularities is provided here.

RSS based localization techniques
They work under the assumption that by measuring the RSS value, distance between the transmitting and receiving node can be estimated.
Can RSS based localization techniques be used to localize nodes in irregular fields? Experiments conducted in [42] show that, in the presence of radio irregularities, RSS values between a pair of transmitter and receiver at fixed distance varied when the receiver was placed at different propagation directions from the transmitter. Hence, RSS based distance estimations result in huge deviations in irregular fields [54].
Hop count based techniques These algorithms make use of hop distance and hop counts to estimate locations. This works under the assumption that each hop distance is same throughout the network. Average distance per hop is calculated by measuring hops between reference nodes and mapping it to the known distance between reference nodes. Using the average distance per hop and number of hops, location unaware nodes estimate their distances to reference nodes. This information is used to estimate the location of unknown nodes [55].
Can hop count based localization techniques be used to localize nodes in irregular fields?
In irregular topologies, communication range will have different values in different propagation directions. This results in different hop sizes in the network depending on irregularity. So, measuring node distance as the product of hop counts and average hop distance gives high location errors. Also, the presence of network holes results in bent paths and hence overestimated distances between nodes, which adds to the localization errors [56,57].
Centroid based localization techniques Location of a node is estimated as geographic centroid of all reference nodes which are in it's communication range [58].
Can centroid based localization techniques be used to localize nodes in irregular fields?
In the presence of obstacles and holes, nodes cannot be uniformly distributed. As shown in [42], in irregular fields, the node that can hear from N reference nodes need not be necessarily at the geographic center of these nodes. Hence, estimating the location of a node as centroid of other nodes results in large estimation errors. The performance gets worse in case of nodes on the borders of network holes and fields.
TOA and TDOA based localization techniques These algorithms estimate distances between nodes by making use of time it takes for signal to travel [59].
Can TOA and TDOA based localization techniques be used to localize nodes in irregular fields?
Generally, in irregular fields, obstacles force radio signal propagation between two nodes to take a longer route. This usually adds a positive bias for the TOA and TDOA measurements [60]. The NLOS error is often regarded as a positive bias due to the longer indirect propagation path in NLOS condition. This positive bias results in over estimation of distance between nodes and larger location estimation errors.
To overcome the above mentioned challenges, researchers have further enhanced localization algorithms which will be discussed in the next sections.

Classification of Localization Algorithms in Irregular Fields
Localization algorithms make use of different techniques to improve the positioning accuracy in anisotropic fields. Localization techniques can be classified based on different parameters. Here, we have discussed various classification criteria and reported a technique based taxonomy framework. Then, a comparative analysis of localization techniques is carried out.

Network Topology
Localization designs employ different techniques to improve location estimations with reduction in power consumption, computation and communication overhead [61]. One such technique is to design network topology as centralized and distributed techniques.

3
Centralized technique In centralized localization techniques, most of the computations are performed at a central node with higher capabilities. These type of algorithms collect all the necessary information for estimating locations from the network and process the collected information centrally. This approach reduces computational overhead on sensor nodes. As the computations are executed at a node with higher capacity, these type of localization methods can be implemented with complex algorithms. But, since multiple communications are needed to exchange data between sensor nodes and central node, communication cost of these methods is generally high.
Distributed technique Localization methods which fall under this category have the algorithms running on sensor nodes in distributed manner. Nodes exchange information with surrounding nodes and estimate their locations on their own. As the computations are performed at node level, algorithms employing complex techniques cannot be run. These type of algorithms are expected to run with simple computations using minimal computational capabilities and storage space of node.

Transmission Modalities
Based on the mode of transmission used, localization algorithms can be classified as range based and range free methods [25].
Range based methods These techniques make use of distance or angle measurements between nodes to estimate the location of nodes. Few of the popular ranging measurements are based on RSS, TOA, TDOA and AOA. These methods are observed to provide higher accuracies. Drawback is that they need special high-cost hardware to estimate point-topoint distance and angles.
Range free methods These techniques make use of connectivity information between nodes to estimate locations. DV-hop, APIT, Centroid, weighted centroid are few of the popular range free techniques reported in the literature. Unlike range based methods, they do not need high-cost hardware on the nodes. These are simple methods with lower computational cost and power consumption. However, these advantages come at the cost of reduced accuracy.

Dimensionality
Based on the dimension of field considered for designing and evaluating localization algorithms, dimensionality can be 2D or 3D.
2D In a 2D field, all the nodes are assumed to be deployed on a plain field. Altitude information of the nodes is ignored for ease of computation.
3D Here, the field is assumed to be more realistic with different altitudes like forests, mountains, buildings, etc. and nodes are deployed at different places in this field. Sensor node location estimation techniques are designed to estimate node locations in 3 co-ordinate system. However, 3D node localizations are much more complex and complicated in terms of computations [48].

Network Type
Based on transmission range of nodes, networks can be classified as homogeneous and heterogeneous networks.
Homogeneous networks Here all the sensor nodes are assumed to be of having same transmission range. Effect of differences in hardware configuration and battery status on transmission ranges are often ignored.
Heterogeneous networks These type of networks consider sensor nodes of different transmission ranges. Usually, in a WSN, few sensor nodes participate more in communication process and their battery gets quickly depleted. Also, sensor nodes can be of different hardware configurations. Due to all these reasons, sensor nodes will have different transmission ranges resulting in heterogeneous network of nodes.

Technique Based Taxonomy Framework for Localization
In this section, technique-based taxonomy framework to categorize localization algorithms designed for WSNs considering field irregularities is discussed. As illustrated in Fig Using the taxonomy framework presented in Fig. 5, localization algorithms are classified into multiple categories. The criteria for classification is chosen as the discipline from which they have adopted ideas to solve the localization problem.

Optimization-Based Approaches
Optimization based approaches make use of nature inspired algorithms which mimic the nature for solving hard and complex problems. Nature exhibits extremely diverse, Fig. 5 Taxonomy of localization techniques for WSNs in irregular fields dynamic, robust, complex and fascinating phenomenon and it always finds the optimal solution to solve its problems maintaining perfect balance among its components. This is the thrust behind bio inspired computing [62]. Different optimization techniques and localization algorithms which make use of these techniques are discussed here.
Particle swarm optimization (PSO) PSO models social behavior of a flock of birds. It can be used for optimization of non-linear functions. It consists of a swarm of candidate solutions called particles, which explore an n-dimensional hyperspace in search of the global solution [63,64].
PSO is a popular algorithm used by many localization methods to improve the localization accuracies in anisotropic fields [47,54,56,59,65]. To reduce the effect of noise parameters, two PSO based localization algorithms namely Dimensionality based Particle Swarm Optimization (DPSO) and Hybrid Dimensionality based Particle Swarm Optimization (HDPSO) were developed by authors in [47]. These algorithms followed swarm based method by considering each dimension individually for particle deployment to obtain the optimized values. The personal best position value (pbest) within the swarm and global best (gbest) value for each dimension were calculated. The location coordinates were obtained by consolidating the gbest values. HDPSO differed from DPSO with its enhanced grouping strategy. Algorithms were evaluated in a 3D field of 20 m × 20 m × 20 m with random deployment of 100 reference nodes and 200 location unaware nodes. ALE was observed at 0.2511 m for DPSO and 0.1449 m for HDPSO. HDPSO attained improved localization accuracy, but with higher computational cost. In the evaluation of the derived algorithms, anisotropy was limited with noise percentage fixed at 0.02.
In [54], authors reported another PSO based localization algorithm called Node Segmentation with Improved Particle Swarm Optimization (NS-IPSO). This algorithm divides nodes into segments and then uses an enhanced PSO to improve the accuracy of the estimated distances between nodes. The results showed that the algorithm yields better accuracy in different shapes of fields.
Bacterial foraging optimization (BFO) BFO algorithm is inspired from the social foraging behavior of bacteria Escherichia coli, commonly abbreviated as E. coli. The main advantages of this optimization are parallel distributed processing, insensitivity to initial value, and global optimization [48,66].
A BFO based range free localization using fuzzy (RFBFO + Fuzzy) algorithm for anisotropic environment was developed in [48] by making use of RSS information between nodes, fuzzy logic system and BFO. The non-linearity induced by anisotropic environment between RSS and distance was overcome by modeling the edge weights between location unaware nodes and reference nodes using Mamdani type fuzzy model. Five membership functions VLOW, LOW, MEDIUM, HIGH, and VHIGH were used to map input variable RSS of fuzzy model. The edge weights were further optimized by conducting 50 independent trials. Weighted centroid method was used to estimate location of target nodes. The algorithm was evaluated in a 3D field of 150 m × 150 m × 150 m with 150 nodes, by considering DOI of 0.02. With ratio of reference nodes at 10%, ALE of 2.269 m was observed. Performance was not evaluated in the presence of network holes.

Invasive weed optimization (IWO)
This is a stochastic optimization algorithm inspired from colonizing weeds. In this optimization, it is tried to mimic robustness, adaptation and randomness of colonizing weeds in a simple and effective optimizing algorithm [67].
A novel IWO based localization method using fuzzy logic system (RFIWO + Fuzzy), which is similar to the previously discussed RFBFO+Fuzzy was developed in [48]. This algorithm gave better localization accuracy than RFBFO+Fuzzy. But convergence rate of RFIWO+Fuzzy was reported to be slower than that of RFBFO+Fuzzy. Based on application demand, suitable algorithm needs to be chosen. The algorithm was evaluated in a 3D field of 150 m × 150 m × 150 m with 150 nodes, by considering DOI of 0.02. With ratio of reference nodes at 10%, ALE of 2.213 m was observed. Effect of network hole on the performance of the system was not discussed.
Firefly algorithm The firefly algorithm is a stochastic optimization algorithm which is developed by idealizing some of the flashing characteristics of fireflies [68].
A NLOS node localization algorithm that utilizes the firefly algorithm was presented in [69]. This method was developed by considering prior error statistics of zero mean normal distribution for observation noise and exponential distribution for NLOS error. The objective function was derived according to the approximate maximum likelihood method. The non-linear objective function was solved by firefly algorithm. The algorithm was evaluated in a 2D field of 40 m × 40 m dimension with randomly deployed obstacles. Reference nodes were located at (0, 0), (0, 40), (40, 0), (40,40) and (10,30). With number of LOS measurements 3 and NLOS measurements 2, noise standard deviation at 5/m, root mean square error was observed at around 13 m.
In [23], authors presented a range-free 3D node localization method RFFA+ Fuzzy using the application of firefly algorithm. In a WSN with randomly distributed sensors, few location aware sensor nodes were used as reference nodes. The co-ordinates of the remaining sensor nodes were calculated by collecting RSS information received from reference sensor nodes and assigning edge weights based on RSS voltage. Edge weights were modeled using fuzzy logic system to reduce computational complexity and further optimized by firefly algorithm to minimize location error. The algorithm was evaluated by considering random deployment of 40 location unaware nodes and 20 reference nodes in a field of 10 L × 10 L × 10 L. Noise variance of 0.02 was considered. ALE of 0.0283 L was observed. The research can be enhanced by considering more realistic representation of irregular fields.

Machine Learning (ML) Based approaches
ML is an application of artificial intelligence which helps systems to self-learn from the experiences and acts without human intervention. These approaches help in solving localization problems by extracting useful information from the large amount of data collected by sensor nodes [70]. Application of different machine learning algorithms in localization is discussed here.

Fuzzy logic system (FLS)
In general, FLS is a nonlinear system that maps an input feature vector into a scalar output. FLS consists of four steps: a fuzzifier, fuzzy rules, a fuzzy inference engine, and a defuzzifier [43,71].
FLS can be used to solve the localization problem in anisotropic fields [43,72]. In [43], authors investigated the integration of two soft-computing techniques FLS and Extreme Learning Machine (ELM) with the goal of enhancing the localization precision while considering varying node densities, sensing coverage conditions and irregular topology. Centroid based Fuzzy Logic (FL) is found to give better results for low signal coverage nodes and low node densities and ELM gives more accurate estimations in high coverage and high node density conditions. A hybrid scheme based on above two models, Hybrid Fuzzy Deep-ELM Localization (HF-DELM) was developed to improve estimation accuracy. First, Centroid based FL was enhanced to overcome the variations in RSSs caused by irregular topologies by using signal and weight normalizations and FLS. ELM was enhanced by using deep learning in training stage and it was further enhanced by applying the concept of force vector. By applying experimental design results to FLS and constructing fuzzy rules, the fuzzy output was derived, which was used for the hybrid weight. The developed technique was evaluated in a 2D field of 100 m × 100 m with 300 nodes deployed randomly. When 15% of the nodes were reference nodes and 20 m of signal radius, ALE of 0.23 m was observed at DOI of 0.02. The irregularity of the field was considered only in terms of DOI.
A centroid based localization method, HVP-FELM that uses hybridization of FLS and ELM was developed in [73]. In this technique, efficiency in heterogeneous topologies was improved by compensating increasing error in estimation on the borders of the field or border with a hole by applying vector concept to determine the proper direction of the moving location approximation. In addition, PSO was used to determine the best solution under a given set of constraints. The performance was evaluated in a 2D field of 100 m × 100 m by considering one hole and 5 holes scenarios.
Artificial neural network (ANN) ANNs are inspired by biological neural networks. ANNs consist of groups of interconnected artificial neurons. Researchers from many scientific disciplines have been using artificial neural networks to solve a variety of problems in pattern recognition, prediction, optimization, associative memory, control etc [74][75][76].
A neural network based localization algorithm called LPSONN was reported in [49], considering both localization accuracy and storage overhead as objective function. This is a centralized algorithm where a head reference node collected hop count information from the network and trained neural network using the received information. As the performance was found to be varying with different number of neurons in the hidden layers, PSO was used to optimize number of neurons in the hidden layers of each module of neural network. Performance was evaluated by considering 400 nodes deployed in a 2D field of 60 m × 60 m. Performance was evaluated in the presence of 5 network holes and for a 'C' shaped field. ALE of around 3 m was observed at 15% reference node density. Performance of the algorithm for smaller obstacles causing irregular RPPs was not reported.
A range free ANN based localization technique was presented in [77]. In this paper, a distance estimation method in which distance estimation solely depends on idealistic transmission range of all nodes and number of hops between any two nodes k and i is reported. To account for the effects of irregular nodes, Multi-Layer Perceptron (MLP)-type feedforward back propagation ANNs were used. The estimated distances between reference nodes using new distance estimation method was fed as input and true distances between reference nodes were derived as output. This model was used to estimate locations of other nodes. The developed algorithm was evaluated in a 2D field of 10 m × 10 m. At 200 node count and 10% reference node density, normalized root mean square error of 0.6 m was observed when DOI was fixed at 0.06. No discussion was reported on evaluation of performance in the presence of network holes.

Support vector machine (SVM)
SVM is a type of machine learning algorithm. SVM is often used to solve both linear and nonlinear classification by training data. The idea of SVM is to gain an optimal hyperplane in the feature space which can separate the two class data with largest interval [78,79].
A novel range-free localization approach based on Multidimensional Support Vector Regression (MSVR) was reported in [80]. In this work the localization problem was mapped as a non-linear regression problem, dealing with multiple dimensions. The localization procedure began with all the sensors and reference nodes communicating with each other to collect the connectivity measurement information. This information was broadcasted to the sink node. Sink node estimated optimal MSVR and broadcasted the parameter vector of optimal MSVR to the sensor nodes. Location of all sensor nodes was estimated by using these parameters. The developed technique was evaluated in a 2D field of 100 m × 100 m dimension with 'X' and 'C' shapes and in 3D field of 100 m × 100 m × 100 m dimension with DOI at 0.14. Considering the deployment of 200 nodes with 25 reference nodes, location estimation error of 3.7 m-5.6 m was observed for 2D field and 11.1-14.8 m was observed for 3D field. Evaluation was not performed under irregular shaped 3D fields.

Kernel partial least squares (KPLS)
A kernel version of the partial least squares, called KPLS, is commonly used in construction of nonlinear regression models in possibly high-dimensional feature spaces. This is more suitable for moderately sized problems with the advantages of simple implementation, less training cost, and easier setting of parameters [1].
A KPLS based localization method called Location Estimation-Kernel Partial Least Squares (LE-KPLS) was developed in [1]. The offline phase consisted of training the model with hop-counts to physical distances of location aware reference nodes. In the online phase, the trained model was used to estimate locations of nodes using hop-counts from the location unaware nodes to the reference nodes. In a 2D 'C' shaped field of dimension 300 m × 300 m with 300 nodes and 28 reference nodes, root mean square error of 3.95 m was observed. In a 3D regular shaped field with DOI of 0.02, root mean square error of 5.6 m was observed. Algorithm was not evaluated for irregular shaped 3D fields.

Cluster Based Approaches
Clustering-based approaches are popularly used in applications with huge data to group similar data instances of similar behavior. Few localization techniques make use of clustering techniques to estimate locations.
Segmentation based approaches Network segmentation is used to solve optimization problem by dividing the network while minimizing or maximizing some given criteria or property [57,81].
A segmentation based localization method, DisLoc was developed in [57]. In this algorithm, the complex shaped network was first divided into several simple sub-networks by applying the approximate convex partitioning. Then, each sub-network was accurately localized by using multidimensional scaling-based algorithm. In the last step, all the partitions were merged to create the global map of the network. Performance was evaluated on four representative topologies of real-world applications like a 5-shaped coal mine tunnel, the Feishape topology as railway station, a H-Shaped terminal building and a C-shaped ordinary building entrance. At network connectivity 14, localization error of about 59% was observed. Localization accuracy was improved along with lower computational overhead.

3
Data clustering based approaches Here, few of the techniques that divide data into different groups based on some characteristic of data for improving the performance of localization accuracy are discussed.
DV-maxHop localization scheme was reported in [82] to minimize localization errors while keeping number of transmissions during localization at minimum. In this algorithm, a control parameter MaxHop was introduced to DV-Hop algorithm. This MaxHop parameter sets an upper limit on the distance to which reference node information transmits. Only the position information from reference nodes which were within MaxHop hop counts were considered by nodes. In anisotropic environment, this reduced estimations from distant reference nodes which would probably cause errors. This also reduced number of transmissions between the neighbors resulting in faster convergence time and lower energy cost. Value of MaxHop was preselected based on network density, anchor ratio, network shape etc. Performance was evaluated in 'O', 'C', 'S' shaped 2D fields. In a 'C' shaped field with 324 nodes and 10% nodes with location information, ALE of 3.99 m was observed. In a regular shaped field with DOI 0.4, ALE of 3 m was observed. Algorithm can be enhanced and evaluated in the future to enhance the performance by considering both network holes and DOI in the same field.
LOS/NLOS mixture creates a high probability of inaccurate estimations. To overcome this, authors in [83] presented a cloud based self-organizing localization (cloud-based SOL). In this technique, all sensor nodes sent their neighbor node ID list to a central node, which forwarded the list to the cloud computing environment. Localization algorithm worked in the cloud computing environment. In the cloud computing environment, a virtual WSN was constructed using the neighbor node lists of all the nodes. Modification of the node location with any hop node was done by giving priority to the estimation of global geometry in the early stages and local geometry in later stages. Later, angle based judgment was used to detect bent estimated topology. Algorithm was tested in a 2D field with dimension 1 m × 1 m and obstacle of dimension 0.5 m × 0.4 m. 50 nodes were deployed with 3 reference nodes and 0.2 m being the coverage of each node. ALE of 0.25 m was observed. Performance evaluation was restricted to 2D fields with complex shapes.
A Heuristic Multidimensional scaling (HMDS) algorithm to improve accuracy of node localization in anisotropic WSNs with holes was developed in [84]. The nodes which communicate across the holes were identified and using virtual nodes, Euclidean distances between these nodes were recalculated. Other nodes calculated Euclidean distances using Dijkstra shortest paths. These distance estimations when applied to Multidimensional Scaling algorithm improved localization accuracies. Algorithm was evaluated in different irregular shapes like semi 'C', 'O', multiple 'O' and concave fields. In a semi 'C' shaped 2D field of dimension 100 m × 100 m with 60 m × 60 m hole, 800 nodes were deployed with 5 nodes as reference nodes. At average connectivity 10.7, ALE of 8 m was observed. Effect of smaller obstacles causing irregular RPPs is not discussed.
In [85,86], authors developed Extended Kalman Filter Multidimensional Scaling (EKF-MDS). Distance estimation was obtained using the concept of virtual node. Location coordinates were obtained by Multidimensional Scaling-MAP (MDS-MAP) algorithm. The results were then refined using Extended Kalman Filter (EKF) algorithm. In [85], tests were conducted in a 2D field of dimension 100 m × 100 m with 70 m × 70 m hole. When 300 nodes were deployed with 3 reference nodes, at average connectivity 6.28, ALE of 10 m was observed. Research can further be enhanced by considering irregular RPPs and 3D fields.
Distributed Hybrid Particle/FIR Filtering (DHPFF) based on distributed filtering to mitigate NLOS effects and localization failures was developed in [60]. In this method, TOA measurements were distributed among several local hybrid particle finite impulse response filters for processing. Distributed filtering and data association techniques were used to separate reliable estimates from NLOS affected estimates. The designed technique was observed to be working correctly in the presence of one NLOS affected data of the four measurements. Performance was evaluated in a 2D field of dimension 20 m × 20 m. An obstacle located at the center with radius 2.5 m produced ALE of 0.094 m and obstacle of radius 5 m produced ALE of 2.86 m. However, in more harsh situations with two or more NLOS receivers, the developed DHPFF failed to estimate, but could recover from failures for next position estimation.
To reduce the adverse influence of multipath effects on distance estimation, [87] reported an Optimal Multi-Channel Trilateration positioning algorithm (OMCT). This algorithm first uses an adaptive Kalman filter to remove the RSS measurement noise and the optimal node position estimates are obtained from multi-objective evolutionary algorithm. Improved localization results were observed under channel diversity. Few other algorithms which use data clustering techniques are Modified Joint Probabilistic Data Association (MJPDA) algorithm [88] and Enhanced Least-Square Algorithm based on improved Bayesian (ELSAB) [89]. MJPDA divides the measurements into LOS and NLOS classes using virtual points obtained by grouping the measurements whereas ELSAB uses Bayesian classifiers for measurement data. These algorithms classify the obtained data into different categories to improve position estimations.

Comparative Analysis of Localization Techniques
A comparison table of localization techniques is presented in Table 1 by making use of classification criteria discussed in Sect. 4.1 and taxonomy framework in Sect. 4.2. Table 1 shows the characteristics of localization techniques developed especially for WSNs in irregular fields. From the table, it can be observed that different algorithms have considered different types of anisotropies. While few algorithms have represented anisotropies by including a noise factor, some have considered different shapes of fields. Some algorithms have modeled the node level anisotropies using parameter DOI. Hop based techniques are the most popular among range free methods and RSS based localization among range based methods. Also, it can be observed that distributed network topology is the most popular approach for localization. This is because distributed algorithms reduce unnecessary communications to a central location and perform computations locally. This table can be used in the analysis of various localization methods.

Evaluation of WSN Localization Techniques
In summary, there are a number of localization algorithms developed to work efficiently in the presence of field irregularities. From the above description of existing algorithms, it can be observed that different algorithms have utilized different techniques to overcome the nonlinearity caused by irregular fields on proximity measures. These algorithms work under the assumption that there is no movement of nodes after deployment. Most of these algorithms are developed under the assumption that all nodes have the same transmission ranges, and there are no hardware failures of nodes. Algorithms were evaluated in terms of localization accuracy by considering complex shaped fields, network holes, and irregular RPPs. Though many algorithms provide good location estimations in 2D fields, localization errors are large in the case of 3D fields. Algorithms have conducted performance analysis in terms of computation and communication complexities. But, the feasibility of implementing these algorithms in the existing sensor nodes of limited computational capabilities is still a question.

Conclusions
In this review of WSN localization algorithms in irregular fields, using the taxonomy framework, existing localization techniques designed for irregular fields were analyzed. Furthermore, a comparison table was created to compare these techniques in terms of network topology, dimensionality, range measurement, and network type. On review of existing localization algorithms considering field irregularities, a summary of current research challenges and important research areas to focus can be drawn as: Irregularities in the field Even though a number of localization algorithms are developed considering irregularities of field, extensive research on this is not yet reported. Algorithms need to be tested in a more realistic representation of fields. Existing algorithms try to model the nonlinearity caused by obstacles on proximity information by measuring proximity between location aware reference nodes in the field. They are evaluated either by considering few network holes or by considering a fixed DOI. But in a real scenario, the field is complex in shape with different sized network holes and varying DOIs at different parts of the field. For example, in a forest environment, small obstacles like trees, small rocks cause irregular radio propagation patterns, whereas bigger obstacles like large rocks, water bodies cause network holes. The communication environment causes different path losses in different parts of the fields. The degree of anisotropy varies at different places in the field. Developing a localization algorithm that accounts for varying types and degrees of irregularities at different parts of the field is an important research area to focus on.
Heterogeneous sensor nodes Most of the existing localization techniques reported assume a homogeneous set of sensor nodes, i.e., sensors are assumed to have identical transmission ranges. But the transmission ranges of any two nodes may vary due to differences in hardware configuration and battery status. For example, the nodes which are more close to sink nodes will have more communication overhead in case of multi-hop routing. Nodes in places where the frequency of occurrence of events is higher will participate more in data forwarding. These factors cause few nodes to be used more frequently, thus in the reduction of their battery power. Hence, it is more realistic to consider a heterogeneous combination of sensor nodes with different transmission ranges. In the case of existing hop based localization techniques, hop length of a node with lower battery power may differ largely from average hop length, and in case of RSS based localization, a node with reduced battery power may be misunderstood as affected by NLOS communication. Localization accuracies are greatly affected by heterogeneous sensor nodes. Hence, developing future localization techniques considering the heterogeneous set of nodes in terms of transmission coverage will improve performance in practical usages.
Fault analysis Existing techniques are developed by making use of a few location aware reference nodes. The system is modeled by measuring the actual and estimated distances using any of the measurement techniques between reference nodes. Locations of other nodes are estimated based on this model. Hence, it is most important to model the system accurately. But, it is possible that few of the nodes that are used to model the system turn faulty. This may be due to hardware malfunctions, rough environments, or security attacks. A more detailed analysis in this regard would provide a more stable localization technique that can work accurately even in the presence of a few faulty reference nodes. As other nodes also participate in the later phase of localization, fault analysis on other nodes also need to be carried out.
Dimensionality Many of the existing techniques focus mainly on a 2D plane. In real-life applications, nodes are often deployed in 3D space; for example, in forests, buildings, mountains, etc. 3D node localization is more complex in terms of computations. Sensor nodes are devices with limited computational capability and power source. Future research on localization can develop power-efficient localization algorithms considering 3D environments that can work in the limited computational capabilities of sensor nodes.