Distributed Spatial Clustering in Sensor Networks

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Sensor networks monitor physical phenomena over large geographic regions. Scientists can gain valuable insight into these phenomena, if they understand the underlying data distribution. Such data characteristics can be efficiently extracted through spatial clustering, which partitions the network into a set of spatial regions with similar observations. The goal of this paper is to perform such a spatial clustering, specifically δ-clustering, where the data dissimilarity between any two nodes inside a cluster is at most δ. We present an in-network clustering algorithm ELink that generates good δ-clusterings for both synchronous and asynchronous networks in \(O(\sqrt{N} {\rm log}N)\) time and in O(N) message complexity, where N denotes the network size. Experimental results on both real world and synthetic data sets show that ELink’s clustering quality is comparable to that of a centralized algorithm, and is superior to other alternative distributed techniques. Furthermore, ELink performs 10 times better than the centralized algorithm, and 3-4 times better than the distributed alternatives in communication costs. We also develop a distributed index structure using the generated clusters that can be used for answering range queries and path queries. The query algorithms direct the spatial search to relevant clusters, leading to performance gains of up to a factor of 5 over competing techniques.