D-GridMST: Clustering Large Distributed Spatial Databases

* Final gross prices may vary according to local VAT.

Get Access

Abstract

In this paper, we will propose a novel distributable clustering algorithm, called Distributed-GridMST (D–GridMST for short), which deals with large distributed spatial databases. D–GridMST employs the notion of a grid to partition the data space involved and uses density criteria to extract representative points from spatial databases, on which a global MST of representatives is constructed. Such an MST is partitioned according to users’ clustering specification and used to label data points in the respective distributed spatial database thereafter. D-GridMST is characterized by fast speed, low space requirement and small network transferring overhead. Experimental results show that D–GridMST is effective since it is able to produce exactly the same clustering result as that produced in the centralized paradigm, making D-GridMST a promising tool for clustering large distributed spatial databases.