Classification and Clustering for Knowledge Discovery

Volume 4 of the series Studies in Computational Intelligence pp 61-72


D-GridMST: Clustering Large Distributed Spatial Databases

  • Ji ZhangAffiliated withDepartment of Computer Science, University of Toronto
  • , Han LiuAffiliated withDepartment of Computer Science, University of Toronto

* Final gross prices may vary according to local VAT.

Get Access


In this paper, we will propose a novel distributable clustering algorithm, called Distributed-GridMST (D–GridMST for short), which deals with large distributed spatial databases. D–GridMST employs the notion of a grid to partition the data space involved and uses density criteria to extract representative points from spatial databases, on which a global MST of representatives is constructed. Such an MST is partitioned according to users’ clustering specification and used to label data points in the respective distributed spatial database thereafter. D-GridMST is characterized by fast speed, low space requirement and small network transferring overhead. Experimental results show that D–GridMST is effective since it is able to produce exactly the same clustering result as that produced in the centralized paradigm, making D-GridMST a promising tool for clustering large distributed spatial databases.