Skip to main content
Log in

Abstract

Efficient clustering in dynamic spatial databases is currently an open problem with many potential applications. Most traditional spatial clustering algorithms are inadequate because they do not have an efficient support for incremental clustering.In this paper, we propose DClust, a novel clustering technique for dynamic spatial databases. DClust is able to provide multi-resolution view of the clusters, generate arbitrary shapes clusters in the presence of noise, generate clusters that are insensitive to ordering of input data and support incremental clustering efficiently. DClust utilizes the density criterion that captures arbitrary cluster shapes and sizes to select a number of representative points, and builds the Minimum Spanning Tree (MST) of these representative points, called R-MST. After the initial clustering, a summary of the cluster structure is built. This summary enables quick localization of the effect of data updates on the current set of clusters. Our experimental results show that DClust outperforms existing spatial clustering methods such as DBSCAN, C2P, DENCLUE, Incremental DBSCAN and BIRCH in terms of clustering time and accuracy of clusters found.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Berchtold, S., Keim, D.A., and Kriegel, H. (1996). The X-tree: An Index Structure for High-Dimensional Data. In Proc. 22nd International Conference on Very Large Data Base (VLDB’96) (pp. 28–39). Mumbai, India.

  • Can, F. (1993). Incremental Clustering for Dynamic Information Processing. ACM Transactions on Information Systems, 11(2), 143–164.

    Google Scholar 

  • Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proc. 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96) (pp. 226–231). Portland, USA.

  • Ester, M., Kriegel, H.-P., Sander, J., Wimmer, M., and Xu, X. (1998). Incremental Clustering for Mining in a Data Warehouse Environment. In Proc. 24th International Conference on Very Large Data Base (VLDB’98) (pp. 323–333). New York, USA.

  • Fisher, D.H. (1987). Knowledge Acquisition via Incremental Conceptual Clustering. Machine Learning, 2(2), 139–172.

    Google Scholar 

  • Ganti, V., Gehrke, J., and Ramakrishnan, R. (2001). DEMON: Mining and Mentoring Evolving Data. IEEE Transactions on Knowledge and Data Engineering, 13(1).

  • Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A., and French, J. (1999). Clustering Large Datasets in Arbitrary Metric Spaces. In Proc. 15thInternational Conference on Data Engineering (ICDE’99)(pp. 502–511). Sydney, Australia.

  • Guha, S., Rastogi, R., and Shim, K. (1998). CURE: An Efficient ClusteringAlgorithm for Large Databases. In Proc. 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD’98) (pp. 73–84). Seattle, WA, USA.

  • Hinneburg, A. and Keim, D.A. (1998). An Efficient Approach to Clusteringin Large Multimedia Databases with Noise. In Proc. 4th International Conference on Knowledge Discovery and Data Mining (KDD’98) (pp. 58–65). New York City, USA.

  • MacQueen, J. (1967). Some Methods for Classification and Analysis of Multivariate Observations. In Proc. 5th Berkeley Symposium on Math, Statistics and Probability, vol. 1 (pp. 281–297).

  • Nanopoulos, A., Theodoridis, Y., and Manolopoulos, Y. (2001). C2P: Clustering Based on Closest Pairs. In Proc. 27th International Conference on Very Large Data Base (VLDB’01) (pp. 331–340). Roma, Italy.

  • Ng, R. and Han, J. (1994). Efficient and Effective Clustering Methods for Spatial Data Mining. In Proc. 20th International Conference on Very Large Data Base (VLDB’94) (pp. 144–155). Santiago, Chile.

  • O’Callaghan, L., Mishra, N., Meyerson, A., Guha, S., and Motwani, R. (2002). Streaming-Data Algorithms For High-Quality Clustering. In Proc. 18th International Conference on Data Engineering (ICDE’02) (pp. 685–694). San Jose, California, USA.

  • Sheikholeslami, G., Chatterjee, S., and Zhang, A. (1999). WaveCluster: A Wavelet based Clustering Approach for Spatial Data in Very Large Database. VLDB Journal, 8(3/4), 289–304.

    Google Scholar 

  • Utgoff, P.E. (1989). Incremental Induction of Decision Tress. Machine Learning, 4, 161–186.

    Google Scholar 

  • Wang, W., Yang, J., and Muntz, R. (1997). STING: A Statistical Information Grid Approach to Spatial Data Mining. In Proc. 23rd International Conference on Very Large Data Base (VLDB’97) (pp. 186–195). Athens, Green.

  • Zhang, T., Ramakrishnan, R., and Livny, M. (1996). BIRCH: An Efficient Data Clustering Method for Very Large Databases. In Proc. 1996 ACMSIGMOD International Conference on Management of Data (SIGMOD’96) (pp.103–114). Montreal, Canada.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mong Li Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Hsu, W. & Li Lee, M. Clustering in Dynamic Spatial Databases. J Intell Inf Syst 24, 5–27 (2005). https://doi.org/10.1007/s10844-005-0265-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-005-0265-0

Keywords

Navigation