Clustering in Dynamic Spatial Databases

Zhang, Ji; Hsu, Wynne; Li Lee, Mong

doi:10.1007/s10844-005-0265-0

Clustering in Dynamic Spatial Databases

Published: January 2005

Volume 24, pages 5–27, (2005)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Ji Zhang¹,
Wynne Hsu¹ &
Mong Li Lee¹

196 Accesses
27 Citations
3 Altmetric
Explore all metrics

Abstract

Efficient clustering in dynamic spatial databases is currently an open problem with many potential applications. Most traditional spatial clustering algorithms are inadequate because they do not have an efficient support for incremental clustering.In this paper, we propose DClust, a novel clustering technique for dynamic spatial databases. DClust is able to provide multi-resolution view of the clusters, generate arbitrary shapes clusters in the presence of noise, generate clusters that are insensitive to ordering of input data and support incremental clustering efficiently. DClust utilizes the density criterion that captures arbitrary cluster shapes and sizes to select a number of representative points, and builds the Minimum Spanning Tree (MST) of these representative points, called R-MST. After the initial clustering, a summary of the cluster structure is built. This summary enables quick localization of the effect of data updates on the current set of clusters. Our experimental results show that DClust outperforms existing spatial clustering methods such as DBSCAN, C2P, DENCLUE, Incremental DBSCAN and BIRCH in terms of clustering time and accuracy of clusters found.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Berchtold, S., Keim, D.A., and Kriegel, H. (1996). The X-tree: An Index Structure for High-Dimensional Data. In Proc. 22nd International Conference on Very Large Data Base (VLDB’96) (pp. 28–39). Mumbai, India.
Can, F. (1993). Incremental Clustering for Dynamic Information Processing. ACM Transactions on Information Systems, 11(2), 143–164.
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proc. 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96) (pp. 226–231). Portland, USA.
Ester, M., Kriegel, H.-P., Sander, J., Wimmer, M., and Xu, X. (1998). Incremental Clustering for Mining in a Data Warehouse Environment. In Proc. 24th International Conference on Very Large Data Base (VLDB’98) (pp. 323–333). New York, USA.
Fisher, D.H. (1987). Knowledge Acquisition via Incremental Conceptual Clustering. Machine Learning, 2(2), 139–172.
Google Scholar
Ganti, V., Gehrke, J., and Ramakrishnan, R. (2001). DEMON: Mining and Mentoring Evolving Data. IEEE Transactions on Knowledge and Data Engineering, 13(1).
Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A., and French, J. (1999). Clustering Large Datasets in Arbitrary Metric Spaces. In Proc. 15thInternational Conference on Data Engineering (ICDE’99)(pp. 502–511). Sydney, Australia.
Guha, S., Rastogi, R., and Shim, K. (1998). CURE: An Efficient ClusteringAlgorithm for Large Databases. In Proc. 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD’98) (pp. 73–84). Seattle, WA, USA.
Hinneburg, A. and Keim, D.A. (1998). An Efficient Approach to Clusteringin Large Multimedia Databases with Noise. In Proc. 4th International Conference on Knowledge Discovery and Data Mining (KDD’98) (pp. 58–65). New York City, USA.
MacQueen, J. (1967). Some Methods for Classification and Analysis of Multivariate Observations. In Proc. 5th Berkeley Symposium on Math, Statistics and Probability, vol. 1 (pp. 281–297).
Nanopoulos, A., Theodoridis, Y., and Manolopoulos, Y. (2001). C2P: Clustering Based on Closest Pairs. In Proc. 27th International Conference on Very Large Data Base (VLDB’01) (pp. 331–340). Roma, Italy.
Ng, R. and Han, J. (1994). Efficient and Effective Clustering Methods for Spatial Data Mining. In Proc. 20th International Conference on Very Large Data Base (VLDB’94) (pp. 144–155). Santiago, Chile.
O’Callaghan, L., Mishra, N., Meyerson, A., Guha, S., and Motwani, R. (2002). Streaming-Data Algorithms For High-Quality Clustering. In Proc. 18th International Conference on Data Engineering (ICDE’02) (pp. 685–694). San Jose, California, USA.
Sheikholeslami, G., Chatterjee, S., and Zhang, A. (1999). WaveCluster: A Wavelet based Clustering Approach for Spatial Data in Very Large Database. VLDB Journal, 8(3/4), 289–304.
Google Scholar
Utgoff, P.E. (1989). Incremental Induction of Decision Tress. Machine Learning, 4, 161–186.
Google Scholar
Wang, W., Yang, J., and Muntz, R. (1997). STING: A Statistical Information Grid Approach to Spatial Data Mining. In Proc. 23rd International Conference on Very Large Data Base (VLDB’97) (pp. 186–195). Athens, Green.
Zhang, T., Ramakrishnan, R., and Livny, M. (1996). BIRCH: An Efficient Data Clustering Method for Very Large Databases. In Proc. 1996 ACMSIGMOD International Conference on Management of Data (SIGMOD’96) (pp.103–114). Montreal, Canada.

Download references

Author information

Authors and Affiliations

School of Computing, National University of Singapore, Singapore, 117543
Ji Zhang, Wynne Hsu & Mong Li Lee

Authors

Ji Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wynne Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Mong Li Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mong Li Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Hsu, W. & Li Lee, M. Clustering in Dynamic Spatial Databases. J Intell Inf Syst 24, 5–27 (2005). https://doi.org/10.1007/s10844-005-0265-0

Download citation

Received: 22 October 2002
Revised: 09 February 2004
Accepted: 23 February 2004
Issue Date: January 2005
DOI: https://doi.org/10.1007/s10844-005-0265-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering in Dynamic Spatial Databases

Abstract

Access this article

Similar content being viewed by others

Density Based Clustering: Alternatives to DBSCAN

A Novel Approach to Determining the Radius of the Neighborhood Required for the DBSCAN Algorithm

DBSCAN-like clustering method for various data densities

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Clustering in Dynamic Spatial Databases

Abstract

Access this article

Similar content being viewed by others

Density Based Clustering: Alternatives to DBSCAN

A Novel Approach to Determining the Radius of the Neighborhood Required for the DBSCAN Algorithm

DBSCAN-like clustering method for various data densities

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation