Skip to main content
Log in

Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we generalize this algorithm in two important directions. The generalized algorithm—called GDBSCAN—can cluster point objects as well as spatially extended objects according to both, their spatial and their nonspatial attributes. In addition, four applications using 2D points (astronomy), 3D points (biology), 5D points (earth science) and 2D polygons (geography) are presented, demonstrating the applicability of GDBSCAN to real-world problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Becker, R.H., White, R.L., and Helfand, D.J. 1995. The FIRST survey: Faint images of the radio sky at twenty centimeters. Astrophys. J., 450:559.

    Google Scholar 

  • Beckmann, N., Kriegel, H.-P., Schneider, R., and Seeger, B. 1990. The R*-tree: An efficient and robust access method for points and rectangles. Proc. ACM SIGMOD Int. Conf. on Management of Data. Atlantic City, NJ, pp. 322–331.

  • Bernstein, F.C., Koetzle, T.F., Williams, G.J., Meyer, E.F., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanovichi, T., and Tasumi, M. 1977. The protein data bank: A computer-based archival file for macromolecular structures. Journal of Molecular Biology, 112:535–542.

    Google Scholar 

  • Brinkhoff, T., Kriegel, H.-P., Schneider, R., and Seeger, B. 1994. Multi-step processing of spatial joins. Proc. ACM SIGMOD Int. Conf. on Management of Data. Minneapolis, MN, pp. 197–208.

  • Connolly, M.L. 1986. Measurement of protein surface shape by solid angles. Journal of Molecular Graphics, 4(1):3–6.

    Google Scholar 

  • Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining. Portland, OR, pp. 226–231.

  • Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. 1997. Density-connected sets and their application for trend detection in spatial databases. Proc. 3rd Int. Conf. on Knowledge Discovery and Data Mining. Newport Beach, CA, pp. 10–15.

  • Ester, M., Kriegel, H.-P., and Xu, X. 1995. A database interface for clustering in large spatial databases. Proc. 1st Int. Conf. on Knowledge Discovery and Data Mining. Montreal, Canada, pp. 94–99.

  • Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. 1996. Knowledge discovery and data mining: Towards a unifying framework. Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining. Portland, OR, pp. 82–88.

  • Gueting, R.H. 1994. An introduction to spatial database systems. The VLDB Journal, 3(4):357–399.

    Google Scholar 

  • Hattori, K. and Torii, Y. 1993. Effective algorithms for the nearest neighbor method in the clustering problem. Pattern Recognition, 26(5):741–746.

    Google Scholar 

  • Jain, A.K. and Dubes, R.C. 1988. Algorithms for Clustering Data. New Jersey: Prentice Hall.

    Google Scholar 

  • Kaufman, L. and Rousseeuw, P.J. 1990. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons.

  • MacQueen, J. 1967. Some Methods for Classification and Analysis of Multivariate Observations. In 5th Berkeley Symp. Math. Statist. Prob., L. Le Cam and J. Neyman (Eds.), vol. 1, pp. 281–297.

  • Matheus, C.J., Chan, P.K., and Piatetsky-Shapiro, G. 1993. Systems for knowledge discovery in databases. IEEE Transactions on Knowledge and Data Engineering, 5(6):903–913.

    Google Scholar 

  • Murtagh, F. 1983. A survey of recent advances in hierarchical clustering algorithms. The Computer Journal, 26(4):354–359.

    Google Scholar 

  • Ng, R.T. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. Proc. 20th Int. Conf. on Very Large Data Bases. Santiago, Chile, pp. 144–155.

  • Niemann, H. 1990. Pattern Analysis and Understanding. Berlin: Springer-Verlag.

    Google Scholar 

  • Protein Data Bank. 1994. Quarterly Newsletter 70. Brookhaven National Laboratory, Upton, NY.

    Google Scholar 

  • Reid, I.N., et al. 1991. The second palomar sky survey. Publ. Astron. Soc. Pacific, 103:661.

    Google Scholar 

  • Richards, A.J. 1983. Remote Sensing Digital Image Analysis. An Introduction. Berlin: Springer-Verlag.

    Google Scholar 

  • Sibson, R. 1973. SLINK: An optimally efficient algorithm for the single-link cluster method. The Computer Journal, 16(1):30–34.

    Google Scholar 

  • Stonebraker, M., Frew, J., Gardels, K., and Meredith, J. 1993. The SEQUOIA 2000 storage benchmark. Proc. ACM SIGMOD Int. Conf. on Management of Data. Washington, DC, pp. 2–11.

  • Vinod, H. 1969. Integer programming and the theory of grouping. J. Amer. Statist. Assoc., 64:506–517.

    Google Scholar 

  • Weir, N., Fayyad, U.M., and Djorgovski, S. 1995. Automated star/galaxy classification for digitized POSS-II. Astron. J., 109:2401.

    Google Scholar 

  • Zepka, A.F., Cordes, J.M., and Wasserman, I. 1994. Signal detection amid noise with known statistics. Astrophys. J., 427–438.

  • Zhang, T., Ramakrishnan, R., and Linvy, M. 1997. BIRCH: An efficient data clustering method for very large databases. Data Mining and Knowledge Discovery, 1(2):141–182.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sander, J., Ester, M., Kriegel, HP. et al. Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications. Data Mining and Knowledge Discovery 2, 169–194 (1998). https://doi.org/10.1023/A:1009745219419

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009745219419

Navigation