Data Mining and Knowledge Discovery

, Volume 4, Issue 2–3, pp 193–216 | Cite as

Spatial Data Mining: Database Primitives, Algorithms and Efficient DBMS Support

  • Martin Ester
  • Alexander Frommelt
  • Hans-Peter Kriegel
  • Jöorg Sander
Article

Abstract

Spatial data mining algorithms heavily depend on the efficient processing of neighborhood relations since the neighbors of many objects have to be investigated in a single run of a typical algorithm. Therefore, providing general concepts for neighborhood relations as well as an efficient implementation of these concepts will allow a tight integration of spatial data mining algorithms with a spatial database management system. This will speed up both, the development and the execution of spatial data mining algorithms. In this paper, we define neighborhood graphs and paths and a small set of database primitives for their manipulation. We show that typical spatial data mining algorithms are well supported by the proposed basic operations. For finding significant spatial patterns, only certain classes of paths “leading away” from a starting object are relevant. We discuss filters allowing only such neighborhood paths which will significantly reduce the search space for spatial data mining algorithms. Furthermore, we introduce neighborhood indices to speed up the processing of our database primitives. We implemented the database primitives on top of a commercial spatial database management system. The effectiveness and efficiency of the proposed approach was evaluated by using an analytical cost model and an extensive experimental study on a geographic database.

mining spatial data database primitives for KDD 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., and Swami, A. 1993. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering, 5(6):914–925.CrossRefGoogle Scholar
  2. Bill, F. 1991. Fundamentals of Geographical Information Systems: Hardware, Software and Data (in German). Heidelberg, Germany: Wichmann Publishing.Google Scholar
  3. Egenhofer, M.J. 1991. Reasoning about binary topological relations. In Proc. 2nd Int. Symp. on Large Spatial Databases, Zurich, Switzerland, pp. 143–160.Google Scholar
  4. Ester, M., Kriegel, H.-P., and Sander, J. 1997. Spatial data mining: A database approach. In Proc. 5th Int. Symp. on Large Spatial Databases, Berlin, Germany, pp. 47–66.Google Scholar
  5. Ester, M., Frommelt, A., Kriegel, H.-P., and Sander, J. 1998. Algorithms for characterization and trend detection in spatial databases. In Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, New York City, NY, pp. 44–50.Google Scholar
  6. Fayyad, U.M.J., Piatetsky-Shapiro, G., and Smyth, P. 1996. From data mining to knowledge discovery: An overview. In Advances in Knowledge Discovery and Data Mining. Menlo Park: AAAI Press, pp. 1–34.Google Scholar
  7. Gueting, R.H. 1994. An introduction to spatial database systems. VLDB Journal Special Issue on Spatial Database Systems, 3(4).Google Scholar
  8. Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 47–54.Google Scholar
  9. Koperski, K. and Han, J. 1995. Discovery of spatial association rules in geographic information databases. In Proc. 4th Int. Symp. on Large Spatial Databases (SSD '95), Portland, ME, pp. 47–66.Google Scholar
  10. Koperski, K., Adhikary, J., and Han, J. 1996. Knowledge discovery in spatial databases: Progress and challenges. In Proc. SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. Technical Report 96-08, University of British Columbia, Vancouver, Canada.Google Scholar
  11. Koperski, K., Han, J., and Stefanovic, N. 1998. An efficient two-step method for classification of spatial data. In Proc. Symposium on Spatial Data Handling (SDH '98), Vancouver, Canada.Google Scholar
  12. Lu,W. and Han, J. 1992. Distance-associated join indices for spatial range search. In Proc. 8th Int. Conf. on Data Engineering, Phoenix, AZ, pp. 284–292.Google Scholar
  13. Ng, R.T. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. In Proc. 20th Int. Conf. on Very Large Data Bases, Santiago, Chile, pp. 144–155.Google Scholar
  14. Rotem, D. 1991. Spatial join indices. In Proc. 7th Int. Conf. on Data Engineering, Kobe, Japan, pp. 500–509.Google Scholar
  15. Sander, J., Ester, M., Kriegel, H.-P., and Xu, X. 1998. Density-based clustering in spatial databases: A new algorithm and its applications. In Data Mining and Knowledge Discovery, 2(2).Google Scholar
  16. Valduriez, P. 1987. Join indices. ACM Transactions on Database Systems, 12(2):218–246.CrossRefGoogle Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Martin Ester
    • 1
  • Alexander Frommelt
    • 2
  • Hans-Peter Kriegel
    • 3
  • Jöorg Sander
    • 4
  1. 1.Institute for Computer ScienceUniversity of MunichMünchenGermany
  2. 2.Institute for Computer ScienceUniversity of MunichMünchenGermany
  3. 3.Institute for Computer ScienceUniversity of MunichMünchenGermany
  4. 4.Institute for Computer ScienceUniversity of MunichMünchenGermany

Personalised recommendations