Advertisement

Divide-and-Conquer Algorithm for Computing Set Containment Joins

  • Sergey Melnik
  • Hector Garcia-Molina
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2287)

Abstract

A set containment join is a join between set-valued attributes of two relations, whose join condition is specified using the subset (⫅) operator. Set containment joins are used in a variety of database applications. In this paper, we propose a novel partitioning algorithm called Divide-and-Conquer Set Join (DCJ) for computing set containment joins efficiently. We show that the divide-and-conquer approach outperforms previously suggested algorithms over a wide range of data sets. We present a detailed analysis of DCJ and previously known algorithms and describe their behavior in an implemented testbed.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BK00.
    C. Böhm and H.-P. Kriegel. Dynamically Optimizing High-Dimensional Index Structures. In Proc. EDBT’00, 2000.Google Scholar
  2. CCKN01.
    J.-Y. Cai, V. T. Chakaravarthy, R. Kaushik, and J.F. Naughton. On the complexity of join predicates. In Proc. PODS’01, 2001.Google Scholar
  3. FC84.
    C. Faloutsos and S. Christodoulakis. Signature files: An access method for documents and its analytical performance evaluation. ACM Trans. on Office Information Systems (TOIS), 2(4):267–288, 1984.CrossRefGoogle Scholar
  4. GEBW94.
    J. Gray, S. Englert, K. Baclawski, and P.J. Weinberger. Quickly generating billion-record synthetic databases. In Proc. SIGMOD’94, 1994.Google Scholar
  5. HKP97.
    J. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. In Proc. PODS’97, 1997.Google Scholar
  6. HM97.
    S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. In Proc. VLDB’97, 1997.Google Scholar
  7. IKO93.
    Y. Ishikawa, H. Kitagawa, and N. Ohbo. Evaluation of signature files as set access facilities in OODBS. In Proc. SIGMOD’93, 1993.Google Scholar
  8. MGM01.
    S. Melnik and H. Garcia-Molina. Divide-and-Conquer Algorithm for Computing Set Containment Joins. Extended Technical Report, http://dbpubs.stanford.edu/pub/2001-32, September 2001.
  9. PD96.
    J. M. Patel and D. J. DeWitt. Partition based spatial-merge join. In Proc. SIGMOD’96, 1996.Google Scholar
  10. RPNK00.
    K. Ramasamy, J. M. Patel, J. F. Naughton, and R. Kaushik. Set Containment Joins: the Good, the Bad and the Ugly. In Proc. VLDB’00, 2000.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Sergey Melnik
    • 1
  • Hector Garcia-Molina
    • 1
  1. 1.Stanford UniversityUSA

Personalised recommendations