Divide-and-Conquer Algorithm for Computing Set Containment Joins
A set containment join is a join between set-valued attributes of two relations, whose join condition is specified using the subset (⫅) operator. Set containment joins are used in a variety of database applications. In this paper, we propose a novel partitioning algorithm called Divide-and-Conquer Set Join (DCJ) for computing set containment joins efficiently. We show that the divide-and-conquer approach outperforms previously suggested algorithms over a wide range of data sets. We present a detailed analysis of DCJ and previously known algorithms and describe their behavior in an implemented testbed.
Unable to display preview. Download preview PDF.
- BK00.C. Böhm and H.-P. Kriegel. Dynamically Optimizing High-Dimensional Index Structures. In Proc. EDBT’00, 2000.Google Scholar
- CCKN01.J.-Y. Cai, V. T. Chakaravarthy, R. Kaushik, and J.F. Naughton. On the complexity of join predicates. In Proc. PODS’01, 2001.Google Scholar
- GEBW94.J. Gray, S. Englert, K. Baclawski, and P.J. Weinberger. Quickly generating billion-record synthetic databases. In Proc. SIGMOD’94, 1994.Google Scholar
- HKP97.J. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. In Proc. PODS’97, 1997.Google Scholar
- HM97.S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. In Proc. VLDB’97, 1997.Google Scholar
- IKO93.Y. Ishikawa, H. Kitagawa, and N. Ohbo. Evaluation of signature files as set access facilities in OODBS. In Proc. SIGMOD’93, 1993.Google Scholar
- MGM01.S. Melnik and H. Garcia-Molina. Divide-and-Conquer Algorithm for Computing Set Containment Joins. Extended Technical Report, http://dbpubs.stanford.edu/pub/2001-32, September 2001.
- PD96.J. M. Patel and D. J. DeWitt. Partition based spatial-merge join. In Proc. SIGMOD’96, 1996.Google Scholar
- RPNK00.K. Ramasamy, J. M. Patel, J. F. Naughton, and R. Kaushik. Set Containment Joins: the Good, the Bad and the Ugly. In Proc. VLDB’00, 2000.Google Scholar