Reference Work Entry

Encyclopedia of Database Systems

pp 1288-1289

Hash Join

  • Jingren ZhouAffiliated withMicrosoft Research

Synonyms

Hash join

Definition

The hash join is a common join algorithm in database systems using hashing. The join predicate needs to be an equality join predicate. The classic algorithm consists of two phases: the “build” phase and the “probe” phase. In the “build” phase, the algorithm builds a hash table on the smaller relation, say R, by applying a hash function to the join attribute of each tuple. In the “probe” phase, the algorithm probes the hash table using tuples of the larger relation, say S, to find matches.

Key Points

The classic algorithm is simple, but it requires that the smaller join relation fits into memory. If there is no enough memory to hold all the tuples in R, an additional “partition” phase is required. There are several variants of the classic hash join algorithm. They differ in terms of utilizing memory and handling overflow.

Grace Hash Join The idea behind grace hash join is to hash partition both relations on the join attribute, using th ...

This is an excerpt from the content