Reference Work Entry

Encyclopedia of Database Systems

pp 2673-2674

Sort-Merge Join

  • Jingren ZhouAffiliated withMicrosoft Research

Synonyms

Merge join

Definition

The sort-merge join is a common join algorithm in database systems using sorting. The join predicate needs to be an equality join predicate. The algorithm sorts both relations on the join attribute and then merges the sorted relations by scanning them sequentially and looking for qualifying tuples.

Key Points

The sorting step groups all tuples with the same value in the join attribute together. Such groups are sorted based on the value in the join attribute so that it is easy to locate groups from the two relations with the same attribute value. Sorting operation can be fairly expensive. If the size of the relation is larger than the available memory, external sorting algorithm is required. However, if one input relation is already clustered (sorted) on the join attribute, sorting can be completely avoided. That is why the sort-merge join looks attractive if any of the input relations is sorted on the join attribute.

The merging step ...

This is an excerpt from the content