, Volume 18, Issue 1, pp 157-179

B-tries for disk-based string management

Purchase on Springer.com

$39.95 / €34.95 / £29.95*

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

A wide range of applications require that large quantities of data be maintained in sort order on disk. The B-tree, and its variants, are an efficient general-purpose disk-based data structure that is almost universally used for this task. The B-trie has the potential to be a competitive alternative for the storage of data where strings are used as keys, but has not previously been thoroughly described or tested. We propose new algorithms for the insertion, deletion, and equality search of variable-length strings in a disk-resident B-trie, as well as novel splitting strategies which are a critical element of a practical implementation. We experimentally compare the B-trie against variants of B-tree on several large sets of strings with a range of characteristics. Our results demonstrate that, although the B-trie uses more memory, it is faster, more scalable, and requires less disk space.