Chapter

Combinatorial Pattern Matching

Volume 1264 of the series Lecture Notes in Computer Science pp 116-129

Date:

Direct construction of compact directed acyclic word graphs

  • Maxime CrochemoreAffiliated withInstitut Gaspard Monge, Université de Marne-La-Vallée
  • , Renaud VérinAffiliated withInstitut Gaspard Monge, Université de Marne-La-Vallée

* Final gross prices may vary according to local VAT.

Get Access

Abstract

The Directed Acyclic Word Graph (DAWG) is an efficient data structure to treat and analyze repetitions in a text, especially in DNA genomic sequences. Here, we consider the Compact Directed Acyclic Word Graph of a word. We give the first direct algorithm to construct it. It runs in time linear in the length of the string on a fixed alphabet. Our implementation requires half the memory space used by DAWGs.

Keywords

pattern matching algorithm suffix automaton DAWG Compact DAWG suffix tree index on text