The Enhanced Suffix Array and Its Applications to Genome Analysis

  • Mohamed Ibrahim Abouelhoda
  • Stefan Kurtz
  • Enno Ohlebusch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2452)

Abstract

In large scale applications as computational genome analysis, the space requirement of the suffix tree is a severe drawback. In this paper, we present a uniform framework that enables us to systematically replace every string processing algorithm that is based on a bottomup traversal of a suffix tree by a corresponding algorithm based on an enhanced suffix array (a suffix array enhanced with the lcp-table). In this framework, we will show how maximal, supermaximal, and tandem repeats, as well as maximal unique matches can be efficiently computed. Because enhanced suffix arrays require much less space than suffix trees, very large genomes can now be indexed and analyzed, a task which was not feasible before. Experimental results demonstrate that our programs require not only less space but also much less time than other programs developed for the same tasks.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Mohamed Ibrahim Abouelhoda
    • 1
  • Stefan Kurtz
    • 1
  • Enno Ohlebusch
    • 1
  1. 1.Faculty of TechnologyUniversity of BielefeldBielefeldGermany

Personalised recommendations