Algorithms in Bioinformatics

Volume 2452 of the series Lecture Notes in Computer Science pp 449-463


The Enhanced Suffix Array and Its Applications to Genome Analysis

  • Mohamed Ibrahim AbouelhodaAffiliated withFaculty of Technology, University of Bielefeld
  • , Stefan KurtzAffiliated withFaculty of Technology, University of Bielefeld
  • , Enno OhlebuschAffiliated withFaculty of Technology, University of Bielefeld

* Final gross prices may vary according to local VAT.

Get Access


In large scale applications as computational genome analysis, the space requirement of the suffix tree is a severe drawback. In this paper, we present a uniform framework that enables us to systematically replace every string processing algorithm that is based on a bottomup traversal of a suffix tree by a corresponding algorithm based on an enhanced suffix array (a suffix array enhanced with the lcp-table). In this framework, we will show how maximal, supermaximal, and tandem repeats, as well as maximal unique matches can be efficiently computed. Because enhanced suffix arrays require much less space than suffix trees, very large genomes can now be indexed and analyzed, a task which was not feasible before. Experimental results demonstrate that our programs require not only less space but also much less time than other programs developed for the same tasks.