Algorithmica

, Volume 67, Issue 4, pp 529–546

Distribution-Aware Compressed Full-Text Indexes

Article

DOI: 10.1007/s00453-013-9782-3

Cite this article as:
Ferragina, P., Sirén, J. & Venturini, R. Algorithmica (2013) 67: 529. doi:10.1007/s00453-013-9782-3

Abstract

In this paper we address the problem of building a compressed self-index that, given a distribution for the pattern queries and a bound on the space occupancy, minimizes the expected query time within that index space bound. We solve this problem by exploiting a reduction to the problem of finding a minimum weight K-link path in a properly designed Directed Acyclic Graph. Interestingly enough, our solution can be used with any compressed index based on the Burrows-Wheeler transform. Our experiments compare this optimal strategy with several other known approaches, showing its effectiveness in practice.

Keywords

Full-text indexing Compressed full-text indexes Succinct data structures Dynamic programming 

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Paolo Ferragina
    • 1
  • Jouni Sirén
    • 2
  • Rossano Venturini
    • 1
  1. 1.Dipartimento di InformaticaUniversity of PisaPisaItaly
  2. 2.Department of Computer ScienceUniversity of HelsinkiHelsinkiFinland