Distribution-Aware Compressed Full-Text Indexes

  • Paolo Ferragina
  • Jouni Sirén
  • Rossano Venturini
Conference paper

DOI: 10.1007/978-3-642-23719-5_64

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6942)
Cite this paper as:
Ferragina P., Sirén J., Venturini R. (2011) Distribution-Aware Compressed Full-Text Indexes. In: Demetrescu C., Halldórsson M.M. (eds) Algorithms – ESA 2011. ESA 2011. Lecture Notes in Computer Science, vol 6942. Springer, Berlin, Heidelberg

Abstract

In this paper we address the problem of building a compressed self-index that, given a distribution for the pattern queries and a bound on the space occupancy, minimizes the expected query-time within that index-space bound. We solve this problem by exploiting a reduction to the problem of finding a minimum weight K-link path in a particular Directed Acyclic Graph. Interestingly enough, our solution is independent of the underlying compressed index in use. Our experiments compare this optimal strategy with several other standard approaches, showing its effectiveness in practice.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Paolo Ferragina
    • 1
  • Jouni Sirén
    • 2
  • Rossano Venturini
    • 3
  1. 1.Dept. of Computer ScienceUniv. of PisaItaly
  2. 2.Dept. of Computer ScienceUniv. of HelsinkiItaly
  3. 3.ISTI-CNRPisaItaly

Personalised recommendations