Fixed Block Compression Boosting in FM-Indexes

  • Juha Kärkkäinen
  • Simon J. Puglisi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7024)

Abstract

A compressed full-text self-index occupies space close to that of the compressed text and simultaneously allows fast pattern matching and random access to the underlying text. Among the best compressed self-indexes, in theory and in practice, are several members of the FM-index family. In this paper, we describe new FM-index variants that combine nice theoretical properties, simple implementation and improved practical performance. Our main result is a new technique called fixed block compression boosting, which is a simpler and faster alternative to optimal compression boosting and implicit compression boosting used in previous FM-indexes.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Burrows, M., Wheeler, D.J.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation, Palo Alto, California (1994)Google Scholar
  2. 2.
    Claude, F., Navarro, G.: Practical rank/Select queries over arbitrary sequences. In: Amir, A., Turpin, A., Moffat, A. (eds.) SPIRE 2008. LNCS, vol. 5280, pp. 176–187. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Ferragina, P., Giancarlo, R., Manzini, G., Sciortino, M.: Boosting textual compression in optimal linear time. Journal of the ACM 52, 688–713 (2005)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Ferragina, P., González, R., Navarro, G., Venturini, R.: Compressed text indexes: From theory to practice. ACM Journal of Experimental Algorithmics 13, 1.12–1.31 (2009)MathSciNetMATHGoogle Scholar
  5. 5.
    Ferragina, P., Manzini, G.: Indexing compressed text. Journal of the ACM 52, 552–581 (2005)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representations of sequences and full-text indexes. ACM Transactions on Algorithms 3, Article 20 (2007)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Grossi, R., Gupta, A., Vitter, J.S.: High-order entropy-compressed text indexes. In: Proc. 14th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 841–850. SIAM, Philadelphia (2003)Google Scholar
  8. 8.
    Grossi, R., Gupta, A., Vitter, J.S.: When indexing equals compression: experiments with compressing suffix arrays and applications. In: Proc. 15th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 636–645. SIAM, Philadelphia (2004)Google Scholar
  9. 9.
    Kärkkäinen, J., Puglisi, S.J.: Medium-space algorithms for inverse bwt. In: de Berg, M., Meyer, U. (eds.) ESA 2010. LNCS, vol. 6346, pp. 451–462. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Mäkinen, V., Navarro, G.: Succinct suffix arrays based on run-length encoding. Nordic Journal of Computing 12, 40–66 (2005)MathSciNetMATHGoogle Scholar
  11. 11.
    Mäkinen, V., Navarro, G.: Implicit compression boosting with applications to self-indexing. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 229–241. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Manzini, G.: An analysis of the Burrows-Wheeler transform. Journal of the ACM 48, 407–430 (2001)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Computing Surveys 39 (2007)Google Scholar
  14. 14.
    Okanohara, D., Sadakane, K.: Practical entropy-compressed rank/select dictionary. In: Proc. Workshop on Algorithm Engineering and Experiments (ALENEX). SIAM, Philadelphia (2007)Google Scholar
  15. 15.
    Raman, R., Raman, V., Rao, S.S.: Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Transactions on Algorithms 3 (2007)Google Scholar
  16. 16.
    Vigna, S.: Broadword implementation of rank/select queries. In: McGeoch, C.C. (ed.) WEA 2008. LNCS, vol. 5038, pp. 154–168. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Juha Kärkkäinen
    • 1
  • Simon J. Puglisi
    • 2
  1. 1.Department of Computer ScienceUniversity of HelsinkiFinland
  2. 2.Department of InformaticsKing’s College LondonLondonUnited Kingdom

Personalised recommendations