Relative FM-Indexes

  • Djamal Belazzougui
  • Travis Gagie
  • Simon Gog
  • Giovanni Manzini
  • Jouni Sirén
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8799)

Abstract

Intuitively, if two strings S1 and S2 are sufficiently similar and we already have an FM-index for S1 then, by storing a little extra information, we should be able to reuse parts of that index in an FM-index for S2. We formalize this intuition and show that it can lead to significant space savings in practice, as well as to some interesting theoretical problems.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bose, P., Buss, J.F., Lubiw, A.: Pattern matching for permutations. Inf. Process. Lett. 65(5), 277–283 (1998)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Burrows, M., Wheeler, D.J.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)Google Scholar
  3. 3.
    Ferrada, H., Gagie, T., Hirvola, T., Puglisi, S.J.: Hybrid indexes for repetitive datasets. Phil. Trans. Royal Society A 372, 2014 (2016)Google Scholar
  4. 4.
    Ferragina, P., Manzini, G.: Indexing compressed text. Journal of the ACM 52(4), 552–581 (2005)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Gog, S., Beller, T., Moffat, A., Petri, M.: From theory to practice: Plug and play with succinct data structures. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 326–337. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  6. 6.
    Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Hybrid compression of bitvectors for the FM-index. In: Proc. 2014 IEEE Data Compression Conference, DCC 2014, pp. 302–311 (2014)Google Scholar
  7. 7.
    Landau, G.M., Vishkin, U., Nussinov, R.: An efficient string matching algorithm with k differences for nucleotide and amino acid sequences. Nucleic Acids Research 14(1), 31–46 (1986)CrossRefGoogle Scholar
  8. 8.
    Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10, R25 (2009)Google Scholar
  9. 9.
    Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)CrossRefGoogle Scholar
  10. 10.
    Li, R., Yu, C., Li, Y., Lam, T.-W., Yiu, S.-M., Kristiansen, K., Wang, J.: SOAP2: An improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)CrossRefGoogle Scholar
  11. 11.
    Mäkinen, V., Navarro, G., Sirén, J., Välimäki, N.: Storage and retrieval of highly repetitive sequence collections. Journal of Computational Biology 17(3), 281–308 (2010)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Myers, E.W.: A fast bit-vector algorithm for approximate string matching based on dynamic programming. Journal of the ACM 46(3), 395–415 (1999)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Myers, E.W.: An O(ND) difference algorithm and its variations. Algorithmica 1(2), 251–266 (1986)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Okanohara, D., Sadakane, K.: Practical entropy-compressed rank/select dictionary. In: Proc. Ninth Workshop on Algorithm Engineering and Experiments (ALENEX 2007), pp. 60–70. SIAM (2007)Google Scholar
  15. 15.
    Raman, R., Raman, V., Rao Satti, S.: Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Transactions on Algorithms 3(4), 43 (2007)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Rozowsky, J., Abyzov, A., Wang, J., Alves, P., Raha, D., Harmanci, A., Leng, J., Bjornson, R., Kong, Y., Kitabayashi, N., Bhardwaj, N., Rubin, M., Snyder, M., Gerstein, M.: AlleleSeq: Analysis of allelespecific expression and binding in a network framework. Molecular Systems Biology 7, 522 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Djamal Belazzougui
    • 1
    • 2
  • Travis Gagie
    • 1
    • 2
  • Simon Gog
    • 3
  • Giovanni Manzini
    • 4
  • Jouni Sirén
    • 5
  1. 1.University of HelsinkiFinland
  2. 2.Helsinki Institute for Information TechnologyFinland
  3. 3.Karlsruhe Institute of TechnologyGermany
  4. 4.University of Eastern PiedmontItaly
  5. 5.University of ChileChile

Personalised recommendations