Skip to main content

Versatile Succinct Representations of the Bidirectional Burrows-Wheeler Transform

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 8125)

Abstract

We describe succinct and compact representations of the bidirectional bwt of a string s ∈ Σ* which provide increasing navigation power and a number of space-time tradeoffs. One such representation allows to extend a substring of s by one character from the left and from the right in constant time, taking O(|s| log |Σ|) bits of space. We then match the functions supported by each representation to a number of algorithms that traverse the nodes of the suffix tree of s, exploiting connections between the bwt and the suffix-link tree. This results in near-linear time algorithms for many sequence analysis problems (e.g. maximal unique matches), for the first time in succinct space.

Keywords

  • Maximal Repeat
  • Wavelet Tree
  • Short Read Alignment
  • Bidirectional Search
  • Absent Word

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This work was partially supported by Academy of Finland under grants 250345 (CoECGR) and 118653 (ALGODAN).

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-40450-4_12
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-642-40450-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostolico, A.: The myriad virtues of subword trees. Technical Report 85–540, Department of Computer Science, Purdue University (1985)

    Google Scholar 

  2. Apostolico, A., Bock, M.E., Lonardi, S.: Monotony of surprise and large-scale quest for unusual words. In: RECOMB 2002, pp. 22–31 (2002)

    Google Scholar 

  3. Apostolico, A., Bock, M.E., Lonardi, S., Xu, X.: Efficient detection of unusual words. J. Comput. Biol. 7(1-2), 71–94 (2000)

    CrossRef  Google Scholar 

  4. Belazzougui, D., Boldi, P., Pagh, R., Vigna, S.: Monotone minimal perfect hashing: searching a sorted table with o(1) accesses. In: SODA 2009, pp. 785–794 (2009)

    Google Scholar 

  5. Belazzougui, D., Navarro, G.: Alphabet-independent compressed text indexing. ACM Trans. Alg. (to appear, 2013)

    Google Scholar 

  6. Beller, T., Berger, K., Ohlebusch, E.: Space-efficient computation of maximal and supermaximal repeats in genome sequences. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 99–110. Springer, Heidelberg (2012)

    CrossRef  Google Scholar 

  7. Breslauer, D.: An on-line string superprimitivity test. Inform. Process. Lett. 44(6), 345–347 (1992)

    MathSciNet  MATH  CrossRef  Google Scholar 

  8. Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representations of sequences and full-text indexes. ACM T. Alg. 3(2), 20 (2007)

    Google Scholar 

  9. Fischer, J.: Optimal succinctness for range minimum queries. In: López-Ortiz, A. (ed.) LATIN 2010. LNCS, vol. 6034, pp. 158–169. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  10. Fischer, J., Mäkinen, V., Navarro, G.: Faster entropy-bounded compressed suffix trees. Theor. Comput. Sci. 410(51), 5354–5364 (2009)

    MATH  CrossRef  Google Scholar 

  11. Gusfield, D.: Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press (1997)

    Google Scholar 

  12. Hoare, C.A.R.: Quicksort. The Computer Journal 5(1), 10–16 (1962)

    MathSciNet  MATH  CrossRef  Google Scholar 

  13. Hon, W.-K., Sadakane, K.: Space-economical algorithms for finding maximal unique matches. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, pp. 144–152. Springer, Heidelberg (2002)

    CrossRef  Google Scholar 

  14. Hon, W.-K., Sadakane, K., Sung, W.-K.: Breaking a time-and-space barrier in constructing full-text indices. SIAM J. Comput. 38(6), 2162–2178 (2009)

    MathSciNet  MATH  CrossRef  Google Scholar 

  15. Kulekci, O., Vitter, J.S., Xu, B.: Efficient maximal repeat finding using the Burrows-Wheeler transform and wavelet tree. TCBB 9(2), 421–429 (2012)

    Google Scholar 

  16. Lam, T.W., Li, R., Tam, A., Wong, S., Wu, E., Yiu, S.: High throughput short read alignment via bi-directional BWT. In: BIBM 2009, pp. 31–36 (2009)

    Google Scholar 

  17. Li, H., Homer, N.: A survey of sequence alignment algorithms for next-generation sequencing. Brief. Bioinform. 11(5), 473–483 (2010)

    MATH  CrossRef  Google Scholar 

  18. Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.-M., Kristiansen, K., Wang, J.: Soap2: An improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)

    CrossRef  Google Scholar 

  19. Ohlebusch, E., Gog, S., Kügel, A.: Computing matching statistics and maximal exact matches on compressed full-text indexes. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 347–358. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  20. Pinho, A.J., Ferreira, P.J.S.G., Garcia, S.P., Rodrigues, J.M.O.S.: On finding minimal absent words. BMC Bioinformatics 10(1), 137 (2009)

    CrossRef  Google Scholar 

  21. Raman, R., Raman, V., Satti, S.R.: Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM T. Alg. 3(4) (2007)

    Google Scholar 

  22. Russo, L.M.S., Navarro, G., Oliveira, A.L.: Dynamic fully-compressed suffix trees. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 191–203. Springer, Heidelberg (2008)

    CrossRef  Google Scholar 

  23. Russo, L.M.S., Navarro, G., Oliveira, A.L.: Fully compressed suffix trees. ACM Trans. Alg. 7(4), 53 (2011)

    MathSciNet  Google Scholar 

  24. Sadakane, K.: Compressed suffix trees with full functionality. Theor. Comput. Syst. 41(4), 589–607 (2007)

    MathSciNet  MATH  CrossRef  Google Scholar 

  25. Sadakane, K.: Succinct data structures for flexible text retrieval systems. J. Discrete Alg. 5(1), 12–22 (2007)

    MathSciNet  MATH  CrossRef  Google Scholar 

  26. Sadakane, K., Navarro, G.: Fully-functional succinct trees. In: SODA 2010, pp. 134–149 (2010)

    Google Scholar 

  27. Schnattinger, T., Ohlebusch, E., Gog, S.: Bidirectional search in a string with wavelet trees. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 40–50. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  28. Schnattinger, T., Ohlebusch, E., Gog, S.: Bidirectional search in a string with wavelet trees and bidirectional matching statistics. Inform. Comput. 213, 13–22 (2012)

    MathSciNet  MATH  CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Belazzougui, D., Cunial, F., Kärkkäinen, J., Mäkinen, V. (2013). Versatile Succinct Representations of the Bidirectional Burrows-Wheeler Transform. In: Bodlaender, H.L., Italiano, G.F. (eds) Algorithms – ESA 2013. ESA 2013. Lecture Notes in Computer Science, vol 8125. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40450-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40450-4_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40449-8

  • Online ISBN: 978-3-642-40450-4

  • eBook Packages: Computer ScienceComputer Science (R0)