Abstract
We describe succinct and compact representations of the bidirectional bwt of a string s ∈ Σ* which provide increasing navigation power and a number of space-time tradeoffs. One such representation allows to extend a substring of s by one character from the left and from the right in constant time, taking O(|s| log |Σ|) bits of space. We then match the functions supported by each representation to a number of algorithms that traverse the nodes of the suffix tree of s, exploiting connections between the bwt and the suffix-link tree. This results in near-linear time algorithms for many sequence analysis problems (e.g. maximal unique matches), for the first time in succinct space.
This work was partially supported by Academy of Finland under grants 250345 (CoECGR) and 118653 (ALGODAN).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Apostolico, A.: The myriad virtues of subword trees. Technical Report 85–540, Department of Computer Science, Purdue University (1985)
Apostolico, A., Bock, M.E., Lonardi, S.: Monotony of surprise and large-scale quest for unusual words. In: RECOMB 2002, pp. 22–31 (2002)
Apostolico, A., Bock, M.E., Lonardi, S., Xu, X.: Efficient detection of unusual words. J. Comput. Biol. 7(1-2), 71–94 (2000)
Belazzougui, D., Boldi, P., Pagh, R., Vigna, S.: Monotone minimal perfect hashing: searching a sorted table with o(1) accesses. In: SODA 2009, pp. 785–794 (2009)
Belazzougui, D., Navarro, G.: Alphabet-independent compressed text indexing. ACM Trans. Alg. (to appear, 2013)
Beller, T., Berger, K., Ohlebusch, E.: Space-efficient computation of maximal and supermaximal repeats in genome sequences. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 99–110. Springer, Heidelberg (2012)
Breslauer, D.: An on-line string superprimitivity test. Inform. Process. Lett. 44(6), 345–347 (1992)
Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representations of sequences and full-text indexes. ACM T. Alg. 3(2), 20 (2007)
Fischer, J.: Optimal succinctness for range minimum queries. In: López-Ortiz, A. (ed.) LATIN 2010. LNCS, vol. 6034, pp. 158–169. Springer, Heidelberg (2010)
Fischer, J., Mäkinen, V., Navarro, G.: Faster entropy-bounded compressed suffix trees. Theor. Comput. Sci. 410(51), 5354–5364 (2009)
Gusfield, D.: Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press (1997)
Hoare, C.A.R.: Quicksort. The Computer Journal 5(1), 10–16 (1962)
Hon, W.-K., Sadakane, K.: Space-economical algorithms for finding maximal unique matches. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, pp. 144–152. Springer, Heidelberg (2002)
Hon, W.-K., Sadakane, K., Sung, W.-K.: Breaking a time-and-space barrier in constructing full-text indices. SIAM J. Comput. 38(6), 2162–2178 (2009)
Kulekci, O., Vitter, J.S., Xu, B.: Efficient maximal repeat finding using the Burrows-Wheeler transform and wavelet tree. TCBB 9(2), 421–429 (2012)
Lam, T.W., Li, R., Tam, A., Wong, S., Wu, E., Yiu, S.: High throughput short read alignment via bi-directional BWT. In: BIBM 2009, pp. 31–36 (2009)
Li, H., Homer, N.: A survey of sequence alignment algorithms for next-generation sequencing. Brief. Bioinform. 11(5), 473–483 (2010)
Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.-M., Kristiansen, K., Wang, J.: Soap2: An improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)
Ohlebusch, E., Gog, S., Kügel, A.: Computing matching statistics and maximal exact matches on compressed full-text indexes. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 347–358. Springer, Heidelberg (2010)
Pinho, A.J., Ferreira, P.J.S.G., Garcia, S.P., Rodrigues, J.M.O.S.: On finding minimal absent words. BMC Bioinformatics 10(1), 137 (2009)
Raman, R., Raman, V., Satti, S.R.: Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM T. Alg. 3(4) (2007)
Russo, L.M.S., Navarro, G., Oliveira, A.L.: Dynamic fully-compressed suffix trees. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 191–203. Springer, Heidelberg (2008)
Russo, L.M.S., Navarro, G., Oliveira, A.L.: Fully compressed suffix trees. ACM Trans. Alg. 7(4), 53 (2011)
Sadakane, K.: Compressed suffix trees with full functionality. Theor. Comput. Syst. 41(4), 589–607 (2007)
Sadakane, K.: Succinct data structures for flexible text retrieval systems. J. Discrete Alg. 5(1), 12–22 (2007)
Sadakane, K., Navarro, G.: Fully-functional succinct trees. In: SODA 2010, pp. 134–149 (2010)
Schnattinger, T., Ohlebusch, E., Gog, S.: Bidirectional search in a string with wavelet trees. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 40–50. Springer, Heidelberg (2010)
Schnattinger, T., Ohlebusch, E., Gog, S.: Bidirectional search in a string with wavelet trees and bidirectional matching statistics. Inform. Comput. 213, 13–22 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Belazzougui, D., Cunial, F., Kärkkäinen, J., Mäkinen, V. (2013). Versatile Succinct Representations of the Bidirectional Burrows-Wheeler Transform. In: Bodlaender, H.L., Italiano, G.F. (eds) Algorithms – ESA 2013. ESA 2013. Lecture Notes in Computer Science, vol 8125. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40450-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-40450-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40449-8
Online ISBN: 978-3-642-40450-4
eBook Packages: Computer ScienceComputer Science (R0)