Advertisement

A Constant-Space Comparison-Based Algorithm for Computing the Burrows–Wheeler Transform

  • Maxime Crochemore
  • Roberto Grossi
  • Juha Kärkkäinen
  • Gad M. Landau
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7922)

Abstract

We introduce the problem of computing the Burrows– Wheeler Transform (\(\small\mathrm{BWT}\)) using just O(1) additional space. Our in-place algorithm does not need the explicit storage for the suffix sort array and the output array, as typically required in previous work. It relies on the combinatorial properties of the \(\small\mathrm{BWT}\), and runs in O(n 2) time in the comparison model using O(1) extra memory cells, apart from the array of n cells storing the n characters of the input text. We also discuss some time-space trade-offs for the inverse algorithm to obtain the text from the given \(\small\mathrm{BWT}\).

Keywords

Comparison Model Input Text Circular Shift Lossless Data Compression String Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adjeroh, D., Bell, T., Mukherjee, A.: The Burrows–Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Springer (2008)Google Scholar
  2. 2.
    Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. Research Report 124, Digital SRC, Palo Alto, CA, USA (May 1994)Google Scholar
  3. 3.
    Chan, T.M.: Comparison-based time-space lower bounds for selection. ACM Trans. Algorithms 6(2), 1–16 (2010)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Dobkin, D.J., Ian Munro, J.: Optimal time minimal space selection algorithms. Journal of the ACM 28(3), 454–461 (1981)zbMATHCrossRefGoogle Scholar
  5. 5.
    Ferragina, P., Manzini, G.: Indexing compressed text. J. ACM 52(4), 552–581 (2005)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Franceschini, G., Muthukrishnan, S.: In-Place Suffix Sorting. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 533–545. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Grossi, R., Gupta, A., Vitter, J.S.: High-order entropy-compressed text indexes. In: SODA, pp. 841–850 (2003)Google Scholar
  8. 8.
    Hoare, C.A.R.: Algorithm 65: Find. Communications of the ACM 4(7), 321–322 (1961)CrossRefGoogle Scholar
  9. 9.
    Hon, W.-K., Lam, T.W., Sadakane, K., Sung, W.-K., Yiu, S.-M.: A space and time efficient algorithm for constructing compressed suffix arrays. Algorithmica 48(1), 23–36 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  10. 10.
    Hon, W.-K., Sadakane, K., Sung, W.-K.: Breaking a time-and-space barrier in constructing full-text indices. SIAM J. Comput. 38(6), 2162–2178 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Kärkkäinen, J.: Fast BWT in small space by blockwise suffix sorting. Theor. Comput. Sci. 387(3), 249–257 (2007)zbMATHCrossRefGoogle Scholar
  12. 12.
    Lam, T.W., Li, R., Tam, A., Wong, S., Wu, E., Yiu, S.M.: High Throughput Short Read Alignment via Bi-directional BWT. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 31–36 (2009)Google Scholar
  13. 13.
    Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10(3), R25 (2009)CrossRefGoogle Scholar
  14. 14.
    Li, H., Durbin, R.: Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26(5), 589–595 (2010)CrossRefGoogle Scholar
  15. 15.
    Manber, U., Myers, G.: Suffix arrays: A new method for on-line string searches. SIAM Journal on Computing 22(5), 935–948 (1993)MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Manzini, G.: An analysis of the Burrows-Wheeler transform. J. ACM 48(3), 407–430 (2001)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Ian Munro, J.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37–42. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  18. 18.
    Ian Munro, J., Raman, V.: Selection from read-only memory and sorting with minimum data movement. Theoretical Computer Science 165(2), 311–323 (1996)MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Na, J.C., Park, K.: Alphabet-independent linear-time construction of compressed suffix arrays using o(nlogn)-bit working space. Theor. Comput. Sci. 385(1-3), 127–136 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Okanohara, D., Sadakane, K.: A linear-time Burrows-Wheeler transform using induced sorting. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 90–101. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  21. 21.
    Raman, V., Ramnath, S.: Improved Upper Bounds for Time-Space Trade-offs for Selection. Nordic J. Computing 6(2), 162–180 (1999)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Salson, M., Lecroq, T., Léonard, M., Mouchard, L.: A four-stage algorithm for updating a Burrows–Wheeler Transform. Theor. Comput. Sci. 410(43), 4350–4359 (2009)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Maxime Crochemore
    • 1
  • Roberto Grossi
    • 2
  • Juha Kärkkäinen
    • 3
  • Gad M. Landau
    • 4
    • 5
  1. 1.King’s College LondonUK
  2. 2.Dipartimento di InformaticaUniversità di PisaItaly
  3. 3.Department of Computer ScienceUniversity of HelsinkiFinland
  4. 4.Department of Computer ScienceUniversity of HaifaIsrael
  5. 5.Department of Computer Science and EngineeringNYU-PolyBrooklynUSA

Personalised recommendations