Skip to main content

An Extension of the Burrows Wheeler Transform and Applications to Sequence Comparison and Data Compression

  • Conference paper
Combinatorial Pattern Matching (CPM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3537))

Included in the following conference series:

Abstract

We introduce a generalization of the Burrows-Wheeler Transform (BWT) that can be applied to a multiset of words. The extended transformation, denoted by E, is reversible, but, differently from BWT, it is also surjective. The E transformation allows to give a definition of distance between two sequences, that we apply here to the problem of the whole mitochondrial genome phylogeny. Moreover we give some consideration about compressing a set of words by using the E transformation as preprocessing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burrows, M., Wheeler, D.J.: A block sorting data compression algorithm. Technical report, DIGITAL System Research Center (1994)

    Google Scholar 

  2. Cao, Y., Janke, A., Waddell, P.J., Westerman, M., Takenaka, O., Murata, S., Okada, N., Pääbo, S., Hasegawa, M.: Conflict among individual mitochondrial proteins in resolving the phylogeny of eutherian orders. J. Mol. Evol. 47, 307–322 (1998)

    Article  Google Scholar 

  3. Cilibrasi, R., Vitányi, P.: Clustering by compression. IEEE Trans. Information Theory (submitted, 2005)

    Google Scholar 

  4. Crochemore, M., Désarménien, J., Perrin, D.: A note on the Burrows-Wheeler transformation. Theoret. Comput. Sci. (to appear)

    Google Scholar 

  5. Ergun, F., Muthukrishnan, S., Sahinalp, C.: Comparing sequences with segment rearrangements. In: Pandya, P.K., Radhakrishnan, J. (eds.) FSTTCS 2003. LNCS, vol. 2914, pp. 183–194. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Gessel, M., Reutenauer, C.: Counting permutations with given cycle structure and descent set. J. Combin. Theory Ser. A 64(2), 189–215 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  7. Li, M., Chen, X., Li, X., Ma, B., Vitányi, P.: The similarity metric. IEEE Trans. Inform. Th. 12(5), 3250–3264 (2004)

    Article  Google Scholar 

  8. Lothaire, M.: Combinatorics on Words. Encyclopedia of Mathematics, vol. 17. Addison-Wesley, Reading (1983); Reprinted in the Cambridge Mathematical Library. Cambridge University Press, Cambridge (1997)

    MATH  Google Scholar 

  9. Lothaire, M.: Algebraic Combinatorics on Words. Cambridge University Press, Cambridge (2002)

    MATH  Google Scholar 

  10. Mantaci, S., Restivo, A., Rosone, G., Sciortino, M.: A new sequence distance measure based on the Burrows-Wheeler transform. Technical Report 268, University of Palermo, Dipartimento di Matematica ed Appl. (December 2004)

    Google Scholar 

  11. Mantaci, S., Restivo, A., Sciortino, M.: Burrows-Wheeler transform and Sturmian words. Informat. Proc. Lett. 86, 241–246 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  12. Mantaci, S., Restivo, A., Sciortino, M.: Combinatorial aspects of the Burrows- Wheeler transform. TUCS (Turku Center for Computer Science) General Pubblication 25, 292–297 (2003); proc. WORDS 2003

    MathSciNet  Google Scholar 

  13. Manzini, G.: The Burrows-Wheeler transform: Theory and practice. In: Kutyłowski, M., Wierzbicki, T., Pacholski, L. (eds.) MFCS 1999. LNCS, vol. 1672, pp. 34–47. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  14. Otu, H.H., Sayood, K.: A new sequence distance measure for phylogenetic tree construction. Bioinformatics 19(16), 2122–2130 (2003)

    Article  Google Scholar 

  15. Vinga, S., Almeida, J.: Alignment-free sequence comparison – a review. Bioinformatics 19(4), 513–523 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mantaci, S., Restivo, A., Rosone, G., Sciortino, M. (2005). An Extension of the Burrows Wheeler Transform and Applications to Sequence Comparison and Data Compression. In: Apostolico, A., Crochemore, M., Park, K. (eds) Combinatorial Pattern Matching. CPM 2005. Lecture Notes in Computer Science, vol 3537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11496656_16

Download citation

  • DOI: https://doi.org/10.1007/11496656_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26201-5

  • Online ISBN: 978-3-540-31562-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics