Skip to main content

Weighted Burrows–Wheeler Compression


A weight-based dynamic compression method has recently been proposed, which is especially suitable for the encoding of files with locally skewed distributions. Its main idea is to assign larger weights to closer to be encoded symbols by means of an increasing weight function, rather than considering each position in the text evenly. A well known transformation that tends to convert input files into files with a more skewed distribution is the Burrows–Wheeler Transform (BWT). This paper proposes to apply the weighted approach on Burrows–Wheeler transformed files. While it is shown that the compression performance is not altered for static and adaptive arithmetic coding by any permutation of the symbols, hence in particular for BWT, empirical evidence of the efficiency of the combination of BWT with the weighted approach is provided.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Data availability

All data generated or analysed during this study are included in this published article.




  1. Burrows M, Wheeler D.J. A block-sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)

  2. Ziv J, Lempel A. A universal algorithm for sequential data compression. IEEE Trans Inf Theory. 1977;23(3):337–43.

    Article  MathSciNet  MATH  Google Scholar 

  3. Moffat A. Huffman coding. ACM Comput Surv. 2019;52(4):85–18535.

    Google Scholar 

  4. Fruchtman A, Gross Y, Klein ST, Shapira D. Weighted Burrows-Wheeler compression. CoRR abs/2105.10327 (2021)

  5. Hon W, Sadakane K, Sung W. Breaking a time-and-space barrier in constructing full-text indices. SIAM J Comput. 2009;38(6):2162–78.

    Article  MathSciNet  MATH  Google Scholar 

  6. Kempa D, Kociumaka T. String synchronizing sets: sublinear-time BWT construction and optimal LCE data structure. In: Charikar M, Cohen E, editors. Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23–26; 2019. p. 756–767.

  7. Bentley JL, Sleator DD, Tarjan RE, Wei VK. A locally adaptive data compression scheme. Commun ACM. 1986;29(4):320–30.

    Article  MathSciNet  MATH  Google Scholar 

  8. Ryabko BY, Horspool RN, Cormack GV. Comments to: a locally adaptive data compression scheme. Commun ACM. 1987;30(9):792–4.

    Google Scholar 

  9. Arnavut Z, Magliveras SS. Block sorting and compression. In: Storer JA, Cohn M, editors. Proceedings of the 7th Data Compression Conference (DCC ’97), Snowbird, Utah, USA, March 25–27; 1997. p. 181–190.

  10. Binder E. Distance coder. Usenet group: comp.compression. 2000.

  11. Gagie T, Manzini G. Move-to-front, distance coding, and inversion frequencies revisited. Theor Comput Sci. 2010;411(31–33):2925–44.

    Article  MathSciNet  MATH  Google Scholar 

  12. Fruchtman A, Gross Y, Klein S.T, Shapira D. Backward weighted coding. In: 31st Data Compression Conference, DCC 2021, Snowbird, UT, USA, March 23–26; 2021. p. 93–102.

  13. Fenwick PM. The Burrows-Wheeler transform for block sorting text compression: principles and improvements. Comput J. 1996;39(9):731–40.

    Article  Google Scholar 

  14. Klein ST, Saadia S, Shapira D. Forward looking Huffman coding. Theory Comput Syst. 2020;65(3):593–612.

    Article  MathSciNet  MATH  Google Scholar 

  15. Fruchtman A, Klein S.T, Shapira D. Bidirectional adaptive compression. In: Proceedings of the Prague Stringology Conference; 2019. pp. 92–101.

  16. Fruchtman A, Gross Y, Klein ST, Shapira D. Weighted forward looking adaptive coding. Theor Comput Sci. 2022;930:86–99.

    Article  MathSciNet  MATH  Google Scholar 

  17. Avrunin RM, Klein ST, Shapira D. Combining forward compression with PPM. SN Comput Sci. 2022;3(3):239.

    Article  Google Scholar 

  18. Cleary J, Witten I. Data compression using adaptive coding and partial string matching. IEEE Trans Commun. 1984;32(4):396–402.

    Article  Google Scholar 

  19. Witten IH, Neal RM, Cleary JG. Arithmetic coding for data compression. Commun ACM. 1987;30(6):520–40.

    Article  Google Scholar 

  20. Vitter JS. Design and analysis of dynamic Huffman codes. JACM. 1987;34(4):825–45.

    Article  MathSciNet  MATH  Google Scholar 

  21. Nelson M, Gailly J-L. The data compression book. New York: M & T Books; 1996. p. 550–1.

    Google Scholar 

  22. Elias P. Universal codeword sets and representations of the integers. IEEE Trans Inf Theory. 1975;21(2):194–203.

    Article  MathSciNet  MATH  Google Scholar 

  23. Moffat A, Turpin A. Compression and Coding Algorithms. The international series in engineering and computer science, vol. 669, Kluwer (2002)

  24. Gray F. Pulse code communication. U.S. Patent 2,632,058A, Serial No. 785697 (1953)

  25. Hankerson DC, Harris GA, Johnson J. Introduction to information theory and data compression. Boca Raton, Florida: CRC; 1998.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Dana Shapira.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “String Processing and Combinatorial Algorithms guest edited by Simone Faro.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fruchtman, A., Gross, Y., Klein, S.T. et al. Weighted Burrows–Wheeler Compression. SN COMPUT. SCI. 4, 265 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Adaptive compression
  • Huffman code
  • Arithmetic code
  • Burrows-Wheeler Transform