Grammar Precompression Speeds Up Burrows–Wheeler Compression
Text compression algorithms based on the Burrows–Wheeler transform (BWT) typically achieve a good compression ratio but are slow compared to Lempel–Ziv type compression algorithms. The main culprit is the time needed to compute the BWT during compression and its inverse during decompression. We propose to speed up BWT-based compression by performing a grammar-based precompression before the transform. The idea is to reduce the amount of data that BWT and its inverse have to process. We have developed a very fast grammar precompressor using pair replacement. Experiments show a substantial speed up in practice without a significant effect on compression ratio.
KeywordsCompression Ratio Compression Algorithm Compression Rate Pair Replacement Alphabet Size
Unable to display preview. Download preview PDF.
- 2.Adjeroh, D., Bell, T., Mukherjee, A.: The Burrows–Wheeler Transform: Data Compression Suffix Arrays, and Pattern Matching. Springer (2008)Google Scholar
- 7.Ferragina, P., Manzini, G.: On compressing the textual web. In: Proc. 3rd Conference on Web Search and Web Data Mining (WSMD), pp. 391–400. ACM (2010)Google Scholar
- 8.Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Slashing the time for BWT inversion. In: Proc. Data Compression Conference, pp. 99–108. IEEE CS (2012)Google Scholar
- 10.Mahoney, M.: Large text compression benchmark (July 10, 2012), http://mattmahoney.net/dc/text.html