Advertisement

Abstract

Compression is most important when space is in short supply, so compression algorithms are often implemented in limited memory. Most analyses ignore memory constraints as an implementation detail, however, creating a gap between theory and practice. In this paper we consider the effect of memory limitations on compression algorithms. In the first part of the paper we assume the memory available is fixed and prove nearly tight upper and lower bound on how much is needed to compress a string close to its k-th order entropy. In the second part we assume the memory available grows (slowly) as more and more characters are read. In this setting we show that the rate of growth of the available memory determines the speed at which the compression ratio approaches the entropy. In particular, we establish a relationship between the rate of growth of the sliding window in the LZ77 algorithm and its convergence rate.

Keywords

Compression Ratio Compression Algorithm Kolmogorov Complexity Input String Arithmetic Code 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the 21st Symposium on Principles of Database Systems, pp. 1–16 (2002)Google Scholar
  2. 2.
    Bentley, J.L., Sleator, D.D., Tarjan, R.E., Wei, V.K.: A locally adaptive data compression scheme. Communications of the ACM 29, 320–330 (1986)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Gagie, T.: Large alphabets and incompressibility. Information Processing Letters 99, 246–251 (2006)CrossRefMathSciNetzbMATHGoogle Scholar
  4. 4.
    Gagie, T., Manzini, G.: Space-conscious compression. Technical Report TR-INF-2007-06-02, Università del Piemonte Orientale (2007)Google Scholar
  5. 5.
    Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40, 1098–1101 (1952)CrossRefGoogle Scholar
  6. 6.
    Kosaraju, R., Manzini, G.: Compression of low entropy strings with Lempel–Ziv algorithms. SIAM Journal on Computing 29(3), 893–911 (1999)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Larsson, N.J.: Extended application of suffix trees to data compression. In: DCC 1996: Proceedings of the Conference on Data Compression, Washington, DC, USA, p. 190. IEEE Computer Society Press, Los Alamitos (1996)CrossRefGoogle Scholar
  8. 8.
    Li, M., Vitányi, P.: An Introduction to Kolmogorov Complexity and Its Applications, 2nd edn. Springer, Heidelberg (1997)zbMATHGoogle Scholar
  9. 9.
    Munro, J.I., Paterson, M.S.: Selection and sorting with limited storage. Theoretical Computer Science 12, 315–323 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Muthukrishnan, S.: Data Streams: Algorithms and Applications (2005). Now Publishers, See also: http://www.nowpublishers.com/tcs/
  11. 11.
    Na, J.C., Apostolico, A., Iliopoulos, C., Park, K.: Truncated suffix trees and their application to data compression. Theor. Comput. Sci. 304(1-3), 87–101 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Rissanen, J.: Generalized Kraft inequality and arithmetic coding. IBM Journal of Research and Development 20, 198–203 (1976)zbMATHMathSciNetCrossRefGoogle Scholar
  13. 13.
    Rosenfeld, V.R.: Enumerating De Bruijn sequences. MATCH Communications in Mathematical and in Computer Chemistry 45, 71–83 (2002)zbMATHMathSciNetGoogle Scholar
  14. 14.
    Wyner, A.J.: The redundancy and distribution of the phrase lengths of the fixed-database Lempel-Ziv algorithm. IEEE Transactions on Information Theory 43, 1452–1464 (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Travis Gagie
    • 1
  • Giovanni Manzini
    • 1
  1. 1.Dipartimento di Informatica, Università del Piemonte Orientale 

Personalised recommendations