Ziv-Lempel Compressors with Deferred-Innovation

  • Martin Cohn
Part of the The Kluwer International Series in Engineering and Computer Science book series (SECS, volume 176)

Abstract

The noiseless data-compression algorithms introduced by Ziv and Lempel [ZL77, ZL78] parse an input data string into successive substrings, each consisting of two parts: The citation, namely the longest prefix that has appeared earlier in the input, and the innovation, the symbol immediately following the citation. Thus the citation has appeared earlier, but was not then followed by the innovation symbol. In “extremal” versions of the LZ algorithm the citation may have begun anywhere in the input; in “incremental” versions it must have begun a previously parsed substring. Originally the citation and the innovation were encoded, individually or jointly, into an output word to be transmitted or stored. Subsequently, several authors [MW85, ZL78, SS82, W84] speculated that the cost of this encoding might be excessive because the coded innovation contributes roughly lg(α) bits, where α is the size of the input alphabet, regardless of the compressibility of the source. To remedy the possible excess, these authors suggested storing the parsed substring as usual, but encoding for output only the citation, deferring the encoding of the innovation as the first symbol of the next parsed substring. Thus the innovation might participate in whatever compression that substring enjoyed. We call this strategy deferred innovation. It is exemplified in the algorithm described by Welch [W84] and implemented in UNIX compress and its progeny.

Keywords

Compressibility Librium Prefix 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. F50.
    Feller, William, An Introduction to Probability Theory and its Applications, Vol. 1 New York, John Wiley & Sons Inc. 1950MATHGoogle Scholar
  2. MW85.
    Miller, V.S, and Wegman, M.N., Variations on a theme by Lempel and Ziv. Combinatorial Algorithms on Words, Springer-Verlag (A. Apostolico and Z. Galil, editors) (1985) 131–140.Google Scholar
  3. R58.
    Riordan, John An Introduction to Combinatorial Analysis New York, John Wiley & Sons, Inc. (1958).MATHGoogle Scholar
  4. SS82.
    Storer, J.A., Szymanski, T.G., Data compression via textual substitution, J. ACM 29, 4, (1982) 928–951.MathSciNetMATHCrossRefGoogle Scholar
  5. W84.
    Welch, T.A., A technique for high-performance data compression, IEEE Computer 17, 6 (1984) 8–19.CrossRefGoogle Scholar
  6. ZL77.
    Ziv, J. and Lempel, A., A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory IT-23, 3 (1977) 337–343.MathSciNetCrossRefGoogle Scholar
  7. ZL78.
    Ziv, J. and Lempel, A., Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theory IT-24, 5 (1978) 530–536.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1992

Authors and Affiliations

  • Martin Cohn
    • 1
  1. 1.Computer Science DepartmentBrandeis UniversityWaltham

Personalised recommendations