Encyclopedia of Algorithms

2008 Edition
| Editors: Ming-Yang Kao

Dictionary-Based Data Compression

1977; Ziv, Lempel
  • Travis Gagie
  • Giovanni Manzini
Reference work entry
DOI: https://doi.org/10.1007/978-0-387-30162-4_108

Keywords and Synonyms

LZ compression ; Ziv–Lempel compression ; Parsing-based compression      

Problem Definition

The problem of lossless data compression is the problem of compactly representing data in a format that admits the faithful recovery of the original information. Lossless data compression is achieved by taking advantage of the redundancy which is often present in the data generated by either humans or machines.

Dictionary-based data compression has been “the solution” to the problem of lossless data compression for nearly 15 years. This technique originated in two theoretical papers of Ziv and Lempel [15,16] and gained popularity in the “80s” with the introduction of the Unix tool compress (1986) and of the gifimage format (1987). Although today there are alternative solutions to the problem of lossless data compression (e. g., Burrows-Wheeler compression and Prediction by Partial Matching), dictionary-based compression is still widely used in everyday applications:...


Compression Ratio Dictionary Word Good Compression Ratio Markov Chain Algorithm Longe Prefix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Arroyuelo, D., Navarro, G., Sadakane, K.: Reducing the space requirement of LZ-index. In: Proc. 17th Combinatorial Pattern Matching conference (CPM), LNCS no. 4009, pp. 318–329, Springer (2006)Google Scholar
  2. 2.
    Charikar, M., Lehman, E., Liu, D., Panigraphy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theor. 51, 2554–2576 (2005)CrossRefGoogle Scholar
  3. 3.
    Cormode, G., Muthukrishnan, S.: Substring compression problems. In: Proc. 16th ACM-SIAM Symposium on Discrete Algorithms (SODA '05), pp. 321–330 (2005)Google Scholar
  4. 4.
    Crochemore, M., Landau, G., Ziv-Ukelson, M.: A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM J. Comput. 32, 1654–1673 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Ferragina, P., Manzini, G.: Indexing compressed text. J. ACM 52, 552–581 (2005)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Kosaraju, R., Manzini, G.: Compression of low entropy strings with Lempel–Ziv algorithms. SIAM J. Comput. 29, 893–911 (1999)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Krishnan, P., Vitter, J.: Optimal prediction for prefetching in the worst case. SIAM J. Comput. 27, 1617–1636 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Lifshits, Y., Mozes, S., Weimann, O., Ziv-Ukelson, M.: Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica to appear doi:10.1007/s00453-007-9128-0Google Scholar
  9. 9.
    Matias, Y., Şahinalp, C.: On the optimality of parsing in dynamic dictionary based data compression. In: Proceedings 10th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '99), pp. 943–944 (1999)Google Scholar
  10. 10.
    Navarro, G.: Indexing text using the Ziv–Lempel trie. J. Discret. Algorithms 2, 87–114 (2004)zbMATHCrossRefGoogle Scholar
  11. 11.
    Navarro, G., Tarhio, J.: LZgrep: A Boyer-Moore string matching tool for Ziv–Lempel compressed text. Softw. Pract. Exp. 35, 1107–1130 (2005)CrossRefGoogle Scholar
  12. 12.
    Şahinalp, C., Rajpoot, N.: Dictionary-based data compression: An algorithmic perspective. In: Sayood, K. (ed.) Lossless Compression Handbook, pp. 153–167. Academic Press, USA (2003)Google Scholar
  13. 13.
    Salomon, D.: Data Compression: the Complete Reference, 4th edn. Springer, London (2007)Google Scholar
  14. 14.
    Savari, S.: Redundancy of the Lempel–Ziv incremental parsing rule. IEEE Trans. Inf. Theor. 43, 9–21 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theor. 23, 337–343 (1977)MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable-length coding. IEEE Trans. Inf. Theor. 24, 530–536 (1978)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  • Travis Gagie
    • 1
  • Giovanni Manzini
    • 1
  1. 1.Department of Computer ScienceUniversity of Eastern PiedmontAlessandriaItaly