Dictionary-Based Data Compression

Gagie, Travis; Manzini, Giovanni

doi:10.1007/978-0-387-30162-4_108

Dictionary-Based Data Compression

1977; Ziv, Lempel

Travis Gagie² &
Giovanni Manzini²

Reference work entry

416 Accesses

Keywords and Synonyms

LZ compression ; Ziv–Lempel compression ; Parsing-based compression

Problem Definition

The problem of lossless data compression is the problem of compactly representing data in a format that admits the faithful recovery of the original information. Lossless data compression is achieved by taking advantage of the redundancy which is often present in the data generated by either humans or machines.

Dictionary-based data compression has been “the solution” to the problem of lossless data compression for nearly 15 years. This technique originated in two theoretical papers of Ziv and Lempel [15,16] and gained popularity in the “80s” with the introduction of the Unix tool compress (1986) and of the gifimage format (1987). Although today there are alternative solutions to the problem of lossless data compression (e. g., Burrows-Wheeler compression and Prediction by Partial Matching), dictionary-based compression is still widely used in everyday applications: consider...

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 399.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

Arroyuelo, D., Navarro, G., Sadakane, K.: Reducing the space requirement of LZ-index. In: Proc. 17th Combinatorial Pattern Matching conference (CPM), LNCS no. 4009, pp. 318–329, Springer (2006)
Google Scholar
Charikar, M., Lehman, E., Liu, D., Panigraphy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theor. 51, 2554–2576 (2005)
Article Google Scholar
Cormode, G., Muthukrishnan, S.: Substring compression problems. In: Proc. 16th ACM-SIAM Symposium on Discrete Algorithms (SODA '05), pp. 321–330 (2005)
Google Scholar
Crochemore, M., Landau, G., Ziv-Ukelson, M.: A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM J. Comput. 32, 1654–1673 (2003)
Article MathSciNet MATH Google Scholar
Ferragina, P., Manzini, G.: Indexing compressed text. J. ACM 52, 552–581 (2005)
Article MathSciNet Google Scholar
Kosaraju, R., Manzini, G.: Compression of low entropy strings with Lempel–Ziv algorithms. SIAM J. Comput. 29, 893–911 (1999)
Article MathSciNet Google Scholar
Krishnan, P., Vitter, J.: Optimal prediction for prefetching in the worst case. SIAM J. Comput. 27, 1617–1636 (1998)
Article MathSciNet MATH Google Scholar
Lifshits, Y., Mozes, S., Weimann, O., Ziv-Ukelson, M.: Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica to appear doi:10.1007/s00453-007-9128-0
Google Scholar
Matias, Y., Şahinalp, C.: On the optimality of parsing in dynamic dictionary based data compression. In: Proceedings 10th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '99), pp. 943–944 (1999)
Google Scholar
Navarro, G.: Indexing text using the Ziv–Lempel trie. J. Discret. Algorithms 2, 87–114 (2004)
Article MATH Google Scholar
Navarro, G., Tarhio, J.: LZgrep: A Boyer-Moore string matching tool for Ziv–Lempel compressed text. Softw. Pract. Exp. 35, 1107–1130 (2005)
Article Google Scholar
Şahinalp, C., Rajpoot, N.: Dictionary-based data compression: An algorithmic perspective. In: Sayood, K. (ed.) Lossless Compression Handbook, pp. 153–167. Academic Press, USA (2003)
Google Scholar
Salomon, D.: Data Compression: the Complete Reference, 4th edn. Springer, London (2007)
Google Scholar
Savari, S.: Redundancy of the Lempel–Ziv incremental parsing rule. IEEE Trans. Inf. Theor. 43, 9–21 (1997)
Article MathSciNet MATH Google Scholar
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theor. 23, 337–343 (1977)
Article MathSciNet MATH Google Scholar
Ziv, J., Lempel, A.: Compression of individual sequences via variable-length coding. IEEE Trans. Inf. Theor. 24, 530–536 (1978)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Eastern Piedmont, Alessandria, Italy
Travis Gagie & Giovanni Manzini

Authors

Travis Gagie
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Manzini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical Engineering and Computer ScienceMcCormick School of Engineering and Applied Science, Northwestern University, Evanston, IL, 60208, USA
Ming-Yang Kao Professor of Computer Science

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Gagie, T., Manzini, G. (2008). Dictionary-Based Data Compression. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30162-4_108

Download citation

DOI: https://doi.org/10.1007/978-0-387-30162-4_108
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30770-1
Online ISBN: 978-0-387-30162-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics