Encyclopedia of Big Data Technologies

2019 Edition
| Editors: Sherif Sakr, Albert Y. Zomaya

Computing the Cost of Compressed Data

  • Alistair MoffatEmail author
  • Matthias PetriEmail author
Reference work entry
DOI: https://doi.org/10.1007/978-3-319-77525-8_57

Synonyms

Definitions

Compression mechanisms reduce the storage cost of retained data. In the extreme case of data that must be retained indefinitely, the initial cost of performing the compression transformation can be amortized down to zero, since the savings in storage space continue to accrue without limit, albeit at decreasing rates as time goes by and disk storage becomes cheaper. A more typical scenario arises when a fixed data retention period must be supported, after which the stored data is no longer required; and when a certain level of access operations to the stored data can be expected, as part of a regulatory or compliance environment. In this second scenario, the total cost of retention(TCR) is a function of multiple competing factors, and the compression regime that provides the most compact storage might not be the one that provides the smallest TCR. This entry summarizes recent work in the area of cost models for...

This is a preview of subscription content, log in to check access.

References

  1. Duda J (2009) Asymmetric numeral systems. CoRR abs/0902.0271Google Scholar
  2. Duda J (2013) Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding. CoRR abs/1311.2540Google Scholar
  3. Farruggia A, Ferragina P, Venturini R (2014) Bicriteria data compression: efficient and usable. In: Proceedings of the European symposium on algorithms (ESA), pp 406–417Google Scholar
  4. Hoobin C, Puglisi SJ, Zobel J (2011) Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections. PVLDB 5(3):265–273Google Scholar
  5. Liao K, Moffat A, Petri M, Wirth A (2017) A cost model for long-term compressed data retention. In: Proceedings of the ACM international conference on web search and data mining (WSDM), pp 241–249Google Scholar
  6. Moffat A, Petri M (2017) ANS-based index compression. In: Proceedings of the ACM international conference on information and knowledge management (CIKM), pp 677–686Google Scholar
  7. Moffat A, Turpin A (2002) Compression and coding algorithms. Kluwer, BostonzbMATHCrossRefGoogle Scholar
  8. Petri M, Moffat A, Nagesh PC, Wirth A (2015) Access time tradeoffs in archive compression. In: Proceedings of the Asia information retrieval societies conference (AIRS), pp 15–28CrossRefGoogle Scholar
  9. Witten IH, Moffat A, Bell TC (1999) Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann, San FranciscozbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of Computing and Information SystemsThe University of MelboureMelbourneAustralia