Advertisement

Space- and time-efficient decoding with canonical huffman trees

Extended abstract
  • Shmuel T. Klein
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1264)

Abstract

A new data structure is investigated, which allows fast decoding of texts encoded by canonical Huffman codes. The storage requirements are much lower than for conventional Huffman trees, O(log2n) for trees of depth O(log n), and decoding is faster, because a part of the bit-comparisons necessary for the decoding may be saved. Empirical results on large real-life distributions show a reduction of up to 50% and more in the number of bit operations.

Keywords

Internal Node Binary String Huffman Code Prefix Code Canonical Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bookstein A., Klein S.T., Compression, Information Theory and Grammars: A Unified Approach, ACM Trans. on Information Systems8 (1990) 27–49.Google Scholar
  2. 2.
    Bookstein A., Klein S.T., Is Huffman coding dead?, Computing50 (1993) 279–296.Google Scholar
  3. 3.
    Bookstein A., Klein S.T., Ziff D.A., A systematic approach to compressing a full text retrieval system, Information Processing & Management28 (1992) 795–806.Google Scholar
  4. 4.
    Choueka Y., Klein S.T., Perl Y., Efficient Variants of Huffman Codes in High Level Languages, Proc. 8-th ACM-SIGIR Conf., Montreal (1985) 122–130.Google Scholar
  5. 5.
    Fraenkel A.S., All about the Responsa Retrieval Project you always wanted to know but were afraid to ask, Expanded Summary, Jurimetrics J.16 (1976) 149–156.Google Scholar
  6. 6.
    Fraenkel A.S., Klein S.T., Novel Compression of Sparse Bit-Strings, in Combinatorial Algorithms on Words, NATO ASI Series Vol F12, Springer Verlag, Berlin (1985) 169–183.Google Scholar
  7. 7.
    Fraenkel A.S., Klein S.T., Bidirectional Huffman Coding, The Computer Journal33 (1990) 296–307.Google Scholar
  8. 8.
    Fraenkel A.S., Klein S.T., Bounding the Depth of Search Trees, The Computer Journal36 (1993) 668–678.Google Scholar
  9. 9.
    Ferguson T.J., Rabinowitz J.H., Self-synchronizing Huffman codes, IEEE Trans. on Information Theory, IT-30 (1984) 687–693.Google Scholar
  10. 10.
    Gilbert E.N., Moore E.F., Variable-length binary encodings, The Bell System Technical Journal38 (1959) 933–968.Google Scholar
  11. 11.
    Heaps H.S., Information Retrieval, Computational and Theoretical Aspects, Academic Press, New York (1978).Google Scholar
  12. 12.
    Huffman D., A method for the construction of minimum redundancy codes, Proc. of the IRE40 (1952) 1098–1101.Google Scholar
  13. 13.
    Hirschberg D.S., Lelewer D.A., Efficient decoding of prefix codes, Comm. of the ACM33 (1990) 449–459.Google Scholar
  14. 14.
    Katona G.H.O., Nemetz T.O.H., Huffman codes and self-information, IEEE Trans. on Inf. Th.IT-11 (1965) 284–292.Google Scholar
  15. 15.
    Knuth D.E., The Art of Computer Programming, VolI, Fundamental Algorithms, Addison-Wesley, Reading, MA (1973).Google Scholar
  16. 16.
    Lelewer D.A., Hirschberg D.S., Data compression, ACM Computing Surveys19 (1987) 261–296.Google Scholar
  17. 17.
    Longo G., Galasso G., An application of informational divergence to Huffman codes, IEEE Trans. on Inf. Th.IT-28 (1982) 36–43.Google Scholar
  18. 18.
    Moffat A., Bell T., In-situ generation of compressed inverted files, J. ASIS46 (1995) 537–550.Google Scholar
  19. 19.
    Moffat A., Turpin A., On the implementation of minimum redundancy prefix codes, Proc. Data Compression Conference DCC-96, Snowbird, Utah (1996) 182–191.Google Scholar
  20. 20.
    Moffat A., Turpin A., Katajainen J., Space-efficient construction of optimal prefix codes, Proc. Data Compression Conference DCC-95, Snowbird, Utah (1995) 192–201.Google Scholar
  21. 21.
    Moffat A., Zobel J., Sharman N., Text compression for dynamic document databases, to appear in IEEE Transactions on Knowledge and Data Engineering. Preliminary version in Proc. Data Compression Conference DCC-94, Snowbird, Utah (1994) 126–135.Google Scholar
  22. 22.
    Schwartz E.S., Kallick B., Generating a canonical prefix encoding, Comm. of the ACM7 (1964) 166–169.Google Scholar
  23. 23.
    Sieminski, A., Fast decoding of the Huffman codes, Information Processing Letters26 (1988) 237–241.Google Scholar
  24. 24.
    Witten I.H., Moffat A., Bell T.C., Managing Gigabytes: Compressing and Indexing Documents and Images, Van Nostrand Reinhold, New York (1994).Google Scholar
  25. 25.
    Zipf G.K., The Psycho-Biology of Language, Boston, Houghton (1935).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Shmuel T. Klein
    • 1
  1. 1.Department of Mathematics and Computer ScienceBar Ilan UniversityRamat-GanIsrael

Personalised recommendations