Compression of clustered inverted files

  • O. Nevalainen
  • M. Jakobsson
  • R. Berg
Part of the Lecture Notes in Computer Science book series (LNCS, volume 64)


One way to save memory space in inverted file organizations is to map each address list to a bit-vector and compress it by a suitable compression technique. Eight such techniques are discussed for nonuniformly distributed bit-vectors in this study. Occurrences of clusters with high 1-bit densities are simulated using a n-state bit-vector generating process. Experiments with a real life file are also reported.


Compression Technique Code Word Code Technique Inverted List Compression Gain 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. /BAL/.
    L.R. Bahl, H. Kobayashi: Image Data Compression by Predictive Coding II, IBM J. Res. Develop., Vol 18, pp 172–179, (1974).Google Scholar
  2. /BRA/.
    S.D. Bradley: Optimizing a Scheme for Run Length Encoding, Proc. IEEE, Vol 57, pp 108–109, (1969).Google Scholar
  3. /CAR/.
    A.F. Cardenas: Analysis and Performance of Inverted Data Base Management Systems, Comm. ACM, Vol 18, pp 253–263, (1975).CrossRefGoogle Scholar
  4. /GOL/.
    S.W. Golomb: Run-Length Encodings, IEEE Trans. Inf. Theory, IT-12, pp 399–401, (1966).CrossRefGoogle Scholar
  5. /HOW/.
    R.A. Howard: Dynamic Probabilistic Systems, Vol I: Markov Models, John Wiley & Sons, Inc., New York, (1971).Google Scholar
  6. /JAK/.
    M. Jakobsson, O. Nevalainen: On the Compression of Inverted Files, Rept. B 14, Dept. of Comp. Sci., Univ. of Turku, Finland, (1977).Google Scholar
  7. /KIN/.
    D.R. King: The Binary Vector as a Basis of an Inverted Index File, J. Libr. Autom., Vol 7, pp 307–315, (1974).Google Scholar
  8. /LEF/.
    D. Lefkowitz: File Structures for On-line Systems, Spartan Books, New York, (1969).Google Scholar
  9. /SCH/.
    E.J. Schuegraf: Compression of Large Inverted Files with Hyperbolic Term Distribution, Inf. Proc. Manag., Vol 12, pp 377–384, (1976).Google Scholar
  10. /THI/.
    L.H. Thiel, H.S. Heaps: Program Design for Retrospective Searches on Large Data Bases, Inf. Stor. Retr., Vol 8, No 1, (1972).Google Scholar
  11. /WED/.
    H. Wedekind, T. Härder: Datenbanksysteme II, B.I.-Wissenschaftswerlag Mannheim/Wien/Zürich, (1976).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1978

Authors and Affiliations

  • O. Nevalainen
    • 1
  • M. Jakobsson
    • 1
  • R. Berg
    • 1
  1. 1.Department of Computer ScienceUniversity of TurkuTurkuFinland

Personalised recommendations