Compression of clustered inverted files
One way to save memory space in inverted file organizations is to map each address list to a bit-vector and compress it by a suitable compression technique. Eight such techniques are discussed for nonuniformly distributed bit-vectors in this study. Occurrences of clusters with high 1-bit densities are simulated using a n-state bit-vector generating process. Experiments with a real life file are also reported.
KeywordsCompression Technique Code Word Code Technique Inverted List Compression Gain
Unable to display preview. Download preview PDF.
- /BAL/.L.R. Bahl, H. Kobayashi: Image Data Compression by Predictive Coding II, IBM J. Res. Develop., Vol 18, pp 172–179, (1974).Google Scholar
- /BRA/.S.D. Bradley: Optimizing a Scheme for Run Length Encoding, Proc. IEEE, Vol 57, pp 108–109, (1969).Google Scholar
- /HOW/.R.A. Howard: Dynamic Probabilistic Systems, Vol I: Markov Models, John Wiley & Sons, Inc., New York, (1971).Google Scholar
- /JAK/.M. Jakobsson, O. Nevalainen: On the Compression of Inverted Files, Rept. B 14, Dept. of Comp. Sci., Univ. of Turku, Finland, (1977).Google Scholar
- /KIN/.D.R. King: The Binary Vector as a Basis of an Inverted Index File, J. Libr. Autom., Vol 7, pp 307–315, (1974).Google Scholar
- /LEF/.D. Lefkowitz: File Structures for On-line Systems, Spartan Books, New York, (1969).Google Scholar
- /SCH/.E.J. Schuegraf: Compression of Large Inverted Files with Hyperbolic Term Distribution, Inf. Proc. Manag., Vol 12, pp 377–384, (1976).Google Scholar
- /THI/.L.H. Thiel, H.S. Heaps: Program Design for Retrospective Searches on Large Data Bases, Inf. Stor. Retr., Vol 8, No 1, (1972).Google Scholar
- /WED/.H. Wedekind, T. Härder: Datenbanksysteme II, B.I.-Wissenschaftswerlag Mannheim/Wien/Zürich, (1976).Google Scholar