On Inverted Index Compression for Search Engine Efficiency

  • Matteo Catena
  • Craig Macdonald
  • Iadh Ounis
Conference paper

DOI: 10.1007/978-3-319-06028-6_30

Part of the Lecture Notes in Computer Science book series (LNCS, volume 8416)
Cite this paper as:
Catena M., Macdonald C., Ounis I. (2014) On Inverted Index Compression for Search Engine Efficiency. In: de Rijke M. et al. (eds) Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham

Abstract

Efficient access to the inverted index data structure is a key aspect for a search engine to achieve fast response times to users’ queries. While the performance of an information retrieval (IR) system can be enhanced through the compression of its posting lists, there is little recent work in the literature that thoroughly compares and analyses the performance of modern integer compression schemes across different types of posting information (document ids, frequencies, positions). In this paper, we experiment with different modern integer compression algorithms, integrating these into a modern IR system. Through comprehensive experiments conducted on two large, widely used document corpora and large query sets, our results show the benefit of compression for different types of posting information to the space- and time-efficiency of the search engine. Overall, we find that the simple Frame of Reference compression scheme results in the best query response times for all types of posting information. Moreover, we observe that the frequency and position posting information in Web corpora that have large volumes of anchor text are more challenging to compress, yet compression is beneficial in reducing average query response times.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Matteo Catena
    • 1
  • Craig Macdonald
    • 2
  • Iadh Ounis
    • 2
  1. 1.GSSI - Gran Sasso Science Institute, INFNL’AquilaItaly
  2. 2.School of Computing ScienceUniversity of GlasgowGlasgowUK

Personalised recommendations