Advertisement

Performance Improvements for Search Systems Using an Integrated Cache of Lists+Intersections

  • Gabriel Tolosa
  • Luca Becchetti
  • Esteban Feuerstein
  • Alberto Marchetti-Spaccamela
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8799)

Abstract

Modern information retrieval systems use several levels of caching to speedup computation by exploiting frequent, recent or costly data used in the past. In this study we propose and evaluate a static cache that works simultaneously as list and intersection cache, offering a more efficient way of handling cache space. In addition, we propose effective strategies to select the term pairs that should populate the cache. Simulation using two datasets and a real query log reveal that the proposed approach improves overall performance in terms of total processing time, achieving savings of up to 40% in the best case.

Keywords

Cache Size Inverted Index Posting List Inverted List Search Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: Proc. of the 30th Annual Int. Conf. on Research and Development in Information Retrieval (2007)Google Scholar
  2. 2.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval: The Concepts and Technology behind Search, 2nd edn. Addison-Wesley Prof., Inc. (2011)Google Scholar
  3. 3.
    Cambazoglu, B.B., Zaragoza, H., Chapelle, O., Chen, J., Liao, C., Zheng, Z., Degenhardt, J.: Early exit optimizations for additive machine learned ranking systems. In: Proc. of the Third ACM Int. Conf. on Web Search and Data Mining (2010)Google Scholar
  4. 4.
    Culpepper, J.S., Moffat, A.: Compact set representation for information retrieval. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 137–148. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Dean, J.: Challenges in building large-scale information retrieval systems: Invited talk. In: Proc. of the Second ACM International Conf. on Web Search and Data Mining, WSDM 2009, p. 1. ACM, New York (2009)CrossRefGoogle Scholar
  6. 6.
    Ding, S., Attenberg, J., Baeza-Yates, R., Suel, T.: Batch query processing for web search engines. In: Proc. of the Fourth ACM International Conf. on Web Search and Data Mining, WSDM 2011, New York, NY, USA, pp. 137–146 (2011)Google Scholar
  7. 7.
    Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of web search engines: Caching and prefetching query results by exploiting historicalusage data. ACM Trans. Inf. Syst. 24(1), 51–78 (2006)CrossRefGoogle Scholar
  8. 8.
    Feuerstein, E., Tolosa, G.: Analysis of cost-aware policies for intersection caching in search nodes. In: Proc. of the XXXII Conf. of the Chilean Society of Computer Science, SCCC 2013 (2013)Google Scholar
  9. 9.
    Feuerstein, E., Tolosa, G.: Cost-aware intersection caching and processing strategies for in-memory inverted indexes. In: Proc. of 11th Workshop on Large-scale and Distributed Systems for Information Retrieval, LSDS-IR 2014, New York (2014)Google Scholar
  10. 10.
    Gan, Q., Suel, T.: Improved techniques for result caching in web search engines. In: Proc. of the 18th Int. Conf. on World Wide Web, WWW 2009, pp. 431–440 (2009)Google Scholar
  11. 11.
    Hirai, J., Raghavan, S., Garcia-Molina, H., Paepcke, A.: Webbase: A repository of web pages. In: Proc. of the 9th International World Wide Web Conf. on Computer Networks. North-Holland Publishing Co. (2000)Google Scholar
  12. 12.
    Lam, H.T., Perego, R., Quan, N.T.M., Silvestri, F.: Entry pairing in inverted file. In: Vossen, G., Long, D.D.E., Yu, J.X. (eds.) WISE 2009. LNCS, vol. 5802, pp. 511–522. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  13. 13.
    Long, X., Suel, T.: Three-level caching for efficient query processing in large web search engines. In: Proc. of the 14th Int. Conf. on World Wide Web, WWW 2005, USA, pp. 257–266 (2005)Google Scholar
  14. 14.
    Markatos, E.: On caching search engine query results. Comput. Commun. 24(2), 137–143 (2001)CrossRefGoogle Scholar
  15. 15.
    Ozcan, R., Altingovde, I.S., Ulusoy, O.: Cost-aware strategies for query result caching in web search engines. ACM Trans. Web 5(2), 9:1–9:25 (2011)Google Scholar
  16. 16.
    Ozcan, R., Sengor Altingovde, I., Barla Cambazoglu, B., Junqueira, F.P., Ulusoy, O.: A five-level static cache architecture for web search engines. Information Processing & Management 48(5), 828–840 (2012)CrossRefGoogle Scholar
  17. 17.
    Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: Proc. of the 1st International Conf. on Scalable Information Systems, InfoScale 2006. ACM (2006)Google Scholar
  18. 18.
    Saraiva, P.C., Silva de Moura, E., Ziviani, N., Meira, W., Fonseca, R., Riberio-Neto, B.: Rank-preserving two-level caching for scalable search engines. In: Proc. of the 24th Annual Int. Conf. on Research and Development in Information Retrieval, SIGIR 2001, USA, pp. 51–58 (2001)Google Scholar
  19. 19.
    Turtle, H., Flood, J.: Query evaluation: Strategies and optimizations. Information Processing and Management 31(6), 831–850 (1995)CrossRefGoogle Scholar
  20. 20.
    Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd edn. Morgan Kaufmann Publishers Inc., San Francisco (1999)Google Scholar
  21. 21.
    Zhang, J., Long, X., Suel, T.: Performance of compressed inverted list caching in search engines. In: Proc. of the 17th Int. Conf. on World Wide Web, WWW 2008, USA, pp. 387–396 (2008)Google Scholar
  22. 22.
    Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. 38(2) (July 2006)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Gabriel Tolosa
    • 1
    • 2
  • Luca Becchetti
    • 3
  • Esteban Feuerstein
    • 1
  • Alberto Marchetti-Spaccamela
    • 3
  1. 1.University of Buenos AiresArgentina
  2. 2.National University of LujánArgentina
  3. 3.Sapienza University of RomeItaly

Personalised recommendations