Modeling Static Caching in Web Search Engines

  • Ricardo Baeza-Yates
  • Simon Jonassen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7224)

Abstract

In this paper we model a two-level cache of a Web search engine, such that given memory resources, we find the optimal split fraction to allocate for each cache, results and index. The final result is very simple and implies to compute just five parameters that depend on the input data and the performance of the search engine. The model is validated through extensive experimental results and is motivated on capacity planning and the overall optimization of the search architecture.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altingovde, I.S., Ozcan, R., Cambazoglu, B.B., Ulusoy, Ö.: Second Chance: A Hybrid Approach for Dynamic Result Caching in Search Engines. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 510–516. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    Baeza-Yates, R.A., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: SIGIR, pp. 183–190 (2007)Google Scholar
  3. 3.
    Baeza-Yates, R.A., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: Design trade-offs for search engine caching. TWEB 2(4) (2008)Google Scholar
  4. 4.
    Baeza-Yate, R., Junqueira, F.P., Plachouras, V., Witschel, H.F.: Admission Policies for Caches of Search Engine Results. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 74–85. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Baeza-Yates, R., Saint-Jean, F.: A Three Level Search Engine Index Based in Query Log Distribution. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 56–65. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans. Inf. Syst. 24(1), 51–78 (2006)CrossRefGoogle Scholar
  7. 7.
    Gan, Q., Suel, T.: Improved techniques for result caching in web search engines. In: WWW, pp. 431–440 (2009)Google Scholar
  8. 8.
    Lempel, R., Moran, S.: Predictive caching and prefetching of query results in search engines. In: WWW (2003)Google Scholar
  9. 9.
    Long, X., Suel, T.: Three-level caching for efficient query processing in large web search engines. In: WWW (2005)Google Scholar
  10. 10.
    Markatos, E.P.: On caching search engine query results. Computer Communications 24(2), 137–143 (2001), citeseer.ist.psu.edu/markatos00caching.html CrossRefGoogle Scholar
  11. 11.
    Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Johnson, D.: Terrier Information Retrieval Platform. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 517–519. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Ozcan, R., Altingovde, I.S., Cambazoglu, B.B., Junqueira, F.P., Ulusoy, Ö.: A five-level static cache architecture for web search engines. Information Processing & Management (2011) (in press), http://www.sciencedirect.com/science/article/pii/S0306457310001081
  13. 13.
    Ozcan, R., Altingovde, I.S., Ulusoy, O.: Cost-aware strategies for query result caching in web search engines. ACM Trans. Web 5, 9:1–9:25 (2011)Google Scholar
  14. 14.
    Podlipnig, S., Boszormenyi, L.: A survey of web cache replacement strategies. ACM Comput. Surv. 35(4), 374–398 (2003)CrossRefGoogle Scholar
  15. 15.
    Raghavan, V.V., Sever, H.: On the reuse of past optimal queries. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 344–350 (1995)Google Scholar
  16. 16.
    Saraiva, P.C., de Moura, E.S., Ziviani, N., Meira, W., Fonseca, R., Riberio-Neto, B.: Rank-preserving two-level caching for scalable search engines. In: SIGIR (2001)Google Scholar
  17. 17.
    Skobeltsyn, G., Junqueira, F., Plachouras, V., Baeza-Yates, R.A.: Resin: a combination of results caching and index pruning for high-performance web search engines. In: SIGIR, pp. 131–138 (2008)Google Scholar
  18. 18.
    Xie, Y., O’Hallaron, D.R.: Locality in search engine queries and its implications for caching. In: INFOCOM (2002)Google Scholar
  19. 19.
    Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: WWW, pp. 401–410 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ricardo Baeza-Yates
    • 1
  • Simon Jonassen
    • 2
  1. 1.Yahoo! Research BarcelonaBarcelonaSpain
  2. 2.Norwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations