Abstract
In this paper we model a two-level cache of a Web search engine, such that given memory resources, we find the optimal split fraction to allocate for each cache, results and index. The final result is very simple and implies to compute just five parameters that depend on the input data and the performance of the search engine. The model is validated through extensive experimental results and is motivated on capacity planning and the overall optimization of the search architecture.
This work was done while the second author was an intern at Yahoo! Research and supported by the iAd Centre (http://iad-centre.no) funded by the Research Council of Norway and the Norwegian University of Science and Technology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Altingovde, I.S., Ozcan, R., Cambazoglu, B.B., Ulusoy, Ö.: Second Chance: A Hybrid Approach for Dynamic Result Caching in Search Engines. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 510–516. Springer, Heidelberg (2011)
Baeza-Yates, R.A., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: SIGIR, pp. 183–190 (2007)
Baeza-Yates, R.A., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: Design trade-offs for search engine caching. TWEBÂ 2(4) (2008)
Baeza-Yate, R., Junqueira, F.P., Plachouras, V., Witschel, H.F.: Admission Policies for Caches of Search Engine Results. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 74–85. Springer, Heidelberg (2007)
Baeza-Yates, R., Saint-Jean, F.: A Three Level Search Engine Index Based in Query Log Distribution. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 56–65. Springer, Heidelberg (2003)
Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans. Inf. Syst. 24(1), 51–78 (2006)
Gan, Q., Suel, T.: Improved techniques for result caching in web search engines. In: WWW, pp. 431–440 (2009)
Lempel, R., Moran, S.: Predictive caching and prefetching of query results in search engines. In: WWW (2003)
Long, X., Suel, T.: Three-level caching for efficient query processing in large web search engines. In: WWW (2005)
Markatos, E.P.: On caching search engine query results. Computer Communications 24(2), 137–143 (2001), citeseer.ist.psu.edu/markatos00caching.html
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Johnson, D.: Terrier Information Retrieval Platform. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 517–519. Springer, Heidelberg (2005)
Ozcan, R., Altingovde, I.S., Cambazoglu, B.B., Junqueira, F.P., Ulusoy, Ö.: A five-level static cache architecture for web search engines. Information Processing & Management (2011) (in press), http://www.sciencedirect.com/science/article/pii/S0306457310001081
Ozcan, R., Altingovde, I.S., Ulusoy, O.: Cost-aware strategies for query result caching in web search engines. ACM Trans. Web 5, 9:1–9:25 (2011)
Podlipnig, S., Boszormenyi, L.: A survey of web cache replacement strategies. ACM Comput. Surv. 35(4), 374–398 (2003)
Raghavan, V.V., Sever, H.: On the reuse of past optimal queries. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 344–350 (1995)
Saraiva, P.C., de Moura, E.S., Ziviani, N., Meira, W., Fonseca, R., Riberio-Neto, B.: Rank-preserving two-level caching for scalable search engines. In: SIGIR (2001)
Skobeltsyn, G., Junqueira, F., Plachouras, V., Baeza-Yates, R.A.: Resin: a combination of results caching and index pruning for high-performance web search engines. In: SIGIR, pp. 131–138 (2008)
Xie, Y., O’Hallaron, D.R.: Locality in search engine queries and its implications for caching. In: INFOCOM (2002)
Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: WWW, pp. 401–410 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baeza-Yates, R., Jonassen, S. (2012). Modeling Static Caching in Web Search Engines. In: Baeza-Yates, R., et al. Advances in Information Retrieval. ECIR 2012. Lecture Notes in Computer Science, vol 7224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-28997-2_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28996-5
Online ISBN: 978-3-642-28997-2
eBook Packages: Computer ScienceComputer Science (R0)