Abstract
This paper proposes and presents a comparison of scheduling algorithms applied to the context of load balancing the query traffic on distributed inverted files. We put emphasis on queries requiring intersection of posting lists, which is a very demanding case for the term partitioned inverted file and a case in which the document partitioned inverted file used by current search engines can perform very efficiently. We show that with proper scheduling of queries the term partitioned approach can outperform the document partitioned approach.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Chaudhuri, S., Church, K., König, A.C., Sui, L.: Heavy-tailed distributions and multi-keyword queries. In: SIGIR 2007: 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 663–670. ACM, New York (2007)
Falchi, F., Gennaro, C., Rabitti, F., Zezula, P.: Mining query logs to optimize index partitioning in parallel web search engines. In: INFOSCALE 2007: 2nd International Conference on Scalable Information Systems (2007)
Li, J., Loo, B., Hellerstein, J., Kaashoek, F., Karger, D., Morris, R.: The feasibility of peer-to-peer web indexing and search (2003)
Marin, M., Gomez, C.: Load balancing distributed inverted files. In: WIDM 2007: 9th annual ACM international workshop on Web information and data management, pp. 57–64. ACM, New York (2007)
Marin, M., Costa, V.G. (SyncjAsync) + MPI search engines. In: Cappello, F., Hrault, T., Dongarra, J. (eds.) PVM/MPI 2007. LNCS, vol. 4757, pp. 117–124. Springer, Heidelberg (2007)
Moffat, A., Webber, W., Zobel, J.: Load balancing for term-distributed parallel retrieval. In: SIGIR 2006: 29th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 348–355. ACM, New York (2006)
Moffat, W., Zobel, J.W., Baeza-Yates, R.: A pipelined architecture for distributed text query evaluation. Information Retrieval (August 2007)
Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672, pp. 21–40. Springer, Heidelberg (2003)
Suel, T., Mathur, C., wen Wu, J., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasundaram, K.: ODISSEA: A peer-to-peer architecture for scalable web search and information retrieval. In: WWW 2003: 12th International World Wide Web Conference (2003)
Valiant, L.: A bridging model for parallel computation. Comm. ACM 33, 103–111 (1990)
Zhang, J., Long, X., Suel, T.: Performance of compressed inverted list caching in search engines. In: WWW 2008: 17th International World Wide Web Conference (2008)
Zhang, J., Suel, T.: Optimized inverted list assignment in distributed search engine architectures. In: IPDPS 2007: 23rd IEEE International Parallel and Distributed Processing Symposium (2007)
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Computing Surveys 38(2) (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marin, M., Gomez-Pantoja, C., Gonzalez, S., Gil-Costa, V. (2008). Scheduling Intersection Queries in Term Partitioned Inverted Files. In: Luque, E., Margalef, T., Benítez, D. (eds) Euro-Par 2008 – Parallel Processing. Euro-Par 2008. Lecture Notes in Computer Science, vol 5168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85451-7_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-85451-7_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85450-0
Online ISBN: 978-3-540-85451-7
eBook Packages: Computer ScienceComputer Science (R0)