Intra-query Concurrent Pipelined Processing for Distributed Full-Text Retrieval

  • Simon Jonassen
  • Svein Erik Bratsberg
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7224)


Pipelined query processing over a term-wise distributed inverted index has superior throughput at high query multiprogramming levels. However, due to long query latencies this approach is inefficient at lower levels. In this paper we explore two types of intra-query parallelism within the pipelined approach, parallel execution of a query on different nodes and concurrent execution on the same node. According to the experimental results, our approach reaches the throughput of the state-of-the-art method at about half of the latency. On the single query case the observed latency improvement is up to 2.6 times.


Query Processing Priority Queue Query Evaluation Inverted Index Inverted List 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Badue, C., Baeza-Yates, R., Ribeiro-Neto, B., Ziviani, N.: Distributed query processing using partitioned inverted files. In: SPIRE (2001)Google Scholar
  2. 2.
    Büttcher, S., Clarke, C.L.A., Cormack, G.V.: Information Retrieval: Implementing and Evaluating Search Engines. The MIT Press (2010)Google Scholar
  3. 3.
    Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In: SIGIR (2011)Google Scholar
  4. 4.
    Feuerstein, E., Marin, M., Mizrahi, M., Gil-Costa, V., Baeza-Yates, R.: Two-Dimensional Distributed Inverted Files. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 206–213. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Jonassen, S., Bratsberg, S.E.: A Combined Semi-pipelined Query Processing Architecture for Distributed Full-Text Retrieval. In: Chen, L., Triantafillou, P., Suel, T. (eds.) WISE 2010. LNCS, vol. 6488, pp. 587–601. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  6. 6.
    Jonassen, S., Bratsberg, S.E.: Efficient Compressed Inverted Index Skipping for Disjunctive Text-Queries. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 530–542. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Lester, N., Moffat, A., Webber, W., Zobel, J.: Space-Limited Ranked Query Evaluation Using Adaptive Pruning. In: Ngu, A.H.H., Kitsuregawa, M., Neuhold, E.J., Chung, J.-Y., Sheng, Q.Z. (eds.) WISE 2005. LNCS, vol. 3806, pp. 470–477. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Lucchese, C., Orlando, S., Perego, R., Silvestri, F.: Mining query logs to optimize index partitioning in parallel web search engines. In: InfoScale (2007)Google Scholar
  9. 9.
    Marin, M., Gil-Costa, V.: High-performance distributed inverted files. In: CIKM (2007)Google Scholar
  10. 10.
    Marin, M., Gil-Costa, V., Bonacic, C., Baeza-Yates, R., Scherson, I.: Sync/async parallel search for the efficient design and construction of web search engines. In: Parallel Computing (2010)Google Scholar
  11. 11.
    Moffat, A., Webber, W., Zobel, J.: Load balancing for term-distributed parallel retrieval. In: SIGIR (2006)Google Scholar
  12. 12.
    Moffat, A., Webber, W., Zobel, J., Baeza-Yates, R.: A pipelined architecture for distributed text query evaluation. Inf. Retr. (2007)Google Scholar
  13. 13.
    Ribeiro-Neto, B., Barbosa, R.: Query performance for tightly coupled distributed digital libraries. In: DL (1998)Google Scholar
  14. 14.
    Strohman, T., Croft, W.: Efficient document retrieval in main memory. In: SIGIR (2007)Google Scholar
  15. 15.
    Tomasic, A., Garcia-Molina, H.: Query processing and inverted indices in shared nothing text document information retrieval systems. The VLDB Journal (1993)Google Scholar
  16. 16.
    Turtle, H., Flood, J.: Query evaluation: strategies and optimizations. Inf. Process. Manage. (1995)Google Scholar
  17. 17.
    Webber, W.: Design and evaluation of a pipelined distributed information retrieval architecture. Master’s thesis (2007)Google Scholar
  18. 18.
    Xi, W., Sornil, O., Luo, M., Fox, E.A.: Hybrid Partition Inverted Files: Experimental Validation. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, p. 422. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  19. 19.
    Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: WWW (2009)Google Scholar
  20. 20.
    Zhang, J., Suel, T.: Optimized inverted list assignment in distributed search engine architectures. In: Paral. and Dist. Proc. Symp. Int. (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Simon Jonassen
    • 1
  • Svein Erik Bratsberg
    • 1
  1. 1.Norwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations