On the Feasibility of Peer-to-Peer Web Indexing and Search

  • Jinyang Li
  • Boon Thau Loo
  • Joseph M. Hellerstein
  • M. Frans Kaashoek
  • David R. Karger
  • Robert Morris
Conference paper

DOI: 10.1007/978-3-540-45172-3_19

Volume 2735 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Li J., Loo B.T., Hellerstein J.M., Kaashoek M.F., Karger D.R., Morris R. (2003) On the Feasibility of Peer-to-Peer Web Indexing and Search. In: Kaashoek M.F., Stoica I. (eds) Peer-to-Peer Systems II. IPTPS 2003. Lecture Notes in Computer Science, vol 2735. Springer, Berlin, Heidelberg

Abstract

This paper discusses the feasibility of peer-to-peer full-text keyword search of the Web. Two classes of keyword search techniques are in use or have been proposed: flooding of queries over an overlay network (as in Gnutella), and intersection of index lists stored in a distributed hash table. We present a simple feasibility analysis based on the resource constraints and search workload. Our study suggests that the peer-to-peer network does not have enough capacity to make naive use of either of search techniques attractive for Web search. The paper presents a number of existing and novel optimizations for P2P search based on distributed hash tables, estimates their effects on performance, and concludes that in combination these optimizations would bring the problem to within an order of magnitude of feasibility. The paper suggests a number of compromises that might achieve the last order of magnitude.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Jinyang Li
    • 1
  • Boon Thau Loo
    • 2
  • Joseph M. Hellerstein
    • 2
  • M. Frans Kaashoek
    • 1
  • David R. Karger
    • 1
  • Robert Morris
    • 1
  1. 1.MIT Lab for Computer Science 
  2. 2.UCBerkeley