Efficient Peer-to-Peer Keyword Searching

  • Patrick Reynolds
  • Amin Vahdat
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2672)


The recent file storage applications built on top of peer-to-peer distributed hash tables lack search capabilities. We believe that search is an important part of any document publication system. To that end, we have designed and analyzed a distributed search engine based on a distributed hash table. Our simulation results predict that our search engine can answer an average query in under one second, using under one kilobyte of bandwidth.


search distributed hash table peer-to-peer Bloom filter caching 


  1. 1.
    Philip Bernstein and Dah-Ming Chiu. Using semi-joins to solve relational queries. Journal of the Association for Computing Machinery, 28(1):25–40, January 1981.MATHMathSciNetGoogle Scholar
  2. 2.
    Burton H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422–426, 1970.MATHCrossRefGoogle Scholar
  3. 3.
    Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. In 7th International World Wide Web Conference, 1998.Google Scholar
  4. 4.
    Junghoo Cho and Hector Garcia-Molina. The evolution of the web and implications for an incremental crawler. In The VLDB Journal, September 2000.Google Scholar
  5. 5.
    I. Clarke. A distributed decentralised information storage and retrieval system, 1999.Google Scholar
  6. 6.
    Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica. Wide-area cooperative storage with CFS. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01), October 2001.Google Scholar
  7. 7.
    Li Fan, Pei Cao, Jussara Almeida, and Andrei Broder. Summary cache: A scalable wide-area web cache sharing protocol. In Proceedings of ACM SIGCOMM’98, pages 254–265, 1998.Google Scholar
  8. 8.
  9. 9.
    T. Hong. Freenet: A distributed anonymous information storage and retrieval system. In ICSI Workshop on Design Issues in Anonymity and Unobservability, 2000.Google Scholar
  10. 10.
    David R. Karger, Eric Lehman, Frank Thomson Leighton, Rina Panigrahy, Matthew S. Levine, and Daniel Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In ACM Symposium on Theory of Computing, pages 654–663, 1997.Google Scholar
  11. 11.
    David Liben-Nowell, Hari Balakrishnan, and David Karger. Analysis of the evolution of peer-to-peer systems. In Proceedings of ACM Conference on Principles of Distributed Computing (PODC), 2002.Google Scholar
  12. 12.
    Lothar Mackert and Guy Lohman. R* optimizer validation and performance evaluation for local queries. In ACM-SIGMOD Conference on Management of Data, 1986.Google Scholar
  13. 13.
    James Mullin. Optimal semijoins for distributed database systems. IEEE Transactions on Software Engineering, 16(5):558–560, May 1990.CrossRefGoogle Scholar
  14. 14.
  15. 15.
    Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford University, 1998.Google Scholar
  16. 16.
    Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker. A scalable content-addressable network. In Proceedings of ACM SIGCOMM’01, 2001.Google Scholar
  17. 17.
    Antony Rowstron and Peter Druschel. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’ 01), 2001.Google Scholar
  18. 18.
    Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking 2002 (MMCN’02), January 2002.Google Scholar
  19. 19.
    Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of ACM SIGCOMM’01, 2001.Google Scholar
  20. 20.
    Beverly Yang and Hector Garcia-Molina. Efficient search in peer-to-peer networks. Technical Report 2001-47, Stanford University, October 2001.Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2003

Authors and Affiliations

  • Patrick Reynolds
    • 1
  • Amin Vahdat
    • 1
  1. 1.Department of Computer ScienceDuke UniversityUSA

Personalised recommendations