Advertisement

Achieving Effective Multi-term Queries for Fast DHT Information Retrieval

  • Quanqing Xu
  • Heng Tao Shen
  • Yafei Dai
  • Bin Cui
  • Xiaofang Zhou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5175)

Abstract

Distributed Hash Tables (DHTs) are well-suited for exact match look- ups using unique identifiers, but do not directly support multi-term queries. Related research of query expansion has shown that adding new terms to a query via ad hoc feedback improves the retrieval effectiveness of such query. In the paper, we propose an effective multi-term query processing algorithm for information retrieval in DHT systems. Given the significance of first term in a multi-term query, the query is sent to the peers containing the first term. To enhance the query effectiveness, we design two query expansion mechanisms and an implicit relevance feedback approach based on users’ behaviors. Additionally, we record the query log and the expansion terms for each query which can accelerate the future queries and improve the query accuracy. Experimental results show that our query methods yield substantial improvements in retrieval effectiveness in the following three aspects: recall, precision at 10 standard recall levels and precision histograms.

Keywords

Relevance Feedback Query Term Query Expansion Bloom Filter Expansion Term 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lu, J., Callan, J.P.: Content-based retrieval in hybrid peer-to-peer networks. In: CIKM, pp. 199–206 (2003)Google Scholar
  2. 2.
    Lu, J., Callan, J.P.: User modeling for full-text federated search in peer-to-peer networks. In: SIGIR, pp. 332–339 (2006)Google Scholar
  3. 3.
    Suel, T., Mathur, C., Wu, J., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasunderam, K.: Odissea: A peer-to-peer architecture for scalable web search and information retrieval. In: WebDB 2003 (June 2003)Google Scholar
  4. 4.
    Bender, M., Michel, S., Parreira, J.X., Crecelius, T.: P2p web search: Make it light, make it fly (demo). In: CIDR, pp. 164–168 (2007)Google Scholar
  5. 5.
    Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Li, J., Loo, B.T., Hellerstein, J., Kaashoek, F., Karger, D.R., Morris, R.: On the feasibility of peer-to-peer web indexing and search. In: IPTPS 2003 (February 2003)Google Scholar
  7. 7.
    Gnawali, O.D.: A keyword set search system for peer-to-peer networks. Master’s thesis, Massachusetts Institute of Technology (June 2002)Google Scholar
  8. 8.
    Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: A study and analysis of user queries on the web. Information Processing and Management 36, 207–227 (2000)CrossRefGoogle Scholar
  9. 9.
    Chen, H., Jin, H., Liu, Y., Ni, L.M.: Difficulty-aware hybrid search in peer-to-peer networks. In: ICPP, p. 6 (2007)Google Scholar
  10. 10.
    Kwok, K.L.: A new method of weighting query terms for ad-hoc retrieval. In: SIGIR, pp. 187–195 (1996)Google Scholar
  11. 11.
    Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: SIGCOMM, pp. 353–366 (2004)Google Scholar
  12. 12.
    Ramabhadran, S., Hellerstein, J.M., Ratnasamy, S., Shenker, S.: Prefix hash tree: An indexing data structure over distributed hash tables (2004)Google Scholar
  13. 13.
    Zhou, M., Zhang, R., Qian, W., Zhou, A.: Gchord: Indexing for multiattribute query in p2p system with low maintenance cost. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 55–66. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  14. 14.
    Sahin, O.D., Gupta, A., Agrawal, D., Abbadi, A.E.: A peer-to-peer framework for caching range queries. In: ICDE, pp. 165–176 (2004)Google Scholar
  15. 15.
    Tang, C., Dwarkadas, S.: Hybrid global-local indexing for efficient peer-to-peer information retrieval. In: NSDI (2004)Google Scholar
  16. 16.
    Witschel, H.F., Böhme, T.: Evaluating profiling and query expansion methods for p2p information retrieval. In: P2PIR (2005)Google Scholar
  17. 17.
    Xu, J., Callan, J.: Effective retrieval with distributed collections. In: Proc. of SIGIR 1998, pp. 112–120 (1998)Google Scholar
  18. 18.
    Churn, K.W., Hanks, P.: Word association norms, mutual information and lexicography. In: Proceedings of ACL 27, Vancouver, Canada, pp. 76–83 (1989)Google Scholar
  19. 19.
    Salton, G., Wang, A., Yang, C.: A vector space model for information retrieval. Journal of the American Society for Information Science 18, 613–620 (1975)MATHGoogle Scholar
  20. 20.
    Bausch, P., Calishain, T., Dornfest, R.: Google Hacks, 3rd edn., pp. 101–105. O’Reilly Media, Inc., Sebastopol (2006)Google Scholar
  21. 21.
    Furnas, G.W., Landauer, T.K., Gomez, L.M., Dumais, S.T.: The vocabulary problem in human-system communication. Communications of the ACM 30, 964–971 (1987)CrossRefGoogle Scholar
  22. 22.
    Chinese web inforamtion retrieval forum, http://www.cwirf.org

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Quanqing Xu
    • 1
  • Heng Tao Shen
    • 2
  • Yafei Dai
    • 1
  • Bin Cui
    • 1
  • Xiaofang Zhou
    • 2
  1. 1.Department of Computer Science and TechnologyPeking UniversityBeijingChina
  2. 2.School of Information Technology and Electrical EngineeringThe University of QueenslandBrisbaneAustralia

Personalised recommendations