Collaborative Ranking and Profiling: Exploiting the Wisdom of Crowds in Tailored Web Search

  • Pascal Felber
  • Peter Kropf
  • Lorenzo Leonini
  • Toan Luu
  • Martin Rajman
  • Etienne Rivière
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6115)


Popular search engines essentially rely on information about the structure of the graph of linked elements to find the most relevant results for a given query. While this approach is satisfactory for popular interest domains or when the user expectations follow the main trend, it is very sensitive to the case of ambiguous queries, where queries can have answers over several different domains. Elements pertaining to an implicitly targeted interest domain with low popularity are usually ranked lower than expected by the user. This is a consequence of the poor usage of user-centric information in search engines. Leveraging semantic information can help avoid such situations by proposing complementary results that are carefully tailored to match user interests. This paper proposes a collaborative search companion system, CoFeed, that collects user search queries and accesses feedback to build user- and document-centric profiling information. Over time, the system constructs ranked collections of elements that maintain the required information diversity and enhance the user search experience by presenting additional results tailored to the user interest space. This collaborative search companion requires a supporting architecture adapted to large user populations generating high request loads. To that end, it integrates mechanisms for ensuring scalability and load balancing of the service under varying loads and user interest distributions. Experiments with a deployed prototype highlight the efficiency of the system by analyzing improvement in search relevance, computational cost, scalability and load balance.


Load Balance Distribute Hash Table User Interest Load Balance Mechanism Ambiguous Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
  2. 2.
    Adamic, L.A., Huberman, B.A.: Zipf’s law and the internet. Glottometrics 3, 143–150 (2002)Google Scholar
  3. 3.
    Akavipat, R., Wu, L.-S., Menczer, F., Maguitman, A.: Emerging semantic communities in peer web search. In: P2PIR 2006 (2006)Google Scholar
  4. 4.
    Bao, S., Xue, G., Wu, X., Yu, Y., Fei, B., Su, Z.: Optimizing web search using social annotations. In: WWW 2007 (2007)Google Scholar
  5. 5.
    Bender, M., Michel, S., Weikum, G., Zimmer, C.: The Minerva project: Database selection in the context of P2P search. Datenbanksysteme in Business, Technologie und Web 65, 125–144 (2005)Google Scholar
  6. 6.
    Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)CrossRefMATHGoogle Scholar
  7. 7.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)CrossRefGoogle Scholar
  8. 8.
    Cheng, K., Xiang, L., Iwaihara, M., Xu, H., Mohania, M.M.: Time-decaying bloom filters for data streams with skewed distributions. In: RIDE-SDMA 2005 (2005)Google Scholar
  9. 9.
    Gylfason, H., Khan, O., Schoenebeck, G.: Chora: Expert-based p2p web search. In: AAMAS 2006 (2006)Google Scholar
  10. 10.
    Klemm, F., Aberer, K.: Aggregation of a term vocabulary for peer-to-peer information retrieval: a DHT stress test. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005. LNCS, vol. 4125, pp. 187–194. Springer, Heidelberg (2005)Google Scholar
  11. 11.
    Leonini, L., Rivière, E., Felber, P.: SPLAY: Distributed systems evaluation made simple (or how to turn ideas into live systems in a breeze). In: NSDI 2009 (2009)Google Scholar
  12. 12.
    Li, J., Loo, B., Hellerstein, J., Kaashoek, F., Karger, D., Morris, R.: The feasibility of peer-to-peer web indexing and search. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  13. 13.
    Lopes, N., Baquero, C.: Taming hot-spots in dht inverted indexes. In: LSDS-IR 2007 (2007)Google Scholar
  14. 14.
    Luu, T., Klemm, F., Podnar, I., Rajman, M., Aberer, K.: Alvis peers: A scalable full-text peer-to-peer retrieval engine. In: Proc of P2PIR 2006 (2006)Google Scholar
  15. 15.
    Mislove, A., Gummadi, K.P., Druschel, P.: Exploiting social networks for internet search. In: HotNets 2006 (2006)Google Scholar
  16. 16.
    Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: InfoScale 2006, New York, NY, USA (2006)Google Scholar
  17. 17.
    Ramasubramanian, V., Sirer, E.G.: Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays. In: NSDI 2004 (2004)Google Scholar
  18. 18.
    Rowstron, A., Druschel, P.: Pastry: scalable, decentralized object location and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, p. 329. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  19. 19.
    Rowstron, A., Druschel, P.: Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In: SOSP 2001 (2001)Google Scholar
  20. 20.
    Schenkel, R., Crecelius, T., Kacimi, M., Michel, S., Neumann, T., Parreira, J.X., Weikum, G.: Efficient top-k querying over social-tagging networks. In: SIGIR 2008 (2008)Google Scholar
  21. 21.
    Serbu, S., Bianchi, S., Kropf, P., Felber, P.: Dynamic load sharing in peer-to-peer systems: When some peers are more equal than others. IEEE Internet Computing, Special Issue on Resource Allocation 11(4), 53–61 (2007)CrossRefGoogle Scholar
  22. 22.
    Suel, T., Mathur, C., Wu, J.-W., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasundaram, K.: Odissea: A peer-to-peer architecture for scalable web search and information retrieval. In: WebDB 2003 (2003)Google Scholar
  23. 23.
    Tan, B., Shen, X., Zhai, C.: Mining long-term search history to improve search accuracy. In: SIGKDD 2006 (2006)Google Scholar
  24. 24.
    Teevan, J., Dumais, S.T., Horvitz, E.: Personalizing search via automated analysis of interests and activities. In: SIGIR-IR 2005 (2005)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2010

Authors and Affiliations

  • Pascal Felber
    • 1
  • Peter Kropf
    • 1
  • Lorenzo Leonini
    • 1
  • Toan Luu
    • 2
  • Martin Rajman
    • 2
  • Etienne Rivière
    • 1
  1. 1.University of NeuchâtelSwitzerland
  2. 2.EPFLSwitzerland

Personalised recommendations