MinervaDL: An Architecture for Information Retrieval and Filtering in Distributed Digital Libraries

  • Christian Zimmer
  • Christos Tryfonopoulos
  • Gerhard Weikum
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4675)

Abstract

We present MinervaDL, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of MinervaDL is based on the peer-to-peer search engine Minerva, and is able to handle huge amounts of data provided by digital libraries in a distributed and self-organizing way. The two-tier architecture and the use of the distributed hash table as the routing substrate provides an infrastructure for creating large networks of digital libraries with minimal administration costs. We discuss the main components of this architecture, present the protocols that regulate node interactions, and experimentally evaluate our approach.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aekaterinidis, I., Triantafillou, P.: Internet Scale String Attribute Publish/Subscribe Data networks. In: CIKM (2005)Google Scholar
  2. 2.
    Bender, M., Michel, S., Triantafillou, P., Weikum, G., Zimmer, C.: Improving Collection Selection with Overlap-Awareness. In: SIGIR (2005)Google Scholar
  3. 3.
    Bender, M., Michel, S., Triantafillou, P., Weikum, G., Zimmer, C.: Minerva: Collaborative P2P Search (Demo). In: VLDB (2005)Google Scholar
  4. 4.
    Callan, J.: Distributed Information Retrieval. Kluwer Academic Publishers, Dordrecht (2000)Google Scholar
  5. 5.
    Callan, J.P., Lu, Z., Croft, W.B.: Searching Distributed Collections with Inference Networks. In: SIGIR (1995)Google Scholar
  6. 6.
    Chatfield, C.: The Analysis of Time Series - An Introduction. CRC Press, Boca Raton (2004)MATHGoogle Scholar
  7. 7.
    Chirita, P.-A., Idreos, S., Koubarakis, M., Nejdl, W.: Publish/Subscribe for RDF-based P2P Networks. In: ESWC (2004)Google Scholar
  8. 8.
    Gravano, L., Garcia-Molina, H., Tomasic, A.: GlOSS: Text-Source Discovery over the Internet. In: ACM TODS (1999)Google Scholar
  9. 9.
    Hsiao, H.-C., King, C.-T.: Similarity Discovery in Structured P2P Overlays. In: ICPP (2003)Google Scholar
  10. 10.
    Idreos, S., Koubarakis, M., Tryfonopoulos, C.: P2P-Diet: An Extensible P2P Service that unifies ad-hoc and Continuous Querying in Super-Peer Networks. In: SIGMOD (2004)Google Scholar
  11. 11.
    Klampanos, I., Jose, J.: An Architecture for Peer-to-Peer Information Retrieval. In: SIGIR (2003)Google Scholar
  12. 12.
    Lu, J., Callan, J.: Content-based Retrieval in Hybrid Peer-to-Peer Networks. In: CIKM (2003)Google Scholar
  13. 13.
    Lu, J., Callan, J.: Federated Search of Text-based Digital Libraries in Hierarchical Peer-to-Peer Networks. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, Springer, Heidelberg (2005)Google Scholar
  14. 14.
    Michel, S., Zimmer, C., et al.: Discovering and Exploiting Keyword and Attribute-value Co-occurrences to Improve P2P Routing Indices. In: CIKM (2006)Google Scholar
  15. 15.
    Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, M., Naeve, A., Nilsson, M., Palmér, M., Risch, T.: Edutella: A P2P Networking Infrastructure based on RDF. In: WWW (2002)Google Scholar
  16. 16.
    Pietzuch, P., Bacon, J.: Hermes: A Distributed event-based Middleware Architecture. In: DEBS (2002)Google Scholar
  17. 17.
    Rowstron, A., Kermarrec, A.-M., Castro, M., Druschel, P.: Scribe: The Design of a Large-scale Event Notification Infrastructure. In: COST264 (2001)Google Scholar
  18. 18.
    Sahin, O., Emekci, F., Agrawal, D., Abbadi, A.: Content-based Similarity Search over Peer-to-Peer Systems. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, Springer, Heidelberg (2005)Google Scholar
  19. 19.
    Stoica, I., et al.: Chord: a Scalable Peer-to-Peer Lookup Protocol for Internet Applications. In: ACM TON (2003)Google Scholar
  20. 20.
    Stribling, J., Councill, I., Li, J., Kaashoek, M., Karger, D., Morris, R., Shenker, S.: Overcite: A Cooperative Digital Research Library. In: Castro, M., van Renesse, R. (eds.) IPTPS 2005. LNCS, vol. 3640, Springer, Heidelberg (2005)CrossRefGoogle Scholar
  21. 21.
    Suel, T., et al.: Odissea: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval. In: WebDB (2003)Google Scholar
  22. 22.
    Tang, C., Xu, Z.: pfilter: Global Information Filtering and Dissemination Using Structured Overlays. In: FTDCS (2003)Google Scholar
  23. 23.
    Tang, C., Xu, Z., Dwarkadas, S.: Peer-to-Peer Information Retrieval Using self-organizing Semantic Overlay Networks. In: SIGCOMM (2003)Google Scholar
  24. 24.
    Tryfonopoulos, C., Idreos, S., Koubarakis, M.: Libraring: An Architecture for Distributed Digital Libraries based on DHTs. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds.) ECDL 2005. LNCS, vol. 3652, Springer, Heidelberg (2005)CrossRefGoogle Scholar
  25. 25.
    Tryfonopoulos, C., Koubarakis, M., Drougas, Y.: Filtering Algorithms for Information Retrieval Models with Named Attributes and Proximity Operators. In: SIGIR (2004)Google Scholar
  26. 26.
    Tryfonopoulos, C., Zimmer, C., Koubarakis, M., Weikum, G.: Architectural Alternatives for Information Filtering in Structured Overlay Networks. IEEE Internet Computing (2007)Google Scholar
  27. 27.
    Yan, T.W., Garcia-Molina, H.: The SIFT Information Dissemination System. In: ACM TODS (1999)Google Scholar
  28. 28.
    Yang, B., Jeh, G.: Retroactive Answering of Search Queries. In: WWW (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Christian Zimmer
    • 1
  • Christos Tryfonopoulos
    • 1
  • Gerhard Weikum
    • 1
  1. 1.Department for Databases and Information Systems, Max-Planck-Institute for Informatics, 66123 SaarbrückenGermany

Personalised recommendations