Approximate Information Filtering in Peer-to-Peer Networks

  • Christian Zimmer
  • Christos Tryfonopoulos
  • Klaus Berberich
  • Manolis Koubarakis
  • Gerhard Weikum
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5175)


Most approaches to information filtering taken so far have the underlying hypothesis of potentially delivering notifications from every information producer to subscribers. This exact publish/subscribe model creates an efficiency and scalability bottleneck, and might not even be desirable in certain applications. The work presented here puts forward MAPS, a novel approach to support approximate information filtering in a peer-to-peer environment. In MAPS a user subscribes to and monitors only carefully selected data sources, and receives notifications about interesting events from these sources only. This way scalability is enhanced by trading recall for lower message traffic. We define the protocols of a peer-to-peer architecture especially designed for approximate information filtering, and introduce new node selection strategies based on time series analysis techniques to improve data source selection. Our experimental evaluation shows that MAPS is scalable; it achieves high recall by monitoring only few data sources.


Cost Ratio Query Term Distribute Hash Table Resource Selection Node Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Yang, B., Jeh, G.: Retroactive Answering of Search Queries. In: WWW (2006)Google Scholar
  2. 2.
    Tang, C., Xu, Z.: pFilter: Global Information Filtering and Dissemination Using Structured Overlay Networks. In: FTDCS (2003)Google Scholar
  3. 3.
    Tryfonopoulos, C., Idreos, S., Koubarakis, M.: Publish/Subscribe Functionality in IR Environments using Structured Overlay Networks. In: SIGIR (2005)Google Scholar
  4. 4.
    Aekaterinidis, I., Triantafillou, P.: PastryStrings: A Comprehensive Content-Based Publish/Subscribe DHT Network. In: ICDCS (2006)Google Scholar
  5. 5.
    Tryfonopoulos, C., Zimmer, C., Weikum, G., Koubarakis, M.: Architectural Alternatives for Information Filtering in Structured Overlays. Internet Computing (2007)Google Scholar
  6. 6.
    Zimmer, C., Tryfonopoulos, C., Weikum, G.: MinervaDL: An Architecture for Information Retrieval and Filtering in Distributed Digital Libraries. In: Kovács, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 148–160. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Zimmer, C., Tryfonopoulos, C., Berberich, K., Weikum, G., Koubarakis, M.: Node Behavior Prediction for LargeScale Approximate Information Filtering. In: LSDS-IR (2007)Google Scholar
  8. 8.
    Terry, D., Goldberg, D., Nichols, D., Oki, B.: Continuous Queries over Append-Only Databases. In: SIGMOD (1992)Google Scholar
  9. 9.
    Liu, L., Pu, C., Tang, W.: Continual Queries for Internet Scale Event-Driven Information Delivery. In: TKDE 2000 (2000)Google Scholar
  10. 10.
    Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: NiagaraCQ: A Scalable Continuous Query System for Internet Databases. In: SIGMOD (2000)Google Scholar
  11. 11.
    Madden, S., Shah, M.A., Hellerstein, J.M., Raman, V.: Continuously Adaptive Continuous Queries over Streams. In: SIGMOD 2002 (2002)Google Scholar
  12. 12.
    Chandrasekaran, S., Franklin, M.J.: PSoup: A System for Streaming Queries over Streaming Data. VLDB Journal (2003)Google Scholar
  13. 13.
    Gedik, B., Liu, L.: PeerCQ: A Decentralized and Self-Configuring Peer-to-Peer Information Monitoring System. In: ICDCS (2003)Google Scholar
  14. 14.
    Ahmad, Y., Çetintemel, U.: Networked Query Processing for Distributed Stream-Based Applications. In: VLDB (2004)Google Scholar
  15. 15.
    Jain, A., Hellerstein, J.M., Ratnasamy, S., Wetherall, D.: A Wakeup Call for Internet Monitoring Systems: The Case for Distributed Triggers. HotNets (2004)Google Scholar
  16. 16.
    Zhang, R., Hu, Y.C.: HYPER: A Hybrid Approach to Efficient Content-Based Publish/Subscribe. In: ICDCS (2005)Google Scholar
  17. 17.
    Pietzuch, P.R., Bacon, J.: Hermes: A Distributed Event-Based Middleware Architecture. In: DEBS (2002)Google Scholar
  18. 18.
    Gupta, A., Sahin, O.D., Agrawal, D., Abbadi, A.E.: Meghdoot: Content-Based Publish/Subscribe over P2P Networks. In: Jacobsen, H.-A. (ed.) Middleware 2004. LNCS, vol. 3231, pp. 254–273. Springer, Heidelberg (2004)Google Scholar
  19. 19.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R.M., Shenker, S.: A Scalable Content-Addressable Network. In: SIGCOMM (2001)Google Scholar
  20. 20.
    Tryfonopoulos, C., Idreos, S., Koubarakis, M.: LibraRing: An Architecture for Distributed Digital Libraries Based on DHTs. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds.) ECDL 2005. LNCS, vol. 3652, pp. 25–36. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  21. 21.
    Stoica, I., Morris, R., Karger, D.R., Kaashoek, M.F., Balakrishnan, H.: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. In: SIGCOMM (2001)Google Scholar
  22. 22.
    Bender, M., Michel, S., Triantafillou, P., Weikum, G., Zimmer, C.: Improving Collection Selection with Overlap Awareness in P2P Search Engines. In: SIGIR (2005)Google Scholar
  23. 23.
    Tryfonopoulos, C., Koubarakis, M., Drougas, Y.: Filtering Algorithms for Information Retrieval Models with Named Attributes and Proximity Operators. In: SIGIR (2004)Google Scholar
  24. 24.
    Yan, T.W., Garcia-Molina, H.: The SIFT Information Dissemination System. In: TODS (1999)Google Scholar
  25. 25.
    Callan, J.: Distributed Information Retrieval. Kluwer Academic Publishers, Dordrecht (2000)Google Scholar
  26. 26.
    Chatfield, C.: The Analysis of Time Series - An Introduction. CRC Press, Boca Raton (2004)MATHGoogle Scholar
  27. 27.
    Nottelmann, H., Fuhr, N.: Evaluating Different Methods of Estimating Retrieval Quality for Resource Selection. In: SIGIR (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Christian Zimmer
    • 1
  • Christos Tryfonopoulos
    • 1
  • Klaus Berberich
    • 1
  • Manolis Koubarakis
    • 2
  • Gerhard Weikum
    • 1
  1. 1.Max-Planck-Institute for InformaticsSaarbrückenGermany
  2. 2.National and Kapodistrian University of AthensGreece

Personalised recommendations