RoSeS: A Continuous Content-Based Query Engine for RSS Feeds

  • Jordi Creus Tomàs
  • Bernd Amann
  • Nicolas Travers
  • Dan Vodislav
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6861)

Abstract

In this paper we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multi-query optimization. Users create personalized feeds by defining and composing content-based filtering and aggregation queries on collections of RSS feeds. Publishing these queries corresponds to defining views which can then be used for building new queries / feeds. This naturally reflects the publish-subscribe nature of RSS applications. The contributions presented in this paper are a declarative RSS feed aggregation language, an extensible stream algebra for building efficient continuous multi-query execution plans for RSS aggregation views, a multi-query optimization strategy for these plans and a running prototype based on a multi-threaded asynchronous execution engine.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    The yahoo! pipes feed aggregator, http://pipes.yahoo.com
  2. 2.
    Yahoo! query language, http://developer.yahoo.com/yql
  3. 3.
    Arasu, A., Babu, S., Widom, J.: The CQL continuous Query Language: Semantic Foundations and Query Execution. In: VLDB, pp. 121–142 (2006)Google Scholar
  4. 4.
    Botan, I., Fischer, P., Florescu, D., Kossman, D., Kraska, T., Tamosevicius, R.: Extending XQuery with Window Functions. In: VLDB, pp. 75–86 (2007)Google Scholar
  5. 5.
    Cammert, M., Krmer, J., Seeger, B., Vaupel, S.: A cost-based approach to adaptive resource management in data stream systems. TKDE 20, 230–245 (2008)Google Scholar
  6. 6.
    Chandramouli, B., Phillips, J.M., Yang, J.: Value-based notification conditions in large-scale publish/subscribe systems. In: VLDB, pp. 878–889 (2007)Google Scholar
  7. 7.
    Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M., Hellerstein, J., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.A.: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In: CIDR (2003)Google Scholar
  8. 8.
    Charikar, M., Chekuri, C., Cheung, T., Dai, Z., Goel, A., Guha, S., Li, M.: Approximation algorithms for directed steiner problems. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms SODA 1998, pp. 192–200. Society for Industrial and Applied Mathematics (1998)Google Scholar
  9. 9.
    Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: NiagaraCQ: A Scalable Continuous Query System for Internet Databases. SIGMOD Record, 379–390 (2000)Google Scholar
  10. 10.
    Creus, J., Amann, B., Travers, N., Vodislav, D.: Un agrégateur de flux rss avancé. 26e Journées Bases de Données Avancées, demonstration (2010)Google Scholar
  11. 11.
    Demers, A., Gehrke, J., Hong, M., Riedewald, M., White, W.: Towards expressive publish/Subscribe systems. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 627–644. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    Fabret, F., Jacobsen, A., Llirbat, F., Pereira, J., Ross, K., Shasha, D.: Filtering algorithms and implementation for very fast publish/subscribe systems. SIGMOD Record, 115–126 (2001)Google Scholar
  13. 13.
    Golab, L., Özsu, M.T.: Issues in Data Stream Management. SIGMOD Record 32(2), 5–14 (2003)CrossRefGoogle Scholar
  14. 14.
    Gupta, A.K., Suciu, D.: Stream Processing of XPath Queries with Predicates. SIGMOD Record, 419–430 (2003)Google Scholar
  15. 15.
    Hong, M., Demers, A.J., Gehrke, J., Koch, C., Riedewald, M., White, W.M.: Massively Multi-Query Join Processing in Publish/Subscribe Systems. SIGMOD Record, 761–772 (2007)Google Scholar
  16. 16.
    Horincar, R., Amann, B., Artières, T.: Best-effort refresh strategies for content-based RSS feed aggregation. In: Chen, L., Triantafillou, P., Suel, T. (eds.) WISE 2010. LNCS, vol. 6488, pp. 262–270. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  17. 17.
    I.E.T.F. IETF. Atompub status pages. (2011), http://tools.ietf.org/wg/atompub
  18. 18.
    Jun, S., Ahamad, M.: FeedEx: Collaborative Exchange of News Feeds. In: WWW, pp. 113–122 (2006)Google Scholar
  19. 19.
    Koch, C., Scherzinger, S., Schweikardt, N., Stegmaier, B.: FluXQuery: An Optimizing XQuery Processor for Streaming XML Data. In: VLDB (2004)Google Scholar
  20. 20.
    König, A.C., Church, K.W., Markov, M.: A Data Structure for Sponsored Search. In: ICDE, pp. 90–101 (2009)Google Scholar
  21. 21.
    Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: Semantics and Evaluation Techniques for Window Aggregates in Data Streams. SIGMOD Record, 311–322 (2005)Google Scholar
  22. 22.
    Li, X., Yan, J., Deng, Z., Ji, L., Fan, W., Zhang, B., Chen, Z.: A Novel Clustering-Based RSS Aggregator. In: WWW, pp. 1309–1310 (2007)Google Scholar
  23. 23.
    Luo, C., Thakkar, H., Wang, H., Zaniolo, C.: A Native Extension of SQL for Mining Data Streams. SIGMOD Record, 873–875 (2005)Google Scholar
  24. 24.
    Milo, T., Zur, T., Verbin, E.: Boosting topic-based publish-subscribe systems with dynamic clustering. SIGMOD Record, 749–760 (2007)Google Scholar
  25. 25.
    Peng, F., Chawathe, S.: XPath Queries on Streaming Data. SIGMOD Record, 431–442 (2003)Google Scholar
  26. 26.
    Rose, I., Murty, R., Pietzuch, P.R., Ledlie, J., Roussopoulos, M., Welsh, M.: Cobra: Content-based filtering and aggregation of blogs and rss feeds. In: NSDI (2007)Google Scholar
  27. 27.
    Sellis, T.K.: Multiple-query optimization. ACM Trans. Database Syst. 13, 23–52 (1988)CrossRefGoogle Scholar
  28. 28.
    Whang, S.E., Garcia-Molina, H., Brower, C., Shanmugasundaram, J., Vassilvitskii, S., Vee, E., Yerneni, R.: Indexing boolean expressions. VLDB Endow. 2, 37–48 (2009)CrossRefGoogle Scholar
  29. 29.
    Wu, E., Diao, Y., Rizvi, S.: High-Performance Complex Event Processing over Streams. SIGMOD Record, 407–418 (2006)Google Scholar
  30. 30.
    Yang, Y., Krämer, J., Papadias, D., Seeger, B.: Hybmig: A hybrid approach to dynamic plan migration for continuous queries. TKDE 19(3), 398–411 (2007)Google Scholar
  31. 31.
    Zhou, Y., Salehi, A., Aberer, K.: Scalable delivery of stream query result. VLDB Endow. 2, 49–60 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jordi Creus Tomàs
    • 1
  • Bernd Amann
    • 1
  • Nicolas Travers
    • 2
  • Dan Vodislav
    • 3
  1. 1.LIP6, CNRS - Université Pierre et Marie CurieParisFrance
  2. 2.Cedric/CNAM - Conservatoire National des Arts et MétiersParisFrance
  3. 3.ETIS, CNRS - University of Cergy-PontoiseCergyFrance

Personalised recommendations