Advertisement

DRSS: Distributed RDF SPARQL Streaming

  • Amadou Fall Dia
  • Zakia Kazi-Aoul
  • Aliou Boly
  • Elisabeth Métais
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 722)

Abstract

In this work, we present DRSS, a distributed and scalable engine for RDF streams processing. DRSS proposes a new query syntax for continuous querying of RDF data streams. The system includes among others three efficient algorithms for (1) rewriting continuous queries sharing common sub-structures (2), SPARQL query partitioning across multiple computer nodes according to an efficient distribution strategy and (3) query-based data distribution for local processing of sub-queries minimizing data exchanged across nodes. Our system combines both real-time data from multiple sources and stored RDF processing. DRSS and its all algorithms are implemented using the real-time data processing platform Storm Framework, which provides parallelization mechanisms of query operators. The DRSS evaluation is conducted on a real dataset containing up to 1 million RDF graphs. Experiments and obtained results confirm the scalability and the effectiveness of our system.

Keywords

DRSS RDF graphs streams Distributed sparql 

Notes

Acknowledgements

This work was performed under the FUI Waves project. This project aims to design and develop a distributed processing platform of massive data streams. The case study concerns the real-time monitoring of a drinking water distribution network.

References

  1. 1.
    Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: a new model and architecture for data stream management. VLDB J. Int. J. Very Large Data Bases 12(2), 120–139 (2003)CrossRefGoogle Scholar
  2. 2.
    Anicic, D., Fodor, P., Rudolph, S., Stojanovic, N.: Ep-sparql: a unified language for event processing and stream reasoning. In: Proceedings of the 20th International Conference on World Wide Web, pp. 635–644. ACM (2011)Google Scholar
  3. 3.
    Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K., Motwani, R., Srivastava, U., Widom, J.: Stream: the stanford data stream management system. In: Data Stream Management, pp. 317–336. Springer (2016)Google Scholar
  4. 4.
    Arasu, A., Babu, S., Widom, J.: The cql continuous query language: semantic foundations and query execution. VLDB J. Int. J. Very Large Data Bases 15(2), 121–142 (2006)CrossRefGoogle Scholar
  5. 5.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 1–16. ACM (2002)Google Scholar
  6. 6.
    Barbieri, D.F., Braga, D., Ceri, S., Grossniklaus, M.: An execution environment for c-sparql queries. In: Proceedings of the 13th International Conference on Extending Database Technology, pp. 441–452. ACM (2010)Google Scholar
  7. 7.
    Barbieri, D.F., Braga, D., Ceri, S., Valle, E.D., Grossniklaus, M.: C-sparql: a continuous query language for RDF data streams. Int. J. Semant. Comput. 4(01), 3–25 (2010)CrossRefMATHGoogle Scholar
  8. 8.
    Bolles, A., Grawunder, M., Jacobi, J.: Streaming sparql-extending sparql to process data streams. Semant. Web: Res. Appl. 448–462 (2008)Google Scholar
  9. 9.
    Buil-Aranda, C., Arenas, M., Corcho, O., Polleres, A.: Federating queries in sparql 1.1: syntax, semantics and evaluation. Web Semant.: Sci. Serv. Agents World Wide Web 18(1), 1–17 (2013)CrossRefGoogle Scholar
  10. 10.
    Calbimonte, J.P., Corcho, O., Gray, A.J.: Enabling ontology-based access to streaming data sources. In: The Semantic Web–ISWC 2010, pp. 96–111. Springer (2010)Google Scholar
  11. 11.
    Etzion, O., Niblett, P.: Event Processing in Action. Manning Publications Co. (2010)Google Scholar
  12. 12.
    Gillani, S., Picard, G., Laforest, F.: Dionysus: towards query-aware distributed processing of RDF graph streams. In: EDBT/ICDT Workshops. Citeseer (2016)Google Scholar
  13. 13.
    Gurajada, S., Seufert, S., Miliaraki, I., Theobald, M.: Triad: a distributed shared-nothing rdf engine based on asynchronous message passing. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 289–300. ACM (2014)Google Scholar
  14. 14.
    Hammoud, M., Rabbou, D.A., Nouri, R., Beheshti, S.M.R., Sakr, S.: Dream: distributed rdf engine with adaptive query planner and minimal communication. Proc. VLDB Endow. 8(6), 654–665 (2015)CrossRefGoogle Scholar
  15. 15.
    Hoeksema, J., Kotoulas, S.: High-performance distributed stream reasoning using s4. In: Ordring Workshop at ISWC (2011)Google Scholar
  16. 16.
    Komazec, S., Cerri, D., Fensel, D.: Sparkwave: continuous schema-enhanced pattern matching over RDF data streams. In: Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems, pp. 58–68 (2012)Google Scholar
  17. 17.
    Le-Phuoc, D., Dao-Tran, M., Parreira, J.X., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: The Semantic Web–ISWC 2011, pp. 370–388. Springer (2011)Google Scholar
  18. 18.
    Le-Phuoc, D., Dao-Tran, M., Pham, M.D., Boncz, P., Eiter, T., Fink, M.: Linked stream data processing engines: facts and figures. Semant. Web-ISWC 2012, 300–312 (2012)Google Scholar
  19. 19.
    Le-Phuoc, D., Quoc, H.N.M., van Le, C., Hauswirth, M.: Elastic and scalable processing of linked stream data in the cloud. In: International Semantic Web Conference, pp. 280–297. Springer (2013)Google Scholar
  20. 20.
    Makris, K., Bikakis, N., Gioldasis, N., Christodoulakis, S.: Sparql-rw: transparent query access over mapped RDF data sources. In: Proceedings of the 15th International Conference on Extending Database Technology, pp. 610–613. ACM (2012)Google Scholar
  21. 21.
    Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of sparql. ACM Trans. Database Syst. (TODS) 34(3), 16 (2009)CrossRefGoogle Scholar
  22. 22.
    Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: Fedx: a federation layer for distributed query processing on linked open data. In: Extended Semantic Web Conference, pp. 481–486. Springer (2011)Google Scholar
  23. 23.
    Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A distributed graph engine for web scale rdf data. In: Proceedings of the VLDB Endowment, vol. 6, pp. 265–276. VLDB Endowment (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Amadou Fall Dia
    • 1
  • Zakia Kazi-Aoul
    • 1
  • Aliou Boly
    • 2
  • Elisabeth Métais
    • 3
  1. 1.LISITE LabISEPParisFrance
  2. 2.LID LabUCADDakar-FannSenegal
  3. 3.CEDRIC LabCNAMParisFrance

Personalised recommendations