Skip to main content

C-SPARQL Extension for Sampling RDF Graphs Streams

  • Chapter
  • First Online:
Advances in Knowledge Discovery and Management

Part of the book series: Studies in Computational Intelligence ((SCI,volume 732))

Abstract

Our daily use of Internet and related technologies generates continuously large amount of heterogeneous data flows. Several RDF Stream Processing (RSP) systems have been proposed. Existing RSP systems benefit from the advantages of semantic web technologies and traditional data flow management systems. C-SPARQL, CQELS, SPARQL\(_{stream}\), EP-SPARQL, and Sparkwave extend the semantic query language SPARQL and are examples of those systems. Considering that the storage and processing of all these streams become expensive, we propose a solution to reduce the load while keeping data semantics, and optimizing treatments. In this paper, we propose to extend C-SPARQL for continuously generating samples on RDF graphs. We add three sampling operators (UNIFORM, RESERVOIR and CHAIN) to the C-SPARQL query syntax. These operators have been implemented into Esper, the C-SPARQL’s data flow management module. The experiments show the performance of our extension in terms of execution time and preserving data semantics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.espertech.com/esper/.

  2. 2.

    https://jena.apache.org/.

  3. 3.

    http://rdf4j.org/sesame/.

References

  • Anicic, D., Fodor, P., Rudolph, S., & Stojanovic, N. (2011a). Ep-sparql: A unified language for event processing and stream reasoning. In Proceedings of the 20th international conference on World wide web (pp.635–644). ACM.

    Google Scholar 

  • Anicic, D., Fodor, P., Rudolph, S., Stuhmer, R., Stojanovic, N., & Studer, R. (2011b). Etalis: Rule-based reasoning in event processing. Reasoning in Event-Based Distributed Systems, 347, 99.

    Article  Google Scholar 

  • Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K., et al., (2004a). Stream: The stanford data stream management system. Book chapter.

    Google Scholar 

  • Arasu, A., Babu, S., & Widom, J. (2004b). Cql: A language for continuous queries over streams and relations. In Database Programming Languages (pp. 1–19). Springer.

    Google Scholar 

  • Babcock, B., Datar, M., & Motwani, R. (2002). Sampling from a moving window over streaming data. In Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms (pp. 633–634). Society for Industrial and Applied Mathematics.

    Google Scholar 

  • Barbieri, D. F., Braga, D., Ceri, S., & Grossniklaus, M. (2010). An execution environment for c-sparql queries. In Proceedings of the 13th International Conference on Extending Database Technology (pp. 441–452). ACM.

    Google Scholar 

  • Bolles, A., Grawunder, M., & Jacobi, J. (2008). Streaming sparql-extending sparql to process data streams. The semantic web: research and applications (pp. 448–462)

    Google Scholar 

  • Calbimonte, J.-P., Corcho, O., & Gray, A. J. (2010). Enabling ontology-based access to streaming data sources. In The Semantic Web–ISWC 2010 (pp. 96–111). Springer.

    Google Scholar 

  • Cao, J., Zhang, W., & Tan, W. (2012). Dynamic control of data streaming and processing in a virtualized environment. IEEE Transactions on Automation Science and Engineering, 9(2), 365–376.

    Article  Google Scholar 

  • Cochran, W. G. (2007). Sampling techniques. Wiley.

    Google Scholar 

  • Komazec, S., Cerri, D., & Fensel, D. (2012). Sparkwave: continuous schema-enhanced pattern matching over rdf data streams. In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems (pp. 58–68).

    Google Scholar 

  • Le-Phuoc, D., Dao-Tran, M., Parreira, J. X., & Hauswirth, M. (2011). A native and adaptive approach for unified processing of linked streams and linked data. In The Semantic Web–ISWC 2011 (pp. 370–388). Springer.

    Google Scholar 

  • Rete, C. (1982). A fast algorithm for the many pattern/many object pattern matching problem. Artificial Intelligence, 19, 17–37.

    Article  Google Scholar 

  • Vijayakumar, S., Zhu, Q., & Agrawal, G. (2010). Dynamic resource provisioning for data streaming applications in a cloud environment. In Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on (pp. 441–448). IEEE.

    Google Scholar 

  • Vitter, J. S. (1985). Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS), 11(1), 37–57.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was performed under the FUI Waves project. This project aims to design and develop a distributed processing platform of massive data streams. The case study concerns the real-time monitoring of a drinking water distribution network.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amadou Fall Dia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Dia, A.F., Kazi-Aoul, Z., Boly, A., Chabchoub, Y. (2018). C-SPARQL Extension for Sampling RDF Graphs Streams. In: Pinaud, B., Guillet, F., Cremilleux, B., de Runz, C. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 732. Springer, Cham. https://doi.org/10.1007/978-3-319-65406-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65406-5_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65405-8

  • Online ISBN: 978-3-319-65406-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics