Abstract
Our daily use of Internet and related technologies generates continuously large amount of heterogeneous data flows. Several RDF Stream Processing (RSP) systems have been proposed. Existing RSP systems benefit from the advantages of semantic web technologies and traditional data flow management systems. C-SPARQL, CQELS, SPARQL\(_{stream}\), EP-SPARQL, and Sparkwave extend the semantic query language SPARQL and are examples of those systems. Considering that the storage and processing of all these streams become expensive, we propose a solution to reduce the load while keeping data semantics, and optimizing treatments. In this paper, we propose to extend C-SPARQL for continuously generating samples on RDF graphs. We add three sampling operators (UNIFORM, RESERVOIR and CHAIN) to the C-SPARQL query syntax. These operators have been implemented into Esper, the C-SPARQL’s data flow management module. The experiments show the performance of our extension in terms of execution time and preserving data semantics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anicic, D., Fodor, P., Rudolph, S., & Stojanovic, N. (2011a). Ep-sparql: A unified language for event processing and stream reasoning. In Proceedings of the 20th international conference on World wide web (pp.635–644). ACM.
Anicic, D., Fodor, P., Rudolph, S., Stuhmer, R., Stojanovic, N., & Studer, R. (2011b). Etalis: Rule-based reasoning in event processing. Reasoning in Event-Based Distributed Systems, 347, 99.
Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K., et al., (2004a). Stream: The stanford data stream management system. Book chapter.
Arasu, A., Babu, S., & Widom, J. (2004b). Cql: A language for continuous queries over streams and relations. In Database Programming Languages (pp. 1–19). Springer.
Babcock, B., Datar, M., & Motwani, R. (2002). Sampling from a moving window over streaming data. In Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms (pp. 633–634). Society for Industrial and Applied Mathematics.
Barbieri, D. F., Braga, D., Ceri, S., & Grossniklaus, M. (2010). An execution environment for c-sparql queries. In Proceedings of the 13th International Conference on Extending Database Technology (pp. 441–452). ACM.
Bolles, A., Grawunder, M., & Jacobi, J. (2008). Streaming sparql-extending sparql to process data streams. The semantic web: research and applications (pp. 448–462)
Calbimonte, J.-P., Corcho, O., & Gray, A. J. (2010). Enabling ontology-based access to streaming data sources. In The Semantic Web–ISWC 2010 (pp. 96–111). Springer.
Cao, J., Zhang, W., & Tan, W. (2012). Dynamic control of data streaming and processing in a virtualized environment. IEEE Transactions on Automation Science and Engineering, 9(2), 365–376.
Cochran, W. G. (2007). Sampling techniques. Wiley.
Komazec, S., Cerri, D., & Fensel, D. (2012). Sparkwave: continuous schema-enhanced pattern matching over rdf data streams. In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems (pp. 58–68).
Le-Phuoc, D., Dao-Tran, M., Parreira, J. X., & Hauswirth, M. (2011). A native and adaptive approach for unified processing of linked streams and linked data. In The Semantic Web–ISWC 2011 (pp. 370–388). Springer.
Rete, C. (1982). A fast algorithm for the many pattern/many object pattern matching problem. Artificial Intelligence, 19, 17–37.
Vijayakumar, S., Zhu, Q., & Agrawal, G. (2010). Dynamic resource provisioning for data streaming applications in a cloud environment. In Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on (pp. 441–448). IEEE.
Vitter, J. S. (1985). Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS), 11(1), 37–57.
Acknowledgements
This work was performed under the FUI Waves project. This project aims to design and develop a distributed processing platform of massive data streams. The case study concerns the real-time monitoring of a drinking water distribution network.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Dia, A.F., Kazi-Aoul, Z., Boly, A., Chabchoub, Y. (2018). C-SPARQL Extension for Sampling RDF Graphs Streams. In: Pinaud, B., Guillet, F., Cremilleux, B., de Runz, C. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 732. Springer, Cham. https://doi.org/10.1007/978-3-319-65406-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-65406-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65405-8
Online ISBN: 978-3-319-65406-5
eBook Packages: EngineeringEngineering (R0)