Sampling Semantic Data Stream: Resolving Overload and Limited Storage Issues

  • Naman JainEmail author
  • Manuel Pozo
  • Raja Chiky
  • Zakia Kazi-Aoul
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 285)


The Semantic Web technologies are being increasingly used for exploiting relations between data. In addition, new tendencies of real-time systems, such as social networks, sensors, cameras or weather information, are continuously generating data. This implies that data and links between them are becoming extremely vast. Such huge quantity of data needs to be analyzed, processed, as well as stored if necessary. In this paper, we propose sampling operators that allow us to drop RDF Triples from the incoming data. Thereby, helping us to reduce the load on existing engines like CQELS, C-SPARQL, which are able to deal with big and linked data. Hence, the processing efforts, time as well as required storage space will be reduced remarkably. We have proposed Uniform Random Sampling, Reservoir Sampling and Chain Sampling operators which may be implemented depending on the application.


Big data Linked data-stream Processing time Sampling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anicic, D., Rudolph, S., Fodor, P., Stojanovic, N.: Stream reasoning and complex event processing in etalis. Semantic Web, 3(4): 397–407 (2012)Google Scholar
  2. 2.
    Babcock, B., Datar, M., Motwani, R.: Sampling from a moving window over streaming data. In Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 633–634. Society for Industrial and Applied Mathematics (2002)Google Scholar
  3. 3.
    Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C-sparql: Sparql for continuous querying. In: Proceedings of the 18th international conference on World wide web, pp. 1061–1062. ACM (2009)Google Scholar
  4. 4.
    Cohen, E., Cormode, G., Duffield, N.: Structure-aware sampling on data streams. In: Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, pp. 197–208. ACM (2011)Google Scholar
  5. 5.
    Komazec, S., Cerri, D., Fensel, D.: Sparkwave: continuous schema-enhanced pattern matching over rdf data streams. In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems, pp. 58–68. ACM (2012)Google Scholar
  6. 6.
    Le-Phuoc, D., Dao-Tran, M., Parreira, J. X., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: The Semantic Web–ISWC 2011, pp. 370–388. Springer (2011)Google Scholar
  7. 7.
    Sheth, A., Henson, C., Sahoo S. S.: Semantic sensor web. Internet Computing 12(4), pp. 78–83. IEEE (2008)Google Scholar
  8. 8.
    Vitter, J. S.: Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS), 11(1):37–57 (1985)Google Scholar

Copyright information

© Springer Science+Business Media Singapore 2014

Authors and Affiliations

  • Naman Jain
    • 1
    Email author
  • Manuel Pozo
    • 2
  • Raja Chiky
    • 2
  • Zakia Kazi-Aoul
    • 2
  1. 1.VIT UniversityVelloreIndia
  2. 2.ISEP—LISITEParisFrance

Personalised recommendations