POL: A Pattern Oriented Load-Shedding for Semantic Data Stream Processing

  • Fethi Belghaouti
  • Amel BouzeghoubEmail author
  • Zakia Kazi-Aoul
  • Raja Chiky
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10042)


Nowadays, high volumes of data are generated and published at a very high velocity, producing heterogeneous data streams. This has led researchers to propose new systems named RDF Stream Processors (RSP), to deal with this new kind of streams. Unfortunately, these systems are fallible when their maximum supported speed is reached especially in a limited system resources environment. To overcome these problems, recent efforts have been made in the field. Some of them decrease the volume of RDF data streams using compression or load-shedding techniques, mostly according to a probabilistic approach. In this paper we propose POL: a Pattern Oriented approach to Load-shed data from RDF streams based on a deterministic approach. As a pre-processing task through a unique pass, the approach extracts the exact needed semantic data from the stream. The conducted experiments on public available datasets have demonstrated the effectiveness of our approach.


BigData Semantic data stream Graph patterns detection Load-shedding 



This work is partially funded by the French National Research Agency (ANR) project CAIR (ANR-14-CE23-0006).


  1. 1.
    Abadi, D., Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Erwin, C., Galvez, E., Hatoun, M., Maskey, A., Rasin, et al.: Aurora: a data stream management system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2003)Google Scholar
  2. 2.
    Anicic, D., Fodor, P., Rudolph, S., Stojanovic, N.: EP-SPARQL: a unified language for event processing and stream reasoning. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 635–644. ACM, New York (2011)Google Scholar
  3. 3.
    Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein, J., Widom, J.: Stream: the stanford stream data manager (demonstration description). In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 665–665. ACM (2003)Google Scholar
  4. 4.
    Babcock, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: 2004 Proceedings of 20th International Conference on Data Engineering, pp. 350–361, March 2004Google Scholar
  5. 5.
    Barbieri, D.F., Braga, D., Ceri, S., Grossniklaus, M.: An execution environment for c-SPARQL queries. In: Proceedings of the 13th International Conference on Extending Database Technology, EDBT 2010, pp. 441–452. ACM, New York (2010)Google Scholar
  6. 6.
    Berners-Lee, T., Hendler, J., Lassila, O., et al.: The semantic web. Sci. Am. 284(5), 28–37 (2001)CrossRefGoogle Scholar
  7. 7.
    Bolles, A., Grawunder, M., Jacobi, J.: Streaming SPARQL - extending SPARQL to process data streams. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 448–462. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-68234-9_34 CrossRefGoogle Scholar
  8. 8.
    Calbimonte, J.-P., Corcho, O., Gray, A.J.G.: Enabling ontology-based access to streaming data sources. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 96–111. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Corcho, Ó., Garijo Verdejo, D., Mora, J., Poveda Villalon, M., Vila Suero, D., Villazón-Terrazas, B., Rozas, P., Atemezing, G.A.: Transforming meteorological data into linked data. Semantic Web (2012)Google Scholar
  10. 10.
    Das, A., Gehrke, J., Riedewald, M.: Approximate join processing over data streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 40–51. ACM (2003)Google Scholar
  11. 11.
    Gao, S., Scharrenbach, T., Bernstein, A.: The clock data-aware eviction approach: towards processing linked data streams with limited resources. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 6–20. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  12. 12.
    Komazec, S., Cerri, D., Fensel, D.: Sparkwave: continuous schema-enhanced pattern matching over RDF data streams. In: DEBS, pp. 58–68. ACM (2012)Google Scholar
  13. 13.
    Margara, A., Urbani, J., van Harmelen, F., Bal, H.: Streaming the web: reasoning over dynamic data. Web Semant.: Sci. Serv. Agents World Wide Web 25, 24–44 (2014)CrossRefGoogle Scholar
  14. 14.
    Nguyen, M.K., Scharrenbach, T., Bernstein, A.: Eviction strategies for semantic flow processing. In: SSWS@ ISWC, pp. 66–80 (2013)Google Scholar
  15. 15.
    Phuoc, D.L.: A native and adaptive approach for linked stream data processing. Ph.D. thesis, Digital Enterprise Research Institute, National University of Ireland, Galwa (2013)Google Scholar
  16. 16.
    Prudhommeau, E., Carothers, G., Machina, L.: Rdf 1.1 turtle terse RDF triple language. W3C Recommendation, 25 February 2014Google Scholar
  17. 17.
    Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB, pp. 309–320 (2003)Google Scholar
  18. 18.
    Jesper, H., Spyros, K.: High-performance distributed stream reasoning using S4. In: Ordering Workshop at ISWC (2011)Google Scholar
  19. 19.
    Le-Phuoc, D., Nguyen Mau Quoc, H., Le Van, C., Hauswirth, M.: Elastic and scalable processing of linked stream data in the cloud. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 280–297. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  20. 20.
    Jain, N., Pozo, M., Chiky, R., Kazi-Aoul, Z.: Sampling semantic data stream: resolving overload and limited storage issues. In: Herawan, T., Deris, M.M., Abawajy, J. (eds.) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). LNEE, vol. 285, pp. 41–48. Springer, Heidelberg (2014). doi: 10.1007/978-981-4585-18-7_5 CrossRefGoogle Scholar
  21. 21.
    Brian, B., Mayur, D., Rajeev, M.: Load shedding in data stream systems. In: Aggarwal, C.C. (ed.) Data Streams. ADS, pp. 127–147. Springer, Heidelberg (2007). Google Scholar
  22. 22.
    Agrawal, R., Imieliski, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993)CrossRefGoogle Scholar
  23. 23.
    Hoan, Q., Mau, N., Le Phuoc, D.: An elastic and scalable spatiotemporal query processing for linked sensor data. In: Proceedings of the 11th International Conference on Semantic Systems. ACM (2015)Google Scholar
  24. 24.
    Belghaouti, F., Bouzeghoub, A., Kazi-Aoul, Z., Chiky, R.: Graph-oriented load-shedding for semantic data stream processing. In: 2015 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM). IEEE, October 2015Google Scholar
  25. 25.
    Belghaouti, F., Bouzeghoub, A., Kazi-Aoul, Z., Chiky, R.: FreGraPaD: frequent graph patterns detection for semantic data streams. In: Tenth IEEE International Conference on Research Challenges in Information Science - RCIS (2016)Google Scholar
  26. 26.
    Dell’Aglio, D., Calbimonte, J.-P., Balduini, M., Corcho, O., Della Valle, E.: On correctness in RDF stream processor benchmarking. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 326–342. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  27. 27.
    Tu, Y.-C., Liu, S., Prabhakar, S., Yao, B.: Load shedding in stream databases: a control-based approach. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 787–798. VLDB Endowment (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Fethi Belghaouti
    • 1
  • Amel Bouzeghoub
    • 1
    Email author
  • Zakia Kazi-Aoul
    • 2
  • Raja Chiky
    • 2
  1. 1.SAMOVAR, Telecom SudParis, CNRSUniversite Paris-SaclayEvry CedexFrance
  2. 2.Institut Superieur d’Electronique de ParisParisFrance

Personalised recommendations