Advertisement

SRBench: A Streaming RDF/SPARQL Benchmark

  • Ying Zhang
  • Pham Minh Duc
  • Oscar Corcho
  • Jean-Paul Calbimonte
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7649)

Abstract

We introduce SRBench, a general-purpose benchmark primarily designed for streaming RDF/SPARQL engines, completely based on real-world data sets from the Linked Open Data cloud. With the increasing problem of too much streaming data but not enough tools to gain knowledge from them, researchers have set out for solutions in which Semantic Web technologies are adapted and extended for publishing, sharing, analysing and understanding streaming data. To help researchers and users comparing streaming RDF/SPARQL (strRS) engines in a standardised application scenario, we have designed SRBench, with which one can assess the abilities of a strRS engine to cope with a broad range of use cases typically encountered in real-world scenarios. The data sets used in the benchmark have been carefully chosen, such that they represent a realistic and relevant usage of streaming data. The benchmark defines a concise, yet comprehensive set of queries that cover the major aspects of strRS processing. Finally, our work is complemented with a functional evaluation on three representative strRS engines: SPARQLStream, C-SPARQL and CQELS. The presented results are meant to give a first baseline and illustrate the state-of-the-art.

Keywords

Streaming Data SPARQL Query Continuous Query Triple Pattern Link Open Data Cloud 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Anicic, D., Fodor, P., Rudolph, S., Stojanovic, N.: EP-SPARQL: a unified language for event processing and stream reasoning. In: WWW 2011, pp. 635–644 (2011)Google Scholar
  2. 2.
    Arasu, A., Babu, S., Widom, J.: CQL: A Language for Continuous Queries over Streams and Relations. In: Lausen, G., Suciu, D. (eds.) DBPL 2003. LNCS, vol. 2921, pp. 1–19. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Arasu, A., et al.: Linear Road: A Stream Data Management Benchmark. In: Proc. of the 30th VLDB Conference, Toronto, Canada, pp. 480–491 (2004)Google Scholar
  4. 4.
    Arenas, M., Conca, S., Pérez, J.: Counting Beyond a Yottabyte, or how SPARQL 1.1 Property Paths will Prevent Adoption of the Standard. In: WWW (2012)Google Scholar
  5. 5.
    Balazinska, M., et al.: Data Management in the Worldwide Sensor Web. IEEE Pervasive Computing 6(2), 30–40 (2007)CrossRefGoogle Scholar
  6. 6.
    Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: Querying RDF Streams with C-SPARQL. SIGMOD Record 39(1), 20–26 (2010)CrossRefGoogle Scholar
  7. 7.
    Berners-Lee, T.: Linked Data - Design Issues (2009), http://www.w3.org/DesignIssues/LinkedData.html
  8. 8.
    Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. Int. J. Semantic Web Inf. Syst. 5(2), 1–24 (2009)CrossRefGoogle Scholar
  9. 9.
    Bolles, A., Grawunder, M., Jacobi, J.: Streaming SPARQL - Extending SPARQL to Process Data Streams. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 448–462. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Bouillet, E., Feblowitz, M., Liu, Z., Ranganathan, A., Riabov, A., Ye, F.: A Semantics-Based Middleware for Utilizing Heterogeneous Sensor Networks. In: Aspnes, J., Scheideler, C., Arora, A., Madden, S. (eds.) DCOSS 2007. LNCS, vol. 4549, pp. 174–188. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Calbimonte, J.-P., Corcho, O., Gray, A.J.G.: Enabling Ontology-Based Access to Streaming Data Sources. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 96–111. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    CKAN - the Data Hub, http://thedatahub.org/
  13. 13.
    Corcho, O., et al.: Characterisation mechanisms for unknown data sources. EU Project PlanetData (FP7-257641), Deliverable 1.1 (2011)Google Scholar
  14. 14.
    Corcho, O., García-Castro, R.: Five challenges for the semantic sensor web. Semantic Web 1(1), 121–125 (2010)Google Scholar
  15. 15.
  16. 16.
    Della Valle, E., et al.: It’s a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6), 83–89 (2009)CrossRefGoogle Scholar
  17. 17.
    Duan, S., Kementsietsidis, A., Srinivas, K., Udrea, O.: Apples and Oranges: A Comparison of RDF Benchmarks and Real RDF Datasets. In: SIGMOD (2011)Google Scholar
  18. 18.
  19. 19.
    Gray, J.: The Benchmark Handbook for Database and Transaction Systems. Morgan Kaufmann (1993)Google Scholar
  20. 20.
    Groppe, S., et al.: A SPARQL Engine for Streaming RDF Data. In: SITIS (2007)Google Scholar
  21. 21.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for owl knowledge base systems. Web Semantics: Science, Services and Agents on the World Wide Web 3(2-3), 158–182 (2005)CrossRefGoogle Scholar
  22. 22.
    Harris, S., Seaborne, A.: SPARQL 1.1 Query Language. W3C Working Draft, World Wide Web Consortium (January 05, 2012), http://www.w3.org/TR/sparql11-query/
  23. 23.
    Hey, T., Tansley, S., Tolle, K. (eds.): The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research (October 2009)Google Scholar
  24. 24.
    Hoeksema, J.: A Parallel RDF Stream Reasoner and C-SPARQL Processor Using the S4 Framework. Master’s thesis, VU University, Amsterdam, The Netherlands (October 2011)Google Scholar
  25. 25.
    Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A Native and Adaptive Approach for Unified Processing of Linked Streams and Linked Data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 370–388. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  26. 26.
    Le-Phuoc, D., Hauswirth, M.: Linked open data in sensor data mashups. In: Proceedings of the 2nd International Workshop on Semantic Sensor Networks (SSN 2009), pp. 1–16 (2009)Google Scholar
  27. 27.
  28. 28.
    Pérez, J., et al.: Semantics and Complexity of SPARQL. ACM TODS 34(3), 1–45 (2009)CrossRefGoogle Scholar
  29. 29.
    Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C Recommendation, World Wide Web Consortium (January 15, 2008)Google Scholar
  30. 30.
    Schmidt, M., et al.: SP2Bench: A SPARQL Performance Benchmark. In: ICDE (2009)Google Scholar
  31. 31.
    Sequeda, J., Corcho, O.: Linked stream data: A position paper. In: Proceedings of Semantic Sensor Networks, pp. 148–157 (2009)Google Scholar
  32. 32.
    Sheth, A.P., et al.: Semantic Sensor Web. IEEE Internet Computing 12(4), 78–83 (2008)CrossRefGoogle Scholar
  33. 33.
  34. 34.
    The Linking Open Data cloud diagram, http://richard.cyganiak.de/2007/10/lod/
  35. 35.
    Walavalkar, O., et al.: Streaming Knowledge Bases. In: SSWS (2008)Google Scholar
  36. 36.
    Whitehouse, K., Zhao, F., Liu, J.: Semantic Streams: A Framework for Composable Semantic Interpretation of Sensor Data. In: Römer, K., Karl, H., Mattern, F. (eds.) EWSN 2006. LNCS, vol. 3868, pp. 5–20. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  37. 37.
    Zhang, Y., et al.: Benchmarking RDF Storage Engines. EU Project PlanetData, Deliverable 1.2 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ying Zhang
    • 1
  • Pham Minh Duc
    • 1
  • Oscar Corcho
    • 2
  • Jean-Paul Calbimonte
    • 2
  1. 1.Centrum Wiskunde & InformaticaAmsterdamThe Netherlands
  2. 2.Universidad Politécnica de MadridSpain

Personalised recommendations