SP2Bench: A SPARQL Performance Benchmark

  • Michael Schmidt
  • Thomas Hornung
  • Michael Meier
  • Christoph Pinkel
  • Georg Lausen
Chapter

Abstract

A meaningful analysis and comparison of both existing storage schemes for RDF data and evaluation approaches for SPARQL queries necessitates a comprehensive and universal benchmark platform. We present SP2Bench, a publicly available, language-specific performance benchmark for the SPARQL query language. SP2Bench is settled in the DBLP scenario and comprises a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror vital key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellations and RDF access patterns. In this chapter, we discuss requirements and desiderata for SPARQL benchmarks and present the SP2Bench framework, including its data generator, benchmark queries and performance metrics.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J.: Scalable Semantic Web data management using vertical partitioning. In: VLDB, pp. 411–422 (2007) Google Scholar
  2. 2.
    Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J.: Using the Barton libraries dataset as an RDF benchmark. Technical Report, MIT-CSAIL-TR-2007-036, MIT (2007) Google Scholar
  3. 3.
    Angles, R., Gutiérrez, C.: The expressive power of SPARQL. In: ISWC, pp. 114–129 (2008) Google Scholar
  4. 4.
    Alexaki, S., Christophides, V., Karvounarakis, G., Plexousakis, D.: On storing voluminous RDF descriptions: the case of web portal catalogs. In: WebDB, pp. 43–48 (2001) Google Scholar
  5. 5.
    Bizer, C., Cyganiak, R.: D2R Server publishing the DBLP Bibliography Database. http://www4.wiwiss.fu-berlin.de/dblp/
  6. 6.
    Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Semant. Web Inf. Syst., Special Issue on Scalability and Performance of Semantic Web Systems (2009) Google Scholar
  7. 7.
    Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A generic architecture for storing and querying RDF and RDF schema. In: ISWC, pp. 54–68 (2002) Google Scholar
  8. 8.
    Carey, M.J., DeWitt, D.J., Naughton, J.F.: The OO7 benchmark. In: SIGMOD, pp. 12–21 (1993) Google Scholar
  9. 9.
    Chebotko, A., Lu, S., Jamil, H.M., Fotouhi, F.: Semantics preserving SPARQL-to-SQL query translation for optional graph patterns. Technical Report, TR-DB-052006-CLJF (2006) Google Scholar
  10. 10.
    Cyganiac, R.: A relational algebra for SPARQL. Technical Report, HP Laboratories Bristol (2005) Google Scholar
  11. 11.
    Elmacioglu, E., Lee, D.: On six degrees of separation in DBLP-DB and more. SIGMOD Rec. 34(2), 33–40 (2005) CrossRefGoogle Scholar
  12. 12.
    Gray, J.: The Benchmark Handbook for Database and Transaction Systems. Morgan Kaufmann, San Mateo (1993) MATHGoogle Scholar
  13. 13.
    Groppe, S., Groppe, J., Linnemann, V.: Using an index of precomputed joins in order to speed up SPARQL processing. In: ICEIS, pp. 13–20 (2007) Google Scholar
  14. 14.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. In: Web Semantics: Science, Services and Agents on the WWW, vol. 3(2–3), pp. 158–182 (2005) Google Scholar
  15. 15.
    Harris, S., Gibbins, N.: 3store: efficient bulk RDF storage. In: PSSS (2003) Google Scholar
  16. 16.
    Harth, A., Decker, S.: Optimized index structures for querying RDF from the web. In: LA-WEB, pp. 71–80 (2005) Google Scholar
  17. 17.
    Hartig, O., Heese, R.: The SPARQL query graph model for query optimization. In: ESWC, pp. 564–578 (2007) Google Scholar
  18. 18.
    Lausen, G., Meier, M., Schmidt, M.: SPARQLing constraints for RDF. In: EDBT, pp. 499–509 (2008) Google Scholar
  19. 19.
  20. 20.
    Lotka, A.J.: The frequency distribution of scientific production. J. Wash. Acad. Sci. 16, 317–323 (1926) Google Scholar
  21. 21.
    Magkanaraki, A., Alexaki, S., Christophides, V., Plexousakis, D.: Benchmarking RDF schemas for the Semantic Web. In: ISWC, pp. 132–146 (2002) Google Scholar
  22. 22.
    Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. In: PVLDB, pp. 647–659 (2008) Google Scholar
  23. 23.
    Pérez, J., Arenas, M., Gutiérrez, C.: Semantics and complexity of SPARQL. In: ICSW, pp. 30–43 (2006) Google Scholar
  24. 24.
    Polleres, A.: From SPARQL to rules (and back). In: WWW, pp. 787–796 (2007) Google Scholar
  25. 25.
    Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management. In: VLDB, pp. 974–985 (2002) Google Scholar
  26. 26.
    Schmidt, M., Hornung, T., Küchlin, N., Lausen, G., Pinkel, C.: An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In: ISWC, pp. 82–97 (2008) Google Scholar
  27. 27.
    Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. Technical Report, Corr cs.DB 0812.3788 (2008)
  28. 28.
    Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: a SPARQL performance benchmark. In: ICDE, pp. 222–233 (2009) Google Scholar
  29. 29.
    Sidirourgos, L., Goncalves, R., Kersten, M.L., Nes, N., Manegold, S.: Column-store support for RDF data management: not all swans are white. In: PVLDB, pp. 1553–1563 (2008) Google Scholar
  30. 30.
    Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: WWW, pp. 505–604 (2008) Google Scholar
  31. 31.
    Theoharis, Y., Christophides, V., Karvounarakis, G.: Benchmarking RDF representations of RDF/S stores. In: ISWC, pp. 685–701 (2005) Google Scholar
  32. 32.
    Theoharis, Y., Tzitzikas, Y., Kotzinos, D., Christophides, V.: On graph features of Semantic Web schemas. IEEE Trans. Knowl. Data Eng. 20(5), 692–702 (2008) CrossRefGoogle Scholar
  33. 33.
    W3C: Web Ontology Language (OWL). http://www.w3.org/2004/OWL/
  34. 34.
    W3C: Resource Description Framework (RDF). http://www.w3.org/RDF/
  35. 35.
    W3C: SPARQL Query Language for RDF. W3C Recommendation, 15 January 2008. http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/
  36. 36.
    Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for Semantic Web data management. In: VLDB, pp. 1008–1019 (2008) Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Michael Schmidt
    • 1
  • Thomas Hornung
    • 1
  • Michael Meier
    • 1
  • Christoph Pinkel
    • 2
  • Georg Lausen
    • 1
  1. 1.Albert-Ludwigs-Universität FreiburgFreiburgGermany
  2. 2.MTC Infomedia OHGSaarbrückenGermany

Personalised recommendations