DBpedia SPARQL Benchmark – Performance Assessment with Real Queries on Real Data

Morsey, Mohamed; Lehmann, Jens; Auer, Sören; Ngonga Ngomo, Axel-Cyrille

doi:10.1007/978-3-642-25073-6_29

Mohamed Morsey²⁴,
Jens Lehmann²⁴,
Sören Auer²⁴ &
…
Axel-Cyrille Ngonga Ngomo²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7031))

Included in the following conference series:

International Semantic Web Conference

4040 Accesses
110 Citations
10 Altmetric

Abstract

Triple stores are the backbone of increasingly many Data Web applications. It is thus evident that the performance of those stores is mission critical for individual projects as well as for data integration on the Data Web in general. Consequently, it is of central importance during the implementation of any of these applications to have a clear picture of the weaknesses and strengths of current triple store implementations. In this paper, we propose a generic SPARQL benchmark creation procedure, which we apply to the DBpedia knowledge base. Previous approaches often compared relational and triple stores and, thus, settled on measuring performance against a relational database which had been converted to RDF by using SQL-like queries. In contrast to those approaches, our benchmark is based on queries that were actually issued by humans and applications against existing RDF data not resembling a relational schema. Our generic procedure for benchmark creation is based on query-log mining, clustering and SPARQL feature analysis. We argue that a pure SPARQL benchmark is more useful to compare existing triple stores and provide results for the popular triple store implementations Virtuoso, Sesame, Jena-TDB, and BigOWLIM. The subsequent comparison of our results with other benchmark results indicates that the performance of triple stores is by far less homogeneous than suggested by previous benchmarks.

This work was supported by a grant from the European Union’s 7th Framework Programme provided for the project LOD2 (GA no. 257943).

Download to read the full chapter text

Chapter PDF

ASPG: Generating OLAP Queries for SPARQL Benchmarking

dbpedia’s Triple Pattern Fragments: Usage Patterns and Insights

FEASIBLE: A Feature-Based SPARQL Benchmark Generation Framework

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Auer, S., Lehmann, J., Hellmann, S.: LinkedGeoData: Adding a Spatial Dimension to the Web of Data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 731–746. Springer, Heidelberg (2009)
Chapter Google Scholar
Belleau, F., Nolin, M.-A., Tourigny, N., Rigault, P., Morissette, J.: Bio2rdf: Towards a mashup to build bioinformatics knowledge systems. Journal of Biomedical Informatics 41(5), 706–716 (2008)
Article Google Scholar
Bishop, B., Kiryakov, A., Ognyanoff, D., Peikov, I., Tashev, Z., Velkov, R.: Owlim: A family of scalable semantic repositories. Semantic Web 2(1), 1–10 (2011)
Article Google Scholar
Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. Int. J. Semantic Web Inf. Syst. 5(2), 1–24 (2009)
Article Google Scholar
Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A generic architecture for storing and querying RDF and RDF schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)
Chapter Google Scholar
Duan, S., Kementsietsidis, A., Srinivas, K., Udrea, O.: Apples and oranges: a comparison of RDF benchmarks and real RDF datasets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 145–156. ACM (2011)
Google Scholar
Erling, O., Mikhailov, I.: RDF support in the virtuoso DBMS. In: Auer, S., Bizer, C., Müller, C., Zhdanova, A.V. (eds.) CSSW. LNI, vol. 113, pp. 59–68. GI (2007)
Google Scholar
Gray, J. (ed.): The Benchmark Handbook for Database and Transaction Systems, 1st edn. Morgan Kaufmann (1991)
Google Scholar
Klyne, G., Carroll, J.J.: Resource description framework (RDF): Concepts and abstract syntax. W3C Recommendation (February 2004)
Google Scholar
Lehmann, J., Bizer, C., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - a crystallization point for the web of data. Journal of Web Semantics 7(3), 154–165 (2009)
Article Google Scholar
Minack, E., Siberski, W., Nejdl, W.: Benchmarking Fulltext Search Performance of RDF Stores. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 81–95. Springer, Heidelberg (2009)
Chapter Google Scholar
Ngonga Ngomo, A.-C., Schumacher, F.: BorderFlow: A local graph clustering algorithm for natural language processing. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 547–558. Springer, Heidelberg (2009)
Chapter Google Scholar
Ngonga Ngomo, A.-C., Auer, S.: Limes - a time-efficient approach for large-scale link discovery on the web of data. In: Proceedings of IJCAI (2011)
Google Scholar
Owens, A., Gibbins, N., Schraefel, m.c.: Effective benchmarking for rdf stores using synthetic data (May 2008)
Google Scholar
Owens, A., Seaborne, A., Gibbins, N., Schraefel, m.c.: Clustered TDB: A clustered triple store for jena. Technical report, Electronics and Computer Science, University of Southampton (2008)
Google Scholar
Pan, Z., Guo, Y., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. Journal of Web Semantics 3, 158–182 (2005)
Article Google Scholar
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C Recommendation (2008)
Google Scholar
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: A SPARQL performance benchmark. In: ICDE, pp. 222–233. IEEE (2009)
Google Scholar
Stickler, P.: CBD - concise bounded description (2005), http://www.w3.org/Submission/CBD/ (retrieved February 15, 2011)

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Leipzig, Johannisgasse 26, 04103, Leipzig, Germany
Mohamed Morsey, Jens Lehmann, Sören Auer & Axel-Cyrille Ngonga Ngomo

Authors

Mohamed Morsey
View author publications
You can also search for this author in PubMed Google Scholar
Jens Lehmann
View author publications
You can also search for this author in PubMed Google Scholar
Sören Auer
View author publications
You can also search for this author in PubMed Google Scholar
Axel-Cyrille Ngonga Ngomo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Dept., VU University Amsterdam, De Boelelaan 1081, 1081 HV, Amsterdam, The Netherlands
Lora Aroyo
IBM Research, 10598, Yorktown Heights, NY, USA
Chris Welty
The Open University, Walton Hall, MK7 6AA, Milton Keynes, UK
Harith Alani
Google, USA
Jamie Taylor
University of Zurich, Binzmuehlestrasse 14, 8050, Zurich, Switzerland
Abraham Bernstein
Massachusetts Institute of Technology, 32 Vassar Street, 02139, Cambridge, MA, USA
Lalana Kagal
Stanford University, 94305, Stanford, CA, USA
Natasha Noy
Linköping University, 581 83, Linköping, Sweden
Eva Blomqvist

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, AC. (2011). DBpedia SPARQL Benchmark – Performance Assessment with Real Queries on Real Data. In: Aroyo, L., et al. The Semantic Web – ISWC 2011. ISWC 2011. Lecture Notes in Computer Science, vol 7031. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25073-6_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-25073-6_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25072-9
Online ISBN: 978-3-642-25073-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DBpedia SPARQL Benchmark – Performance Assessment with Real Queries on Real Data

Abstract

Chapter PDF

Similar content being viewed by others

ASPG: Generating OLAP Queries for SPARQL Benchmarking

dbpedia’s Triple Pattern Fragments: Usage Patterns and Insights

FEASIBLE: A Feature-Based SPARQL Benchmark Generation Framework

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

DBpedia SPARQL Benchmark – Performance Assessment with Real Queries on Real Data

Abstract

Chapter PDF

Similar content being viewed by others

ASPG: Generating OLAP Queries for SPARQL Benchmarking

dbpedia’s Triple Pattern Fragments: Usage Patterns and Insights

FEASIBLE: A Feature-Based SPARQL Benchmark Generation Framework

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation