ASPG: Generating OLAP Queries for SPARQL Benchmarking

Wang, Xin; Staab, Steffen; Tiropanis, Thanassis

doi:10.1007/978-3-319-50112-3_13

Xin Wang²⁰,
Steffen Staab^20,21 &
Thanassis Tiropanis²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10055))

Included in the following conference series:

Joint International Semantic Technology Conference

692 Accesses
1 Citations

Abstract

The increasing use of data analytics on Linked Data leads to the requirement for SPARQL engines to efficiently execute Online Analytical Processing (OLAP) queries. While SPARQL 1.1 provides basic constructs, further development on optimising OLAP queries lacks benchmarks that mimic the data distributions found in Link Data. Existing work on OLAP benchmarking for SPARQL has usually adopted queries and data from relational databases, which may not well represent Linked Data. We propose an approach that maps typical OLAP operations to SPARQL and a tool named ASPG to automatically generate OLAP queries from real-world Linked Data. We evaluate ASPG by constructing a benchmark called DBOBfrom the online DBpedia endpoint, and use DBOB to measure the performance of the Virtuoso engine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://virtuoso.openlinksw.com/.
2.
Mapping an OLAP data point to a subject is just one intuitive approach. An OLAP data point can be mapped to any RDF term.
3.
It is enough to GROUP BY a subset of all variables that uniquely identifies an entity. Variables excluded from GROUP BY can be selected using the SAMPLE aggregation.
4.
SPARQL 1.1 doesn’t have the ability to define new functions, and therefore cat should be considered as a macro in Query 3.
5.
It requires to calculate the position of an item in a linked list and to identify the maximum item in a set. Refer to https://git.io/vwP0t for more details.
6.
The complexity of a BGP is also affected by the number of intermediate results in each join. However the later requires detailed statistics to estimate which are not always available.
7.
http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/spec/BenchmarkRules/index.html#datagenerator.

References

Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Semant. Web Inf. Syst. (IJSWIS) - Special Issue on Scalability and Performance of Semantic Web Systems 5(2), 1–24 (2009)
Google Scholar
Capadisli, S., Auer, S., Riedl, R.: Linked Statistical Data Analysis. Semantic Web (2013)
Google Scholar
Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM SIGMOD Record 26(1), 65–74 (1997)
Article Google Scholar
Ciferri, C., Ciferri, R., Gómez, L., Schneider, M., Vaisman, A., Zimányi, E.: Cube algebra: a generic user-centric model and query language for OLAP cubes. Int. J. Data Warehous. Min. 9(2), 39–65 (2013)
Article Google Scholar
Codd, E.F., Codd, S.B., Salley, C.T.: Providing OLAP (on-line Analytical Processing) to user-analysts: an IT mandate. Codd Date 32, 3–5 (1993)
Google Scholar
Cyganiak, R., Reynolds, D., Tennison, J.: The RDF Data Cube Vocabulary
Google Scholar
Demartini, G., Enchev, I.: The bowlogna ontology: fostering open curricula and agile knowledge bases for Europe ’ s higher education. Landscape 0, 1–11 (2012)
Google Scholar
Demartini, G., Enchev, I., Wylot, M., Gapany, J., Cudré-Mauroux, P.: BowlognaBench-Benchmarking RDF analytics. Data-Driven Process Discovery Anal. 116, 82–102 (2011)
Article Google Scholar
Görlitz, O., Thimm, M., Staab, S.: SPLODGE: systematic generation of SPARQL benchmark queries for linked open data. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 116–132. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35176-1_8
Chapter Google Scholar
Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. Web Semant. 3(2–3), 158–182 (2005)
Article Google Scholar
Harris, S., Seaborne, A.: SPARQL 1.1 Query Language (2013)
Google Scholar
Kämpgen, B., Harth, A.: No size fits all – running the star schema benchmark with SPARQL and RDF aggregate views. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 290–304. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_20
Chapter Google Scholar
Kämpgen, B., O’Riain, S., Harth, A.: Interacting with Statistical Linked Data via OLAP Operations. In: Simperl, E., Norton, B., Mladenic, D., Della Valle, E., Fundulaki, I., Passant, A., Troncy, R. (eds.) ESWC 2012. LNCS, vol. 7540, pp. 87–101. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46641-4_7
Google Scholar
Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_29
Chapter Google Scholar
Neil, P.O., Neil, B.O., Chen, X.: Star Schema Benchmark - Revision 3. Technical report, UMass/Boston (2009)
Google Scholar
Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25007-6_4
Chapter Google Scholar
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: a SPARQL performance benchmark. In: Proceedings of the International Conference on Data Engineering, pp. 222–233. IEEE (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Web and Internet Science Group, University of Southampton, Southampton, UK
Xin Wang, Steffen Staab & Thanassis Tiropanis
Institute for Web Science and Technology, University of Koblenz-Landau, Mainz, Germany
Steffen Staab

Authors

Xin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Steffen Staab
View author publications
You can also search for this author in PubMed Google Scholar
Thanassis Tiropanis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Wang .

Editor information

Editors and Affiliations

Information Technology, Monash University, Melbourne, Victoria, Australia
Yuan-Fang Li
Computer Science and Technology, Nanjing University, Nanjing, China
Wei Hu
Computer Science, National University of Singapore, Singapore, Singapore
Jin Song Dong
University of Huddersfield, Huddersfield, United Kingdom
Grigoris Antoniou
Information and Communication Technology, Griffith University, Brisbane, Queensland, Australia
Zhe Wang
ISTD, Singapore University of Technology and Design, Singapore, Singapore
Jun Sun
Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
Yang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Staab, S., Tiropanis, T. (2016). ASPG: Generating OLAP Queries for SPARQL Benchmarking. In: Li, YF., et al. Semantic Technology. JIST 2016. Lecture Notes in Computer Science(), vol 10055. Springer, Cham. https://doi.org/10.1007/978-3-319-50112-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-50112-3_13
Published: 27 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50111-6
Online ISBN: 978-3-319-50112-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics