Advertisement

Exploiting Emergent Schemas to Make RDF Systems More Efficient

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9981)

Abstract

We build on our earlier finding that more than 95 % of the triples in actual RDF triple graphs have a remarkably tabular structure, whose schema does not necessarily follow from explicit metadata such as ontologies, but for which an RDF store can automatically derive by looking at the data using so-called “emergent schema” detection techniques. In this paper we investigate how computers and in particular RDF stores can take advantage from this emergent schema to more compactly store RDF data and more efficiently optimize and execute SPARQL queries. To this end, we contribute techniques for efficient emergent schema aware RDF storage and new query operator algorithms for emergent schema aware scans and joins. In all, these techniques allow RDF schema processors fully catch up with relational database techniques in terms of rich physical database design options and efficiency, without requiring a rigid upfront schema structure definition.

Keywords

Emergent Strategy SPARQL Query Optimization Emergency Table Triple Patterns MonetDB 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bizer, C., Schultz, A.: IJSWIS. The Berlin SPARQL Benchmark 5, 1–24 (2009)Google Scholar
  2. 2.
    Bornea, M., et al.: Building an efficient RDF store over a relational database. In: SIGMOD (2013)Google Scholar
  3. 3.
    Brodt, A., et al.: Efficient resource attribute retrieval in RDF triple stores. In: CIKM (2011)Google Scholar
  4. 4.
    Chong, E., et al.: An efficient SQL-based RDF querying scheme. In: VLDB (2005)Google Scholar
  5. 5.
    Erling, O.: Virtuoso, a hybrid RDBMS/graph column store. DEBULL 35, 3–8 (2012)Google Scholar
  6. 6.
    Erling, O., et al.: The LDBC social network benchmark. In: SIGMOD (2015)Google Scholar
  7. 7.
    Gubichev, A., et al.: Exploiting the query structure for efficient join ordering in SPARQL queries. In: EDBT, pp. 439–450 (2014)Google Scholar
  8. 8.
    Guo, Y., et al.: LUBM: a benchmark for owl knowledge base systems. Web Semant. 3, 158–182 (2005)CrossRefGoogle Scholar
  9. 9.
    Idreos, S., et al.: Database cracking. In: CIDR, Asilomar, California (2007)Google Scholar
  10. 10.
    Levandoski, J., et al.: RDF data-centric storage. In: ICWS (2009)Google Scholar
  11. 11.
    Matono, A., Kojima, I.: Paragraph tables: a storage scheme based on RDF document structure. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012, Part II. LNCS, vol. 7447, pp. 231–247. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Neumann, T., et al.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: ICDE (2011)Google Scholar
  14. 14.
    Neumann, T., et al.: RDF-3X: a RISC-style engine for RDF. Proc. VLDB Endow. 1, 647–659 (2008)CrossRefGoogle Scholar
  15. 15.
    Pham, M.D., et al.: Deriving an emergent relational schema from RDF data. In: WWW (2015)Google Scholar
  16. 16.
    Tsialiamanis, P., et al.: Heuristics-based query optimisation for SPARQL. In: EDBT (2012)Google Scholar
  17. 17.
    Ullman, J., Widom, J.: Database Systems: The Complete Book (2008)Google Scholar
  18. 18.
    Wang, Y., Du, X., Lu, J., Wang, X.: FlexTable: using a dynamic relation model to store RDF data. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5981, pp. 580–594. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  19. 19.
    Weiss, C., et al.: Hexastore: sextuple indexing for semantic web data management. Proc. VLDB Endow. 1, 1008–1019 (2008)CrossRefGoogle Scholar
  20. 20.
    Wilkinson, K.: Jena property table implementation (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.CWIAmsterdamThe Netherlands

Personalised recommendations