Advertisement

Selectivity Estimation for SPARQL Triple Patterns with Shape Expressions

  • Abdullah Abbas
  • Pierre Genevès
  • Cécile Roisin
  • Nabil Layaïda
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10845)

Abstract

We optimize the evaluation of conjunctive SPARQL queries, on big RDF graphs, by taking advantage of ShEx schema constraints. Our optimization is based on computing ranks for query triple patterns, which indicates their order of execution. We first define a set of well-formed ShEx schemas, that possess interesting characteristics for SPARQL query optimization. We then define our optimization method by exploiting information extracted from a ShEx schema. The experimentations performed shows the advantages of applying our optimization on the top of an existing state-of-the-art query evaluation system.

References

  1. 1.
    Abadi, D.J., Marcus, A., Madden, S.R., Hollenbach, K.: Scalable semantic web data management using vertical partitioning. In: Proceedings of the 33rd International Conference on VLDB, VLDB 2007, pp. 411–422. VLDB Endowment (2007)Google Scholar
  2. 2.
    Bagan, G., Bonifati, A., Ciucanu, R., Fletcher, G.H.L., Lemay, A., Advokaat, N.: gMark: schema-driven generation of graphs and queries. IEEE Trans. Knowl. Data Eng. 29(4), 856–869 (2017)CrossRefGoogle Scholar
  3. 3.
    Benzaken, V., Castagna, G., Colazzo, D., Nguyen, K.: Optimizing XML querying using type-based document projection. ACM Trans. Database Syst. 38(1), 4:1–4:45 (2013)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Erling, O., Averbuch, A., Larriba-Pey, J., Chafi, H., Gubichev, A., Prat, A., Pham, M.-D., Boncz, P.: The LDBC social network benchmark: interactive workload. In: Proceedings of SIGMOD 2015, pp. 619–630. ACM (2015)Google Scholar
  5. 5.
    Goasdoué, F., Kaoudi, Z., Manolescu, I., Quiané-Ruiz, J., Zampetakis, S., et al.: CliqueSquare: efficient Hadoop-based RDF query processing. In: BDA 2013-Journées de Bases de Données Avancées (2013)Google Scholar
  6. 6.
    Graux, D., Jachiet, L., Genevès, P., Layaïda, N.: SPARQLGX: efficient distributed evaluation of SPARQL with Apache spark. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016, Part II. LNCS, vol. 9982, pp. 80–87. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46547-0_9CrossRefGoogle Scholar
  7. 7.
    Joshi, A.K., Hitzler, P., Dong, G.: Logical linked data compression. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 170–184. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38288-8_12CrossRefGoogle Scholar
  8. 8.
    Kim, H., Ravindra, P., Anyanwu, K.: Type-based semantic optimization for scalable RDF graph pattern matching. In: Proceedings of WWW 2017, pp. 785–793. International WWW Conference Steering Committee (2017)Google Scholar
  9. 9.
    Lee, K., Liu, L.: Scaling queries over big RDF graphs with semantic hash partitioning. Proc. VLDB Endow. 6(14), 1894–1905 (2013)CrossRefGoogle Scholar
  10. 10.
    Papailiou, N., Konstantinou, I., Tsoumakos, D., Karras, P., Koziris, N.: H2rdf+: high-performance distributed joins over large-scale RDF graphs. In: 2013 IEEE International Conference on Big Data, pp. 255–263, October 2013Google Scholar
  11. 11.
    Pham, M.-D., Passing, L., Erling, O., Boncz, P.: Deriving an emergent relational schema from RDF data. In: Proceedings of WWW 2015, pp. 864–874 (2015)Google Scholar
  12. 12.
    Prud’hommeaux, E., Gayo, J.E.L., Solbrig, H.: Shape expressions: an RDF validation and transformation language. In: Proceedings of SEM 2014, pp. 32–40. ACM (2014)Google Scholar
  13. 13.
    Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: Proceedings of the 13th International Conference on Database Theory, ICDT 2010, pp. 4–33. ACM (2010)Google Scholar
  14. 14.
    Schreiber, G., Raimond, Y.: RDF 1.1 primer. W3C note, W3C, June 2014. http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/
  15. 15.
    Seaborne, A., Harris, S.: SPARQL 1.1 query language. W3C recommendation, W3C, March 2013. http://www.w3.org/TR/2013/REC-sparql11-query-20130321/
  16. 16.
    Serfiotis, G., Koffina, I., Christophides, V., Tannen, V.: Containment and minimization of RDF/S query patterns. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 607–623. Springer, Heidelberg (2005).  https://doi.org/10.1007/11574620_44CrossRefGoogle Scholar
  17. 17.
    Staworko, S., Boneva, I., Gayo, J.E.L., Hym, S., Prud’hommeaux, E.G., Solbrig, H.: Complexity and expressiveness of ShEx for RDF. In: ICDT 2015, vol. 31, pp. 195–211 (2015)Google Scholar
  18. 18.
    Zou, L., Mo, J., Chen, L., Özsu, M.T., Zhao, D.: gStore: answering SPARQL queries via subgraph matching. Proc. VLDB 4(8), 482–493 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Abdullah Abbas
    • 1
  • Pierre Genevès
    • 1
  • Cécile Roisin
    • 1
  • Nabil Layaïda
    • 1
  1. 1.Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIGGrenobleFrance

Personalised recommendations