SPARQL Query Containment Under Schema

Abstract

Query containment is defined as the problem of determining if the result of a query is included in the result of another query for any dataset. It has major applications in query optimization and knowledge base verification. To date, testing query containment has been performed using different techniques: containment mapping, canonical databases, automata theory techniques and through a reduction to the validity problem in logic. Here, we use the latter technique to test containment of SPARQL queries using an expressive modal logic called \(\mu \)-calculus. For that purpose, we define an RDF graph encoding as a transition system which preserves its characteristics. In addition, queries and schema axioms are encoded as \(\mu \)-calculus formulae. Thereby, query containment can be reduced to testing validity in the logic. We identify various fragments of SPARQL and description logic schema languages for which containment is decidable. Additionally, we provide theoretically and experimentally proven procedures to check containment of these decidable fragments. Finally, we propose a benchmark for containment solvers which is used to test and compare the current state-of-the-art containment solvers.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    The transition graph is similar to the tuple-graph used in [13] to detect the dependency among variables.

  2. 2.

    http://tyrex.inria.fr/websolver/.

  3. 3.

    DBpedia 3.5.1 logs (ftp://download.openlinksw.com/support/dbpedia/) contain 3 210 368 queries between 30/04/2010 and 20/07/2010 and 378 530 queries of 13/07/2010 only.

References

  1. 1.

    Abiteboul S, Hull R, Vianu V (1995) Foundations of databases, vol 8. Addison-Wesley, Boston

    Google Scholar 

  2. 2.

    Aho AV, Sagiv Y, Ullman JD (1979) Equivalences among relational expressions. SIAM J Comput 8(2):218–246

    MathSciNet  Article  MATH  Google Scholar 

  3. 3.

    Alkhateeb F, Baget JF, Euzenat J (2009) Extending SPARQL with regular expression patterns (for querying RDF). Web Semant Sci Serv Agents World Wide Web 7(2):57–73

    Article  Google Scholar 

  4. 4.

    Angles R, Gutierrez C (2008) The expressive power of SPARQL. In: Proceedings of ISWC. Springer, pp 114–129

  5. 5.

    Baader F, Calvanese D, McGuinness D, Nardi D, Patel-Schneider P (eds) (2007) The description logic handbook: theory, implementation, and applications. Cambridge University Press, Cambridge. ISBN 9780511717383

  6. 6.

    Baget JF (2005) RDF entailment as a graph homomorphism. Semant Web-ISWC 2005:82–96

    Google Scholar 

  7. 7.

    Barcelo P, Libkin L, Lin AW, Wood PT (2012) Expressive languages for path queries over graph-structured data. ACM Trans Database Syst 37:31

    Article  Google Scholar 

  8. 8.

    Bizer C, Schultz A (2008) Benchmarking the performance of storage systems that expose SPARQL endpoints. In: Proceedings of 4th international workshop on scalable semantic web knowledge base systems (SSWS)

  9. 9.

    Bizer C, Schultz A (2009) The Berlin SPARQL benchmark. Int J Semant Web Inf Syst 5(2):1–24

    Article  Google Scholar 

  10. 10.

    Blackburn P, van Benthem JF, Wolter F (2007) Handbook of modal logic, vol 3. Elsevier, New York

    Google Scholar 

  11. 11.

    Calvanese D, De Giacomo G, Lenzerini M, Vardi MY (2000) Containment of conjunctive regular path queries with inverse. In: Proceedings of 7th international conference on the principles of knowledge representation and reasoning (KR 2000), pp 176–185

  12. 12.

    Calvanese D, De Giacomo G, Lenzerini M, Vardi MY (2003) Reasoning on regular path queries. SIGMOD Rec 32(4):83–92

    Article  Google Scholar 

  13. 13.

    Calvanese D, Giacomo GD, Lenzerini M (2008) Conjunctive query containment and answering under description logic constraints. ACM Trans Comput Log 9(3):22

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Calvanese D, Ortiz M, Simkus M (2011) Containment of regular path queries under description logic constraints. In: IJCAI proceedings-international joint conference on artificial intelligence, vol 22, p 805

  15. 15.

    Calvanese D, Rosati R (2003) Answering recursive queries under keys and foreign keys is undecidable. In: Proceedings of 10th international workshop on knowledge representation meets databases (KRDB 2003), vol 79, pp 3–14

  16. 16.

    Chandra AK, Merlin PM (1977) Optimal implementation of conjunctive queries in relational data bases. In: Proceedings of the ninth annual ACM symposium on theory of computing. ACM, pp 77–90

  17. 17.

    Chebotko A, Lu S, Jamil HM, Fotouhi F (2006) Semantics preserving SPARQL-to-SQL query translation for optional graph patterns. Technical report

  18. 18.

    Chekol MW (2012) Static analysis of semantic web queries. Ph.D. thesis, Université de Grenoble. Thesis

  19. 19.

    Chekol MW (2016) On the containment of SPARQL queries under entailment regimes. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, February 12–17, 2016, Phoenix, Arizona, USA, pp 936–942

  20. 20.

    Chekol MW, Euzenat J, Genevès P, Layaïda N (2011) PSPARQL query containment. In: Proceedings of DBPL

  21. 21.

    Chekol MW, Euzenat J, Genevès P, Layaïda N (2012) SPARQL query containment under RDFS entailment regime. In: Proceedings of IJCAR, pp 134–148

  22. 22.

    Chekol MW, Euzenat J, Genevès P, Layaïda N (2012) SPARQL query containment under SHI axioms. In: Proceedings of AAAI, pp 10–16

  23. 23.

    Chekol MW, Euzenat J, Genevès P, Layaïda N (2013) Evaluating and benchmarking SPARQL query containment solvers. Int Semant Web Conf 2:408–423

    MATH  Google Scholar 

  24. 24.

    Cyganiak R (2005) A relational algebra for SPARQL. Digital Media Systems Laboratory HP Laboratories Bristol. HPL-2005-170, p 35

  25. 25.

    Florescu D, Levy A, Suciu D (1998) Query containment for conjunctive queries with regular expressions. In: Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems. ACM, pp 139–148

  26. 26.

    Genevès P, Layaïda N (2006) A system for the static analysis of XPath. ACM Trans Inf Syst 24(4):475–502

    Article  Google Scholar 

  27. 27.

    Genevès P, Layaïda N, Schmitt A (2007) Efficient static analysis of XML paths and types. In: Proceedings of PLDI. ACM, pp 342–351

  28. 28.

    Glimm B (2011) Using SPARQL with RDFS and OWL entailment. In: Polleres A et al (eds) Reasoning web. Semantic technologies for the web of data. Springer, Berlin, pp 137–201

    Google Scholar 

  29. 29.

    Groppe J, Groppe S, Kolbaum J (2009) Optimization of SPARQL by using coreSPARQL. In: Proceedings of ICEIS, pp 107–112

  30. 30.

    Gutierrez C, Hurtado C, Mendelzon AO (2004) Foundations of semantic web databases. In: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. ACM, pp 95–106

  31. 31.

    Hayes P (2004) RDF semantics. W3C Recommendation

  32. 32.

    Hitzler P, Krötzsch M, Parsia B, Patel-Schneider PF, Rudolph S (2009) OWL 2 web ontology language primer. W3C Recommendation 27(1)

  33. 33.

    Ioannidis YE, Ramakrishnan R (1995) Containment of conjunctive queries: beyond relations as sets. ACM Trans Database Syst 20(3):288–324

    Article  Google Scholar 

  34. 34.

    Johnson DS, Klug A (1984) Testing containment of conjunctive queries under functional and inclusion dependencies. J Comput Syst Sci 28(1):167–189

    MathSciNet  Article  MATH  Google Scholar 

  35. 35.

    Kostylev EV, Cuenca Grau B (2014) On the semantics of SPARQL queries with optional matching under entailment regimes. In: The semantic web ISWC 2014. Lecture notes in computer science, vol 8797. Springer, pp 374–389

  36. 36.

    Kostylev EV, Reutter JL, Romero M, Vrgoč D (2015) SPARQL with property paths. In: International semantic web conference. Springer, pp 3–18

  37. 37.

    Kozen D (1983) Results on the propositional \(\mu \)-calculus. Theor Comput Sci 27(3):333–354

    MathSciNet  Article  MATH  Google Scholar 

  38. 38.

    Letelier A, Pérez J, Pichler R, Skritek S (2012) Static analysis and optimization of semantic web queries. In: Proceedings of PODS. ACM, pp. 89–100

  39. 39.

    Manola F, Miller E (2004) Resource description framework primer. W3C recommendation

  40. 40.

    Mateescu R, Meriot S, Rampacek S (2009) Extending SPARQL with temporal logic. INRIA Research Report, RR-7056. http://hal.archives-ouvertes.fr/inria-00404761/en/

  41. 41.

    Munoz S, Pérez J, Gutierrez C (2007) Minimal deductive systems for RDF. In: Franconi E, Kifer M, May W (eds) The semantic web: research and applications, vol 4519. Springer, Berlin, pp 53–67

    Google Scholar 

  42. 42.

    Pérez J, Arenas M, Gutierrez C (2009) Semantics and complexity of SPARQL. ACM Trans Database Syst 34(3):16

    Article  Google Scholar 

  43. 43.

    Pichler R, Skritek S (2014) Containment and equivalence of well-designed SPARQL. In: Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, ACM, pp 39–50

  44. 44.

    Polleres A (2007) From SPARQL to rules (and back). In: Proceedings of the 16th international conference on World Wide Web. ACM, pp 787–796

  45. 45.

    PrudHommeaux E, Seaborne A (2008) SPARQL query language for RDF. W3C recommendation 15

  46. 46.

    Schmidt M, Hornung T, Lausen G, Pinkel C (2009) SP\(^2\)Bench: a SPARQL performance benchmark. In: Proceedings of ICDE, pp 222–233

  47. 47.

    Schmidt M, Meier M, Lausen G (2010) Foundations of SPARQL query optimization. In: Proceedings of ICDT. ACM, pp 4–33

  48. 48.

    Serfiotis G, Koffina I, Christophides V, Tannen V (2005) Containment and minimization of RDF/S query patterns. In: Gil Y, Motta E, Benjamins VR, Musen MA (eds) The semantic web–ISWC 2005. LNCS, vol 3729. Springer, Berlin, pp 607–623

    Google Scholar 

  49. 49.

    Stocker M, Seaborne A, Bernstein A, Kiefer C, Reynolds D (2008) SPARQL basic graph pattern optimization using selectivity estimation. In: Proceedings of WWW conference, pp 595–604

  50. 50.

    Tanabe Y, Takahashi K, Hagiya M (2008) A decision procedure for alternation-free modal \(\mu \)-calculi. Adv Modal Logic 7:341–362

    MathSciNet  MATH  Google Scholar 

  51. 51.

    Tanabe Y, Takahashi K, Yamamoto M, Tozawa A, Hagiya M (2005) A decision procedure for the alternation-free two-way modal \(\mu \)-calculus. In: TABLEAUX, pp 277–291

Download references

Acknowledgements

This research was partially supported by the ANR project CLEAR (ANR-16-CE25-0010).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Melisachew Wudage Chekol.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chekol, M.W., Euzenat, J., Genevès, P. et al. SPARQL Query Containment Under Schema. J Data Semant 7, 133–154 (2018). https://doi.org/10.1007/s13740-018-0087-1

Download citation

Keywords

  • SPARQL
  • Query containment
  • \(\mu \)-calculus