Efficiently Pinpointing SPARQL Query Containments

  • Claus Stadler
  • Muhammad Saleem
  • Axel-Cyrille Ngonga Ngomo
  • Jens Lehmann
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10845)


Query containment is a fundamental problem in database research, which is relevant for many tasks such as query optimisation, view maintenance and query rewriting. For example, recent SPARQL engines built on Big Data frameworks that precompute solutions to frequently requested query patterns, are conceptually an application of query containment. We present an approach for solving the query containment problem for SPARQL queries – the W3C standard query language for RDF datasets. Solving the query containment problem can be reduced to the problem of deciding whether a sub graph isomorphism exists between the normalized algebra expressions of two queries.

Several state-of-the-art methods are limited to matching two queries only, as well as only giving a boolean answer to whether a containment relation holds. In contrast, our approach is fit for view selection use cases, and thus capable of efficiently enumerating all containment mappings among a set of queries. Furthermore, it provides the information about how two queries’ algebra expression trees correspond under containment mappings. All of our source code and experimental results are openly available.



This work was partly supported by the grant from the European Union’s Horizon 2020 research Europe flag and innovation programme for the projects HOBBIT (GA no. 688227), QROWD (GA no. 732194) and WDAqua (GA no. 642795).


  1. 1.
    Angles, R., Gutierrez, C.: The multiset semantics of SPARQL patterns. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 20–36. Springer, Cham (2016). Scholar
  2. 2.
    Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, pp. 77–90. ACM (1977)Google Scholar
  3. 3.
    Chaudhuri, S., Vardi, M.Y.: Optimization of real conjunctive queries. In: Beeri, C. (ed.) PODS, pp. 59–70. ACM Press (1993)Google Scholar
  4. 4.
    Wudage Chekol, M., Euzenat, J., Genevès, P., Layaïda, N.: Evaluating and benchmarking SPARQL query containment solvers. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 408–423. Springer, Heidelberg (2013). Scholar
  5. 5.
    Halevy, A.: Answering queries using views - a survey. VLDB J. 10(4), 270–294 (2001)CrossRefGoogle Scholar
  6. 6.
    Lee, J., Han, W.-S., Kasperovics, R., Lee, J.-H.: An in-depth comparison of subgraph isomorphism algorithms in graph databases. PVLDB 6(2), 133–144 (2012)Google Scholar
  7. 7.
    Martin, M., Unbehauen, J., Auer, S.: Improving the performance of semantic web applications with SPARQL query caching. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6089, pp. 304–318. Springer, Heidelberg (2010). Scholar
  8. 8.
    Papailiou, N., Tsoumakos, D., Karras, P., Koziris, N.: Graph-aware, workload-adaptive SPARQL query caching. In: Sellis, T.K., Davidson, S.B., Ives, Z.G. (eds.) SIGMOD Conference, pp. 1777–1792. ACM (2015)Google Scholar
  9. 9.
    Saleem, M., Stadler, C., Mehmood, Q., Lehmann, J., Ngomo, A.-C.N.: SQCFramework: SPARQL query containment benchmark generation framework. In: Proceedings of the Knowledge Capture Conference, K-CAP 2017, pp. 28:1–28:8. ACM, New York (2017)Google Scholar
  10. 10.
    Savnik, I.: Index data structure for fast subset and superset queries. In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 134–148. Springer, Heidelberg (2013). Scholar
  11. 11.
    Schtzle, A., Przyjaciel-Zablocki, M., Skilevic, S., Lausen, G.: S2RDF: RDF querying with SPARQL on spark. CoRR, abs/1512.07021 (2015)Google Scholar
  12. 12.
    Singh, K., Lytra, I., Vidal, M.-E., Punjani, D., Thakkar, H., Lange, C., Auer, S.: QAestro – semantic-based composition of question answering pipelines. In: Benslimane, D., Damiani, E., Grosky, W.I., Hameurlain, A., Sheth, A., Wagner, R.R. (eds.) DEXA 2017. LNCS, vol. 10438, pp. 19–34. Springer, Cham (2017). Scholar
  13. 13.
    Wang, J., Ntarmos, N., Triantafillou, P.: Graphcache: a caching system for graph queries. In: Markl, V., Orlando, S., Mitschang, B., Andritsos, P., Sattler, K.-U., Bre, S. (eds.) EDBT, pp. 13–24. (2017)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Claus Stadler
    • 1
  • Muhammad Saleem
    • 1
  • Axel-Cyrille Ngonga Ngomo
    • 2
  • Jens Lehmann
    • 3
    • 4
  1. 1.Computer Science InstituteUniversity of LeipzigLeipzigGermany
  2. 2.University of PaderbornPaderbornGermany
  3. 3.Smart Data Analytics Group, Computer Science Institute IIIUniversity of BonnBonnGermany
  4. 4.Enterprise Information Systems DepartmentFraunhofer IAISSankt AugustinGermany

Personalised recommendations