MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates

  • Kemele M. Endris
  • Mikhail Galkin
  • Ioanna Lytra
  • Mohamed Nadjib Mami
  • Maria-Esther Vidal
  • Sören Auer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10438)

Abstract

The increasing number of RDF data sources that allow for querying Linked Data via Web services form the basis for federated SPARQL query processing. Federated SPARQL query engines provide a unified view of a federation of RDF data sources, and rely on source descriptions for selecting the data sources over which unified queries will be executed. Albeit efficient, existing federated SPARQL query engines usually ignore the meaning of data accessible from a data source, and describe sources only in terms of the vocabularies utilized in the data source. Lack of source description may conduce to the erroneous selection of data sources for a query, thus affecting the performance of query processing over the federation. We tackle the problem of federated SPARQL query processing and devise MULDER, a query engine for federations of RDF data sources. MULDER describes data sources in terms of RDF molecule templates, i.e., abstract descriptions of entities belonging to the same RDF class. Moreover, MULDER utilizes RDF molecule templates for source selection, and query decomposition and optimization. We empirically study the performance of MULDER on existing benchmarks, and compare MULDER performance with state-of-the-art federated SPARQL query engines. Experimental results suggest that RDF molecule templates empower MULDER federated query processing, and allow for the selection of RDF data sources that not only reduce execution time, but also increase answer completeness.

References

  1. 1.
    Acosta, M., Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: an adaptive query processing engine for SPARQL endpoints. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 18–34. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_2 CrossRefGoogle Scholar
  2. 2.
    Alexander, K., Hausenblas, M.: Describing linked datasets-on the design and usage of voiD, the ‘vocabulary of interlinked datasets’. In: LDOW (2009)Google Scholar
  3. 3.
    Basca, C., Bernstein, A.: Querying a messy web of data with avalanche. J. Web Semant. 26, 1–28 (2014)CrossRefGoogle Scholar
  4. 4.
    Görlitz, O., Staab, S.: SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. In: COLD (2011)Google Scholar
  5. 5.
    Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Montoya, G., Skaf-Molli, H., Molli, P., Vidal, M.: Decomposing federated queries in presence of replicated fragments. J. Web Semant. 42, 1–18 (2017)CrossRefGoogle Scholar
  7. 7.
    Palma, G., Vidal, M.-E., Raschid, L.: Drug-target interaction prediction using semantic similarity and edge partitioning. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 131–146. Springer, Cham (2014). doi:10.1007/978-3-319-11964-9_9 Google Scholar
  8. 8.
    Saleem, M., Khan, Y., Hasnain, A., Ermilov, I., Ngomo, A.N.: A fine-grained evaluation of SPARQL endpoint federation systems. Semant. Web 7(5), 493–518 (2015)CrossRefGoogle Scholar
  9. 9.
    Saleem, M., Ngonga Ngomo, A.-C.: HiBISCuS: hypergraph-based source selection for SPARQL endpoint federation. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 176–191. Springer, Cham (2014). doi:10.1007/978-3-319-07443-6_13 CrossRefGoogle Scholar
  10. 10.
    Scheufele, W., Moerkotte, G.: On the complexity of generating optimal plans with cross products. In: 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 238–248 (1997)Google Scholar
  11. 11.
    Schmidt, M., Görlitz, O., Haase, P., Ladwig, G., Schwarte, A., Tran, T.: FedBench: a benchmark suite for federated semantic data query processing. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 585–600. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_37 CrossRefGoogle Scholar
  12. 12.
    Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_38 CrossRefGoogle Scholar
  13. 13.
    Verborgh, R., Sande, M.V., Hartig, O., Herwegen, J.V., Vocht, L.D., Meester, B.D., Haesendonck, G., Colpaert, P.: Triple pattern fragments: a low-cost knowledge graph interface for the web. J. Web Semant. 37, 184–206 (2016)CrossRefGoogle Scholar
  14. 14.
    Vidal, M., Castillo, S., Acosta, M., Montoya, G., Palma, G.: On the selection of SPARQL endpoints to efficiently execute federated SPARQL queries. Trans. Large-Scale Data Knowl.-Cent. Syst. 25, 109–149 (2016)CrossRefGoogle Scholar
  15. 15.
    Wylot, M., Cudré-Mauroux, P.: DiploCloud: efficient and scalable management of RDF data in the cloud. IEEE Trans. Knowl. Data Eng. 28(3), 659–674 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Kemele M. Endris
    • 1
  • Mikhail Galkin
    • 1
    • 2
    • 3
  • Ioanna Lytra
    • 1
    • 2
  • Mohamed Nadjib Mami
    • 1
    • 2
  • Maria-Esther Vidal
    • 2
  • Sören Auer
    • 1
    • 2
  1. 1.Enterprise Information Systems (EIS)University of BonnBonnGermany
  2. 2.Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS)Sankt AugustinGermany
  3. 3.ITMO UniversitySaint PetersburgRussia

Personalised recommendations