FedX: A Federation Layer for Distributed Query Processing on Linked Open Data

  • Andreas Schwarte
  • Peter Haase
  • Katja Hose
  • Ralf Schenkel
  • Michael Schmidt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6644)

Abstract

Driven by the success of the Linked Open Data initiative today’s Semantic Web is best characterized as a Web of interlinked datasets. Hand in hand with this structure new challenges to query processing are arising. Especially queries for which more than one data source can contribute results require advanced optimization and evaluation approaches, the major challenge lying in the nature of distribution: Heterogenous data sources have to be integrated into a federation to globally appear as a single repository. On the query level, though, techniques have to be developed to meet the requirements of efficient query computation in the distributed setting. We present FedX, a project which extends the Sesame Framework with a federation layer that enables efficient query processing on distributed Linked Open Data sources. We discuss key insights to its architecture and summarize our optimization techniques for the federated setting. The practicability of our system will be demonstrated in various scenarios using the Information Workbench.

References

  1. 1.
    Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, p. 54. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Alexander, K., et al.: Describing Linked Datasets – On the Design and Usage of voiD. In: Proceedings of the Linked Data on the Web Workshop (2009)Google Scholar
  3. 3.
    Stocker, M., et al.: SPARQL basic graph pattern optimization using selectivity estimation. In: WWW, pp. 595–604. ACM, New York (2008)CrossRefGoogle Scholar
  4. 4.
    Görlitz, O., et al.: Federated Data Management and Query Optimization for Linked Open Data. In: New Directions in Web Data Management (2011)Google Scholar
  5. 5.
    Erling, O., et al.: Rdf support in the virtuoso dbms. In: CSSW (2007)Google Scholar
  6. 6.
    Haase, P., Mathäß, T., Ziller, M.: An Evaluation of Approaches to Federated Query Processing over Linked Data. In: I-SEMANTICS (2010)Google Scholar
  7. 7.
    Neumann, T., Weikum, G.: Rdf-3X: a RISC-style engine for RDF. PVLDB 1(1) (2008)Google Scholar
  8. 8.
    Haase, P., et al.: The Information Workbench - Interacting with the Web of Data. Technical report, fluid Operations & AIFB Karlsruhe (2009)Google Scholar
  9. 9.
    Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Schenk, S., Staab, S.: Networked graphs: a declarative mechanism for sparql rules, sparql views and rdf data integration on the web. In: WWW (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Andreas Schwarte
    • 1
  • Peter Haase
    • 1
  • Katja Hose
    • 2
  • Ralf Schenkel
    • 2
  • Michael Schmidt
    • 1
  1. 1.fluid Operations AGWalldorfGermany
  2. 2.Max-Planck Institute for InformaticsSaarbrückenGermany

Personalised recommendations