The Semantic Web - ISWC 2009

Volume 5823 of the series Lecture Notes in Computer Science pp 293-309

Executing SPARQL Queries over the Web of Linked Data

  • Olaf HartigAffiliated withCarnegie Mellon UniversityHumboldt-Universität zu Berlin
  • , Christian BizerAffiliated withCarnegie Mellon UniversityFreie Universität Berlin
  • , Johann-Christoph FreytagAffiliated withCarnegie Mellon UniversityHumboldt-Universität zu Berlin

* Final gross prices may vary according to local VAT.

Get Access


The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.