The VLDB Journal

, Volume 25, Issue 2, pp 243–268

Processing SPARQL queries over distributed RDF graphs

  • Peng Peng
  • Lei Zou
  • M. Tamer Özsu
  • Lei Chen
  • Dongyan Zhao
Regular Paper

DOI: 10.1007/s00778-015-0415-0

Cite this article as:
Peng, P., Zou, L., Özsu, M.T. et al. The VLDB Journal (2016) 25: 243. doi:10.1007/s00778-015-0415-0

Abstract

We propose techniques for processing SPARQL queries over a large RDF graph in a distributed environment. We adopt a “partial evaluation and assembly” framework. Answering a SPARQL query Q is equivalent to finding subgraph matches of the query graph Q over RDF graph G. Based on properties of subgraph matching over a distributed graph, we introduce local partial match as partial answers in each fragment of RDF graph G. For assembly, we propose two methods: centralized and distributed assembly. We analyze our algorithms from both theoretically and experimentally. Extensive experiments over both real and benchmark RDF repositories of billions of triples confirm that our method is superior to the state-of-the-art methods in both the system’s performance and scalability.

Keywords

RDF SPARQL RDF graph  Distributed queries 

Supplementary material

778_2015_415_MOESM1_ESM.pdf (109 kb)
Supplementary material 1 (pdf 109 KB)

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Institute of Computer Science and TechnologyPeking UniversityBeijingChina
  2. 2.David R. Cheriton School of Computer ScienceUniversity of WaterlooWaterlooCanada
  3. 3.Department of Computer Science and EngineeringHong Kong University of Science and TechnologyClear Water BayChina

Personalised recommendations