The Journal of Supercomputing

, Volume 71, Issue 10, pp 3695–3725

SigMR: MapReduce-based SPARQL query processing by signature encoding and multi-way join

Article

DOI: 10.1007/s11227-015-1459-z

Cite this article as:
Ahn, J., Im, DH. & Kim, HG. J Supercomput (2015) 71: 3695. doi:10.1007/s11227-015-1459-z

Abstract

Large numbers of Resource Description Framework triples are available in Linked Data which can grow exponentially. It makes SPARQL query processing engines infeasible on a single machine. To address this scalability issue, MapReduce framework-based SPARQL engines have been proposed, but we note that these methods are limited in terms of join evaluations. The two-way join-based approach evaluates joins via a sequence of binary multiplications that require multiple MapReduce jobs, which involves costly disk accesses between MapReduce jobs. The multi-way join-based approach combines multiple two-way join operations, which allows the simultaneous evaluation of joins during one MapReduce job. However, the size of data for the MapReduce job might increase exponentially if a complex query is given. In this study, we propose SigMR, a pruning method for multi-way join-based SPARQL query processing in MapReduce. In the proposed approach, a SPARQL query can be evaluated in a single MapReduce job, where the size of data is reduced dramatically by pruning based on our signature encoding technique, thereby overcoming the weaknesses of the previous approaches. In experiments, we showed that the query processing time required was lower with our approach than existing MapReduce-based methods.

Keywords

Hadoop MapReduce Multi-way join Signature encoding SigMR SPARQL 

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Biomedical Knowledge Engineering Laboratory, Dental Research InstituteSeoul National UniversitySeoulRepublic of Korea
  2. 2.Department of Computer and Information EngineeringHoseo UniversityAsanRepublic of Korea
  3. 3.Institute of Human-Environment Interface BiologySeoul National UniversitySeoulRepublic of Korea