Chapter

Search Computing

Volume 7538 of the series Lecture Notes in Computer Science pp 143-156

Extending SPARQL Algebra to Support Efficient Evaluation of Top-K SPARQL Queries

  • Alessandro BozzonAffiliated withPolitecnico of Milano
  • , Emanuele Della ValleAffiliated withPolitecnico of Milano
  • , Sara MagliacaneAffiliated withPolitecnico of MilanoVU University Amsterdam

* Final gross prices may vary according to local VAT.

Get Access

Abstract

With the widespread adoption of Linked Data, the efficient processing of SPARQL queries gains importance. A crucial category of queries that is prone to optimization is “top-k” queries, i.e. queries returning the top k results ordered by a specified ranking function. Top-k queries can be expressed in SPARQL by appending to a SELECT query the ORDER BY and LIMIT clauses, which impose a sorting order on the result set, and limit the number of results. However, the ORDER BY and LIMIT clauses in SPARQL algebra are result modifiers, i.e. their evaluation is performed only after the evaluation of the other query clauses. The evaluation of ORDER BY and LIMIT clauses in SPARQL engines typically requires the process of all the matching solutions (possibly thousands), followed by a monolithically computation of the ranking function for each solution, even if only a limited number (e.g. K = 10) of them were requested, thus leading to poor performance.

In this paper, we present \(\mathcal{S}\)PARQL-\(\mathcal{R}{\rm ANK}\), an extension of the SPARQL algebra and execution model that supports ranking as a first-class SPAR-QL construct. The new algebra and execution model allow for splitting the ranking function and interleaving it with other operations. We also provide a prototypal open source implementation of \(\mathcal{S}\)PARQL-\(\mathcal{R}{\rm ANK}\) based on ARQ, and we carry out a series of preliminary experiments.