Efficient Execution of Top-K SPARQL Queries

  • Sara Magliacane
  • Alessandro Bozzon
  • Emanuele Della Valle
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7649)


Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The \(\mathcal{S}\)PARQL-\(\mathcal{R}\)ANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for \(\mathcal{S}\)PARQL-\(\mathcal{R}\)ANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a \(\mathcal{S}\)PARQL-\(\mathcal{R}\)ANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.


Random Access Rank Operator Graph Pattern Execution Plan Query Optimization 


  1. 1.
    Bozzon, A., et al.: Towards and efficient SPARQL top-k query execution in virtual RDF stores. In: DBRANK Workshop in VLDB 2011 (2011)Google Scholar
  2. 2.
    Wagner, A., Duc, T.T., Ladwig, G., Harth, A., Studer, R.: Top-k Linked Data Query Processing. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 56–71. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. Int. J. Semantic Web Inf. Syst. 5(2) (2009)Google Scholar
  4. 4.
    Li, C., et al.: RankSQL: query algebra and optimization for relational top-k queries. In: SIGMOD 2005. ACM (2005)Google Scholar
  5. 5.
    Castagna, P.: Avoid a total sort for order by + limit queries. JENA bug tracker, https://issues.apache.org/jira/browse/jena-89
  6. 6.
    Della Valle, E., et al.: Order matters! harnessing a world of orderings for reasoning over massive data. Semantic Web Journal (2012)Google Scholar
  7. 7.
    Hwang, S.-W., Chang, K.: Probe minimization by schedule optimization: Supporting top-k queries with expensive predicates. IEEE TKDE 19(5) (2007)Google Scholar
  8. 8.
    Ilyas, I.F., et al.: Rank-aware Query Optimization. In: SIGMOD 2004. ACM (2004)Google Scholar
  9. 9.
    Ilyas, I.F., et al.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4) (2008)Google Scholar
  10. 10.
    Cheng, J., Ma, Z.M., Yan, L.: f-SPARQL: A Flexible Extension of SPARQL. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010, Part I. LNCS, vol. 6261, pp. 487–494. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Pérez, J., et al.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3) (2009)Google Scholar
  12. 12.
    Anyanwu, K., et al.: SemRank: ranking complex relationship search results on the semantic web. In: WWW 2005. ACM (2005)Google Scholar
  13. 13.
    Schmidt, M., et al.: Foundations of SPARQL query optimization. In: ICDT 2010. ACM (2010)Google Scholar
  14. 14.
    Stocker, M., et al.: SPARQL basic graph pattern optimization using selectivity estimation. In: WWW 2008. ACM (2008)Google Scholar
  15. 15.
    Martinenghi, D., Tagliasacchi, M.: Cost-Aware Rank Join with Random and Sorted Access. IEEE TKDE (2011)Google Scholar
  16. 16.
    Bruno, N., et al.: Evaluating Top-k Queries over Web-Accessible Databases. In: ICDE 2002. IEEE (2002)Google Scholar
  17. 17.
    Lopes, N., Polleres, A., Straccia, U., Zimmermann, A.: AnQL: SPARQLing Up Annotated RDFS. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 518–533. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  18. 18.
    Schnaitter, K., Polyzotis, N.: Optimal algorithms for evaluating rank joins in database systems. ACM Transactions on Database Systems 35(1) (2010)Google Scholar
  19. 19.
    Straccia, U.: SoftFacts: A top-k retrieval engine for ontology mediated access to relational databases. In: SMC 2010. IEEE (2010)Google Scholar
  20. 20.
    Siberski, W., Pan, J.Z., Thaden, U.: Querying the Semantic Web with Preferences. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 612–624. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  21. 21.
    Qi, Y., et al.: Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs. In: VLDB 2007 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Sara Magliacane
    • 1
    • 2
  • Alessandro Bozzon
    • 1
  • Emanuele Della Valle
    • 1
  1. 1.Politecnico of MilanoMilanoItaly
  2. 2.VU University AmsterdamThe Netherlands

Personalised recommendations