Extending SPARQL Algebra to Support Efficient Evaluation of Top-K SPARQL Queries

  • Alessandro Bozzon
  • Emanuele Della Valle
  • Sara Magliacane
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7538)

Abstract

With the widespread adoption of Linked Data, the efficient processing of SPARQL queries gains importance. A crucial category of queries that is prone to optimization is “top-k” queries, i.e. queries returning the top k results ordered by a specified ranking function. Top-k queries can be expressed in SPARQL by appending to a SELECT query the ORDER BY and LIMIT clauses, which impose a sorting order on the result set, and limit the number of results. However, the ORDER BY and LIMIT clauses in SPARQL algebra are result modifiers, i.e. their evaluation is performed only after the evaluation of the other query clauses. The evaluation of ORDER BY and LIMIT clauses in SPARQL engines typically requires the process of all the matching solutions (possibly thousands), followed by a monolithically computation of the ranking function for each solution, even if only a limited number (e.g. K = 10) of them were requested, thus leading to poor performance.

In this paper, we present \(\mathcal{S}\)PARQL-\(\mathcal{R}{\rm ANK}\), an extension of the SPARQL algebra and execution model that supports ranking as a first-class SPAR-QL construct. The new algebra and execution model allow for splitting the ranking function and interleaving it with other operations. We also provide a prototypal open source implementation of \(\mathcal{S}\)PARQL-\(\mathcal{R}{\rm ANK}\) based on ARQ, and we carry out a series of preliminary experiments.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anyanwu, K., Maduko, A., Sheth, A.: SemRank: ranking complex relationship search results on the semantic web. In: WWW 2005, pp. 117–127. ACM (2005)Google Scholar
  2. 2.
    Buil-Aranda, C., Arenas, M., Corcho, O.: Semantics and Optimization of the SPARQL 1.1 Federation Extension. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part II. LNCS, vol. 6644, pp. 1–15. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  3. 3.
    Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. Int. J. Semantic Web Inf. Syst. 5(2), 1–24 (2009)CrossRefGoogle Scholar
  4. 4.
    Bozzon, A., Della Valle, E., Magliacane, S.: Towards and efficient SPARQL top-k query execution in virtual RDF stores. In: 5th International Workshop on Ranking in Databases (DBRANK 2011) (August 2011)Google Scholar
  5. 5.
    Bruno, N., Gravano, L., Marian, A.: Evaluating Top-k Queries over Web-Accessible Databases. In: ICDE, p. 369. IEEE Computer Society (2002)Google Scholar
  6. 6.
    Castagna, P.: Avoid a total sort for order by + limit queries. JENA bug tracker, https://issues.apache.org/jira/browse/jena-89
  7. 7.
    Chang, K.C.-C., Hwang, S.-W.: Minimal probing: supporting expensive predicates for top-k queries. In: SIGMOD Conference, pp. 346–357. ACM (2002)Google Scholar
  8. 8.
    Cheng, J., Ma, Z.M., Yan, L.: f-SPARQL: A Flexible Extension of SPARQL. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010, Part I. LNCS, vol. 6261, pp. 487–494. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Harris, S., Seaborne, A.: SPARQL 1.1 Working Draft. Technical report, W3C (2011), http://www.w3.org/TR/sparql11-query/
  10. 10.
    Hwang, S.-W., Chang, K.C.-C.: Probe minimization by schedule optimization: Supporting top-k queries with expensive predicates. IEEE Transactions on Knowledge and Data Engineering 19(5), 646–662 (2007)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Ilyas, I.F., Aref, W.G., Elmagarmid, A.K.: Supporting Top-k Join Queries in Relational Databases. In: VLDB, pp. 754–765 (2003)Google Scholar
  12. 12.
    Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4) (2008)Google Scholar
  13. 13.
    Ilyas, I.F., Shah, R., Aref, W.G., Vitter, J.S., Elmagarmid, A.K.: Rank-aware Query Optimization. In: SIGMOD Conference, pp. 203–214. ACM (2004)Google Scholar
  14. 14.
    Li, C., Soliman, M.A., Chang, K.C.-C., Ilyas, I.F.: RankSQL: query algebra and optimization for relational top-k queries. In: SIGMOD 2005, pp. 131–142 (2005)Google Scholar
  15. 15.
    Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3) (2009)Google Scholar
  16. 16.
    Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF W3C Recommendation (January 2008), http://www.w3.org/TR/rdf-sparql-query/
  17. 17.
    Qi, Y., Candan, K.S., Sapino, M.L.: Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs. In: VLDB, pp. 507–518 (2007)Google Scholar
  18. 18.
    Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: ICDT 2010, pp. 4–33. ACM, New York (2010)Google Scholar
  19. 19.
    Schnaitter, K., Polyzotis, N.: Optimal algorithms for evaluating rank joins in database systems. ACM Transactions on Database Systems 35(1), 1–47 (2010)CrossRefGoogle Scholar
  20. 20.
    Siberski, W., Pan, J.Z., Thaden, U.: Querying the Semantic Web with Preferences. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 612–624. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  21. 21.
    Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: WWW, pp. 595–604. ACM (2008)Google Scholar
  22. 22.
    Straccia, U.: SoftFacts: A top-k retrieval engine for ontology mediated access to relational databases. In: SMC, pp. 4115–4122. IEEE (2010)Google Scholar
  23. 23.
    Vidal, M.-E., Ruckhaus, E., Lampo, T., Martínez, A., Sierra, J., Polleres, A.: Efficiently Joining Group Patterns in SPARQL Queries. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part I. LNCS, vol. 6088, pp. 228–242. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  24. 24.
    Zimmermann, A., Lopes, N., Polleres, A., Straccia, U.: A general framework for representing, reasoning and querying with annotated semantic web data. CoRR, abs/1103.1255 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Alessandro Bozzon
    • 1
  • Emanuele Della Valle
    • 1
  • Sara Magliacane
    • 1
    • 2
  1. 1.Politecnico of MilanoMilanoItaly
  2. 2.VU University AmsterdamThe Netherlands

Personalised recommendations