Top-k Linked Data Query Processing

  • Andreas Wagner
  • Thanh Tran Duc
  • Günter Ladwig
  • Andreas Harth
  • Rudi Studer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7295)

Abstract

In recent years, top-k query processing has attracted much attention in large-scale scenarios, where computing only the k “best” results is often sufficient. One line of research targets the so-called top-k join problem, where the k best final results are obtained through joining partial results. In this paper, we study the top-k join problem in a Linked Data setting, where partial results are located at different sources and can only be accessed via URI lookups. We show how existing work on top-k join processing can be adapted to the Linked Data setting. Further, we elaborate on strategies for a better estimation of scores of unprocessed join results (to obtain tighter bounds for early termination) and for an aggressive pruning of partial results. Based on experiments on real-world Linked Data, we show that the proposed top-k join processing technique substantially improves runtime performance.

References

  1. 1.
    Elbassuoni, S., Ramanath, M., Schenkel, R., Sydow, M., Weikum, G.: Language-model-based ranking for queries on RDF-graphs. In: CIKM, pp. 977–986 (2009)Google Scholar
  2. 2.
    Finger, J., Polyzotis, N.: Robust and efficient algorithms for rank join evaluation. In: SIGMOD, pp. 415–428 (2009)Google Scholar
  3. 3.
    Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K., Umbrich, J.: Data summaries for on-demand queries over linked data. In: World Wide Web (2010)Google Scholar
  4. 4.
    Harth, A., Kinsella, S., Decker, S.: Using Naming Authority to Rank Data and Ontologies for Web Search. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 277–292. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Hartig, O.: Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 154–169. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  6. 6.
    Hartig, O., Bizer, C., Freytag, J.-C.: Executing SPARQL Queries over the Web of Linked Data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Ilyas, I.F., Aref, W.G., Elmagarmid, A.K.: Supporting top-k join queries in relational databases. The VLDB Journal 13, 207–221 (2004)CrossRefGoogle Scholar
  8. 8.
    Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 58, 11:1–11:58 (2008)CrossRefGoogle Scholar
  9. 9.
    Klyne, G., Carroll, J.J., McBride, B.: Resource description framework (RDF): concepts and abstract syntax (2004)Google Scholar
  10. 10.
    Ladwig, G., Tran, T.: Linked Data Query Processing Strategies. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 453–469. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Ladwig, G., Tran, T.: SIHJoin: Querying Remote and Local Linked Data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 139–153. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  12. 12.
    Li, C., Chang, K.C.-C., Ilyas, I.F., Song, S.: Ranksql: query algebra and optimization for relational top-k queries. In: SIGMOD, pp. 131–142 (2005)Google Scholar
  13. 13.
    Mamoulis, N., Yiu, M.L., Cheng, K.H., Cheung, D.W.: Efficient top-k aggregation of ranked inputs. ACM Trans. Database Syst. (2007)Google Scholar
  14. 14.
    Natsev, A., Chang, Y.-C., Smith, J.R., Li, C.-S., Vitter, J.S.: Supporting incremental join queries on ranked inputs. In: VLDB, pp. 281–290 (2001)Google Scholar
  15. 15.
    Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C Recommendation (2008)Google Scholar
  16. 16.
    Schmedding, F.: Incremental SPARQL evaluation for query answering on linked data. In: Workshop on Consuming Linked Data in Conjunction with ISWC (2011)Google Scholar
  17. 17.
    Schnaitter, K., Polyzotis, N.: Optimal algorithms for evaluating rank joins in database systems. ACM Trans. Database Syst. 35, 6:1–6:47 (2010)CrossRefGoogle Scholar
  18. 18.
    Theobald, M., Weikum, G., Schenkel, R.: Top-k query evaluation with probabilistic guarantees. In: VLDB, pp. 648–659 (2004)Google Scholar
  19. 19.
    Wagner, A., Tran, D.T., Ladwig, G., Harth, A., Studer, R.: Top-k linked data query processing (2011), http://www.aifb.kit.edu/web/Techreport3022
  20. 20.
    Wu, M., Berti-Equille, L., Marian, A., Procopiuc, C.M., Srivastava, D.: Processing top-k join queries. In: VLDB, pp. 860–870 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Andreas Wagner
    • 1
  • Thanh Tran Duc
    • 1
  • Günter Ladwig
    • 1
  • Andreas Harth
    • 1
  • Rudi Studer
    • 1
  1. 1.Institute AIFBKarlsruhe Institute of TechnologyGermany

Personalised recommendations