Walking Without a Map: Ranking-Based Traversal for Querying Linked Data

  • Olaf Hartig
  • M. Tamer Özsu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9981)


The traversal-based approach to execute queries over Linked Data on the WWW fetches data by traversing data links and, thus, is able to make use of up-to-date data from initially unknown data sources. While the downside of this approach is the delay before the query engine completes a query execution, user perceived response time may be improved significantly by returning as many elements of the result set as soon as possible. To this end, the query engine requires a traversal strategy that enables the engine to fetch result-relevant data as early as possible. The challenge for such a strategy is that the query engine does not know a priori which of the data sources discovered during the query execution will contain result-relevant data. In this paper, we investigate 14 different approaches to rank traversal steps and achieve a variety of traversal strategies. We experimentally study their impact on response times and compare them to a baseline that resembles a breadth-first traversal. While our experiments show that some of the approaches can achieve noteworthy improvements over the baseline in a significant number of cases, we also observe that for every approach, there is a non-negligible chance to achieve response times that are worse than the baseline.


  1. 1.
    Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Sem. Web Inf. Sys. 5(2), 1–24 (2009)CrossRefGoogle Scholar
  2. 2.
    Duan, S., Kementsietsidis, A., Srinivas, K., Udrea, O.: Apples and oranges: a comparison of RDF benchmarks and real RDF datasets. In: Proceedings of the ACM SIGMOD (2011)Google Scholar
  3. 3.
    Harth, A.: Billion triples challenge data set (2011).
  4. 4.
    Harth, A., Hose, K., Schenkel, R. (eds.): Linked Data Management. Chapman & Hall, London (2014)Google Scholar
  5. 5.
    Hartig, O.: How caching improves efficiency and result completeness for querying linked data. In: Proceedings of the 4th Linked Data on the Web Workshop (LDOW) (2011)Google Scholar
  6. 6.
    Hartig, O.: SPARQL for a web of linked data: semantics and computability. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 8–23. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  7. 7.
    Hartig, O.: An overview on execution strategies for linked data queries. Datenbank-Spektrum 13(2), 89–99 (2013)CrossRefGoogle Scholar
  8. 8.
    Hartig, O., Bizer, C., Freytag, J.-C.: Executing SPARQL queries over the web of linked data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  9. 9.
    Hartig, O., Özsu, M.T.: Reachable subwebs for traversal-based query execution. In: Proceedings of the 23rd International World Wide Web Conference (WWW) (2014)Google Scholar
  10. 10.
    Hartig, O., Özsu, M.T.: Walking without a map: optimizing response times of traversal-based linked data queries (extended version). CoRR abs/1607.01046 (2016)Google Scholar
  11. 11.
    Ladwig, G., Tran, T.: Linked data query processing strategies. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 453–469. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Ladwig, G., Tran, T.: SIHJoin: querying remote and local linked data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 139–153. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Miranker, D.P., Depena, R.K., Jung, H., Sequeda, J.F., Reyna, C.: Diamond: a SPARQL query engine, for linked data based on the rete match. In: Proceedings of the AImWB (2012)Google Scholar
  14. 14.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report 1999–66, Stanford InfoLab, November 1999Google Scholar
  15. 15.
    Schmidt, M., Görlitz, O., Haase, P., Ladwig, G., Schwarte, A., Tran, T.: FedBench: a benchmark suite for federated semantic data query processing. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 585–600. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    Umbrich, J., Hogan, A., Polleres, A., Decker, S.: Link traversal querying for a diverse web of data. Semant. Web J. 6(6), 585–624 (2015)CrossRefGoogle Scholar
  17. 17.
    Umbrich, J., Hose, K., Karnstedt, M., Harth, A., Polleres, A.: Comparing data summaries for processing live queries over linked data. World Wide Web 14(5–6), 495–544 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Department of Computer and Information Science (IDA)Linköping UniversityLinköpingSweden
  2. 2.Cheriton School of Computer ScienceUniversity of WaterlooWaterlooCanada

Personalised recommendations