Skip to main content

Link Traversal Query Processing Over Decentralized Environments with Structural Assumptions

  • Conference paper
  • First Online:
The Semantic Web – ISWC 2023 (ISWC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14265))

Included in the following conference series:

Abstract

To counter societal and economic problems caused by data silos on the Web, efforts such as Solid strive to reclaim private data by storing it in permissioned documents over a large number of personal vaults across the Web. Building applications on top of such a decentralized Knowledge Graph involves significant technical challenges: centralized aggregation prior to query processing is impossible for legal reasons, and current federated querying techniques cannot handle this large scale of distribution at the expected performance. We propose an extension to Link Traversal Query Processing (LTQP) that incorporates structural properties within decentralized environments to tackle their unprecedented scale. In this article, we analyze the structural properties of the Solid decentralization ecosystem that are relevant for query execution, we introduce novel LTQP algorithms leveraging these structural properties, and evaluate their effectiveness. Our experiments indicate that these new algorithms obtain correct results in the order of seconds, which existing algorithms cannot achieve. This work reveals that a traversal-based querying method using structural assumptions can be effective for large-scale decentralization, but that advances are needed in the area of query planning for LTQP to handle more complex queries. These insights open the door to query-driven decentralized applications, in which declarative queries shield developers from the inherent complexity of a decentralized landscape.

Canonical version: https://comunica.github.io/Article-ISWC2023-SolidQuery/

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Berners-Lee, T.J.: Information management: a proposal (1989)

    Google Scholar 

  2. Verborgh, R.: Re-decentralizing the web, for good this time. In: Seneviratne, O., Hendler, J. (eds.) Linking the World’s Information: A Collection of Essays on the Work of Sir Tim Berners-Lee. ACM (2022)

    Google Scholar 

  3. Bluesky. Bluesky (2023). https://blueskyweb.xyz/

  4. Zignani, M., Gaito, S., Rossi, G.P.: Follow the Mastodon: structure and evolution of a decentralized online social network. In: Twelfth International AAAI Conference on Web and Social Media (2018)

    Google Scholar 

  5. Kuhn, T., Taelman, R., Emonet, V., Antonatos, H., Soiland-Reyes, S., Dumontier, M.: Semantic micro-contributions with decentralized nanopublication services. PeerJ Comput. Sci. (2021). https://doi.org/10.7717/peerj-cs.387

    Article  Google Scholar 

  6. Hogan, A., et al.: Knowledge graphs. In: Synthesis Lectures on Data, Semantics, and Knowledge, vol. 12, pp. 1–257 (2021)

    Google Scholar 

  7. Dedecker, R., Slabbinck, W., Wright, J., Hochstenbach, P., Colpaert, P., Verborgh, R.: What’s in a Pod? – a knowledge graph interpretation for the Solid ecosystem. In: Saleem, M., Ngonga Ngomo, A.-C. (eds.) Proceedings of the 6th Workshop on Storing, Querying and Benchmarking Knowledge Graphs, pp. 81–96 (2022)

    Google Scholar 

  8. Berners-Lee, T.: Linked Data (2009). https://www.w3.org/DesignIssues/LinkedData.html

  9. Cyganiak, R., Wood, D., Lanthaler, M.: RDF 1.1: Concepts and Abstract Syntax. W3C (2014). https://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/

  10. Feigenbaum, L., Todd Williams, G., Grant Clark, K., Torres, E.: SPARQL 1.1 Protocol. W3C (2013). https://www.w3.org/TR/2013/REC-sparql11-protocol-20130321/

  11. Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: Fedx: optimization techniques for federated query processing on linked data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_38

    Chapter  Google Scholar 

  12. Verborgh, R., et al.: Triple pattern fragments: a low-cost knowledge graph interface for the web. J. Web Semant. 37, 184–206 (2016)

    Article  Google Scholar 

  13. Saleem, M., Ngonga Ngomo, A.-C.: Hibiscus: hypergraph-based source selection for SPARQL endpoint federation. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 176–191. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07443-6_13

    Chapter  Google Scholar 

  14. Görlitz, O., Staab, S.: Splendid: SPARQL endpoint federation exploiting void descriptions. In: Proceedings of the Second International Conference on Consuming Linked Data, vol. 782, pp. 13–24. CEUR-WS.org (2011)

    Google Scholar 

  15. Hartig, O.: An overview on execution strategies for linked data queries. Datenbank-Spektrum 13, 89–99 (2013)

    Google Scholar 

  16. Hartig, O., Freytag, J.-C.: Foundations of traversal based query execution over linked data. In: Proceedings of the 23rd ACM Conference on Hypertext and Social Media, pp. 43–52. ACM (2012)

    Google Scholar 

  17. Speicher, S., Arwe, J., Malhotra, A.: Linked Data Platform 1.0. W3C (2015). https://www.w3.org/TR/ldp/

  18. Turdean, T.: Type Indexes. Solid (2022). https://solid.github.io/type-indexes/

  19. Hartig, O.: SPARQL for a web of linked data: semantics and computability. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 8–23. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_8

    Chapter  Google Scholar 

  20. Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K.-U., Umbrich, J.: Data summaries for on-demand queries over linked data. In: Proceedings of the 19th International Conference on World Wide Web, pp. 411–420 (2010)

    Google Scholar 

  21. Umbrich, J., Hose, K., Karnstedt, M., Harth, A., Polleres, A.: Comparing data summaries for processing live queries over linked data. World Wide Web 14, 495–544 (2011)

    Article  Google Scholar 

  22. Hartig, O., Hose, K., Sequeda, J.: Linked data management. In: Sakr, S., Zomaya, A. (eds.) Encyclopedia of Big Data Technologies, pp. 1–7. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-63962-8_76-1

    Chapter  Google Scholar 

  23. Hartig, O.: Zero-knowledge query planning for an iterator implementation of link traversal based query execution. In: Antoniou, G., et al. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 154–169. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_11

    Chapter  Google Scholar 

  24. Mendelzon, A.O., Mihaila, G.A., Milo, T.: Querying the world wide web. In: Fourth International Conference on Parallel and Distributed Information Systems, pp. 80–91. IEEE (1996)

    Google Scholar 

  25. Konopnicki, D., Shmueli, O.: Information gathering in the world-wide web: the W3QL query language and the W3QS system. ACM Trans. Datab. Syst. 23, 369–410 (1998)

    Article  Google Scholar 

  26. Chakrabarti, S., Van den Berg, M., Dom, B.: Focused crawling: a new approach to topic-specific Web resource discovery. Comput. Netw. 31, 1623–1640 (1999)

    Article  Google Scholar 

  27. Batsakis, S., Petrakis, E.G.M., Milios, E.: Improving the performance of focused web crawlers. Data Knowl. Eng. 68, 1001–1013 (2009)

    Article  Google Scholar 

  28. Hartig, O., Pirrò, G.: SPARQL with property paths on the web. Semantic Web 8, 773–795 (2017)

    Article  Google Scholar 

  29. Harris, S., Seaborne, A., Prud’hommeaux, E.: SPARQL 1.1 Query Language. W3C (2013). https://www.w3.org/TR/2013/REC-sparql11-query-20130321/

  30. Bogaerts, B., Ketsman, B., Zeboudj, Y., Aamer, H., Taelman, R., Verborgh, R.: Link traversal with distributed subweb specifications. In: Proceedings of the Rules and Reasoning: 5th International Joint Conference, RuleML+RR 2021, Leuven, 8–15 September 2021 (2021)

    Google Scholar 

  31. Hartig, O., Özsu, M.T.: Walking without a map: optimizing response times of traversal-based linked data queries (extended version). arXiv preprint arXiv:1607.01046 (2016)

  32. Schaffert, S., Bauer, C., Kurz, T., Dorschel, F., Glachs, D., Fernandez, M.: The linked media framework: Integrating and interlinking enterprise media content and data. In: Proceedings of the 8th International Conference on Semantic Systems, pp. 25–32 (2012)

    Google Scholar 

  33. Hartig, O., Pérez, J.: LDQL: a query language for the web of linked data. J. Web Semant. 41, 9–29 (2016)

    Article  Google Scholar 

  34. Fionda, V., Pirrò, G., Gutierrez, C.: NautiLOD: a formal language for the web of data graph. ACM Trans. Web (TWEB) 9, 1–43 (2015)

    Article  Google Scholar 

  35. Capadisli, S., Berners-Lee, T., Verborgh, R., Kjernsmo, K.: Solid Protocol. Solid (2020). https://solidproject.org/TR/protocol

  36. Van Herwegen, J., Verborgh, R., Taelman, R., Bosquet, M.: Community Solid Server (2022). https://github.com/CommunitySolidServer/CommunitySolidServer

  37. Inrupt. PodSpaces (2022). https://docs.inrupt.com/pod-spaces/

  38. Flanders, D.: The Flemish Data Utility Company (2022). https://www.vlaanderen.be/digitaal-vlaanderen/het-vlaams-datanutsbedrijf/the-flemish-data-utility-company

  39. Capadisli, S.: Web Access Control. Solid (2022). https://solid.github.io/web-access-control-spec/

  40. Bosquet, M.: Access Control Policy (ACP). Solid (2022). https://solid.github.io/authorization-panel/acp-specification/

  41. Coburn, A., Pavlik, E., Zagidulin, D.: Solid-OIDC. Solid (2022). https://solid.github.io/solid-oidc/

  42. Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Datab. Syst. 34, 1–45 (2009)

    Article  Google Scholar 

  43. Hartig, O.: SQUIN: a traversal based query execution system for the web of linked data. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1081–1084 (2013)

    Google Scholar 

  44. Ladwig, G., Tran, T.: SIHJoin: querying remote and local linked data. In: Antoniou, G., et al. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 139–153. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_10

    Chapter  Google Scholar 

  45. Miranker, D.P., Depena, R.K., Jung, H., Sequeda, J.F., Reyna, C.: Diamond: a SPARQL query engine, for linked data based on the Rete match. In: Proceedings of the Workshop on Artificial Intelligence Meets the Web of Data (AImWD) (2012)

    Google Scholar 

  46. Wilschut, A.N., Apers, P.M.G.: Pipelining in query execution. In: Proceedings. PARBASE-90: International Conference on Databases, Parallel Architectures, and Their Applications, p. 562. IEEE (1990)

    Google Scholar 

  47. Taelman, R., Van Herwegen, J., Vander Sande, M., Verborgh, R.: Comunica: a Modular SPARQL query engine for the web. In: Proceedings of the 17th International Semantic Web Conference (2018)

    Google Scholar 

  48. Fafalios, P., Yannakis, T., Tzitzikas, Y.: Querying the web of data with SPARQL-LD. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds.) TPDL 2016. LNCS, vol. 9819, pp. 175–187. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43997-6_14

    Chapter  Google Scholar 

  49. Hartig, O.: How caching improves efficiency and result completeness for querying linked data. In: LDOW (2011)

    Google Scholar 

  50. Erling, O., et al.: The LDBC social network benchmark: interactive workload. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 619–630 (2015)

    Google Scholar 

  51. Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: Proceedings of the 13th International Conference on Database Theory, pp. 4–33. ACM (2010)

    Google Scholar 

  52. Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: Proceedings of the 7th International Conference on World Wide Web, pp. 595–604. ACM (2008)

    Google Scholar 

  53. Nielsen, J.: Response times: the three important limits. Usabil. Eng. (1993)

    Google Scholar 

  54. Deshpande, A., Ives, Z., Raman, V.: Adaptive query processing. Found. Trends® Databases 1, 1–140 (2007)

    Google Scholar 

  55. Acosta, M., Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: an adaptive query processing engine for SPARQL endpoints. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 18–34. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_2

    Chapter  Google Scholar 

  56. Acosta, M., Vidal, M.-E.: Networks of linked data eddies: an adaptive web query processing engine for RDF data. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 111–127. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_7

    Chapter  Google Scholar 

  57. Heling, L., Acosta, M.: Robust query processing for linked data fragments. Semantic Web 1–35 (2022)

    Google Scholar 

  58. Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: 2011 IEEE 27th International Conference on Data Engineering, pp. 984–994. IEEE (2011)

    Google Scholar 

  59. Prud’hommeaux, E., Bingham, J.: Shape Trees Specification. W3C (2021). https://shape-trees.org/TR/specification/

  60. Taelman, R., Steyskal, S., Kirrane, S.: Towards querying in decentralized environments with privacy-preserving aggregation. In: Proceedings of the 4th Workshop on Storing, Querying, and Benchmarking the Web of Data (2020)

    Google Scholar 

  61. Taelman, R., Verborgh, R.: A prospective analysis of security vulnerabilities within link traversal-based query processing. In: Proceedings of the 6th International Workshop on Storing, Querying and Benchmarking Knowledge Graphs (2022)

    Google Scholar 

  62. Azzam, A., Fernández, J.D., Acosta, M., Beno, M., Polleres, A.: SMART-KG: hybrid shipping for SPARQL querying on the web. In: Proceedings of the Web Conference 2020, pp. 984–994 (2020)

    Google Scholar 

  63. Minier, T., Skaf-Molli, H., Molli, P.: SaGe: web preemption for public SPARQL query services. In: The World Wide Web Conference, pp. 1268–1278 (2019)

    Google Scholar 

  64. Azzam, A., Aebeloe, C., Montoya, G., Keles, I., Polleres, A., Hose, K.: WiseKG: balanced access to web knowledge graphs. In: Proceedings of the Web Conference 2021, pp. 1422–1434 (2021)

    Google Scholar 

  65. Aebeloe, C., Keles, I., Montoya, G., Hose, K.: Star pattern fragments: accessing knowledge graphs through star patterns. arXiv preprint arXiv:2002.09172 (2020)

  66. Hartig, O., Buil-Aranda, C.: Bindings-restricted triple pattern fragments. In: Debruyne, C., et al. (eds.) OTM 2016. LNCS, vol. 10033, pp. 762–779. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48472-3_48

    Chapter  Google Scholar 

  67. Heling, L., Acosta, M.: Federated SPARQL query processing over heterogeneous linked data fragments. In: Proceedings of the ACM Web Conference 2022, pp. 1047–1057 (2022)

    Google Scholar 

  68. Cheng, S., Hartig, O.: FedQPL: A language for logical query plans over heterogeneous federations of RDF data sources. In: Proceedings of the 22nd International Conference on Information Integration and Web-based Applications and Services, pp. 436–445 (2020)

    Google Scholar 

  69. Montoya, G., Aebeloe, C., Hose, K.: Towards efficient query processing over heterogeneous RDF interfaces. In: 2nd Workshop on Decentralizing the Semantic Web, DeSemWeb 2018. CEUR Workshop Proceedings (2018)

    Google Scholar 

Download references

Acknowledgements

This work is supported by SolidLab Vlaanderen (Flemish Government, EWI and RRF project VV023/10). Ruben Taelman is a postdoctoral fellow of the Research Foundation – Flanders (FWO) (1274521N).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruben Taelman .

Editor information

Editors and Affiliations

Supplemental Material Statement

Supplemental Material Statement

Implementation: https://github.com/comunica/comunica-feature-link-traversal Experiments: https://github.com/comunica/Experiments-Solid-Link-Traversal Benchmark: https://github.com/SolidBench/SolidBench.js.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Taelman, R., Verborgh, R. (2023). Link Traversal Query Processing Over Decentralized Environments with Structural Assumptions. In: Payne, T.R., et al. The Semantic Web – ISWC 2023. ISWC 2023. Lecture Notes in Computer Science, vol 14265. Springer, Cham. https://doi.org/10.1007/978-3-031-47240-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-47240-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-47239-8

  • Online ISBN: 978-3-031-47240-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics