Advertisement

Using an Existing Website as a Queryable Low-Cost LOD Publishing Interface

  • Brecht Van de VyvereEmail author
  • Ruben Taelman
  • Pieter Colpaert
  • Ruben Verborgh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11762)

Abstract

Maintaining an Open Dataset comes at an extra recurring cost when it is published in a dedicated Web interface. As there is not often a direct financial return from publishing a dataset publicly, these extra costs need to be minimized. Therefore we want to explore reusing existing infrastructure by enriching existing websites with Linked Data. In this demonstrator, we advised the data owner to annotate a digital heritage website with JSON-LD snippets, resulting in a dataset of more than three million triples that is now available and officially maintained. The website itself is paged, and thus hydra partial collection view controls were added in the snippets. We then extended the modular query engine Comunica to support following page controls and extracting data from HTML documents while querying. This way, a SPARQL or GraphQL query over multiple heterogeneous data sources can power automated data reuse. While the query performance on such an interface is visibly poor, it becomes easy to create composite data dumps. As a result of implementing these building blocks in Comunica, any paged collection and enriched HTML page now becomes queryable by the query engine. This enables heterogenous data interfaces to share functionality and become technically interoperable.

This is a print-version of a paper first written for the Web. The Web-version is available at https://brechtvdv.github.io/Article-Using-an-existing-website-as-a-queryable-low-cost-LOD-publishing-interface/.

Keywords

JSON-LD data snippets Hypermedia web APIs Intelligent agents Digital humanities Linked Open Data Semantic web 

References

  1. 1.
    Berners-Lee, T.: 5 Star Data (2009). https://5stardata.info/en/
  2. 2.
    Verborgh, R., et al.: Triple pattern fragments: a low-cost knowledge graph interface for the web. J. Web Semant. 37, 184–206 (2016)CrossRefGoogle Scholar
  3. 3.
    Mika, P.: On Schema.org and why it matters for the web. IEEE Internet Comput. 19, 52–55 (2015)CrossRefGoogle Scholar
  4. 4.
    Taelman, R., Van Herwegen, J., Vander Sande, M., Verborgh, R.: Comunica: a modular SPARQL query engine for the web. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 239–255. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-00668-6_15CrossRefGoogle Scholar
  5. 5.
    Fielding, R.T., Taylor, R.N.: Architectural Styles and the Design of Network-Based Software Architectures. University of California, Irvine (2000)Google Scholar
  6. 6.
    Fernández, J.D., Martínez-Prieto, M.A., Gutiérrez, C., Polleres, A., Arias, M.: Binary RDF representation for publication and exchange (HDT). J. Web Sem. 19, 22–41 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.IDLab (imec – Ghent University)GhentBelgium

Personalised recommendations