Merging and Enriching DCAT Feeds to Improve Discoverability of Datasets

  • Pieter Heyvaert
  • Pieter Colpaert
  • Ruben Verborgh
  • Erik Mannens
  • Rik Van de Walle
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9341)

Abstract

Data Catalog Vocabulary (DCAT) is a \({\mathrm{W}_{3}\mathrm{C}}\) specification to describe datasets published on the Web. However, these catalogs are not easily discoverable based on a user’s needs. In this paper, we introduce the Node.js module ‘dcat-merger’ which allows a user agent to download and semantically merge different DCAT feeds from the Web into one DCAT feed, which can be republished. Merging the input feeds is followed by enriching them. Besides determining the subjects of the datasets, using DBpedia Spotlight, two extensions were built: one categorizes the datasets according to a taxonomy, and the other adds spatial properties to the datasets. These extensions require the use of information available in DBpedia’s SPARQL endpoint. However, public SPARQL endpoints often suffer from low availability, its Triple Pattern Fragments alternative is used. However, the need for DCAT Merger sparks the discussion for more high level functionality to improve a catalog’s discoverability.

Keywords

Data publishing DCAT Triple pattern fragments Linked open data Open data Smart cities 

References

  1. 1.
    Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.-Y.: SPARQL web-querying infrastructure: ready for action? In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 277–293. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  2. 2.
    Colpaert, P., Verborgh, R., Mannens, E., de Walle, R.V.: Painless URI dereferencing using the datatank. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC Satellite Events 2014. LNCS, vol. 8798, pp. 304–309. Springer, Heidelberg (2014) Google Scholar
  3. 3.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011)Google Scholar
  4. 4.
    Verborgh, R., Hartig, O., De Meester, B., Haesendonck, G., De Vocht, L., Vander Sande, M., Cyganiak, R., Colpaert, P., Mannens, E., Van de Walle, R.: Querying datasets on the web with high availability. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 180–196. Springer, Heidelberg (2014) Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Pieter Heyvaert
    • 1
  • Pieter Colpaert
    • 1
  • Ruben Verborgh
    • 1
  • Erik Mannens
    • 1
  • Rik Van de Walle
    • 1
  1. 1.Ghent University - iMinds - Multimedia LabGhentBelgium

Personalised recommendations