Declarative Data Transformations for Linked Data Generation: The Case of DBpedia

  • Ben De Meester
  • Wouter Maroy
  • Anastasia Dimou
  • Ruben Verborgh
  • Erik Mannens
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10250)

Abstract

Mapping languages allow us to define how Linked Data is generated from raw data, but only if the raw data values can be used as is to form the desired Linked Data. Since complex data transformations remain out of scope for mapping languages, these steps are often implemented as custom solutions, or with systems separate from the mapping process. The former data transformations remain case-specific, often coupled with the mapping, whereas the latter are not reusable across systems. In this paper, we propose an approach where data transformations (i) are defined declaratively and (ii) are aligned with the mapping languages. We employ an alignment of data transformations described using the Function Ontology (Open image in new window) and mapping of data to Linked Data described using the rdf Mapping Language (rml). We validate that our approach can map and transform dbpedia in a declaratively defined and aligned way. Our approach is not case-specific: data transformations are independent of their implementation and thus interoperable, while the functions are decoupled and reusable. This allows developers to improve the generation framework, whilst contributors can focus on the actual Linked Data, as there are no more dependencies, neither between the transformations and the generation framework nor their implementations.

Keywords

Data transformations FnO Linked Data generation RML 

References

  1. 1.
    Arenas, M., Bertails, A., Prudhommeaux, E., Sequeda, J.: A direct mapping of relational data to RDF. W3C Recommendation (2012). http://www.w3.org/TR/rdb-direct-mapping/
  2. 2.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76298-0_52 CrossRefGoogle Scholar
  3. 3.
    Auer, S., Lehmann, J.: What have innsbruck and leipzig in common? Extracting semantics from wiki content. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 503–517. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72667-8_36 CrossRefGoogle Scholar
  4. 4.
    Battle, R., Kolas, D.: GeoSPARQL: enabling a geospatial Semantic Web. Semant. Web J. 3(4), 355–370 (2011)Google Scholar
  5. 5.
    Bischof, S., Decker, S., Krennwallner, T., Lopes, N., Polleres, A.: Mapping between RDF and XML with XSPARQL. J. Data Semant. 1(3), 147–185 (2012)CrossRefGoogle Scholar
  6. 6.
    Cyganiak, R., Bizer, C., Garbers, J., Maresch, O., Becker, C.: The D2RQ Mapping Language. Technical report (2012). http://d2rq.org/d2rq-language
  7. 7.
    Das, S., Sundara, S., Cyganiak, R.: R2RML: RDB to RDF Mapping Language. Working group recommendation, W3C, September 2012. http://www.w3.org/TR/r2rml/
  8. 8.
    De Meester, B., Dimou, A.: The Function Ontology. Unofficial Draft (2016). https://w3id.org/function/spec
  9. 9.
    De Meester, B., Dimou, A., Verborgh, R., Mannens, E.: Discovering and using functions via content negotiation. In: 15th International Semantic Web Conference: Posters & Demonstrations Track. CEUR Workshop Proceedings, vol. 1690 (2016)Google Scholar
  10. 10.
    De Meester, B., Dimou, A., Verborgh, R., Mannens, E.: An ontology to semantically declare and describe functions. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9989, pp. 46–49. Springer, Cham (2016). doi:10.1007/978-3-319-47602-5_10 CrossRefGoogle Scholar
  11. 11.
    Debruyne, C., O’Sullivan, D.: R2RML-F: towards sharing and executing domain logic in R2RML mappings. In: Workshop on Linked Data on the Web. CEUR Workshop Proceedings, vol. 1593 (2016)Google Scholar
  12. 12.
    Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: RML: a generic language for integrated RDF mappings of heterogeneous data. In: Proceedings of the 7th Workshop on Linked Data on the Web. CEUR Workshop Proceedings, vol. 1184 (2014)Google Scholar
  13. 13.
    Euzenat, J., Shvaiko, P.: Ontology Matching. Springer, New York (2013)CrossRefMATHGoogle Scholar
  14. 14.
    Hernández, M., Koutrika, G., Krishnamurthy, R., Popa, L., Wisnesky, R.: HIL a high-level scripting language for entity integration. In: Proceedings of the 16th International Conference on Extending Database Technology. ACM (2013)Google Scholar
  15. 15.
    Hert, M., Reif, G., Gall, H.C.: ‘Semantic Web 2.0’ - write-enabling the Web of Data. In: 6th Workshop on Semantic Web Applications and Perspectives (2010)Google Scholar
  16. 16.
    Hert, M., Reif, G., Gall, H.C.: A comparison of RDB-to-RDF mapping languages. In: Proceedings of the 7th International Conference on Semantic Systems. ACM (2011)Google Scholar
  17. 17.
    Heyvaert, P., Dimou, A., Herregodts, A.-L., Verborgh, R., Schuurman, D., Mannens, E., Walle, R.: RMLEditor: a graph-based mapping editor for linked data mappings. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 709–723. Springer, Cham (2016). doi:10.1007/978-3-319-34129-3_43 CrossRefGoogle Scholar
  18. 18.
    Hyland, B., Atemezing, G., Villazón-Terrazas, B.: Best Practices for Publishing Linked Data. WG Note, W3C, January 2014. http://www.w3.org/TR/ld-bp/
  19. 19.
    Klímek, J., Škoda, P., Nečaský, M.: LinkedPipes ETL: evolved linked data preparation. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9989, pp. 95–100. Springer, Cham (2016). doi:10.1007/978-3-319-47602-5_20 CrossRefGoogle Scholar
  20. 20.
    Lange, C.: Krextor - an extensible framework for contributing content math to the web of data. In: Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.) CICM 2011. LNCS (LNAI), vol. 6824, pp. 304–306. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22673-1_29 CrossRefGoogle Scholar
  21. 21.
    Lanthaler, M.: Hydra Core Vocabulary. Unofficial Draft, June 2014. http://www.hydra-cg.com/spec/latest/core/
  22. 22.
    Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 23(4), 3–13 (2000)Google Scholar
  23. 23.
    Regalia, B., Janowicz, K., Gao, S.: VOLT: a provenance-producing, transparent SPARQL proxy for the on-demand computation of linked data and its application to spatiotemporally dependent data. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 523–538. Springer, Cham (2016). doi:10.1007/978-3-319-34129-3_32 CrossRefGoogle Scholar
  24. 24.
    Roman, D., Nikolov, N., Putlier, A., Sukhobok, D., Elvesaeter, B., Berre, A., Ye, Xi., Dimitrov, M., Simov, A., Zarev, M., Moynihan, R., Roberts, B., Berlocher, I., Kin, K.S., Lee, T., Smith, A., Heath, T.: DataGraft: one-stop-shop for open data management. Semant. Web J. (2016)Google Scholar
  25. 25.
    Scharffe, F., Atemezing, G., Troncy, R., Gandon, F., Villata, S., Bucher, B., Hamdi, F., Bihanic, L., Képéklian, G., Cotton, F., Euzenat, J., Fan, Z., Vandenbussche, P.Y., Vatant, B.: Enabling linked data publication with the datalift platform. In: Proceedings AAAI Workshop on Semantic Cities (2012)Google Scholar
  26. 26.
    Schultz, A., Matteini, A., Isele, R., Bizer, C., Becker, C.: LDIF - linked data integration framework. In: Proceedings of the Second International Conference on Consuming Linked Data. CEUR Workshop Proceedings, vol. 782, pp. 125–130 (2011)Google Scholar
  27. 27.
    Tennison, J., Kellogg, G., Herman, I.: Generating RDF from Tabular Data on the Web. W3C Recommendation, December 2015. https://www.w3.org/TR/csv2rdf/

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Ben De Meester
    • 1
  • Wouter Maroy
    • 1
  • Anastasia Dimou
    • 1
  • Ruben Verborgh
    • 1
  • Erik Mannens
    • 1
  1. 1.Ghent University - imec - IDLabGhentBelgium

Personalised recommendations