International Semantic Web Conference

The Semantic Web - ISWC 2015 pp 199-216 | Cite as

Ontology-Based Integration of Cross-Linked Datasets

  • Diego Calvanese
  • Martin Giese
  • Dag Hovland
  • Martin Rezk
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9366)

Abstract

In this paper we tackle the problem of answering SPARQL queries over virtually integrated databases. We assume that the entity resolution problem has already been solved and explicit information is available about which records in the different databases refer to the same real world entity. Surprisingly, to the best of our knowledge, there has been no attempt to extend the standard Ontology-Based Data Access (OBDA) setting to take into account these DB links for SPARQL query-answering and consistency checking. This is partly because the OWL built-in owl:sameAs property, the most natural representation of links between data sets, is not included in OWL 2 QL, the de facto ontology language for OBDA. We formally treat several fundamental questions in this context: how links over database identifiers can be represented in terms of owl:sameAs statements, how to recover rewritability of SPARQL into SQL (lost because of owl:sameAs statements), and how to check consistency. Moreover, we investigate how our solution can be made to scale up to large enterprise datasets. We have implemented the approach, and carried out an extensive set of experiments showing its scalability.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite family and relations. J. of Artificial Intelligence Research 36, 1–69 (2009)MathSciNetMATHGoogle Scholar
  2. 2.
    Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning 39(3), 385–429 (2007)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Calvanese, D., Giese, M., Hovland, D., Rezk, M.: Ontology-based integration of cross-linked datasets (2015). http://www.inf.unibz.it/~mrezk/pdf/techRep-ISWC15.pdf (accessed April 30, 2015)
  4. 4.
    Das, S., Sundara, S., Cyganiak, R.: R2RML: RDB to RDF mapping language. W3C Recommendation, W3C (September 2012). http://www.w3.org/TR/r2rml/
  5. 5.
    DeWitt, D.J.: The wisconsin benchmark: past, present, and future. In: Gray, J. (ed.) The Benchmark Handbook. Morgan Kaufmann (1992)Google Scholar
  6. 6.
    Doan, A., Halevy, A.Y., Ives, Z.G.: Principles of Data Integration. Morgan Kaufmann (2012)Google Scholar
  7. 7.
    Ioannou, E., Nejdl, W., Niederée, C., Velegrakis, Y.: On-the-fly entity-aware query processing in the presence of linkage. PVLDB 3(1), 429–438 (2010)Google Scholar
  8. 8.
    Kontchakov, R., Lutz, C., Toman, D., Wolter, F., Zakharyaschev, M.: The combined approach to ontology-based data access. In: Proc. of IJCAI 2011, pp. 2656–2661 (2011)Google Scholar
  9. 9.
    Kontchakov, R., Rezk, M., Rodríguez-Muro, M., Xiao, G., Zakharyaschev, M.: Answering SPARQL queries over databases under OWL 2 QL entailment regime. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 552–567. Springer, Heidelberg (2014) Google Scholar
  10. 10.
    Lloyd, J.W.: Foundations of Logic Programming, 2nd edn. Springer-Verlag New York Inc, Secaucus (1993)MATHGoogle Scholar
  11. 11.
    Marnette, B.: Generalized schema-mappings: from termination to tractability. In: PODS 2009, pp. 13–22. ACM, New York (2009)Google Scholar
  12. 12.
    Motik, B., Cuenca Grau, B., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 Web Ontology Language profiles, 2nd edn. W3C Recommendation, W3C (December 2012). http://www.w3.org/TR/owl2-profiles/
  13. 13.
    Motik, B., Nenov, Y., Piro, R.E.F., Horrocks, I.: Handling owl:sameAs via rewriting. In: Bonet, B., Koenig, S. (eds) Proc. 29th AAAI, pp. 231–237. AAAI Press (2015)Google Scholar
  14. 14.
    Rodríguez-Muro, M., Kontchakov, R., Zakharyaschev, M.: Ontology-based data access: Ontop of databases. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 558–573. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  15. 15.
    Rodriguez-Muro, M., Rezk, M.: Efficient SPARQL-to-SQL with R2RML mappings. J. of Web Semantics 33, 141–169 (2015)Google Scholar
  16. 16.
    Schlegel, K., Stegmaier, F., Bayerl, S., Granitzer, M., Kosch, H.: Balloon fusion: SPARQL rewriting based on unified co-reference information. In: Proc. of the 30th Int. Conf. on Data Engineering Workshops (ICDE 2014), pp. 254–259. IEEE (2014)Google Scholar
  17. 17.
    Tsangaris, M.M., Kakaletris, G., Kllapi, H., Papanikos, G., Pentaris, F., Polydoras, P., Sitaridi, E., Stoumpos, V., Ioannidis, Y.E.: Dataflow processing and optimization on grid and cloud infrastructures. IEEE Bull. on Data Engineering 32(1), 67–74 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Diego Calvanese
    • 1
  • Martin Giese
    • 2
  • Dag Hovland
    • 2
  • Martin Rezk
    • 1
  1. 1.Free University of Bozen-BolzanoBolzanoItaly
  2. 2.University of OsloOsloNorway

Personalised recommendations