Datalog Revisited for Reasoning in Linked Data

  • Marie-Christine RoussetEmail author
  • Manuel Atencia
  • Jerome David
  • Fabrice Jouanot
  • Olivier Palombi
  • Federico Ulliana
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10370)


Linked Data provides access to huge, continuously growing amounts of open data and ontologies in RDF format that describe entities, links and properties on those entities. Equipping Linked Data with inference paves the way to make the Semantic Web a reality. In this survey, we describe a unifying framework for RDF ontologies and databases that we call deductive RDF triplestores. It consists in equipping RDF triplestores with Datalog inference rules. This rule language allows to capture in a uniform manner OWL constraints that are useful in practice, such as property transitivity or symmetry, but also domain-specific rules with practical relevance for users in many domains of interest. The expressivity and the genericity of this framework is illustrated for modeling Linked Data applications and for developing inference algorithms. In particular, we show how it allows to model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. We also explain how it makes possible to efficiently extract expressive modules from Semantic Web ontologies and databases with formal guarantees, whilst effectively controlling their succinctness. Experiments conducted on real-world datasets have demonstrated the feasibility of this approach and its usefulness in practice for data integration and information extraction.


Link Data Disjunctive Normal Form Conjunctive Query Triple Pattern Link Open Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Abiteboul, S., Abrams, Z., Haar, S., Milo, T.: Diagnosis of asynchronous discrete event systems: datalog to the rescue! In: Proceedings of the Twenty-Fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 13–15 June 2005, Baltimore, pp. 358–367. ACM (2005)Google Scholar
  2. 2.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)zbMATHGoogle Scholar
  3. 3.
    Al-Bakri, M., Atencia, M., David, J., Lalande, S., Rousset, M.-C.: Uncertainty-sensitive reasoning for inferring sameAS facts in linked data. In: Proceedings of the European Conference on Artificial Intelligence (ECAI 2016), August 2016, The Hague (2016)Google Scholar
  4. 4.
    Al-Bakri, M., Atencia, M., Lalande, S., Rousset, M.-C.: Inferring same-as facts from linked data: an iterative import-by-query approach. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 25–30 January 2015, Austin, pp. 9–15. AAAI Press (2015)Google Scholar
  5. 5.
    Allemang, D., Hendler, J.: Semantic Web for the Working Ontologist: Modeling in RDF, RDFS and OWL. Morgan Kaufmann, San Francisco (2011)Google Scholar
  6. 6.
    Amarilli, A., Bourhis, P., Senellart, P.: Provenance circuits for trees and treelike instances. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 56–68. Springer, Heidelberg (2015). doi: 10.1007/978-3-662-47666-6_5 CrossRefGoogle Scholar
  7. 7.
    Arasu, A., Ré, C., Suciu, D.: Large-scale deduplication with constraints using dedupalog. In: Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, 29 March 2009–2 April 2009, Shanghai, pp. 952–963. IEEE Computer Society (2009)Google Scholar
  8. 8.
    Arenas, M., Gottlob, G., Pieris, A.: Expressive languages for querying the semantic web. In: Proceedings of the International Conference on Principles of Database Systems (PODS 2014) (2014)Google Scholar
  9. 9.
    Atencia, M., Al-Bakri, M., Rousset, M.-C.: Trust in networks of ontologies and alignments. J. Knowl. Inf. Syst. (2013). doi: 10.1007/s10115-013-0708-9
  10. 10.
    Atencia, M., David, J., Euzenat, J.: Data interlinking through robust linkkey extraction. In: ECAI 2014 - 21st European Conference on Artificial Intelligence, 18–22 August 2014, Prague, - Including Prestigious Applications of Intelligent Systems (PAIS 2014). Frontiers in Artificial Intelligence and Applications, vol. 263, pp. 15–20. IOS Press (2014)Google Scholar
  11. 11.
    Atencia, M., David, J., Scharffe, F.: Keys and pseudo-keys detection for web datasets cleansing and interlinking. In: Teije, A., et al. (eds.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 144–153. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33876-2_14 CrossRefGoogle Scholar
  12. 12.
    Bröcheler, M., Mihalkova, L., Getoor, L.: Probabilistic similarity logic. In: Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI 2010, Catalina Island, 8–11 July 2010, pp. 73–82. AUAI Press (2010)Google Scholar
  13. 13.
    Calì, A., Gottlob, G., Lukasiewicz, T.: A general datalog-based framework for tractable query answering over ontologies. J. Web Semant. 14, 57–83 (2012)CrossRefGoogle Scholar
  14. 14.
    Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: the DL-Lite family. J. Autom. Reason. 39(3), 385–429 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational databases. In: Proceedings of the 9th ACM Symposium on Theory of Computing, pp. 77–90 (1975)Google Scholar
  16. 16.
    Christen, P.: Data Matching - Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Data-Centric Systems and Applications. Springer, Heidelberg (2012)Google Scholar
  17. 17.
    Dalvi, N., Suciu, D.: The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM 59(6), 17–37 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    De Giacomo, G., Lenzerini, M., Rosati, R.: Higher-order description logics for domain metamodeling. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI-11) (2011)Google Scholar
  19. 19.
    Euzenat, J., Shvaiko, P.: Ontology Matching, 2nd edn. Springer, Heidelberg (2013)CrossRefzbMATHGoogle Scholar
  20. 20.
    Ferrara, A., Nikolov, A., Scharffe, F.: Data linking for the semantic web. Int. J. Semant. Web Inf. Syst. 7(3), 46–76 (2011)CrossRefGoogle Scholar
  21. 21.
    Forgy, C.: Rete: a fast algorithm for the many patterns/many objects match problem. Artif. Intell. 19(1), 17–37 (1982)CrossRefGoogle Scholar
  22. 22.
    Fuhr, N.: Probabilistic models in information retrieval. Comput. J. 3(35), 243–255 (1992)CrossRefzbMATHGoogle Scholar
  23. 23.
    Fuhr, N.: Probabilistic datalog: implementing logical information retrieval for advanced applications. J. Am. Soc. Inf. Sci. 51(2), 95–110 (2000)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontologies: theory and practice. J. Artif. Intell. Res. (JAIR-08) 31, 273–318 (2008)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Grau, B.C., Motik, B.: Reasoning over ontologies with hidden content: the import-by-query approach. J. Artif. Intell. Res. (JAIR) 45, 197–255 (2012)MathSciNetzbMATHGoogle Scholar
  26. 26.
    Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Morgan and Claypool, Palo Alto (2011)Google Scholar
  27. 27.
    Herre, H.: General formal ontology (GFO): a foundational ontology for conceptual modelling. In: Poli, R., Healy, M., Healy, A. (eds.) Theory and Applications of Ontology, vol. 2, pp. 297–345. Springer, Berlin (2010)CrossRefGoogle Scholar
  28. 28.
    Hillebrand, G.G., Kanellakis, P.C., Mairson, H.G., Vardi, M.Y.: Undecidable boundedness problems for datalog programs. J. Log. Program. (JLP-95) 25, 163–190 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Hinkelmann, K., Hintze, H.: Computing cost estimates for proof strategies. In: Dyckhoff, R. (ed.) ELP 1993. LNCS, vol. 798, pp. 152–170. Springer, Heidelberg (1994). doi: 10.1007/3-540-58025-5_54 CrossRefGoogle Scholar
  30. 30.
    Hoehndorf, R., Ngonga Ngomo, A.-C., Kelso, J.: Applying the functional abnormality ontology pattern to anatomical functions. J. Biomed. Semant. 1(4), 1–15 (2010)Google Scholar
  31. 31.
    Hogan, A., Zimmermann, A., Umbrich, J., Polleres, A., Decker, S.: Scalable and distributed methods for entity matching, consolidation and disambiguation over linked data corpora. J. Web Semant. 10, 76–110 (2012)CrossRefGoogle Scholar
  32. 32.
    Konev, B., Lutz, C., Walther, D., Wolter, F.: Semantic modularity and module extraction in description logics. In: Proceedings of the European Conference on Artificial Intelligence (ECAI-08) (2008)Google Scholar
  33. 33.
    Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A., Schneider, L.: Wonder-web deliverable D17. The WonderWeb library of foundational ontologies and the DOLCE ontology. Technical report, ISTC-CNR (2002)Google Scholar
  34. 34.
    Ngonga Ngomo, A.-C., Auer, S.: LIMES - a time-efficient approach for large-scale link discovery on the web of data. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, Barcelona, 16–22 July 2011, pp. 2312–2317. IJCAI/AAAI (2011)Google Scholar
  35. 35.
    Noy, N.F., Musen, M.A.: Specifying ontology views by traversal. In: McIlraith, S.A., Plexousakis, D., Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 713–725. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-30475-3_49 CrossRefGoogle Scholar
  36. 36.
    Palombi, O., Ulliana, F., Favier, V., Rousset, M.-C.: My Corporis Fabrica: an ontology-based tool for reasoning and querying on complex anatomical models. J. Biomed. Semant. (JOBS 2014) 5, 20 (2014)CrossRefGoogle Scholar
  37. 37.
    Rabattu, P.-Y., Masse, B., Ulliana, F., Rousset, M.-C., Rohmer, D., Leon, J.-C., Palombi, O.: My Corporis Fabrica embryo: an ontology-based 3D spatio-temporal modeling of human embryo development. J. Biomed. Semant. (JOBS 2015) 6, 36 (2015)CrossRefGoogle Scholar
  38. 38.
    Rosse, C., Mejino, J.L.V.: A reference ontology for biomedical informatics: the foundational model of anatomy. J. Biomed. Inform. 36, 500 (2003)CrossRefGoogle Scholar
  39. 39.
    Rousset, M.-C., Ulliana, F.: Extractiong bounded-level modules from deductive triplestores. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 25–30 January 2015, Austin. AAAI Press (2015)Google Scholar
  40. 40.
    Saïs, F., Pernelle, N., Rousset, M.-C.: Combining a logical and a numerical method for data reconciliation. J. Data Semant. 12, 66–94 (2009)CrossRefGoogle Scholar
  41. 41.
    Singla, P., Domingos, P.M.: Entity resolution with Markov logic. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 18–22 December 2006, Hong Kong, pp. 572–582. IEEE Computer Society (2006)Google Scholar
  42. 42.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the World Wide Web Conference (WWW-07) (2007)Google Scholar
  43. 43.
    Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Morgan & Claypool, San Francisco (1995)zbMATHGoogle Scholar
  44. 44.
    Symeonidou, D., Armant, V., Pernelle, N., Saïs, F.: SAKey: scalable almost key discovery in RDF data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 33–49. Springer, Cham (2014). doi: 10.1007/978-3-319-11964-9_3 Google Scholar
  45. 45.
    Tournaire, R., Petit, J.-M., Rousset, M.-C., Termier, A.: Discovery of probabilistic mappings between taxonomies: principles and experiments. J. Data Semant. 15, 66–101 (2011)CrossRefGoogle Scholar
  46. 46.
    Urbani, J., Harmelen, F., Schlobach, S., Bal, H.: QueryPIE: backward reasoning for OWL horst over very large knowledge bases. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 730–745. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-25073-6_46 CrossRefGoogle Scholar
  47. 47.
    Vieille, L.: Recursive axioms in deductive databases: the query/subquery approach. In: Expert Database Conference, pp. 253–267 (1986)Google Scholar
  48. 48.
    Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk - a link discovery framework for the web of data. In: Proceedings of the WWW 2009 Workshop on Linked Data on the Web, LDOW 2009, Madrid, 20 April 2009, vol. 538. CEUR Workshop Proceedings. (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Marie-Christine Rousset
    • 1
    • 2
    Email author
  • Manuel Atencia
    • 1
  • Jerome David
    • 1
  • Fabrice Jouanot
    • 1
  • Olivier Palombi
    • 3
    • 4
  • Federico Ulliana
    • 5
  1. 1.Université Grenoble Alpes, Grenoble INP, CNRS, Inria, LIGGrenobleFrance
  2. 2.Institut universitaire de FranceParisFrance
  3. 3.Université Grenoble Alpes, Grenoble INP, CNRS, Inria, LJKGrenobleFrance
  4. 4.Université Grenoble Alpes, LADAF, CHU GrenobleGrenobleFrance
  5. 5.Université de Montpellier, CNRS, Inria, LIRMMMontpellierFrance

Personalised recommendations