Link Representation and Discovery

  • Philipp Cimiano
  • Christian Chiarcos
  • John P. McCrae
  • Jorge Gracia


In this chapter we address the question of how links can be discovered between different datasets published as Linguistic Linked Open Data. We describe common patterns to represent links both between data that are on the same language (monolingual scenario) and between data in different languages (cross-lingual scenario). Further, we describe techniques that can be used to automatically discover links between datasets. As most of these techniques rely on computing similarities between data elements, we briefly review the most common techniques for computing syntactic and semantic similarity. Finally, we provide a brief overview of tools and frameworks that can be used to semi-automatically discover links between language resources.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    J. Gracia, E. Montiel-Ponsoda, P. Cimiano, A. Gómez-Pérez, P. Buitelaar, J. McCrae, Challenges for the Multilingual Web of Data. Web Semant. Sci. Serv. Agents World Wide Web 11, 63 (2012)CrossRefGoogle Scholar
  2. 2.
    J.E. Labra Gayo, D. Kontokostas, S. Auer, J.E.L. Gayo, D. Kontokostas, S. Auer, Multilingual linked data patterns. Semantic Web 6(4), 319 (2015)CrossRefGoogle Scholar
  3. 3.
    M. Schmachtenberg, C. Bizer, H. Paulheim, Adoption of the linked data best practices in different topical domains, in Proceedings of the 13th International Semantic Web Conference (ISWC) (Springer, Berlin, 2014), pp. 245–260Google Scholar
  4. 4.
    M. Nentwig, M. Hartung, A.C. Ngonga Ngomo, E. Rahm, A survey of current link discovery frameworks. Semantic Web 1, 419–436 (2017)Google Scholar
  5. 5.
    A. Ferrara, A. Nikolov, F. Scharffe, Data linking for the Semantic Web. Int. J. Semant. Web Inf. Syst. 7(3), 46 (2011)CrossRefGoogle Scholar
  6. 6.
    S. Wölger, K. Siorpaes, T. Bürger, E. Simperl, S. Thaler, C. Hofer, A survey on data interlinking methods. Technical Report, STI Innsbruck (2011)Google Scholar
  7. 7.
    P. Christen, in Data Matching—Concepts and Techniques for Record Linkage, Entity Resolution and Duplicate Detection. Data-Centric Systems and Applications (Springer, Berlin, 2012)Google Scholar
  8. 8.
    H.L. Dunn, Record linkage. Am. J. Public Health 36(12), 1412 (1946)CrossRefGoogle Scholar
  9. 9.
    J. Euzenat, P. Shvaiko, Ontology Matching, 2nd edn. (Springer, Berlin, 2013)CrossRefGoogle Scholar
  10. 10.
    H.W. Kuhn, The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 2(1–2), 83 (1955)MathSciNetCrossRefGoogle Scholar
  11. 11.
    W.W. Cohen, P. Ravikumar, S.E. Fienberg, A comparison of string distance metrics for name-matching tasks, in Proceedings of the International Conference on Information Integration on the Web (AAAI Press, Menlo Park, 2003), pp. 73–78Google Scholar
  12. 12.
    V.I. Levenshtein, Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 10(8), 707 (1966)Google Scholar
  13. 13.
    J.P. McCrae, P. Buitelaar, Linking datasets using semantic textual similarity. Cybern. Inf Technol. 18(1), 109 (2018)MathSciNetCrossRefGoogle Scholar
  14. 14.
    G.A. Miller, WordNet: a lexical database for English. Commun. Assoc. Comput. Mach. 38(11), 39 (1995)CrossRefGoogle Scholar
  15. 15.
    F. Lin, K. Sandkuhl, A survey of exploiting wordnet in ontology matching, in Artificial Intelligence in Theory and Practice II (Springer, Boston, 2008), pp. 341–350Google Scholar
  16. 16.
    Z. Wu, M. Palmer, Verb semantics and lexical selection, in Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL) (Association for Computational Linguistics, Morristown, 1994), pp. 133–138Google Scholar
  17. 17.
    Y. Li, Z. Bandar, An approach for measuring semantic similarity between words using multiple information sources. Trans. Knowl. Data Eng. 15(4), 871 (2003)Google Scholar
  18. 18.
    C. Leacock, M. Chodorow, Combining local context and wordnet similarity for word sense identification. WordNet Electron. Lexical database 49(2), 265 (1998)Google Scholar
  19. 19.
    T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in Proceedings of Advances in Neural Information Processing Systems (NIPS) (Curran Associates Inc., Lake Tahoe, 2013), pp. 3111–3119Google Scholar
  20. 20.
    J. Pennington, R. Socher, C. Manning, GloVe: global vectors for word representation, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (Association for Computational Linguistics, Morristown, 2014), pp. 1532–1543Google Scholar
  21. 21.
    Q. Le, T. Mikolov, Distributed representations of sentences and documents, in Proceedings of the 31st International Conference on Machine Learning (ICML) (2014), pp. 1188–1196Google Scholar
  22. 22.
    D. Cer, M. Diab, E. Agirre, I. Lopez-Gazpio, L. Specia, Semeval-2017 task 1: semantic textual similarity-multilingual and cross-lingual focused evaluation, in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval) (2017)Google Scholar
  23. 23.
    K.S. Tai, R. Socher, C.D. Manning, Improved semantic representations from tree-structured long short-term memory networks, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (2015), pp. 1556–1566Google Scholar
  24. 24.
    E. Agirre, G. Rigau, Word sense disambiguation using conceptual density, in Proceedings of the 16th Conference on Computational Linguistics (COLING) (Association for Computational Linguistics, Morristown, 1996), pp. 16–22Google Scholar
  25. 25.
    J. Gracia, K. Asooja, Monolingual and cross-lingual ontology matching with CIDER-CL: evaluation report for OAEI 2013, in Proceedings of the 8th International Conference on Ontology Matching, vol. 1111 (2013), pp. 109–116Google Scholar
  26. 26.
    P. Sorg, P. Cimiano, Cross-lingual information retrieval with explicit semantic analysis, in Working Notes for the CLEF 2008 Workshop (GEIE-ERCIM, Sophia Antipolis, 2008)Google Scholar
  27. 27.
    E. Jiménez-Ruiz, B.C. Grau, Logmap: logic-based and scalable ontology matching, in Proceedings of the 10th International Semantic Web Conference (ISWC) (Springer, Berlin, 2011), pp. 273–288Google Scholar
  28. 28.
    G. Stoilos, G. Stamou, S. Kollias, A string metric for ontology alignment, in Proceedings of the 4th International Semantic Web Conference (ISWC) (Springer, Berlin, 2005), pp. 624–637Google Scholar
  29. 29.
    J. Gracia, E. Montiel-Ponsoda, A. Gómez-Pérez, Cross-lingual linking on the multilingual web of data (position statement), in Proceedings of the 3rd Workshop on the Multilingual Semantic Web (MSW) at the 11th International Semantic Web Conference (ISWC), vol. 936 (CEUR-WS, Aachen, 2012)Google Scholar
  30. 30.
    B. Fu, R. Brennan, D. O’Sullivan, Cross-lingual ontology mapping–an investigation of the impact of machine translation, in Proceedings of the 4th Asian Semantic Web Conference (ASWC) (Springer, Berlin, 2009), pp. 1–15Google Scholar
  31. 31.
    D. Spohr, L. Hollink, P. Cimiano, A machine learning approach to multilingual and cross-lingual ontology matching, in Proceedings of the 10th International Semantic Web Conference (ISWC) (Springer, Berlin, 2011), pp. 665–680Google Scholar
  32. 32.
    P. Sorg, P. Cimiano, An experimental comparison of explicit semantic analysis implementations for cross-language retrieval, in Proceedings of the 14th International Conference on Application of Natural Language to Information Systems (NLDB) (Springer, Berlin, 2009), pp. 36–48Google Scholar
  33. 33.
    N. Aggarwal, K. Asooja, G. Bordea, P. Buitelaar, Non-orthogonal explicit semantic analysis, in Proceedings of the 4th Joint Conference on Lexical and Computational Semantics (Association for Computational Linguistics, Stroudsburg, 2015), pp. 92–100Google Scholar
  34. 34.
    J. McCrae, P. Cimiano, R. Klinger, Orthonormal explicit topic analysis for cross-lingual document matching, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Stroudsburg, 2013), pp. 1732–1742Google Scholar
  35. 35.
    J. Volz, C. Bizer, M. Gaedke, G. Kobilarov, Silk-a link discovery framework for the web of data, in Proceedings of the 2nd Workshop About Linked Data on the Web (LDOW) (2009)Google Scholar
  36. 36.
    A.C.N. Ngomo, S. Auer, Limes-a time-efficient approach for large-scale link discovery on the web of data, in Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI) (AAAI Press, Menlo Park, 2011)Google Scholar
  37. 37.
    M. Fréchet, Sur quelques points de calcul fonctionnel. Ph.D. thesis, Faculté des sciences de Paris (1906)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Semantic Computing GroupBielefeld UniversityBielefeldGermany
  2. 2.Angewandte ComputerlinguistikGoethe-UniversityFrankfurt am MainGermany
  3. 3.Insight Centre for Data AnalyticsNational University of IrelandGalwayIreland
  4. 4.Aragon Institute of Engineering Research (I3A)University of ZaragozaZaragozaSpain

Personalised recommendations