Normalized Semantic Web Distance

  • Tom De NiesEmail author
  • Christian Beecks
  • Fréderic Godin
  • Wesley De Neve
  • Grzegorz Stepien
  • Dörthe Arndt
  • Laurens De Vocht
  • Ruben Verborgh
  • Thomas Seidl
  • Erik Mannens
  • Rik Van de Walle
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9678)


In this paper, we investigate the Normalized Semantic Web Distance (NSWD), a semantics-aware distance measure between two concepts in a knowledge graph. Our measure advances the Normalized Web Distance, a recently established distance between two textual terms, to be more semantically aware. In addition to the theoretic fundamentals of the NSWD, we investigate its properties and qualities with respect to computation and implementation. We investigate three variants of the NSWD that make use of all semantic properties of nodes in a knowledge graph. Our performance evaluation based on the Miller-Charles benchmark shows that the NSWD is able to correlate with human similarity assessments on both Freebase and DBpedia knowledge graphs with values up to 0.69. Moreover, we verified the semantic awareness of the NSWD on a set of 20 unambiguous concept-pairs. We conclude that the NSWD is a promising measure with (1) a reusable implementation across knowledge graphs, (2) sufficient correlation with human assessments, and (3) awareness of semantic differences between ambiguous concepts.


Semantic Distance Jaccard Similarity Semantic Context Kolmogorov Complexity Knowledge Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The research activities in this paper were funded by Ghent University, iMinds (by the Flemish Government), IWT Flanders, FWO-Flanders, the European Union, and RWTH Aachen University.


  1. 1.
    Beecks, C., Uysal, M.S., Seidl, T.: Signature quadratic form distance. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 438–445. ACM (2010)Google Scholar
  2. 2.
    Ceccarelli, D., Lucchese, C., Orlando, S., Perego, R., Trani, S.: Learning relatedness measures for entity linking. In: 22nd ACM International Conference on Information and Knowledge Management, pp. 139–148. ACM (2013)Google Scholar
  3. 3.
    Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Cilibrasi, R., Vitányi, P.M.B.: Normalized web distance and word similarity. CoRR abs/0905.4039 (2009)Google Scholar
  5. 5.
    Cilibrasi, R.L., Vitanyi, P.M.B.: The Google similarity distance. IEEE Trans. Knowl. Data Eng. 19(3), 370–383 (2007)CrossRefGoogle Scholar
  6. 6.
    De Nies, T., Beecks, C., De Neve, W., Seidl, T., Mannens, E., Van de Walle, R.: Towards named-entity-based similarity measures: challenges and opportunities. In: ESAIR, pp. 9–11. ACM(2014)Google Scholar
  7. 7.
    De Nies, T., Beecks, C., Godin, F., De Neve, W., Stepien, G., Arndt, D., De Vocht, L., Verborgh, R., Seidl, T., Mannens, E., Van de Walle, R.: A distance-based approach for semantic dissimilarity in knowledge graphs. In: Proceedings of the 10th International Conference on Semantic Computing (ICSC, TBP). IEEE (2016)Google Scholar
  8. 8.
    De Vocht, L., Coppens, S., Verborgh, R., Vander Sande, M., Mannens, E., Van de Walle, R.: Discovering meaningful connections between resources in the web of data. In: Proceedings of the 6th Workshop on Linked Data on the Web (2013)Google Scholar
  9. 9.
    Eskevich, M., Jones, G.J., Aly, R., Ordelman, R., Chen, S., Nadeem, D., Guinaudeau, C., Gravier, G., Sébillot, P., De Nies, T., et al.: Multimedia information seeking through search and hyperlinking. In: Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval, pp. 287–294 (2013)Google Scholar
  10. 10.
    Godin, F., De Nies, T., Beecks, C., De Vocht, L., De Neve, W., Mannens, E., Seidl, T., Van de Walle, R.: The normalized Freebase distance. In: 11th Extended Semantic Web Conference (2014)Google Scholar
  11. 11.
    Hliaoutakis, A., Varelas, G., Voutsakis, E., Petrakis, E.G., Milios, E.: Information retrieval by semantic similarity. Int. J. Semant. Web Inf. Syst. (IJSWIS) 2(3), 55–73 (2006)CrossRefGoogle Scholar
  12. 12.
    Jacobs, I., Walsh, N., et al. (eds.): Architecture of the World Wide Web, Volume One. W3C Recommendation 15 December 2004Google Scholar
  13. 13.
    Kulkarni, S., Caragea, D.: Computation of the semantic relatedness between words using concept clouds. In: Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, pp. 183–188 (2009)Google Scholar
  14. 14.
    Li, M.: An Introduction to Kolmogorov Complexity and Its Applications. Springer, Heidelberg (1997)CrossRefzbMATHGoogle Scholar
  15. 15.
    Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Trans. Inf. Theory 50(12), 3250–3264 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Meymandpour, R., Davis, J.G.: Recommendations using linked data. In: Proceedings of the 5th Ph.D. workshop on Information and knowledge, pp. 75–82 (2012)Google Scholar
  17. 17.
    Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Process. 6(1), 1–28 (1991)CrossRefGoogle Scholar
  18. 18.
    Milne, D., Witten, I.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In: Proceedings of the AAAI Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 25–30, Chicago, USA (2008)Google Scholar
  19. 19.
    Moore, J.L., Steinke, F., Tresp, V.: A novel metric for information retrieval in semantic networks. In: García-Castro, R., Fensel, D., Antoniou, G. (eds.) ESWC 2011. LNCS, vol. 7117, pp. 65–79. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  20. 20.
    Nunes, B.P., Dietze, S., Casanova, M.A., Kawase, R., Fetahu, B., Nejdl, W.: Combining a co-occurrence-based and a semantic measure for entity linking. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) The Semantic Web: Semantics and Big Data. LNCS, vol. 7882, pp. 548–562. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  21. 21.
    Nunes, B.P., Herrera, J., Taibi, D., Lopes, G.R., Casanova, M.A., Dietze, S.: SCS connector-quantifying and visualising semantic connections between entity pairs. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) The Semantic Web: ESWC 2014 Satellite Events. LNCS, vol. 8798, pp. 461–466. Springer, Heidelberg (2014)Google Scholar
  22. 22.
    Passant, A.: Measuring semantic distance on linking data and using it for resources recommendations. In: AAAI Spring Symposium: Linked Data Meets Artificial Intelligence (2010)Google Scholar
  23. 23.
    Sánchez, D., Batet, M., Isern, D., Valls, A.: Ontology-based semantic similarity: a new feature-based approach. Expert Syst. Appl. 39(9), 7718–7728 (2012)CrossRefGoogle Scholar
  24. 24.
    Schwartz, B.: Google removes the + search command (2011).
  25. 25.
    Zuo, Z., Huang, H.H., Kawagoe, K.: Similarity search of human behavior processes using extended linked data semantic distance. In: 25th International Workshop on Database and Expert Systems Applications (DEXA), pp. 178–182. IEEE (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Tom De Nies
    • 1
    Email author
  • Christian Beecks
    • 2
  • Fréderic Godin
    • 1
  • Wesley De Neve
    • 1
    • 3
  • Grzegorz Stepien
    • 2
  • Dörthe Arndt
    • 1
  • Laurens De Vocht
    • 1
  • Ruben Verborgh
    • 1
  • Thomas Seidl
    • 2
  • Erik Mannens
    • 1
  • Rik Van de Walle
    • 1
  1. 1.iMinds – Data Science LabGhent UniversityGhentBelgium
  2. 2.DME GroupRWTH Aachen UniversityAachenGermany
  3. 3.IVY LabKAISTDaejeonRepublic of Korea

Personalised recommendations