A Feature and Information Theoretic Framework for Semantic Similarity and Relatedness

  • Giuseppe Pirró
  • Jérôme Euzenat
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6496)


Semantic similarity and relatedness measures between ontology concepts are useful in many research areas. While similarity only considers subsumption relations to assess how two objects are alike, relatedness takes into account a broader range of relations (e.g., part-of). In this paper, we present a framework, which maps the feature-based model of similarity into the information theoretic domain. A new way of computing IC values directly from an ontology structure is also introduced. This new model, called Extended Information Content (eIC) takes into account the whole set of semantic relations defined in an ontology. The proposed framework enables to rewrite existing similarity measures that can be augmented to compute semantic relatedness. Upon this framework, a new measure called FaITH (Feature and Information THeoretic) has been devised. Extensive experimental evaluations confirmed the suitability of the framework.


Semantic Similarity Feature Based Similarity Ontologies 


  1. 1.
    Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Pasca, M., Soroa, A.: A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches. In: Proc. of NAACL-HLT (2009)Google Scholar
  2. 2.
    Borgida, A., Walsh, T., Hirsh, T.: Towards Measuring Similarity in Description Logics. In: Proc. of Description Logics (2005)Google Scholar
  3. 3.
    Danushka, B., Yutaka, M., Mitsuru, I.: Measuring Semantic Similarity Between Words using Web Search Engines. In: Proc. of WWW 2007, pp. 757–766 (2007)Google Scholar
  4. 4.
    D ’ Amato, C.: Similarity-based Learning Methods for the Semantic Web. PhD Thesis, University of Bari (2007)Google Scholar
  5. 5.
    Son, J.Y., Goldstone, R.L.: The Transfer of Scientific Principles using Concrete and Idealized Simulation. The Journal of the Learning Sciences (14), 69–110 (2005)Google Scholar
  6. 6.
    Hirst, G., St-Onge, D.: Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms. In: Fellbaum, C. (ed.) WordNet. An Electronic Lexical Database, ch. 13, pp. 305–332Google Scholar
  7. 7.
    Hliaoutakis, A.: Semantic Similarity Measures in MeSH Ontology and their Application to Information Retrieval on Medline, Technical report, Technical Univ. of Crete, Dept. of Electronic and Computer Engineering (2005)Google Scholar
  8. 8.
    Hliaoutakis, A., Varelas, G., Voutsakis, E., Petrakis, E.G.M., Milios, E.E.: Information Retrieval by Semantic Similarity. Int. J. SWIS 2(3), 55–73 (2006)Google Scholar
  9. 9.
    Jiang, J.J., Conrath, D.W.: Semantic Similarity based on Corpus Statistics and Lexical Taxonomy. In: Proc. of ROCLING X (1997)Google Scholar
  10. 10.
    Leacock, C., Chodorow, M.: Combining Local Context and WordNet Similarity for Word Sense Identification. In: Fellbaum, C. (ed.) WordNet. An Electronic Lexical Database, ch. 11, pp. 265–283Google Scholar
  11. 11.
    Li, Y., Bandar, A., McLean, D.: An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources. IEEE TKDE 15(4), 871–882Google Scholar
  12. 12.
    Lin, D.: An Information-theoretic Definition of Similarity. In: Proc. of Conf. on Machine Learning, pp. 296–304 (1998)Google Scholar
  13. 13.
    Miller, G.A.: WordNet an on-line Lexical Database. International Journal of Lexicography 3(4), 235–312 (1990)CrossRefGoogle Scholar
  14. 14.
    Miller, G.A., Charles, W.G.: Contextual Correlates of Semantic Similarity. Language and Cognitive Processes (6), 1–28 (1991)Google Scholar
  15. 15.
    Banerjee, S., Pedersen, T.: Extended Gloss Overlaps as a Measure of Semantic Relatedness. In: Proc. of IJCAI, pp. 805–810 (2003)Google Scholar
  16. 16.
    Pirró, G., Ruffolo, M., Talia, D.: SECCO: On Building Semantic Links in Peer to Peer Networks. Journal on Data Semantics XII, 1–36 (2009)CrossRefGoogle Scholar
  17. 17.
    Pirró, G.: A Semantic Similarity Metric Combining Features and Intrinsic Information Content. Data Knowl. Eng. 68(11), 1289–1308 (2009)CrossRefGoogle Scholar
  18. 18.
    Rada, R., Mili, H., Bicknell, M., Blettner, E.: Development and Application of a measure on Semantic Nets. IEEE TSMC (19), 17–30 (1989)Google Scholar
  19. 19.
    Resnik, P.: Information Content to Evaluate Semantic Similarity in a Taxonomy. In: Proc. of IJCAI, pp. 448–453 (1995)Google Scholar
  20. 20.
    Rodriguez, M.A., Egenhofer, M.J.: Determining Semantic Similarity among Entity Classes from Different Ontologies. IEEE TKDE 15(2), 442–456 (2003)Google Scholar
  21. 21.
    Rubenstein, H., Goodenough, J.B.: Contextual Correlates of Synonymy. CACM 8(10), 627–633 (1965)CrossRefGoogle Scholar
  22. 22.
    Schickel-Zuber, V., Faltings, B.: OSS: A Semantic Similarity Function based on Hierarchical Ontologies. In: IJCAI, pp. 551–556 (2007)Google Scholar
  23. 23.
    Seco, N., Veale, T., Hayes, J.: An Intrinsic Information Content measure for Semantic Similarity in WordNet. In: Proc. of ECAI 2004, pp. 1089–1090 (2004)Google Scholar
  24. 24.
    Tversky, A.: Features of Similarity. Psychological Review 84(2), 327–352 (1977)CrossRefGoogle Scholar
  25. 25.
    Wang, J., Du, Z., Payattakool, R., Yu, P., Chen, C.: A New Method to Measure the Semantic Similarity of GO Terms. Bioinformatics 23(10), 1274–1281 (2007)CrossRefGoogle Scholar
  26. 26.
    Watanable, S.: Knowing and Guessing: A Quantitative Study of Inference and Information. Wiley, Chichester (1969)Google Scholar
  27. 27.
    Wu, Z., Palmer, M.: Verb semantics and Lexical Selection. In: Proc. of FQAS ACL 1994, pp. 133–138 (1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Giuseppe Pirró
    • 1
  • Jérôme Euzenat
    • 1
  1. 1.INRIA Rhône-AlpesMontbonnotFrance

Personalised recommendations