Exploiting Taxonomical Knowledge to Compute Semantic Similarity: An Evaluation in the Biomedical Domain
Determining the semantic similarity between concept pairs is an important task in many language related problems. In the biomedical field, several approaches to assess the semantic similarity between concepts by exploiting the knowledge provided by a domain ontology have been proposed. In this paper, some of those approaches are studied, exploiting the taxonomical structure of a biomedical ontology (SNOMED-CT). Then, a new measure is presented based on computing the amount of overlapping and non-overlapping taxonomical knowledge between concept pairs. The performance of our proposal is compared against related ones using a set of standard benchmarks of manually ranked terms. The correlation between the results obtained by the computerized approaches and the manual ranking shows that our proposal clearly outperforms previous works.
KeywordsSemantic similarity Ontologies Biomedicine Data mining
Unable to display preview. Download preview PDF.
- 8.Resnik, P.: Using information content to evalutate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI 95), Montreal, Canada, pp. 448–453 (1995)Google Scholar
- 9.Lin, D.: An information-theoretic definition of similarity. In: Shavlik, J.W. (ed.) Proceedings of the 15th International Conference on Machine Learning (ICML 98), Madison, Wisconson, USA, pp. 296–304. Morgan Kaufmann, San Francisco (1998)Google Scholar
- 10.Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the International Conference on Research in Computational Linguistics, September 1997, pp. 19–33 (1997)Google Scholar
- 12.Neches, R., Fikes, R., Finin, T., Gruber, T., Senator, T., Swartout, W.: Enabling technology for knowledge sharing. AI Magazine 12(3), 36–56 (1991)Google Scholar
- 13.Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: Proceedings of the 32nd annual Meeting of the Association for Computational Linguistics, New Mexico, USA, pp. 133–138. Association for Computational Linguistics (1994)Google Scholar
- 14.Leacock, C., Chodorow, M.: WordNet: An electronic lexical database. In: Combining local context and WordNet similarity for word sense identification, pp. 265–283. MIT Press, Cambridge (1998)Google Scholar
- 17.Lemaire, B., Denhiére, G.: Effects of high-order co-occurrences on word semantic similarities. Current Psychology Letters - Behaviour, Brain and Cognition 18(1) (2006)Google Scholar
- 18.Gómez-Pérez, A., Fernández-López, M., Corcho, O.: Ontological Engineering, 2nd printing. Springer, Heidelberg (2004)Google Scholar
- 21.Nguyen, H., Al-Mubaid, H.: New ontology-based semantic similarity measure for the biomedical domain. In: IEEE conference on Granular Computing, pp. 623–628 (2006)Google Scholar
- 22.Burgun, A., Bodenreider, O.: Comparing terms, concepts and semantic classes in wordnet and the unified medical language system. In: Proc. of the NAACL 2001 Workshop: WordNet and other lexical resources: Applications, extensions and customizations, Pittsburgh, PA, pp. 77–82 (2001)Google Scholar
- 24.Cimiano, P.: Ontology Learning and Population from Text. Algorithms, Evaluation and Applications (2006)Google Scholar
- 25.Hliaoutakis, A., Varelas, G., Voutsakis, E., Petrakis, E.G.M., Milios, E.E.: Information retrieval by semantic similarity. Int. J. Semantic Web Inf. Syst. 2(3), 55–73 (2006)Google Scholar