Extended Tversky Similarity for Resolving Terminological Heterogeneities across Ontologies
We propose a novel method to compute similarity between cross-ontology concepts based on the amount of overlap of the information content of their labels. We extend Tversky’s similarity measure by using the information content of each term within an ontology label both for the similarity computation and for the weight assignment to tokens. The approach is suitable for handling compound labels. Our experiments showed that it outperforms existing terminological similarity measures for the ontology matching task.
KeywordsSimilarity Measure Weight Assignment Ontology Match Heterogeneity Type Concept Label
Unable to display preview. Download preview PDF.
- 1.Cohen, W.W., Ravikumar, P.D., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: IIWeb, pp. 73–78 (2003)Google Scholar
- 2.Euzenat, J., Shvaiko, P.: Ontology matching. Springer (2007)Google Scholar
- 3.Monge, A.E., Elkan, C.P.: An efficient domain-independent algorithm for detecting approximately duplicate database records. In: SIGMOD WS on Research Issues on Data Mining and Knowledge Discovery, pp. 23–29 (1997)Google Scholar
- 5.Shannon, C.E.: Prediction and entropy of printed English. Bell Systems Technical Journal, 50–64 (1951)Google Scholar