Applied Intelligence

, Volume 38, Issue 1, pp 29–44

Semantic similarity estimation from multiple ontologies

  • Montserrat Batet
  • David Sánchez
  • Aida Valls
  • Karina Gibert
Article

Abstract

The estimation of semantic similarity between words is an important task in many language related applications. In the past, several approaches to assess similarity by evaluating the knowledge modelled in an ontology have been proposed. However, in many domains, knowledge is dispersed through several partial and/or overlapping ontologies. Because most previous works on semantic similarity only support a unique input ontology, we propose a method to enable similarity estimation across multiple ontologies. Our method identifies different cases according to which ontology/ies input terms belong. We propose several heuristics to deal with each case, aiming to solve missing values, when partial knowledge is available, and to capture the strongest semantic evidence that results in the most accurate similarity assessment, when dealing with overlapping knowledge. We evaluate and compare our method using several general purpose and biomedical benchmarks of word pairs whose similarity has been assessed by human experts, and several general purpose (WordNet) and biomedical ontologies (SNOMED CT and MeSH). Results show that our method is able to improve the accuracy of similarity estimation in comparison to single ontology approaches and against state of the art related works in multi-ontology similarity assessment.

Keywords

Semantic similarity Ontologies Knowledge representation WordNet MeSH SNOMED 

References

  1. 1.
    Resnik P (1999) Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J Artif Intell Res 11:95–130 MATHGoogle Scholar
  2. 2.
    Cilibrasi RL, Vitányi PMB (2006) The Google Similarity Distance. IEEE Trans Knowl Data Eng 19(3):370–383 CrossRefGoogle Scholar
  3. 3.
    Batet M, Gibert K, Valls A (2011) Semantic clustering based on ontologies—an application to the study of visitors in a natural reserve. In: ICAART 2011 3rd international conference on agents and artificial intelligence proceedings, vol 1. SciPress, Marrickville, pp 283–289 Google Scholar
  4. 4.
    Batet M (2011) Ontology-based semantic clustering. AI Commun 24(3):291–292 Google Scholar
  5. 5.
    Budanitsky A, Hirst G (2006) Evaluating wordnet-based measures of semantic distance. Comput Linguist 32(1):13–47 MATHCrossRefGoogle Scholar
  6. 6.
    Sánchez D, Isern D (2011) Automatic extraction of acronym definitions from the Web. Appl Intell 34(2):311–327 CrossRefGoogle Scholar
  7. 7.
    Sánchez D, Isern D, Millán M (2011) Content annotation for the semantic Web: an automatic web-based approach. Knowl Inf Syst 27(3):393–418 CrossRefGoogle Scholar
  8. 8.
    Sánchez D, Moreno A (2008) Learning non-taxonomic relationships from Web documents for domain ontology construction. Data Knowl Eng 63(3):600–623 CrossRefGoogle Scholar
  9. 9.
    Sánchez D (2010) A methodology to learn ontological attributes from the Web. Data Knowl Eng 69(6):573–597 CrossRefGoogle Scholar
  10. 10.
    Sánchez D, Moreno A (2008) Pattern-based automatic taxonomy learning from the Web. AI Commun 21(1):27–48 MathSciNetMATHGoogle Scholar
  11. 11.
    Li S-T, Tsai F-C (2010) Constructing tree-based knowledge structures from text corpus. Appl Intell 33(1):67–78 MathSciNetCrossRefGoogle Scholar
  12. 12.
    Iannone L, Palmisano I, Fanizzi N (2007) An algorithm based on counterfactuals for concept learning in the semantic Web. Appl Intell 26(2):139–159 CrossRefGoogle Scholar
  13. 13.
    Nguyen HA, Al-mubaid H (2006) New ontology-based semantic similarity measure for the biomedical domain. In: IEEE conference on granular computing, GrC 2006, Silicon Valley, USA. IEEE Computer Society, Los Alamitos, pp 623–628 CrossRefGoogle Scholar
  14. 14.
    Sim KM, Wong PT (2004) Toward agency and ontology for web-based information retrieval. IEEE Trans Syst Man Cybern, Part C, Appl Rev 34(3):257–269 CrossRefGoogle Scholar
  15. 15.
    Lee JH, Kim MH, Lee YJ (1993) Information retrieval based on conceptual distance in Is-A hierarchies. J Doc 49(2):188–207 Google Scholar
  16. 16.
    Guarino N (1998) Formal ontology in information systems. In: Guarino N (ed) 1st International conference on formal ontology in information systems, FOIS 1998, Trento, Italy, June 6–8, 1998. Frontiers in artificial intelligence and applications. IOS Press, Amsterdam, pp 3–15 Google Scholar
  17. 17.
    Berners-Lee T, Hendler J, Lassila O (2001) The semantic Web—a new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. Sci Am 284(5):34–43 CrossRefGoogle Scholar
  18. 18.
    Fellbaum C (1998) WordNet: an electronic lexical database. MIT Press, Cambridge MATHGoogle Scholar
  19. 19.
    Isern D, Moreno A, Sánchez D, Hajnal Á, Pedone G, Varga LZ (2011) Agent-based execution of personalised home care treatments. Appl Intell 34(2):155–180 CrossRefGoogle Scholar
  20. 20.
    Baumeister J, Reutelshoefer J, Puppe F (2011) KnowWE: a semantic Wiki for knowledge engineering. Appl Intell 35(3):323–344 CrossRefGoogle Scholar
  21. 21.
    Eyharabide V, Amandi A (2012) Ontology-based user profile learning. Appl Intell 36(4):857–869 CrossRefGoogle Scholar
  22. 22.
    Mousavi A, Nordin MJ, Othman ZA (2012) Ontology-driven coordination model for multiagent-based mobile workforce brokering systems. Appl Intell 36(4):768–787 CrossRefGoogle Scholar
  23. 23.
    Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: 32nd annual meeting of the association for computational linguistics, Las Cruces, New Mexico. Association for Computational Linguistics, Stroudsburg, pp 133–138 CrossRefGoogle Scholar
  24. 24.
    Li Y, Bandar Z, McLean D (2003) An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans Knowl Data Eng 15(4):871–882 CrossRefGoogle Scholar
  25. 25.
    Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. In: WordNet: an electronic lexical database. MIT Press, Cambridge, pp 265–283 Google Scholar
  26. 26.
    Rada R, Mili H, Bichnell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 9(1):17–30 CrossRefGoogle Scholar
  27. 27.
    Lin D (1998) An information-theoretic definition of similarity. In: Shavlik J (ed) Fifteenth international conference on machine learning, ICML 1998, Madison, Wisconsin, USA, July 24–27, 1998. Morgan Kaufmann, San Mateo, pp 296–304 Google Scholar
  28. 28.
    Jiang JJ, Conrath DW (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: International conference on research in computational linguistics, ROCLING X, Taipei, Taiwan, Sep, pp 19–33 Google Scholar
  29. 29.
    Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Mellish CS (ed) 14th International joint conference on artificial intelligence, IJCAI 1995, Montreal, Quebec, Canada, August 20–25, 1995. Morgan Kaufmann, San Mateo, pp 448–453 Google Scholar
  30. 30.
    Sánchez D, Batet M, Isern D (2011) Ontology-based Information Content computation. Knowl-Based Syst 24(2):297–303 CrossRefGoogle Scholar
  31. 31.
    Sánchez D, Batet M, Valls A, Gibert K (2010) Ontology-driven web-based semantic similarity. Inf Sci 35(3):383–413 Google Scholar
  32. 32.
    Sánchez D, Batet M (2011) Semantic similarity estimation in the biomedical domain: an ontology-based information-theoretic perspective. J Biomed Inform 44(5):749–759 CrossRefGoogle Scholar
  33. 33.
    Batet M, Sánchez D, Valls A (2011) An ontology-based measure to compute semantic similarity in biomedicine. J Biomed Inform 44(1):118–125 CrossRefGoogle Scholar
  34. 34.
    Al-Mubaid H, Nguyen HA (2009) Measuring semantic similarity between biomedical concepts within multiple ontologies. IEEE Trans Syst Man Cybern, Part C, Appl Rev 39(4):389–398 CrossRefGoogle Scholar
  35. 35.
    Tversky A (1977) Features of similarity. Psychol Rev 84:327–352 CrossRefGoogle Scholar
  36. 36.
    Gangemi A, Pisanelli D, Steve G (1998) Ontology integration: experiences with medical terminologies. In: Guarino N (ed) Formal ontology in information systems. Frontiers in artificial intelligence and applications. IOS Press, Amsterdam, pp 163–178 Google Scholar
  37. 37.
    Weinstein P, Birmingham WP (1999) Comparing concepts in differentiated ontologies. In: 12th Workshop on knowledge acquisition, modeling and management, KAW 1999, Banff, Alberta, Canada Google Scholar
  38. 38.
    Mena E, Kashyap V, Sheth A, Illarramendi A (1996) OBSERVER: an approach for query processing in global information systems based on interoperation across pre-existing ontologies. In: Proceedings of the first IFCIS international conference on cooperative information systems, CoopIS’96. IEEE Computer Society, Los Alamitos, pp 14–26 CrossRefGoogle Scholar
  39. 39.
    Bergamaschi B, Castano S, Vermercati SDCD, Montanari S, Vicini M (1998) An intelligent approach to information integration. In: Guarino N (ed) Proceedings of the first international conference formal ontology in information systems, pp 253–268 Google Scholar
  40. 40.
    Rodríguez MA, Egenhofer MJ (2003) Determining semantic similarity among entity classes from different ontologies. IEEE Trans Knowl Data Eng 15(2):442–456 CrossRefGoogle Scholar
  41. 41.
    Petrakis EGM, Varelas G, Hliaoutakis A, Raftopoulou P (2006) X-similarity: computing semantic similarity between concepts from different ontologies. J Digit Inf Manag 4:233–237 Google Scholar
  42. 42.
    Ding L, Finin T, Joshi A, Pan R, Cost RS, Peng Y, Reddivari P, Doshi V, Sachs J (2004) Swoogle: a search and metadata engine for the semantic Web. In: Thirteenth ACM international conference on information and knowledge management, CIKM 2004, Washington, DC, USA. ACM Press, New York, pp 652–659 CrossRefGoogle Scholar
  43. 43.
    Saruladha K, Aghila G, Bhuvaneswary A (2010) Computation of semantic similarity among cross ontological concepts for biomedical domain. J Comput 2:111–118 Google Scholar
  44. 44.
    Sánchez D, Solé-Ribalta A, Batet M, Serratosa F (2012) Enabling semantic similarity estimation across multiple ontologies: an evaluation in the biomedical domain. J Biomed Inform 45(1):141–155 CrossRefGoogle Scholar
  45. 45.
    Al-Mubaid H, Nguyen HA (2006) A cluster-based approach for semantic similarity in the biomedical domain. In: 28th Annual international conference of the IEEE engineering in medicine and biology society, EMBS 2006, New York, USA. IEEE Computer Society, Los Alamitos, pp 2713–2717 Google Scholar
  46. 46.
    Bollegala D, Matsuo Y, Ishizuka M (2007) WebSim: a Web-based semantic similarity measure. In: 21st annual conference of the Japanese society for artificial intelligence, JSAI 2007, Miyazaki, Japan, June 18–22, 2007, pp 757–766 Google Scholar
  47. 47.
    Solé-Ribalta A, Serratosa F (2011) Exploration of the labelling space given graph edit distance costs. In: Graph-based representations in pattern recognition. LNCS, vol 6658. Springer, Berlin, pp 164–174 CrossRefGoogle Scholar
  48. 48.
    Euzenat J, Shvaiko P (2007) Ontology matching. Springer, Berlin MATHGoogle Scholar
  49. 49.
    Gómez-Pérez A, Fernández-López M, Corcho O (2004) Ontological engineering, 2nd edn. Springer, Berlin Google Scholar
  50. 50.
    Krumhansl C (1978) Concerning the applicability of geometric models to similarity data: the interrelationship between similarity and spatial density. Psychol Rev 85:445–463 CrossRefGoogle Scholar
  51. 51.
    Noy NF, Musen MA (1999) SMART: automated support for ontology merging and alignment. In: Gaines BR, Kamam B (eds) Proceedings of the 12th Banff workshop on knowledge acquisition, modeling, and management, Banff, Alberta, Canada, pp 1–20 Google Scholar
  52. 52.
    Lambrix P, Tan H (2007) A tool for evaluating ontology alignment strategies. J Data Semant 182:182–202 Google Scholar
  53. 53.
    Stoilos G, Stamou G, Kollias S (2005) A string metric for ontology alignment. In: 4th International semantic Web conference, pp 624–637 Google Scholar
  54. 54.
    Miller GA, Charles WG (1991) Contextual correlates of semantic similarity. Lang Cogn Process 6(1):1–28 CrossRefGoogle Scholar
  55. 55.
    Rubenstein H, Goodenough J (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633 CrossRefGoogle Scholar
  56. 56.
    Pedersen T, Pakhomov S, Patwardhan S, Chute C (2007) Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 40(3):288–299 CrossRefGoogle Scholar
  57. 57.
    Hliaoutakis A (2005) Semantic similarity measures in the MESH ontology and their application to information retrieval on medline. Diploma Thesis. Technical Univ. of Crete (TUC), Dept. of Electronic and Computer Engineering, Crete, Greece Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Montserrat Batet
    • 1
  • David Sánchez
    • 1
  • Aida Valls
    • 1
  • Karina Gibert
    • 2
  1. 1.Departament d’Enginyeria Informàtica i MatemàtiquesUniversitat Rovira i VirgiliTarragonaSpain
  2. 2.Department of Statistics and Operational ResearchUniversitat Politècnica de CatalunyaBarcelonaSpain

Personalised recommendations