Skip to main content

DIS-C: conceptual distance in ontologies, a graph-based approach

Abstract

This paper presents the DIS-C approach, which is a novel method to assess the conceptual distance between concepts within an ontology. DIS-C is graph based in the sense that the whole topology of the ontology is considered when computing the weight of the relationships between concepts. The methodology is composed of two main steps. First, in order to take advantage of previous knowledge, an expert of the ontology domain assigns initial weight values to each of the relations in the ontology. Then, an automatic method for computing the conceptual relations refines the weights assigned to each relation until reaching a stable state. We introduce a metric called generality that is defined in order to evaluate the accessibility of each concept, considering the ontology like a strongly connected graph. Unlike most previous approaches, the DIS-C algorithm computes similarity between concepts in ontologies that are not necessarily represented in a hierarchical or taxonomic structure. So, DIS-C is capable of incorporating a wide variety of relationships between concepts such as meronymy, antonymy, functionality and causality.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Notes

  1. Formally, the output is not a distance, since some conditions are not met, such as symmetry and triangle inequality.

  2. The distance is inversely proportional in the absolute value of the correlation.

  3. Simple rounding.

  4. As we have mentioned, the conceptual distance is not symmetric (\(\exists a,b\in C|\Delta _{K}(a,b)\ne \Delta _{K}(b,a)\)). So, we present the conceptual distance from word A to word B (column DIS-C(to)), from word B to word A (column DIS-C(from)), the average of these two distances (column DIS-C(avg), the minimum (DIS-C(min)) and the maximum (column DIS-C(max)).

References

  1. Al-Mubaid H, Nguyen H et al (2006) A cluster-based approach for semantic similarity in the biomedical domain. In: Engineering in Medicine and Biology Society, 2006. EMBS’06. 28th annual international conference of the IEEE’, IEEE, pp 2713–2717

  2. Al-Mubaid H, Nguyen H et al (2009) Measuring semantic similarity between biomedical concepts within multiple ontologies. IEEE Trans Syst Man Cybern Part C: Appl Rev 39(4):389–398

    Article  Google Scholar 

  3. Albacete E, Calle-Gómez J, Castro E, Cuadra D (2012) Semantic similarity measures applied to an ontology for human-like interaction. J Artif Intell Res (JAIR) 44:397–421

    Article  MATH  Google Scholar 

  4. Albertoni R, De Martino M (2006) Semantic similarity of ontology instances tailored on the application context. In: On the move to meaningful internet systems 2006: CoopIS, DOA, GADA, and ODBASE, Springer, Berlin, pp 1020–1038

  5. Atkinson J, Ferreira A, Aravena E (2009) Discovering implicit intention-level knowledge from natural-language texts. Knowl-Based Syst 22(7):502–508

    Article  Google Scholar 

  6. Batet M, Sánchez D, Valls A (2011) An ontology-based measure to compute semantic similarity in biomedicine. J Biomed Inform 44(1):118–125

    Article  Google Scholar 

  7. Blanco-Fernández Y, Pazos-Arias JJ, Gil-Solla A, Ramos-Cabrer M, López-Nores M, García-Duque J, Fernández-Vilas A, Díaz-Redondo RP, Bermejo-Muñoz J (2008) A flexible semantic inference methodology to reason about user preferences in knowledge-based recommender systems. Knowl-Based Syst 21(4):305–320

    Article  Google Scholar 

  8. Bollegala D, Matsuo Y, Ishizuka M (2007) Measuring semantic similarity between words using web search engines. WWW 7:757–766

    Google Scholar 

  9. Budan I, Graeme H (2006) Evaluating wordnet-based measures of semantic distance. Comut Linguist 32(1):13–47

    Article  MATH  Google Scholar 

  10. Chu H-C, Chen M-Y, Chen Y-M (2009) A semantic-based approach to content abstraction and annotation for content management. Expert Syst Appl 36(2):2360–2376

    Article  Google Scholar 

  11. Cilibrasi RL, Vitanyi P (2007) The google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383

    Article  Google Scholar 

  12. Consortium GO (2004) The gene ontology (go) database and informatics resource. Nucleic Acids Res 32(suppl 1):D258–D261

    Article  Google Scholar 

  13. Couto FM, Silva MJ, Coutinho PM (2007) Measuring semantic similarity between gene ontology terms. Data Knowl Eng 61(1):137–152

    Article  Google Scholar 

  14. Cross V, Hu X (2011) Using semantic similarity in ontology alignment. Ontology Matching p 61

  15. Ding L, Finin T, Joshi A, Pan R, Cost RS, Peng Y, Reddivari P, Doshi V, Sachs J (2004) Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management, ACM, 652–659

  16. Fellbaum C (1998) WordNet: an electronic database. MIT Press, Cambridge

    Book  MATH  Google Scholar 

  17. Fonseca F (2008) Ontology-based geospatial data integration. In: Encyclopedia of GIS, pp 812–815

  18. Formica A (2006) Ontology-based concept similarity in formal concept analysis. Inf Sci 176(18):2624–2641

    MathSciNet  Article  MATH  Google Scholar 

  19. Gangemi A, Guarino N, Masolo C, Oltramari A, Schneider L (2002) Sweetening ontologies with dolce. In: Knowledge engineering and knowledge management: ontologies and the semantic web. Springer, Berlin, pp 166–181

  20. Goldstone R (1994a) An efficient method for obtaining similarity data. Behav Res Methods Instrum Comput 26(4):381–386

    Article  Google Scholar 

  21. Goldstone RL (1994b) Similarity, interactive activation, and mapping. J Exp Psychol Learn Mem Cognit 20(1):3

    Article  Google Scholar 

  22. Goldstone RL, Medin DL, Halberstadt J (1997) Similarity in context. Mem Cognit 25(2):237–255

    Article  Google Scholar 

  23. Han L, Sun L, Chen G, Xie L (2006) Adss: an approach to determining semantic similarity. Adv Eng Softw 37(2):129–132

    Article  Google Scholar 

  24. Harispe S, Sánchez D, Ranwez S, Janaqi S, Montmain J (2014) A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J Biomed Inform 48:38–53

    Article  Google Scholar 

  25. Héja G, Surján G, Varga P (2008) Ontological analysis of snomed ct. BMC Med Inform Decis Mak 8(Suppl 1):S8

    Article  Google Scholar 

  26. Hirst G, St-Onge D (1998) Lexical chains as representations of context for the detection and correction of malapropisms. WordNet: Electron Lex Database 305:305–332

    Google Scholar 

  27. Hliaoutakis A, Varelas G, Voutsakis E, Petrakis EG, Milios E (2006) Information retrieval by semantic similarity. Int J Semant Web Inf Syst 2(3):55–73

    Article  Google Scholar 

  28. Jain P, Yeh PZ, Verma K, Vasquez RG, Damova M, Hitzler P, Sheth AP (2011) Contextual ontology alignment of lod with an upper ontology: a case study with proton. In: The semantic web: research and applications. Springer, Berlin, pp 80–92

  29. Janowicz K, Raubal M, Kuhn W (2015) The semantics of similarity in geographic information retrieval. J Spat Inf Sci 2:29–57

    Google Scholar 

  30. Jarmasz M, Szpakowicz S (2003) Roget’s thesaurus and semantic similarity. In: Proceedings of the international conference on recent advances in natural language processing, 212–219

  31. Jiang JJ, Conrath DW (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the international conference on research in computational linguistics, 19–33

  32. Kashyap V, Sheth A (1996) Semantic and schematic similarities between database objects: a context-based approach. VLDB J-Int J Very Large Data Bases 5(4):276–304

    Article  Google Scholar 

  33. Kastrati Z, Imran AS, Yildirim-Yayilgan S (2016) Semcon: a semantic and contextual objective metric for enriching domain ontology concepts. Int J Semant Web Inf Syst 12(2):1–24

    Article  Google Scholar 

  34. Kumar S, Baliyan N, Sukalikar S (2017) Ontology cohesion and coupling metrics. Int J Semant Web Inf Syst 13(4):1–26

    Article  Google Scholar 

  35. Leacock C, Chodorow M (1998) Combining local context and wordnet similarity for word sense identification. WordNet: Electron Lex Database 49(2):265–283

    Google Scholar 

  36. Levachkine S, Guzmán-Arenas A (2007) Hierarchy as a new data type for qualitative variables. Expert Syst Appl 32(3):899–910

    Article  Google Scholar 

  37. Li Y, Bandar Z, McLean D et al (2003) An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans Knowl Data Eng 15(4):871–882

    Article  Google Scholar 

  38. Li Y, McLean D, Bandar Z, O’shea JD, Crockett K (2006) Sentence similarity based on semantic nets and corpus statistics. IEEE Trans Knowl Data Eng 18(8):1138–1150

    Article  Google Scholar 

  39. Likavec S, Osborne F, Cena F (2015) Property-based semantic similarity and relatedness for improving recommendation accuracy and diversity. Int J Semant Web Inf Syst 11(4):1–40

    Article  Google Scholar 

  40. Lin D et al (1998) An information-theoretic definition of similarity. In: ICML vol 98, 296–304

  41. Meilicke C, Stuckenschmidt H, Tamilin A (2007) Repairing ontology mappings. In: AAAI, vol 3, 6

  42. Meng L, Huang R, Gu J (2013) A review of semantic similarity measures in wordnet. Int J Hybrid Inf Technol 6(1):1–12

    Google Scholar 

  43. Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41

    Article  Google Scholar 

  44. Miller GA, Charles WG (1991) Contextual correlates of semantic similarity. Lang Cognit Process 6(1):1–28

    MathSciNet  Article  Google Scholar 

  45. Moreno M (2007) Similitud semantica entre sistemas de objetos geograficos aplicada a la generalizacion de datos geo-espaciales, Ph.D. thesis

  46. Nedas K, Egenhofer M (2008) Spatial-scene similarity queries. Trans GIS 12(6):661–681

    Article  Google Scholar 

  47. Niles I, Pease A (2001) Towards a standard upper ontology. In: Proceedings of the international conference on formal ontology in information systems, 2001, ACM, 2–9

  48. Patwardhan S, Banerjee S, Pedersen T (2003) Using measures of semantic relatedness for word sense disambiguation. In: Computational linguistics and intelligent text processing. Springer, Berlin, 241–257

  49. Pedersen T, Pakhomov SV, Patwardhan S, Chute CG (2007) Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 40(3):288–299

    Article  Google Scholar 

  50. Petrakis EG, Varelas G, Hliaoutakis A, Raftopoulou P (2006) X-similarity: computing semantic similarity between concepts from different ontologies. JDIM 4(4):233–237

    Google Scholar 

  51. Pirró G (2009) A semantic similarity metric combining features and intrinsic information content. Data Knowl Eng 68(11):1289–1308

    Article  Google Scholar 

  52. Pirrò G, Ruffolo M, Talia D (2009) Secco: on building semantic links in peer-to-peer networks. In: Journal on data semantics XII’, Springer, Berlin, 1–36

  53. Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30

    Article  Google Scholar 

  54. Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy, arXiv preprint cmp-lg/9511007

  55. Resnik P (1999) Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J Artif Intell Res 11:95–130

    Article  MATH  Google Scholar 

  56. Rissland EL (2006) Ai and similarity. IEEE Intell Syst 3:39–49

    Article  Google Scholar 

  57. Rodríguez MA, Egenhofer MJ (2003) Determining semantic similarity among entity classes from different ontologies. IEEE Trans Knowl Data Eng 15(2):442–456

    Article  Google Scholar 

  58. Rodríguez M, Egenhofer M (2004) Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure. Int J Geogr Inf Sci 18(3):229–256

    Article  Google Scholar 

  59. Rubenstein H, Goodenough JB (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633

    Article  Google Scholar 

  60. Sánchez D (2010) A methodology to learn ontological attributes from the web. Data Knowl Eng 69(6):573–597

    Article  Google Scholar 

  61. Sánchez D, Batet M (2011) Semantic similarity estimation in the biomedical domain: an ontology-based information-theoretic perspective. J Biomed Inform 44(5):749–759

    Article  Google Scholar 

  62. Sánchez D, Batet M (2013) A semantic similarity method based on information content exploiting multiple ontologies. Expert Syst Appl 40(4):1393–1399

    Article  Google Scholar 

  63. Sánchez D, Batet M, Isern D (2011) Ontology-based information content computation. Knowl-Based Syst 24(2):297–303

    Article  Google Scholar 

  64. Sánchez D, Batet M, Isern D, Valls A (2012) Ontology-based semantic similarity: a new feature-based approach. Expert Syst Appl 39(9):7718–7728

    Article  Google Scholar 

  65. Sánchez D, Isern D (2011) Automatic extraction of acronym definitions from the web. Appl Intell 34(2):311–327

    Article  Google Scholar 

  66. Sánchez D, Isern D, Millan M (2011) Content annotation for the semantic web: an automatic web-based approach. Knowl Inf Syst 27(3):393–418

    Article  Google Scholar 

  67. Sánchez D, Moreno A, Del Vasto-Terrientes L (2012) Learning relation axioms from text: an automatic web-based approach. Expert Syst Appl 39(5):5792–5805

    Article  Google Scholar 

  68. Sánchez D, Solé-Ribalta A, Batet M, Fz Serratosa (2012) Enabling semantic similarity estimation across multiple ontologies: an evaluation in the biomedical domain. J Biomed Inform 45(1):141–155

    Article  Google Scholar 

  69. Saruladha K, Aghila G, Bhuvaneswary A (2011) Information content based semantic similarity for cross ontological concepts. Int J Eng Sci Technol 3(6)

  70. Schickel-Zuber V, Faltings B (2007) Oss: a semantic similarity function based on hierarchical ontologies. In: IJCAI, vol 7, 551–556

  71. Schwering A (2005) Hybrid model for semantic similarity measurement. In: On the move to meaningful internet systems 2005: CoopIS, DOA, and ODBASE’, Springer, Berlin, 1449–1465

  72. Schwering A (2008) Approaches to semantic similarity measurement for geo-spatial data: a survey. Trans GIS 12(1):5–29

    Article  Google Scholar 

  73. Schwering A, Raubal M (2005) Measuring semantic similarity between geospatial conceptual regions. In: GeoSpatial semantics. Springer, Berlin, 90–106

  74. Seco N, Veale T, Hayes J (2004) An intrinsic information content metric for semantic similarity in wordnet. In: ECAI, vol 16, 1089

  75. Sheeren D, Mustière S, Zucker JD (2009) A data mining approach for assessing consistency between multiple representations in spatial databases. Int J Geogr Inf Sci 23:961–992

    Article  Google Scholar 

  76. Sinha R, Mihalcea R (2007) Unsupervised graph-basedword sense disambiguation using measures of word semantic similarity. In: Null, IEEE, 363–369

  77. Song W, Li CH, Park SC (2009) Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures. Expert Syst Appl 36(5):9095–9104

    Article  Google Scholar 

  78. Stevenson M, Greenwood MA (2005) A semantic approach to ie pattern induction. In: Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 379–386

  79. Tapeh AG, Rahgozar M (2008) A knowledge-based question answering system for b2c ecommerce. Knowl-Based Syst 21(8):946–950

    Article  Google Scholar 

  80. Torres M, Quintero R, Moreno-Ibarra M, Menchaca-Mendez R, Guzman G (2011) GEONTO-MET: an approach to conceptualizing the geographic domain. Int J Geogr Inf Sci 25(10):1633–1657

    Article  Google Scholar 

  81. Tversky A, Gati I (1978) Studies of similarity. Cognit Categ 1(1978):79–98

    Google Scholar 

  82. Wang H, Wang W, Yang J, Yu PS (2002) Clustering by pattern similarity in large data sets. In: Proceedings of the 2002 ACM SIGMOD international conference on management of data. ACM, 394–405

  83. Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: Proceedings of the 32nd annual meeting on association for computational linguistics. Association for Computational Linguistics, 133–138

  84. Zadeh PDH, Reformat MZ (2013) Assessment of semantic similarity of concepts defined in ontology. Inf Sci 250:21–39

    Article  Google Scholar 

  85. Zhou Z, Wang Y, Gu J (2008) A new model of information content for semantic similarity in wordnet. In: Future generation communication and networking symposia, 2008. FGCNS’08. Second international conference on’, vol 3, IEEE, 85–89

Download references

Acknowledgements

Work partially sponsored by Instituto Politécnico Nacional and SIP-IPN under Grants 20182159, 20180308, 20180409, 20180773, 20180839 and 20181568. Also is sponsored by Consejo Nacional de Ciencia y Tecnología (CONACyT) under Grant PN-2016/2110. We are thankful to the reviewers for their invaluable and constructive feedback that helped improve the quality of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rolando Quintero.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Quintero, R., Torres-Ruiz, M., Menchaca-Mendez, R. et al. DIS-C: conceptual distance in ontologies, a graph-based approach. Knowl Inf Syst 59, 33–65 (2019). https://doi.org/10.1007/s10115-018-1200-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-018-1200-3

Keywords

  • Conceptual distance
  • Semantic similarity
  • Ontology
  • Graph