Advertisement

Combining Semantic and Lexical Measures to Evaluate Medical Terms Similarity

  • Silvio Domingos Cardoso
  • Marcos Da Silveira
  • Ying-Chi Lin
  • Victor Christen
  • Erhard Rahm
  • Chantal Reynaud-Delaître
  • Cédric Pruski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11371)

Abstract

The use of similarity measures in various domains is cornerstone for different tasks ranging from ontology alignment to information retrieval. To this end, existing metrics can be classified into several categories among which lexical and semantic families of similarity measures predominate but have rarely been combined to complete the aforementioned tasks. In this paper, we propose an original approach combining lexical and ontology-based semantic similarity measures to improve the evaluation of terms relatedness. We validate our approach through a set of experiments based on a corpus of reference constructed by domain experts of the medical field and further evaluate the impact of ontology evolution on the used semantic similarity measures.

Keywords

Similarity measures Ontology evolution Semantic Web Medical terminologies 

References

  1. 1.
    Aouicha, M.B., Taieb, M.A.H.: Computing semantic similarity between biomedical concepts using new information content approach. J. Biomed. Inform. 59, 258–275 (2016)CrossRefGoogle Scholar
  2. 2.
    Cardoso, S.D., et al.: Leveraging the impact of ontology evolution on semantic annotations. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds.) EKAW 2016. LNCS (LNAI), vol. 10024, pp. 68–82. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49004-5_5CrossRefGoogle Scholar
  3. 3.
    Cardoso, S.D., Reynaud-Delaître, C., Da Silveira, M., Pruski, C.: Combining rules, background knowledge and change patterns to maintain semantic annotations. In: AMIA Annual Symposium, Washington DC, USA, November 2017 (2017)Google Scholar
  4. 4.
    Cardoso, S.D., et al.: Evolving semantic annotations through multiple versions of controlled medical terminologies. Health Technol. 8, 361–376 (2018).  https://doi.org/10.1007/s12553-018-0261-3CrossRefGoogle Scholar
  5. 5.
    Christen, V., Groß, A., Varghese, J., Dugas, M., Rahm, E.: Annotating medical forms using UMLS. In: Ashish, N., Ambite, J.-L. (eds.) DILS 2015. LNCS, vol. 9162, pp. 55–69. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-21843-4_5CrossRefGoogle Scholar
  6. 6.
    Couto, F., Pinto, S.: The next generation of similarity measures that fully explore the semantics in biomedical ontologies. J. Bioinf. Comput. Biol. 11(5), 1371001 (2013)CrossRefGoogle Scholar
  7. 7.
    Couto, F.M., Silva, M.J., Coutinho, P.M.: Semantic similarity over the gene ontology: family correlation and selecting disjunctive ancestors. In: Proceedings of the 14th ACM International Conference on Information And Knowledge Management, pp. 343–344. ACM (2005)Google Scholar
  8. 8.
    Cross, V.: Tversky’s parameterized similarity ratio model: a basis for semantic relatedness. In: 2006 Fuzzy Information Processing Society, NAFIPS 2006, Annual meeting of the North American, pp. 541–546. IEEE (2006)Google Scholar
  9. 9.
    Cross, V., Silwal, P., Chen, X.: Experiments varying semantic similarity measures and reference ontologies for ontology alignment. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 279–281. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-41242-4_42CrossRefGoogle Scholar
  10. 10.
    Da Silveira, M., Dos Reis, J.C., Pruski, C.: Management of dynamic biomedical terminologies: current status and future challenges. Yearb. Med. Inf. 10(1), 125–133 (2015)Google Scholar
  11. 11.
    Dos Reis, J.C., Pruski, C., Da Silveira, M., Reynaud-Delaître, C.: DyKOSMap: a framework for mapping adaptation between biomedical knowledge organization systems. J. Biomed. Inf. 55, 153–173 (2015)CrossRefGoogle Scholar
  12. 12.
    Faria, D., Pesquita, C., Couto, F.M., Falcão, A.: Proteinon: a web tool for protein semantic similarity. Department of Informatics, University of Lisbon (2007)Google Scholar
  13. 13.
    Ferreira, R., Lins, R.D., Simske, S.J., Freitas, F., Riss, M.: Assessing sentence similarity through lexical, syntactic and semantic analysis. Comput. Speech Lang. 39, 1–28 (2016)CrossRefGoogle Scholar
  14. 14.
    Garla, V.N., Brandt, C.: Semantic similarity in the biomedical domain: an evaluation across knowledge sources. BMC Bioinf. 13(1), 261 (2012)CrossRefGoogle Scholar
  15. 15.
    Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13), 13–18 (2013)Google Scholar
  16. 16.
    Harispe, S.: Knowledge-based semantic measures: from theory to applications. Ph.D. thesis (2014)Google Scholar
  17. 17.
    Harispe, S., Sánchez, D., Ranwez, S., Janaqi, S., Montmain, J.: A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J. Biomed. Inf. 48, 38–53 (2014)CrossRefGoogle Scholar
  18. 18.
    Islam, A., Inkpen, D.: Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Discov. Data 2(2), 10:1–10:25 (2008)CrossRefGoogle Scholar
  19. 19.
    Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint https://arxiv.org/abs/cmp-lg/9709008 (1997)
  20. 20.
    Li, B., Wang, J.Z., Feltus, F.A., Zhou, J., Luo, F.: Effectively integrating information content and structural relationship to improve the go-based similarity measure between proteins. arXiv preprint arXiv:1001.0958 (2010)
  21. 21.
    Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco (1998). http://dl.acm.org/citation.cfm?id=645527.657297
  22. 22.
    Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10), 1275–1283 (2003)CrossRefGoogle Scholar
  23. 23.
    Mazandu, G.K., Mulder, N.J.: A topology-based metric for measuring term similarity in the gene ontology. Adv. Bioinform. 2012 (2012)CrossRefGoogle Scholar
  24. 24.
    McInnes, B.T., Pedersen, T.: Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. J. Biomed. Inf. 46(6), 1116–1124 (2013). Special Section: Social Media EnvironmentsCrossRefGoogle Scholar
  25. 25.
    Mihalcea, R., Corley, C., Strapparava, C., et al.: Corpus-based and knowledge-based measures of text semantic similarity. In: AAAI, vol. 6, 775–780 (2006)Google Scholar
  26. 26.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
  27. 27.
    Morris, J.F.: A quantitative methodology for vetting dark network intelligence sources for social network analysis. Technical report, Air Force Inst of Tech Wright-Patterson AFB OH Graduate School of Engineering and Management (2012)Google Scholar
  28. 28.
    Nguyen, T.T., Conrad, S.: Ontology matching using multiple similarity measures. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), vol. 01, pp. 603–611, November 2015. doi.ieeecomputersociety.org/Google Scholar
  29. 29.
    Oliva, J., Serrano, J.I., del Castillo, M.D., Iglesias, Á.: SyMSS: a syntax-based measure for short-text semantic similarity. Data Knowl. Eng. 70(4), 390–405 (2011)CrossRefGoogle Scholar
  30. 30.
    Pakhomov, S., McInnes, B., Adam, T., Liu, Y., Pedersen, T., Melton, G.B.: Semantic similarity and relatedness between clinical terms: an experimental study. In: Annual Symposium proceedings, AMIA Symposium, vol. 2010, pp. 572–576. AMIA (2010)Google Scholar
  31. 31.
    Pakhomov, S.V., Pedersen, T., McInnes, B., Melton, G.B., Ruggieri, A., Chute, C.G.: Towards a framework for developing semantic relatedness reference standards. J. Biomed. Inf. 44(2), 251–265 (2011)CrossRefGoogle Scholar
  32. 32.
    Pedersen, T., Pakhomov, S., Patwardhan, S., Chute, C.: Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inf. 40, 288–299 (2007)CrossRefGoogle Scholar
  33. 33.
    Pesquita, C., Couto, F.M.: Predicting the extension of biomedical ontologies. PLoS Comput. Biol. 8(9), e1002630 (2012).  https://doi.org/10.1371/journal.pcbi.1002630CrossRefGoogle Scholar
  34. 34.
    Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, New York (1988)zbMATHGoogle Scholar
  35. 35.
    Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448–453. Morgan Kaufmann Publishers Inc. (1995)Google Scholar
  36. 36.
    Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowl.-Based Syst. 24(2), 297–303 (2011)CrossRefGoogle Scholar
  37. 37.
    Sánchez, D., Batet, M., Isern, D., Valls, A.: Ontology-based semantic similarity: a new feature-based approach. Expert Syst. Appl. 39(9), 7718–7728 (2012)CrossRefGoogle Scholar
  38. 38.
    Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in wordnet. In: ECAI, vol. 16, p. 1089 (2004)Google Scholar
  39. 39.
    Strehl, A., Ghosh, J., Mooney, R.: Impact of similarity measures on web-page clustering. In: Workshop on Artificial Intelligence for Web Search (AAAI 2000), vol. 58, p. 64 (2000)Google Scholar
  40. 40.
    Tversky, A.: Features of similarity. Psychol. Rev. 84(4), 327 (1977)CrossRefGoogle Scholar
  41. 41.
    Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in wordnet. In: 2008 Second International Conference on Future Generation Communication and Networking Symposia, FGCNS 2008, vol. 3, pp. 85–89. IEEE (2008)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Silvio Domingos Cardoso
    • 1
    • 2
  • Marcos Da Silveira
    • 1
  • Ying-Chi Lin
    • 3
  • Victor Christen
    • 3
  • Erhard Rahm
    • 3
  • Chantal Reynaud-Delaître
    • 2
  • Cédric Pruski
    • 1
  1. 1.LIST, Luxembourg Institute of Science and TechnologyEsch-sur-AlzetteLuxembourg
  2. 2.LRI, University of Paris-Sud XIGif-sur-YvetteFrance
  3. 3.Department of Computer ScienceUniversität LeipzigLeipzigGermany

Personalised recommendations