Skip to main content

Combining Semantic and Lexical Measures to Evaluate Medical Terms Similarity

  • Conference paper
  • First Online:
Data Integration in the Life Sciences (DILS 2018)

Abstract

The use of similarity measures in various domains is cornerstone for different tasks ranging from ontology alignment to information retrieval. To this end, existing metrics can be classified into several categories among which lexical and semantic families of similarity measures predominate but have rarely been combined to complete the aforementioned tasks. In this paper, we propose an original approach combining lexical and ontology-based semantic similarity measures to improve the evaluation of terms relatedness. We validate our approach through a set of experiments based on a corpus of reference constructed by domain experts of the medical field and further evaluate the impact of ontology evolution on the used semantic similarity measures.

This work is supported by FNR Luxembourg and DFG Germany through the ELISA project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pandas.pydata.org/pandas-docs/version/0.21/generated/pandas.Series.rank.html.

References

  1. Aouicha, M.B., Taieb, M.A.H.: Computing semantic similarity between biomedical concepts using new information content approach. J. Biomed. Inform. 59, 258–275 (2016)

    Article  Google Scholar 

  2. Cardoso, S.D., et al.: Leveraging the impact of ontology evolution on semantic annotations. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds.) EKAW 2016. LNCS (LNAI), vol. 10024, pp. 68–82. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49004-5_5

    Chapter  Google Scholar 

  3. Cardoso, S.D., Reynaud-Delaître, C., Da Silveira, M., Pruski, C.: Combining rules, background knowledge and change patterns to maintain semantic annotations. In: AMIA Annual Symposium, Washington DC, USA, November 2017 (2017)

    Google Scholar 

  4. Cardoso, S.D., et al.: Evolving semantic annotations through multiple versions of controlled medical terminologies. Health Technol. 8, 361–376 (2018). https://doi.org/10.1007/s12553-018-0261-3

    Article  Google Scholar 

  5. Christen, V., Groß, A., Varghese, J., Dugas, M., Rahm, E.: Annotating medical forms using UMLS. In: Ashish, N., Ambite, J.-L. (eds.) DILS 2015. LNCS, vol. 9162, pp. 55–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21843-4_5

    Chapter  Google Scholar 

  6. Couto, F., Pinto, S.: The next generation of similarity measures that fully explore the semantics in biomedical ontologies. J. Bioinf. Comput. Biol. 11(5), 1371001 (2013)

    Article  Google Scholar 

  7. Couto, F.M., Silva, M.J., Coutinho, P.M.: Semantic similarity over the gene ontology: family correlation and selecting disjunctive ancestors. In: Proceedings of the 14th ACM International Conference on Information And Knowledge Management, pp. 343–344. ACM (2005)

    Google Scholar 

  8. Cross, V.: Tversky’s parameterized similarity ratio model: a basis for semantic relatedness. In: 2006 Fuzzy Information Processing Society, NAFIPS 2006, Annual meeting of the North American, pp. 541–546. IEEE (2006)

    Google Scholar 

  9. Cross, V., Silwal, P., Chen, X.: Experiments varying semantic similarity measures and reference ontologies for ontology alignment. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 279–281. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41242-4_42

    Chapter  Google Scholar 

  10. Da Silveira, M., Dos Reis, J.C., Pruski, C.: Management of dynamic biomedical terminologies: current status and future challenges. Yearb. Med. Inf. 10(1), 125–133 (2015)

    Google Scholar 

  11. Dos Reis, J.C., Pruski, C., Da Silveira, M., Reynaud-Delaître, C.: DyKOSMap: a framework for mapping adaptation between biomedical knowledge organization systems. J. Biomed. Inf. 55, 153–173 (2015)

    Article  Google Scholar 

  12. Faria, D., Pesquita, C., Couto, F.M., Falcão, A.: Proteinon: a web tool for protein semantic similarity. Department of Informatics, University of Lisbon (2007)

    Google Scholar 

  13. Ferreira, R., Lins, R.D., Simske, S.J., Freitas, F., Riss, M.: Assessing sentence similarity through lexical, syntactic and semantic analysis. Comput. Speech Lang. 39, 1–28 (2016)

    Article  Google Scholar 

  14. Garla, V.N., Brandt, C.: Semantic similarity in the biomedical domain: an evaluation across knowledge sources. BMC Bioinf. 13(1), 261 (2012)

    Article  Google Scholar 

  15. Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13), 13–18 (2013)

    Google Scholar 

  16. Harispe, S.: Knowledge-based semantic measures: from theory to applications. Ph.D. thesis (2014)

    Google Scholar 

  17. Harispe, S., Sánchez, D., Ranwez, S., Janaqi, S., Montmain, J.: A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J. Biomed. Inf. 48, 38–53 (2014)

    Article  Google Scholar 

  18. Islam, A., Inkpen, D.: Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Discov. Data 2(2), 10:1–10:25 (2008)

    Article  Google Scholar 

  19. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint https://arxiv.org/abs/cmp-lg/9709008 (1997)

  20. Li, B., Wang, J.Z., Feltus, F.A., Zhou, J., Luo, F.: Effectively integrating information content and structural relationship to improve the go-based similarity measure between proteins. arXiv preprint arXiv:1001.0958 (2010)

  21. Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco (1998). http://dl.acm.org/citation.cfm?id=645527.657297

  22. Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10), 1275–1283 (2003)

    Article  Google Scholar 

  23. Mazandu, G.K., Mulder, N.J.: A topology-based metric for measuring term similarity in the gene ontology. Adv. Bioinform. 2012 (2012)

    Article  Google Scholar 

  24. McInnes, B.T., Pedersen, T.: Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. J. Biomed. Inf. 46(6), 1116–1124 (2013). Special Section: Social Media Environments

    Article  Google Scholar 

  25. Mihalcea, R., Corley, C., Strapparava, C., et al.: Corpus-based and knowledge-based measures of text semantic similarity. In: AAAI, vol. 6, 775–780 (2006)

    Google Scholar 

  26. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  27. Morris, J.F.: A quantitative methodology for vetting dark network intelligence sources for social network analysis. Technical report, Air Force Inst of Tech Wright-Patterson AFB OH Graduate School of Engineering and Management (2012)

    Google Scholar 

  28. Nguyen, T.T., Conrad, S.: Ontology matching using multiple similarity measures. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), vol. 01, pp. 603–611, November 2015. doi.ieeecomputersociety.org/

    Google Scholar 

  29. Oliva, J., Serrano, J.I., del Castillo, M.D., Iglesias, Á.: SyMSS: a syntax-based measure for short-text semantic similarity. Data Knowl. Eng. 70(4), 390–405 (2011)

    Article  Google Scholar 

  30. Pakhomov, S., McInnes, B., Adam, T., Liu, Y., Pedersen, T., Melton, G.B.: Semantic similarity and relatedness between clinical terms: an experimental study. In: Annual Symposium proceedings, AMIA Symposium, vol. 2010, pp. 572–576. AMIA (2010)

    Google Scholar 

  31. Pakhomov, S.V., Pedersen, T., McInnes, B., Melton, G.B., Ruggieri, A., Chute, C.G.: Towards a framework for developing semantic relatedness reference standards. J. Biomed. Inf. 44(2), 251–265 (2011)

    Article  Google Scholar 

  32. Pedersen, T., Pakhomov, S., Patwardhan, S., Chute, C.: Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inf. 40, 288–299 (2007)

    Article  Google Scholar 

  33. Pesquita, C., Couto, F.M.: Predicting the extension of biomedical ontologies. PLoS Comput. Biol. 8(9), e1002630 (2012). https://doi.org/10.1371/journal.pcbi.1002630

    Article  Google Scholar 

  34. Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, New York (1988)

    MATH  Google Scholar 

  35. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448–453. Morgan Kaufmann Publishers Inc. (1995)

    Google Scholar 

  36. Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowl.-Based Syst. 24(2), 297–303 (2011)

    Article  Google Scholar 

  37. Sánchez, D., Batet, M., Isern, D., Valls, A.: Ontology-based semantic similarity: a new feature-based approach. Expert Syst. Appl. 39(9), 7718–7728 (2012)

    Article  Google Scholar 

  38. Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in wordnet. In: ECAI, vol. 16, p. 1089 (2004)

    Google Scholar 

  39. Strehl, A., Ghosh, J., Mooney, R.: Impact of similarity measures on web-page clustering. In: Workshop on Artificial Intelligence for Web Search (AAAI 2000), vol. 58, p. 64 (2000)

    Google Scholar 

  40. Tversky, A.: Features of similarity. Psychol. Rev. 84(4), 327 (1977)

    Article  Google Scholar 

  41. Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in wordnet. In: 2008 Second International Conference on Future Generation Communication and Networking Symposia, FGCNS 2008, vol. 3, pp. 85–89. IEEE (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Silvio Domingos Cardoso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cardoso, S.D. et al. (2019). Combining Semantic and Lexical Measures to Evaluate Medical Terms Similarity. In: Auer, S., Vidal, ME. (eds) Data Integration in the Life Sciences. DILS 2018. Lecture Notes in Computer Science(), vol 11371. Springer, Cham. https://doi.org/10.1007/978-3-030-06016-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-06016-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-06015-2

  • Online ISBN: 978-3-030-06016-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics