Skip to main content

Annotating Medical Forms Using UMLS

  • Conference paper
  • First Online:
Data Integration in the Life Sciences (DILS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9162))

Included in the following conference series:


Medical forms are frequently used to document patient data or to collect relevant data for clinical trials. It is crucial to harmonize medical forms in order to improve interoperability and data integration between medical applications. Here we propose a (semi-) automatic annotation of medical forms with concepts of the Unified Medical Language System (UMLS). Our annotation workflow encompasses a novel semantic blocking, sophisticated match techniques and post-processing steps to select reasonable annotations. We evaluate our methods based on reference mappings between medical forms and UMLS, and further manually validate the recommended annotations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


  1. 1.

  2. 2.

  3. 3.


  1. Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17(3), 229–236 (2010)

    Article  Google Scholar 

  2. Bodenreider, O.: The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(suppl 1), D267–D270 (2004)

    Article  Google Scholar 

  3. Bramesfeld, A., Willms, G.: Cross-Sectoral Quality Assurance. §137a Social Code Book V. Public Health Forum, pp. 14.e1–14.e3 (2014)

    Google Scholar 

  4. Breil, B., Kenneweg, J., Fritz, F., et al.: Multilingual medical data models in ODM format-a novel form-based approach to semantic interoperability between routine health-care and clinical research. Appl. Clin. Inf. 3, 276–289 (2012)

    Article  MATH  Google Scholar 

  5. Donnelly, K.: SNOMED-CT: The advanced terminology and coding system for eHealth. Stud. Health Technol. Inform. Med. Care Compunetics 3(121), 279–290 (2006)

    Google Scholar 

  6. Dugas, M.: Missing semantic annotation in databases. The root cause for data integration and migration problems in information systems. Methods Inf. Med. 53(6), 516–517 (2014)

    Article  Google Scholar 

  7. Dugas, M., Fritz, F., Krumm, R., Breil, B.: Automated UMLS-based comparison of medical forms. PloS one 8(7) (2013). doi:10.1371/journal.pone.0067883

  8. Euzenat, J., Shvaiko, P.: Ontology Matching, vol. 18. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  9. Hao, T., Rusanov, A., Boland, M.R., et al.: Clustering clinical trials with similar eligibility criteria features. J. Biomed. Inform. 52, 112–120 (2014)

    Article  MATH  Google Scholar 

  10. Huntley, R.P., Sawford, T., Mutowo-Meullenet, P., et al.: The GOA database: gene Ontology annotation updates for 2015. Nucleic Acids Res. 43(D1), D1057–D1063 (2015)

    Article  MATH  Google Scholar 

  11. Kirsten, T., Gross, A., Hartung, M., Rahm, E.: GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution. J. Biomed. Semant. 2(6), 1–24 (2011)

    Google Scholar 

  12. Lingren, T., Deleger, L., Molnar, K., et al.: Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. J. Am. Med. Inform. Assoc. 21(3), 406–413 (2014)

    Article  MATH  Google Scholar 

  13. Lowe, H.J., Barnett, G.O.: Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. J. Am. Med. Assoc. (JAMA) 271(14), 1103–1108 (1994)

    Article  MATH  Google Scholar 

  14. Luo, Z., Duffy, R., Johnson, S., Weng, C.: Corpus-based approach to creating a semantic lexicon for clinical research eligibility criteria from umls. AMIA Summits Transl. Sci. Proc. 2010, 26–30 (2010)

    Google Scholar 

  15. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  16. Ogren, P., Savova, G., Chute, C.: Constructing evaluation corpora for automated clinical named entity recognition. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC), pp. 3143–3150 (2008)

    Google Scholar 

  17. Rahm, E.: Towards large-scale schema and ontology matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping. Data-Centric Systems and Applications, pp. 3–27. Springer, Berlin (2011)

    Chapter  Google Scholar 

  18. Ren, K., Lai, A.M., Mukhopadhyay, A., et al.: Effectively processing medical term queries on the UMLS Metathesaurus by layered dynamic programming. BMC Med. Genomics 7(Suppl 1), 1–12 (2014)

    Article  Google Scholar 

  19. Roberts, A., Gaizauskas, R., Hepple, M., et al.: Building a semantically annotated corpus of clinical texts. J. Biomed. Inform. 42(5), 950–966 (2009)

    Article  Google Scholar 

  20. Varghese, J., Dugas, M.: Frequency analysis of medical concepts in clinical trials and their coverage in MeSH and SNOMED-CT. Methods Inf. Med. 53(6), 83–92 (2014)

    Article  MATH  Google Scholar 

Download references


This work is funded by the German Research Foundation (DFG) (grant RA 497/22-1, “ELISA - Evolution of Semantic Annotations”).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Victor Christen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Christen, V., Groß, A., Varghese, J., Dugas, M., Rahm, E. (2015). Annotating Medical Forms Using UMLS. In: Ashish, N., Ambite, JL. (eds) Data Integration in the Life Sciences. DILS 2015. Lecture Notes in Computer Science(), vol 9162. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21842-7

  • Online ISBN: 978-3-319-21843-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics