Skip to main content

Building a Very Large Ontology from Medical Thesauri

  • Chapter
Book cover Handbook on Ontologies

Part of the book series: International Handbooks on Information Systems ((INFOSYS))

Summary

We report on a large-scale knowledge conversion and curation case study. Medical knowledge from a comprehensive, though semantically shallow terminological repository, the UMLS, is transformed into a formally rigorous, expressive description logics format. This way, the broad coverage of the UMLS is combined with inference mechanisms for consistency and cycle checking. They are the key not only to proper cleansing of the knowledge directly imported from the UMLS, but also to subsequent updating and refinement of large amounts of rich and complex terminological knowledge structures. The emerging biomedical ontology currently comprises more than 240,000 conceptual entities and, hence, constitutes one of the largest formal knowledge bases ever built.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alessandro Artale, Enrico Franconi, Nicola Guarino, and Luca Pazzi Part-whole relations in object-centered systems: An overview. Data & Knowledge Engineering,20(3):347383, 1996.

    Google Scholar 

  2. Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter Patel-Schneider, editors. The Description Logic Handbook. Theory, Implementation and Applications. Cambridge, U.K.: Cambridge University Press, 2003.

    Google Scholar 

  3. Franz Baader and Werner Nutt. Basic description logics. In Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter Patel-Schneider, editors, The Description Logic Handbook. Theory, Implementation and Applications, pages 43–95. Cambridge, U.K.: Cambridge University Press, 2003.

    Google Scholar 

  4. Jochen Bernauer. Analysis of part-whole relation and subsumption in the medical domain. Data & Knowledge Engineering,20(3):405— 415, 1996.

    Google Scholar 

  5. Olivier Bodenreider. Circular hierarchical relationships in the UMLs: Etiology, diagnosis, treatment, complications and prevention. In S. Bakken, editor, AMIA 2001 — Proceedings of the Annual Symposium of the American Medical Informatics Association. A Medical Informatics Odyssey: Visions of the Future and Lessons from the Past, pages 57–61. Washington, D.C., November 3–7, 2001. Philadelphia, PA: Hanley & Belfus, 2001.

    Google Scholar 

  6. James J. Cimino. Distributed cognition and knowledge-based controlled medical terminologies. Artificial Intelligence in Medicine, 12 (1): 153–168, 1998.

    Google Scholar 

  7. James J. Cimino, Paul D. Clayton, George Hripsack, and Stephen B. Johnson. Knowledge-based approaches to the maintenance of a large controlled medical terminology. Journal of the American Medical Informatics Association, 1(1):35–50,1994.

    Google Scholar 

  8. Roger Côté, David J. Rothwell, Ronald S. Beckett, James L Palotay, and Louise Brochu. The Systemised Nomenclature of Medicine: SNOMED International. Northfield, IL: College of American Pathologists, 1993.

    Google Scholar 

  9. D. Alan Cruse. On the transitivity of the part-whole relation. Journal of Linguistics, 15: 29–38, 1979.

    Article  Google Scholar 

  10. Aldo Gangemi, Domenico M. Pisanelli, and Geri Steve. An overview of the ONION project: Applying ontologies to the integration of medical terminologies. Data & Knowledge Engineering, 31(2): 183–220, 1999.

    Google Scholar 

  11. Udo Hahn, Martin Romacker, and Stefan Schulz. MEDSYNDIKATE: A natural language system for the extraction of medical information from finding reports. International Journal of Medical Informatics, 67 (1/3): 63–74, 2002.

    Article  Google Scholar 

  12. Udo Hahn, Stefan Schulz, and Martin Romacker. Part-whole reasoning: A case study in medical ontology engineering. IEEE Intelligent Systems & Their Applications, 14 (5): 5967, 1999.

    Article  Google Scholar 

  13. Ira J. Haimowitz, Ramesh S. Patil, and Peter Szolovits. Representing medical knowledge in a terminological language is difficult. In R. A. Greenes, editor, SCAMC’88–Proceedings of the 12th Annual Symposium on Computer Applications in Medical Care, pages 101–105. Washington, D C • IEEE Computer Society Press, 1988.

    Google Scholar 

  14. Ian Horrocks and Ulrike Sattler. A description logic with transitive and inverse roles and role hierarchies. Journal of Logic and Computation, 9 (3): 385–410, 1999.

    Article  Google Scholar 

  15. Robert MacGregor and Raymond Bates. The Loom knowledge representation language. Technical Report RS-87–188, Information Sciences Institute, University of Southern California, 1987.

    Google Scholar 

  16. Robert M. MacGregor. A description classifier for the predicate calculus. In AAAI’94 - Proceedings of the 12th National Conference on Artificial Intelligence, volume 1, pages 213–220. Seattle, WA, USA, July 31 - August 4, 1994. Menlo Park, CA: AAAI Press & MIT Press, 1994.

    Google Scholar 

  17. Eric Mays, Robert Weida, Robert Dionne, Meir Laker, Brian White, Chihong Liang, and Frank J. Oles. Scalable and expressive medical terminologies. In J. J. Cimino, editor, AMIA’96 - Proceedings of the 1996 AMIA Annual Fall Symposium (formerly SCAMC). Beyond the Superhighway: Exploiting the Internet with Medical Informatics, pages 259263. Washington, D.C., October 26–30, 1996. Philadelphia, PA: Hanley & Belfus, 1996.

    Google Scholar 

  18. Alexa T. McCray and Stuart J. Nelson. The representation of meaning in the UMLS. Methods of Information in Medicine, 34 (1/2): 193–201, 1995.

    Google Scholar 

  19. Domenico M. Pisanelli, Aldo Gangemi, and Geri Steve. An ontological analysis of the UMLS metathesaurus. In C. G. Chute, editor, AMIA’98 - Proceedings of the 1998 AMIA Annual Fall Symposium. A Paradigm Shift in Health Care Information Systems: Clinical Infrastructures for the 21st Century, pages 810–814. Orlando, FL, November 7–11, 1998. Philadelphia, PA: Hanley & Belfus, 1998.

    Google Scholar 

  20. Alan L. Rector. Clinical terminology: Why is it so hard ? Methods of Information in Medicine, 38: 239–252, 1999.

    Google Scholar 

  21. Alan L. Rector. Analysis of propagation along transitive roles: Formalisation of the GALEN experience with medical ontologies. In Ian Horrocks and Sergio Tessaris, editors, DL 2002 - Proceedings of the 2002 International Workshop on Description Logics. Toulouse, France, April 19–21, 2002. Published via http://CEUR-WS.org/Vol-53/.

    Google Scholar 

  22. Alan L. Rector, Sean Bechhofer, Carole A. Goble, Ian Horrocks, W. Anthony Nowlan, and W. Danny Solomon. The GRAIL concept modelling language for medical terminology. Artificial Intelligence in Medicine, 9: 139–171, 1997.

    Article  Google Scholar 

  23. James A. Reggia and Stanley Tuhrim. An overview of methods for computer-assisted medical decision making. In James A. Reggia and Stanley Tuhrim, editors, Computer-Assisted Medical Decision Making. Vol. 1, pages 3–45. New York, N.Y.: Springer, 1985.

    Google Scholar 

  24. Jeremy E. Rogers, Colin Price, Alan Rector, W. Daniel Solomon, and Nick Smeijko. Validating clinical terminology structures: Integration and cross-validation of READ THESAURUS and GALEN. In C. G. Chute, editor, AMIA’98 - Proceedings of the 1998 AMIA Annual Fall Symposium. A Paradigm Shift in Health Care Information Systems: Clinical Infrastructures for the 21st Century, pages 845–849. Orlando, FL, November 7–11,1998. Philadelphia, PA: Hanley & Belfus, 1998.

    Google Scholar 

  25. Cornelius Rosse, José Leonardo V. Mejino, Bharath R. Modayur, Rex Jakobovits, Kevin P. Hinshaw, and James F. Brinkley. Motivation and organizational principles for anatomical knowledge representation: The DIGITAL ANATOMIST symbolic knowledge base. Journal of the American Medical Informatics Association, 5(1):17–40,1998.

    Google Scholar 

  26. James G. Schmolze and William S. Mark. The NIKL experience. Computational Intelligence, 7(11: 48–69, 1991.

    Google Scholar 

  27. Rainer Schubert and Karl-Heinz Höhne. Partonomies for interactive explorable 3D-models of anatomy. In C. G. Chute, editor, AMIA’98 - Proceedings of the 1998 AMIA Annual Fall Symposium. A Paradigm Shift in Health Care Information Systems: Clinical Infrastructures for the 21st Century, pages 433–437. Orlando, FL, November 7–11,1998. Philadelphia, PA: Hanley & Belfus, 1998.

    Google Scholar 

  28. Erich B. Schulz, Colin Price, and Philip J. B. Brown. Symbolic anatomic knowledge representation in the READ CODES Version 3: Structure and application. Journal of the American Medical Informatics Association, 4(1): 38–48, 1997.

    Google Scholar 

  29. Stefan Schulz and Udo Hahn. Mereotopological reasoning about parts and (w)holes in bio-ontologies. In Chris Welty and Barry Smith, editors, Formal Ontology in Information Systems. Collected Papers from the 2nd International FOIS Conference, pages 210–221. Ogunquit, Maine, USA, October 17–19, 2001. New York, NY: ACM Press, 2001.

    Google Scholar 

  30. Stefan Schulz and Udo Hahn. Necessary parts and wholes in bio-ontologies. In D. Fensel, F. Giunchiglia, D. McGuinness, and M.-A. Williams, editors, Principles of Knowledge Representation and Reasoning. Proceedings of the 8th International Conference–KR 2002, pages 387–394. Toulouse, France, April 22–25, 2002. San Francisco, CA: Morgan Kaufmann, 2002.

    Google Scholar 

  31. Stefan Schulz, Udo Hahn, and Martin Romacker. Modeling anatomical spatial relations with description logics. In J. M. Overhage, editor, AMIA 2000 - Proceedings of the Annual Symposium of the American Medical Informatics Association. Converging Information, Technology, and Health Care, pages 779–783. Los Angeles, CA, November 4–8, 2000. Philadelphia, PA: Hanley & Belfus, 2000.

    Google Scholar 

  32. Kent A. Spackman. Normal forms for description logic expression of clinical concepts in SNOMED RT. In S. Bakken, editor, AMIA 2001 - Proceedings of the Annual Symposium of the American Medical Informatics Association. A Medical Informatics Odyssey: Visions of the Future and Lessons from the Past, pages 627–631. Washington, D.C., November 3–7, 2001. Philadelphia, PA: Hanley & Belfus, 2001.

    Google Scholar 

  33. Kent A. Spackman and Keith E. Campbell. Compositional concept representation using SNOMED: Towards further convergence of clinical terminologies. In C. G. Chute, editor, AMIA’98 - Proceedings of the 1998 AMIA Annual Fall Symposium. A Paradigm Shift in Health Care Information Systems: Clinical Infrastructures for the 21st Century, pages 740–744. Orlando, FL, November 7–1 I, 1998. Philadelphia, PA: Hanley & Belfus, 1998.

    Google Scholar 

  34. Françoise Volot, Michel Joubert, and Marius Fieschi. Review of biomedical knowledge and data representation with Conceptual Graphs. Methods of Information in Medicine, 37 (1): 86–96, 1998.

    Google Scholar 

  35. Morton Winston, Roger Chaffin, and Douglas J. Herrmann. A taxonomy of part-whole relationships. Cognitive Science, 11:417–444, 1987.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hahn, U., Schulz, S. (2004). Building a Very Large Ontology from Medical Thesauri. In: Staab, S., Studer, R. (eds) Handbook on Ontologies. International Handbooks on Information Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24750-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24750-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-11957-0

  • Online ISBN: 978-3-540-24750-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics