Advanced Clustering Technique for Medical Data Using Semantic Information

  • Kwangcheol Shin
  • Sang-Yong Han
  • Alexander Gelbukh
Conference paper

DOI: 10.1007/978-3-540-24694-7_33

Part of the Lecture Notes in Computer Science book series (LNCS, volume 2972)
Cite this paper as:
Shin K., Han SY., Gelbukh A. (2004) Advanced Clustering Technique for Medical Data Using Semantic Information. In: Monroy R., Arroyo-Figueroa G., Sucar L.E., Sossa H. (eds) MICAI 2004: Advances in Artificial Intelligence. MICAI 2004. Lecture Notes in Computer Science, vol 2972. Springer, Berlin, Heidelberg

Abstract

MEDLINE is a representative collection of medical documents supplied with original full-text natural-language abstracts as well as with representative keywords (called MeSH-terms) manually selected by the expert annotators from a pre-defined ontology and structured according to their relation to the document. We show how the structured manually assigned semantic descriptions can be combined with the original full-text abstracts to improve quality of clustering the documents into a small number of clusters. As a baseline, we compare our results with clustering using only abstracts or only MeSH-terms. Our experiments show 36% to 47% higher cluster coherence, as well as more refined keywords for the produced clusters.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Kwangcheol Shin
    • 1
  • Sang-Yong Han
    • 1
  • Alexander Gelbukh
    • 1
    • 2
  1. 1.School of Computer Science and EngineeringChung-Ang UniversitySeoulKorea
  2. 2.Center for Computing ResearchNational Polytechnic InstituteMexico CityMexico

Personalised recommendations