Advanced Clustering Technique for Medical Data Using Semantic Information
- Cite this paper as:
- Shin K., Han SY., Gelbukh A. (2004) Advanced Clustering Technique for Medical Data Using Semantic Information. In: Monroy R., Arroyo-Figueroa G., Sucar L.E., Sossa H. (eds) MICAI 2004: Advances in Artificial Intelligence. MICAI 2004. Lecture Notes in Computer Science, vol 2972. Springer, Berlin, Heidelberg
MEDLINE is a representative collection of medical documents supplied with original full-text natural-language abstracts as well as with representative keywords (called MeSH-terms) manually selected by the expert annotators from a pre-defined ontology and structured according to their relation to the document. We show how the structured manually assigned semantic descriptions can be combined with the original full-text abstracts to improve quality of clustering the documents into a small number of clusters. As a baseline, we compare our results with clustering using only abstracts or only MeSH-terms. Our experiments show 36% to 47% higher cluster coherence, as well as more refined keywords for the produced clusters.
Unable to display preview. Download preview PDF.