A Lightweight Approach for Extracting Disease-Symptom Relation with MetaMap toward Automated Generation of Disease Knowledge Base
Diagnostic decision support systems necessitate disease knowledge base, and this part may occupy dominant portion in the total development cost of such systems. Accordingly, toward automated generation of disease knowledge base, we conducted a preliminary study for efficient extraction of symptomatic expressions, utilizing MetaMap, a tool for assigning UMLS (Unified Medical Language System) semantic tags onto phrases in a given medical literature text.
We first utilized several tags in the MetaMap output, related to symptoms and findings, for extraction of symptomatic terms. This straightforward approach resulted in Recall 82% and Precision 64%. Then, we applied a heuristics that exploits certain patterns of tag sequences that frequently appear in typical symptomatic expressions. This simple approach achieved 7% recall gain, without sacrificing precision.
Although the extracted information requires manual inspection, the study suggested that the simple approach can extract symptomatic expressions, at very low cost. Failure analysis of the output was also performed to further improve the performance.
KeywordsSymptomatic Expression Free Text Format Lightweight Approach Clinical Synopsis Diagnostic Decision Support System
Unable to display preview. Download preview PDF.
- 1.Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: AMIA Annual Symposium, pp. 17–21 (2001)Google Scholar
- 2.Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association 17(3), 229–236 (2010)Google Scholar
- 3.Bashyam, V., Divita, G., Bennett, D.B., Browne, A.C., Taira, R.K.: A normalized lexical lookup approach to identifying UMLS concepts in free text. Studies in Health Technology and Informatics 129(Pt 1), 545–549 (2007)Google Scholar
- 4.Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Research 32(Database issue), D267–D270 (2004)Google Scholar
- 5.Cantor, M.N., Lussier, Y.A.: Mining OMIM for insight into complex diseases. Studies in Health Technology and Informatics 107(Pt 2), 753–757 (2004)Google Scholar
- 6.Chapman, W.W., Fiszman, M., Dowling, J.N., Chapman, B.E., Rindflesch, T.C.: Identifying respiratory findings in emergency department reports for biosurveillance using MetaMap. Studies in Health Technology and Informatics 107(Pt 1), 487–491 (2004)Google Scholar
- 8.Divita, G., Tse, T., Roth, L.: Failure analysis of MetaMap Transfer (MMTx). Studies in Health Technology and Informatics 107(Pt 2), 763–767 (2004)Google Scholar
- 10.INSERM SC11: Orphanet, http://www.orpha.net/
- 11.Jimeno, A., Jimenez-Ruiz, E., Lee, V., Gaudan, S., Berlanga, R., Rebholz-Schuhmann, D.: Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinformatics 9(suppl. 3), S3 (2008)Google Scholar
- 12.John Hopkins University: OMIM: Online Mendelian Inheritance in Man, http://www.ncbi.nlm.nih.gov/omim
- 13.Meystre, S., Haug, P.J.: Evaluation of medical problem extraction from electronic clinical documents using MetaMap Transfer (MMTx). Studies in Health Technology and Informatics 116, 823–828 (2005)Google Scholar
- 16.Pratt, W., Yetisgen-Yildiz, M.: A study of biomedical concept identification: MetaMap vs. people. In: AMIA Annual Symposium, pp. 529–533 (2003)Google Scholar
- 18.Sneiderman, C.A., Rindflesch, T.C., Aronson, A.R.: Finding the findings: identification of findings in medical literature using restricted natural language processing. In: AMIA Annual Fall Symposium, pp. 239–243 (1996)Google Scholar