Compositional Information Extraction Methodology from Medical Reports

  • Pratibha Rani
  • Raghunath Reddy
  • Devika Mathur
  • Subhadip Bandyopadhyay
  • Arijit Laha
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6588)


Currently health care industry is undergoing a huge expansion in different aspects. Advances in Clinical Informatics (CI) are an important part of this expansion process. One of the goals of CI is to apply Information Technology for better patient care service provision through two major applications namely electronic health care data management and information extraction from medical documents. In this paper we focus on the second application. For better management and fruitful use of information, it is necessary to contextually segregate important/relevant information buried in a huge corpus of unstructured texts. Hence Information Extraction (IE) from unstructured texts becomes a key technology in CI that deals with different sub-topics like extraction of biomedical entity and relations, passage/paragraph level information extraction, ontological study of diseases and treatments, summarization and topic identification etc. Though literature is promising for different IE tasks for individual topics, availability of an integrated approach for contextually relevant IE from medical documents is not apparent enough. To this end, we propose a compositional approach using integration of contextually (domain specific) constructed IE modules to improve knowledge support for patient care activity. The input to this composite system is free format medical case reports containing stage wise information corresponding to the evolution path of a patient care activity. The output is a compilation of various types of extracted information organized under different tags like past medical history, sign/symptoms, test and test results, diseases, treatment and follow up. The outcome is aimed to help the health care professionals in exploring a large corpus of medical case-studies and selecting only relevant component level information according to need/interest.


Information Extraction Medical document mining Health care application Clinical Informatics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
    Afantenos, S., Karkaletsis, V., Stamatopoulos, P.: Summarization from medical documents: a survey. Artif. Intell. Med. 33(2), 157–177 (2005)CrossRefGoogle Scholar
  7. 7.
    Philip, B., Deshpande, P., Lee-and, Y.K., Barzilay, R.: Finding Temporal Order in Discharge Summaries. In: EMNLP (2006)Google Scholar
  8. 8.
    Bundschus, M., Dejori, M., Stetter, M., Tresp, V., Kriegel, H.-P.: Extraction of semantic biomedical relations from text using conditional random fields. BMC Bioinformatics 9(1), 207 (2008)CrossRefGoogle Scholar
  9. 9.
    Bundschus, M., Dejori, M., Yu, S., Tresp, V., Kriegel, H.-P.: Statistical modeling of medical indexing processes for biomedical knowledge information discovery from text. In: BIOKDD 2008 (2008)Google Scholar
  10. 10.
    Han, H., Choi, Y., Choi, Y.M., Zhou, X., Brooks, A.D.: A Generic Framework: From Clinical Notes to Electronic Medical Records. In: CBMS 2006, pp. 111–118 (2006)Google Scholar
  11. 11.
    Hearst, M.A.: Multi-paragraph segmentation of expository text. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, pp. 9–16 (1994)Google Scholar
  12. 12.
    Mangold, C.: A survey and classification of semantic search approaches. Int. J. Metadata Semant. Ontologies 2(1), 23–34 (2007)CrossRefGoogle Scholar
  13. 13.
    Meystre, S., Haug, P.J.: Natural language processing to extract medical problems from electronic clinical documents: Performance evaluation. J. of Biomedical Informatics 39(6), 589–599 (2006)CrossRefGoogle Scholar
  14. 14.
    Mooney, R.J., Bunescu, R.C.: Mining knowledge from text using information extraction. SIGKDD Explorations 7(1), 3–10 (2005)CrossRefGoogle Scholar
  15. 15.
    Morales, L.P., Esteban, A.D., Gervás, P.: Concept-graph based biomedical automatic summarization using ontologies. In: TextGraphs 2008, pp. 53–56 (2008)Google Scholar
  16. 16.
    Mowery, D.L., Harkema, H., Dowling, J.N., Lustgarten, J.L., Chapman, W.W.: Distinguishing historical from current problems in clinical reports: which textual features help? In: BioNLP 2009, pp. 10–18 (2009)Google Scholar
  17. 17.
    Takeuchi, K., Collier, N.: Bio-medical entity extraction using support vector machines. Artif. Intell. Med. 33(2), 125–137 (2005)CrossRefGoogle Scholar
  18. 18.
    Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F.: Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Web Semantics 4(1), 14–28 (2006)CrossRefGoogle Scholar
  19. 19.
    Zhou, X., Han, H., Chankai, I., Prestrud, A., Brooks, A.: Approaches to text mining for clinical medical records. In: SAC 2006, pp. 235–239 (2006)Google Scholar
  20. 20.
    Zhou, X., Hu, X., Lin, X., Han, H., Zhang, X.-d.: Relation-Based Document Retrieval for Biomedical Literature Databases. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 689–701. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  21. 21.
    Zhou, X., Zhang, X., Hu, X.: MaxMatcher: Biological concept extraction using approximate dictionary lookup. In: Yang, Q., Webb, G. (eds.) PRICAI 2006. LNCS (LNAI), vol. 4099, pp. 1145–1149. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Pratibha Rani
    • 1
  • Raghunath Reddy
    • 1
  • Devika Mathur
    • 2
  • Subhadip Bandyopadhyay
    • 2
  • Arijit Laha
    • 2
  1. 1.International Institute of Information TechnologyHyderabadIndia
  2. 2.SETLabsInfosys Technologies Ltd.HyderabadIndia

Personalised recommendations