Skip to main content

A Survey of Text Mining Approaches, Techniques, and Tools on Discharge Summaries

  • Conference paper
  • First Online:
Advances in Computational Intelligence and Communication Technology

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1086))

Abstract

The discharge summary contains voluminous information regarding the patient like history, symptoms, investigations, treatment, medication, etc. Though the discharge summary has a general structured way of representation, it is still not structured in a way that clinical systems can process. Different natural language processing (NLP) and machine learning techniques have been explored on the discharge summaries to extract various interesting information. Text mining techniques have been carried out in public and private discharge summaries. This survey discusses different tasks performed on discharge summaries and the existing tools which have been explored. The major dataset which has been used in existing research is also discussed. A common outline of system architectures on discharge summaries across various researches is explored. Major challenges in extracting information from discharge summaries are also detailed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. S.S. Shastri, P.C. Nair, D. Gupta, R.C. Nayar, R. Rao, A. Ram, Breast cancer diagnosis and prognosis using machine learning techniques, in The International Symposium on Intelligent Systems Technologies and Applications (Springer, Cham, 2017)

    Google Scholar 

  2. S. Khare, D. Gupta, K. Prabhavathi, M.G. Deepika, A. Jyotishi, Health and nutritional status of children: survey, challenges and directions, in International Conference on Cognitive Computing and Information Processing (Springer, Singapore, 2017)

    Google Scholar 

  3. D.P. Pragna, S. Dandu, M. Meenakzshi, C. Jyotsna, J. Amudha, Health alert system to detect oral cancer, in Inventive Communication and Computational Technologies (ICICCT) (2017)

    Google Scholar 

  4. T. Babu, T. Singh, D. Gupta, S. Hameed, Colon cancer detection in biopsy images for Indian population at different magnification factors using texture features, in 2017 Ninth International Conference on Advanced Computing (ICoAC) (IEEE, 2017)

    Google Scholar 

  5. A. Madabhushi, G. Lee, Image analysis and machine learning in digital pathology: challenges and opportunities 170–175 (2016)

    Google Scholar 

  6. S.V. Iyer, R. Harpaz, P. LePendu, A. Bauer-Mehren, N.H. Shah, Mining clinical text for signals of adverse drug-drug interactions. J. Am. Med. Inform. Assoc. 21(2), 353–362 (2014)

    Article  Google Scholar 

  7. K.B. Wagholikar, K.L. MacLaughlin, M.R. Henry, R.A. Greenes, R.A. Hankey, H. Liu, R. Chaudhry, Clinical decision support with automated text processing for cervical cancer screening. J. Am. Med. Inform. Assoc. 19(5), 833–839 (2012)

    Article  Google Scholar 

  8. R. Angus, R. Gaizauska, M. Hepple, Extracting clinical relationships from patient narratives, in Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (2008)

    Google Scholar 

  9. W. Long, Extracting diagnoses from discharge summaries, in AMIA Annual Symposium Proceedings (2005)

    Google Scholar 

  10. S. Doan, N. Collier, H. Xu, P.H. Duy, T.M. Phuong, Recognition of medication information from discharge summaries using ensembles of classifiers. BMC Med. Inform. Dec. Mak. 12(1), 36 (2012)

    Article  Google Scholar 

  11. D.T. Heinze, M.L. Morsch, R.E. Sheffer Jr, M.A. Jimmink, M.A. Jennings, W.C. Morris, A.E. Morsch, LifeCode™—a natural language processing system for medical coding and data mining, in AAAI/IAAI (2000)

    Google Scholar 

  12. C. Friedman, P.O. Alderson, J. Austin, J. Cimino, S. Johnson, A general natural-language text processor for clinical radiology. J. Am. Med. Inform. Assoc. 1(2), 161–174 (1994)

    Article  Google Scholar 

  13. G.K. Savova, J.J. Masanz, P.V. Ogren J. Zheng, Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010)

    Google Scholar 

  14. A.R. Aronson, Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program, in Proceedings of the AMIA Symposium (2001)

    Google Scholar 

  15. X. Zhou, H. Han, I. Chankai, A.A. Prestrud, A.D. Brooks, Converting semi-structured clinical medical records into information and knowledge, in 21st International Conference on Data Engineering Workshops (2005)

    Google Scholar 

  16. S. Keretna, C.P. Lim, D. Creighton, A hybrid model for named entity recognition using unstructured medical text. in 2014 9th International Conference on System of Systems Engineering (SOSE) (IEEE, 2014)

    Google Scholar 

  17. E. Aramaki, Y. Miura, M. Tonoike, T. Ohkuma, H. Masuichi, K. Waki, K. Ohe, Extraction of adverse drug effects from clinical records, in MedInfo (2010)

    Google Scholar 

  18. E. Aramaki et al., Text2table: medical text summarization system based on named entity recognition and modality identification, in Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing. Association for Computational Linguistics (2009)

    Google Scholar 

  19. L. Cui, S.S. Sahoo, S.D. Lhatoo, G. Garg, P. Rai, A. Bozorgi, G.-Q. Zhang, Complex epilepsy phenotype extraction from narrative clinical discharge summaries. J. Biomed. Inform. 51, 272–279 (2014)

    Article  Google Scholar 

  20. L. Deléger, C. Grouin, P. Zweigenbaum, Extracting medical information from narrative patient records: the case of medication-related information. J. Am. Med. Inform. Assoc. 17(5), 555–558 (2010)

    Article  Google Scholar 

  21. Ö. Uzuner, Y. Luo, P. Szolovits, Evaluating the state-of-the-art in automatic de-identification. J. Am. Med. Inform. Assoc. 14(5), 550–563 (2007)

    Article  Google Scholar 

  22. E. Aramaki et al., Automatic deidentification by using sentence features and label consistency, in i2b2 Workshop on Challenges in Natural Language Processing for Clinical Data, vol. 2006 (2006)

    Google Scholar 

  23. R. Guillen, Automated de-identification and categorization of medical records, in i2b2 Workshop on Challenges in Natural Language Processing for Clinical Data (2006)

    Google Scholar 

  24. H. Scott, F. Xia, I. Solti, E. Cadag, Ö. Uzuner, Extracting medication information from discharge summaries, in Proceedings of the NAACL HLT Second Louhi Workshop on Text and Data Mining of Health Documents. Association for Computational Linguistics (2010)

    Google Scholar 

  25. X. Zhou, H. Han, I. Chankai, A. Prestrud, A. Brooks, Approaches to text mining for clinical medical records, in Proceedings of the 2006 ACM Symposium on Applied Computing (2006)

    Google Scholar 

  26. Y. Xu, K. Hong, J. Tsujii, E.I.-C. Chang, Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. J. Am. Med. Inform. Assoc. 19(5), 824–832 (2012)

    Article  Google Scholar 

  27. C.A. Bejan, L. Vanderwende, F. Xia, M. Yetisgen-Yildiz, Assertion modeling and its role in clinical phenotype identification. J. Biomed. Inform. 46(1), 68–74 (2013)

    Article  Google Scholar 

  28. Q.T. Zeng, S. Goryachev, S. Weiss, M. Sordo, S.N. Murphy, R. Lazarus, Extracting principal diagnosis, co-morbidity and smoking status for asthma research. BMC Med. Inform. Decis. Mak. 6(1), 30 (2006)

    Article  Google Scholar 

  29. M. Sordoa, M. Topazb, F. Zhongb, M. Murralid, S., Navathed, R.A. Rochaa, Identifying patients with depression using free-text clinical documents, in MEDINFO (2015)

    Google Scholar 

  30. L. Zhou, J.M. Plasek, L.M. Mahoney, N. Karipineni, F. Chang, X. Yan, F. Chang, D. Dimaggio, D.S. Goldman, R.A. Rocha, Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to process medication information in outpatient clinical notes, in AMIA Annual Symposium Proceedings, vol. 2011

    Google Scholar 

  31. R.G. Jackson, R. Patel, N. Jayatilleke, A. Kolliakou, M. Ball, G. Gorrell, A. Roberts, R.J. Dobson, R. Stewart, Symptoms of severe mental illness from clinical text: the Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project. BMJ Open 7(1), e012012 (2017)

    Article  Google Scholar 

  32. J.-W. Seol, W. Yi, J. Choi, K.S. Lee, Causality patterns and machine learning for the extraction of problem-action relations in discharge summaries. Int. J. Med. Inform. 98, 1–12 (2017)

    Article  Google Scholar 

  33. Y. Xu, Y. Wang, L. Tianren, J. Tsujii, E.I.-C. Chang, An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. J. Am. Med. Inform. Assoc. 20(5), 849–858 (2013)

    Article  Google Scholar 

  34. A.R. Aronson, F.-M. Lang, An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17(3), 229–236 (2010)

    Article  Google Scholar 

  35. M. Kholghi, L. Sitbon, G. Zuccon, A. Nguyen, Active learning: a step towards automating medical concept extraction. J. Am. Med. Inform. Assoc. 23(2), 289–296 (2015)

    Article  Google Scholar 

  36. K. Denecke, Extracting medical concepts from medical social media with clinical NLP tools: a qualitative study, in Proceedings of the Fourth Workshop on Building and Evaluation Resources for Health and Biomedical Text Processing (2014)

    Google Scholar 

  37. B. Wellner, M. Huyck, S. Mardis, J. Aberdeen, A. Morgan, L. Peshkin, A. Yeh, J. Hitzeman, L. Hirschman, Rapidly retargetable approaches to de-identification in medical records. J. Am. Med. Inform. Assoc. 14(5), 564–573 (2007)

    Article  Google Scholar 

  38. A.M. Cohen, Five-way smoking status classification using text hot-spot identification and error-correcting output codes. J. Am. Med. Inform. Assoc. 15(1), 32–35 (2008)

    Article  Google Scholar 

  39. Ö. Uzuner, I. Goldstein, Y. Luo, I. Kohane, Identifying patient smoking status from medical discharge records. J. Am. Med. Inform. Assoc. 15(1), 14–24 (2008)

    Article  Google Scholar 

  40. H. Yang, I. Spasic, J.A. Keane, G. Nenadic, A text mining approach to the prediction of disease status from clinical discharge summaries. J. Am. Med. Inform. Assoc. 16(4), 596–600 (2009)

    Article  Google Scholar 

  41. Ö. Uzuner, Recognizing obesity and co-morbidities in sparse data. J. Am. Med. Inform. Assoc. 16(4), 561–570 (2009)

    Article  Google Scholar 

  42. I. Solt, D. Tikk, V. Gál, Z.T. Kardkovács, Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier. J. Am. Med. Inform. Assoc. 16(4), 580–584 (2009)

    Article  Google Scholar 

  43. V.N. Garla, C. Brandt, Ontology-guided feature engineering for clinical text classification. J. Biomed. Inform. 45(5), 992–998 (2012)

    Article  Google Scholar 

  44. K.H. Ambert, A.M. Cohen, A system for classifying disease comorbidity status from medical discharge summaries using automated hotspot and negated concept detection. J. Am. Med. Inform. Assoc. 16(4), 590–595 (2009)

    Article  Google Scholar 

  45. Ö. Uzuner, I. Solti, E. Cadag, Extracting medication information from clinical text. J. Am. Med. Inform. Assoc. 17(5), 514–518 (2010)

    Article  Google Scholar 

  46. Ö. Uzuner, B.R. South, S. Shen, S.L. DuVall, 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18(5), 552–556 (2011)

    Article  Google Scholar 

  47. K. Roberts, B. Rink, S.M. Harabagiu, A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. J. Am. Med. Inform. Assoc. 20(5), 867–875 (2013)

    Article  Google Scholar 

  48. W. Sun, A. Rumshisky, O. Uzuner, Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J. Am. Med. Inform. Assoc. 20(5), 806–813 (2013)

    Article  Google Scholar 

  49. C. Friedman, Towards a comprehensive medical language processing system: methods and issues, in Proceedings of the AMIA Annual Fall Symposium (American Medical Informatics Association, 1997)

    Google Scholar 

  50. S. Gold, N. Elhadad, X. Zhu, J.J. Cimino, G. Hripcsak, Extracting structured medication event information from discharge summaries, in AMIA Annual Symposium Proceedings (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepa Gupta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nair, P.C., Gupta, D., Devi, B.I. (2021). A Survey of Text Mining Approaches, Techniques, and Tools on Discharge Summaries. In: Gao, XZ., Tiwari, S., Trivedi, M., Mishra, K. (eds) Advances in Computational Intelligence and Communication Technology. Advances in Intelligent Systems and Computing, vol 1086. Springer, Singapore. https://doi.org/10.1007/978-981-15-1275-9_27

Download citation

Publish with us

Policies and ethics