Skip to main content

Analysis of Medical Documents with Text Mining and Association Rule Mining

  • Conference paper
  • First Online:
Information Technology and Systems (ICITS 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 918))

Included in the following conference series:

  • 2194 Accesses

Abstract

Text mining techniques extracts meaningful information from large amounts of semi-structured and unstructured texts. In this work, the MetaMap tool was used to extract medical entities like diseases and syndromes from discharge summaries. Also, association rule mining algorithms such as Apriori and FP-Growth were applied to the extracted entities in order to find associations between them. The dataset used consists of 1237 discharge summaries obtained from the 2008 i2b2 Obesity Challenge. The rules that have a principal diagnosis as antecedent showed that the cardiac disease frequently occurred with other diseases like hypertension and diabetes. Most of the rules describe associations between diabetes and other diseases like hypertension, dyslipidemia, nephropathy, heart disease, lung diseases, and arthritis. These rules have a confidence parameter of above 0.5.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.cs.waikato.ac.nz/ml/weka/.

  2. 2.

    https://uts.nlm.nih.gov/home.html.

  3. 3.

    https://www.r-project.org/. R is free software for statistical computing and graphics. Also, this tool has a list of packages to work with text mining and natural language processing.

References

  1. Chiaramello, E., Paglialonga, A., Pinciroli, F., Tognola, G.: Attempting to use MetaMap in clinical practice: a feasibility study on the identification of medical concepts from italian clinical notes. Stud. Health Technol. Inform. 228, 28–32 (2016)

    Google Scholar 

  2. Reategui, R., Ratte, S.: Comparison of MetaMap and cTAKES for entity extraction in clinical notes. BMC Med. Inform. Decis. Mak. 18, 74 (2018)

    Article  Google Scholar 

  3. Pradhan, S., Elhadad, N., South, B.R., Martinez, D., Christensen, L., Vogel, A., Suominen, H., Chapman, W.W., Savova, G.: Evaluating the state of the art in disorder recognition and normalization of the clinical narrative. J. Am. Med. Inf. Assoc.: JAMIA 22, 143–154 (2015)

    Article  Google Scholar 

  4. Sun, W., Cai, Z., Li, Y., Liu, F., Fang, S., Wang, G.: Data processing and text mining technologies on electronic medical records: a review. J. Healthc. Eng. 2018 (2018). 4302425

    Article  Google Scholar 

  5. Miner, G., Delen, D., Elder, J., Fast, A., Hill, T., Nisbet, R.A.: Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications. Elsevier Inc., New York (2012)

    Google Scholar 

  6. Weiss, S., Indurkhya, N., Zhang, T., Damerau, F.: Text Mining Predictive Methods (2005)

    Google Scholar 

  7. Bukhanov, N., Balakhontceva, M., Krikunov, A., Sabirov, A., Semakova, A., Zvartau, N., Konradi, A.: Clustering of comorbidities based on conditional probabilities of diseases in hypertensive patients. Procedia Comput. Sci. 108, 2478–2487 (2017)

    Article  Google Scholar 

  8. Kang’ethe, S., Wagacha, P.: Extracting Diagnosis Patterns in Electronic Medical Records using Association Rule Mining (2014)

    Google Scholar 

  9. Kim, H.S., Shin, A.M., Kim, M.K., Kim, Y.N.: Comorbidity study on type 2 diabetes mellitus using data mining. Korean J. Internal Med. 27, 197–202 (2012)

    Article  Google Scholar 

  10. Lakshmi, K.S., Vadivu, G.: Extracting association rules from medical health records using multi-criteria decision analysis. Procedia Comput. Sci. 115, 290–295 (2017)

    Article  Google Scholar 

  11. Raghavan, P.: Medical Event Timeline Generation from Clinical Narratives. Doctor of Philosophy, The Ohio State University (2014)

    Google Scholar 

  12. Uzuner, Ö.: Recognizing obesity and comorbidities in sparse data. JAMIA 16, 561–570 (2009)

    Google Scholar 

  13. Aronson, A.R., Lang, F.-M.: An overview of MetaMap: historical perspective and recent advances. JAMIA 17, 229–236 (2010)

    Google Scholar 

  14. Aronso, A.: Effective mapping of biomedical text to the UMLS metathesaurus: the MetaMap program. In: AMIA Annual Symposium Proceedings 2001, pp. 17–21 (2001)

    Google Scholar 

  15. Kotsiantis, S., Kanellopoulos, D.: Association rules mining: a recent overview. GESTS Int. Trans. Comput. Sci. Eng. 32, 71–82 (2006)

    Google Scholar 

  16. Han, J.W., Pei, J., Yin, Y.W.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29, 1–12 (2000)

    Article  Google Scholar 

  17. Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., Chouvarda, I.: Machine learning and data mining methods in diabetes research. Comput. Struct. Biotechnol. J. 15, 104–116 (2017)

    Article  Google Scholar 

  18. Aronson, D., Edelman, E.R.: Coronary artery disease and diabetes mellitus. Cardiol. Clin. 32, 439–455 (2014)

    Article  Google Scholar 

  19. Aune, D., Feng, T., Schlesinger, S., Janszky, I., Norat, T., Riboli, E.: Diabetes mellitus, blood glucose and the risk of atrial fibrillation: a systematic review and meta-analysis of cohort studies. J. Diabetes Complications 32, 501–511 (2018)

    Article  Google Scholar 

  20. Long, A.N., Dagogo-Jack, S.: Comorbidities of diabetes and hypertension: mechanisms and approach to target organ protection. J. Clin. Hypertens. (Greenwich) 13, 244–251 (2011)

    Article  Google Scholar 

  21. Lin, P.J., Kent, D.M., Winn, A., Cohen, J.T., Neumann, P.J.: Multiple chronic conditions in type 2 diabetes mellitus: prevalence and consequences. Am. J. Manag. Care 21, e23–e34 (2015)

    Google Scholar 

  22. Edeoga, C., Owei, I., Siwakoti, K., Umekwe, N., Ceesay, F., Wan, J., Dagogo-Jack, S.: Relationships between blood pressure and blood glucose among offspring of parents with type 2 diabetes: prediction of incident dysglycemia in a biracial cohort. J. Diabetes Complications 31, 1580–1586 (2017)

    Article  Google Scholar 

  23. Wang, Y.-Z., Xu, W.-W., Zhu, D.-Y., Zhang, N., Wang, Y.-L., Ding, M., Xie, X.-M., Sun, L.-L., Wang, X.-X.: Specific expression network analysis of diabetic nephropathy kidney tissue revealed key methylated sites. J. Cell. Physiol. 233, 7139–7147 (2018)

    Article  Google Scholar 

  24. Tziomalos, K., Athyros, V.G.: Diabetic nephropathy: new risk factors and improvements in diagnosis. Rev. Diabet. Stud. 12, 110–118 (2015)

    Article  Google Scholar 

  25. Thompson, G.R.: Management of dyslipidaemia. Heart 90, 949–955 (2004)

    Article  Google Scholar 

  26. Anderson, A.E., Kerr, W.T., Thames, A., Li, T., Xiao, J.Y., Cohen, M.S.: Electronic health record phenotyping improves detection and screening of type 2 diabetes in the general United States population: a cross-sectional, unselected, retrospective study. J. Biomed. Inform. 60, 162–168 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruth Reátegui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Reátegui, R., Ratté, S. (2019). Analysis of Medical Documents with Text Mining and Association Rule Mining. In: Rocha, Á., Ferrás, C., Paredes, M. (eds) Information Technology and Systems. ICITS 2019. Advances in Intelligent Systems and Computing, vol 918. Springer, Cham. https://doi.org/10.1007/978-3-030-11890-7_70

Download citation

Publish with us

Policies and ethics