Skip to main content

Knowledge Discovery in Clinical Data

  • Chapter
  • First Online:
Machine Learning for Health Informatics

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9605))

Abstract

There has been a recent surge in the implementation of electronic health care records. These patient records contain valuable medical information including patient demographic data, diagnosis, therapeutic approach, and patient outcomes. It is important to analyze patterns within these records in order to more effectively treat individuals. In this paper, a method is presented for identifying these themes and patterns within patient data. This methodology includes extraction of the main themes or patterns in the data and linking those themes back to the corpus from which they were generated. In our research, we partitioned graphs from terms gathered from electronic medical records. We used two sets of data including eight charts and ten case studies for this study from primary disease categories. The Electronic Medical Records (EMRs) and case studies were modeled as networks of interacting terms where the interactions were captured by their co-occurrences in the documents. A greedy algorithm was used to find communities with high modularity. Finally, we compared our method with probabilistic topic modeling algorithms and evaluated the efficacy of our method by using recall and precision measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang, F., Lee, N., Hu, J., Sun, J., Ebadollahi, S., Laine, A.F.: A framework for mining signatures from event sequences and its applications in healthcare data. IEEE Trans. Pattern Anal. Mach. Intell. 35(2), 272–285 (2013)

    Article  Google Scholar 

  2. Gotz, D., Wongsuphasawat, K.: Interactive intervention analysis. In: AMIA 2012, American Medical Informatics Association Annual Symposium, Chicago, Illinois, USA, 3–7 November 2012 (2012)

    Google Scholar 

  3. Gotz, D., Wang, F., Perer, A.: A methodology for interactive mining and visual analysis of clinical event patterns using electronic health record data. J. Biomed. Inform. 48, 148–159 (2014)

    Article  Google Scholar 

  4. Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Informatics 3(2), 119–131 (2016)

    Article  Google Scholar 

  5. Hund, M., Bohm, D., Sturm, W., Seldmair, M., Schreck, T., Ulrich, T., Keim, D.A., Majnaric, L., Holzinger, A.: Visual analytics for concept exploration in subspaces of patient groups: making sense of complex datasets with the doctor-in-the-loop. Brain Bioinform. 15(Suppl. 6), 233–247 (2016)

    Google Scholar 

  6. Blondel, V.D., Guillaume, J., Lambiotte, R., Lefebvre, E.: Fast unfolding of community hierarchies in large networks. J. Stat. Mech. 2008, 10008–10011 (2008)

    Google Scholar 

  7. Batal, I., Hauskrecht, M.: Mining of predictive patterns in electronic health records data (2014)

    Google Scholar 

  8. Wang, T.D., Plaisant, C., Shneiderman, B., Spring, N., Roseman, D., Marchand, G., Mukherjee, V., Smith, M.: Temporal summaries: supporting temporal categorical searching, aggregation and comparison. IEEE Trans. Visual Comput. Graphics 15(6), 1049–1056 (2009)

    Article  Google Scholar 

  9. Wongsuphasawat, K., Gotz, D.: Outflow: visualizing patient flow by symptoms and outcome. In: IEEE VisWeek Workshop on Visual Analytics in Healthcare (2011)

    Google Scholar 

  10. Wongsuphasawat, K., Shneiderman, B.: Finding comparable temporal categorical records: a similarity measure with an interactive visualization. In: Proceedings of the IEEE Symposium on Visual Analytics Science and Technology, pp. 27–34 (2009)

    Google Scholar 

  11. Pickett, R.M., Grinstein, G.G.: Iconographic displays for visualizing multidimensional data. In: Proceedings of the 1988 IEEE International Conference on Systems, Man, and Cybernetics, pp. 514–519 (1998)

    Google Scholar 

  12. Post, F.H., Walsum, T., Post, F.H., Silver, D.: Iconic techniques for feature visualization, pp. 288–295 (1995)

    Google Scholar 

  13. Chernoff, H.: The use of faces to represent points in k-dimensional space graphically. J. Am. Stat. Assoc. 68, 361–368 (1973)

    Article  Google Scholar 

  14. Muller, H., Reihs, R., Zatloukal, K., Holzinger, A.: Analysis of biomedical data with multilevel glyphs. Brain Bioinform. 15(Suppl. 6), 117–140 (2016)

    Google Scholar 

  15. Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): what is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40511-2_22

    Chapter  Google Scholar 

  16. Mu̇ller, H., Reihs, R., Zatloukal, K., Holzinger, A.: Analysis of biomedical data with multilevel glyphs. BNC Bioinform. 15(6), S5 (2014)

    Google Scholar 

  17. Gotz, D., Sun, J., Cao, N., Ebadollahi, S.: Visual cluster analysis in support of clinical decision intelligence. In: AMIA Annual Symposium Proceedings, pp. 481–490 (2011)

    Google Scholar 

  18. Orthuber, W.: A searchable patient record database for decision support. Stud. Health Technol. Inform. 150, 584–588 (2009)

    Google Scholar 

  19. Tufte, E.R.: The Visual Display of Quantitative Information. Graphics Press, USA (1992)

    Google Scholar 

  20. Carr, D.B., Littlefield, R.J., Nichloson, W.L.: Scatterplot matrix techniques for large n. J. Am. Stat. Assoc. 82, 424–436 (1986)

    MathSciNet  Google Scholar 

  21. Inselberg, A., Dimsdale, B.: Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proceedings of the 1st Conference on Visualization 1990. IEEE Computer Society Press (1990)

    Google Scholar 

  22. Novotny, M.: Visually effective information visualization of large data. In: 8th Central European Seminar on Computer Graphics (2004)

    Google Scholar 

  23. Climer, S., Zhang, W.: Rearrangement clustering: pitfalls, remedies, and applications. J. Mach. Learn. Res. 7, 919–943 (2006)

    MathSciNet  MATH  Google Scholar 

  24. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95(25), 14863–14868 (1998)

    Article  Google Scholar 

  25. Friendly, M.: Corrgrams: exploratory displays for correlation matrices. Am. Stat. 56(4), 316–324 (2002)

    Article  MathSciNet  Google Scholar 

  26. Chattopadhyay, S., Ray, P., Chen, H.S., Lee, M.B., Chiang, H.C.: Suicidal risk evaluation using a similarity-based classifier. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS (LNAI), vol. 5139, pp. 51–61. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88192-6_7

    Chapter  Google Scholar 

  27. Ebadollahi, S., Sun, J., Gotz, D., Hu, J., Sow, D., Neti, C.: Predicting patient’s trajectory of physiological data using temporal trends in similar patients: a system for near-term prognostics. In: AMIA Annual Symposium Proceedings, pp. 192–196 (2010)

    Google Scholar 

  28. Cao, N., Gotz, D., Sun, J., Qu, H.: Dicon: interactive visual analysis of multidimensional clusters. IEEE Trans. Vis. Comput. Graph. 17(12), 2581–2590 (2011)

    Article  Google Scholar 

  29. Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13(6), 394–405 (2012)

    Article  Google Scholar 

  30. Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)

    Article  Google Scholar 

  31. Oztekin, A., Delen, D., Kong, Z.: Predicting the graft survival for heart-lung transplantation patients: an integrated data mining methodology. Int. J. Med. Inf. 78(12), e84–e96 (2009)

    Article  Google Scholar 

  32. Hidalgo, C.A., Blumm, N., Barabasi, A.L., Christakis, N.: A dynamic network approach for the study of human phenotypes. PLOS Comput. Biol. 5, 1–11 (2009)

    Article  Google Scholar 

  33. Chen, L., Blumm, N., Christakis, N., Barabasi, A.L., Deisboeck, T.S.: Cancer metastasis networks and the prediction of progression patterns. Br. Cancer J. 101, 749–758 (2009)

    Article  Google Scholar 

  34. Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. 69, 066133 (2004)

    Google Scholar 

  35. Albert, R., Barbasi, A.: The Structure and Dynamics of Networks. Princeton University Press, NY (2006)

    Google Scholar 

  36. Agichtein, E., Castillo, C., Donato, D.: Finding high-quality content in social media. In: Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM (2008)

    Google Scholar 

  37. Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M.: Maximizing modularity is hard. arXiv:physics/0608255v2 [physics.data-an] (2006)

  38. Nakao, Y., Terai, H.: Embolic brain infarction related to posttraumatic occlusion of vertebral artery resulting from cervical spine injury: a case report. J. Med. Case Rep. 8, 344–350 (2014)

    Google Scholar 

  39. Yilmaz, T., Cikla, U., Kirst, A., Baskaya, M.: Glioblastoma multiforme in klippel-trenaunay- weber syndrome: a case report. J. Med. Case Rep. 9, 83–87 (2015)

    Google Scholar 

  40. Genuis, K., Pewarchuk, J.: Granulomatosis with polyangiitis (wegeners) as a necrotizing gingivitis mimic: a case report. J. Med. Case Rep. 8, 297–301 (2014)

    Google Scholar 

  41. Toyoshima, M., Kudo, T., Igeta, S., et al.: Spontaneous retroperitoneal hemorrhage caused by rupture of an ovarian artery aneurysm: a case report and review of the literature. J. Med. Case Rep. 9, 84–89 (2015)

    Google Scholar 

  42. Panazzolo, D., Braga, T., Bergamim, A., et al.: Hypoparathyroidism after roux-en-y gastric bypass - a challenge for clinical management: a case report. J. Med. Case Rep. 8, 357–361 (2014)

    Google Scholar 

  43. Osaku, T., Ogata, H., Magoshi, S., et al.: Metastatic nonpalpable invasive lobular breast carcinoma presenting as rectal stenosis: a case report. J. Med. Case Rep. 9, 88–93 (2015)

    Google Scholar 

  44. Hartog, N., Kamath, A.: A 90-year-old patient presenting with postoperative hypotension and a new murmur: a case report. J. Med. Case Rep. 8, 363–366 (2014)

    Google Scholar 

  45. Jellinge, M.: Severe septic shock and cardiac arrest in a patient with vibrio metschnikovii: a case report. J. Med. Case Rep. 8, 348–350 (2014)

    Google Scholar 

  46. Jiang, B., Zhu, R., Cao, Q., Pan, H.: Severe thoracic spinal fracture-dislocation without neurological symptoms and costal fractures: a case report and review of the literature. J. Med. Case Rep. 8, 343–348 (2014)

    Google Scholar 

  47. Erdal, U., Mehmet, D., Turkay, K., Mehmet, I., Ibrahim, N., Hasan, B.: Esophagus perforation and myocardial penetration caused by swallowing of a foreign body leading to a misdiagnosis of acute coronary syndrome: a case report. J. Med. Case Rep. 9, 57–59 (2015)

    Google Scholar 

  48. Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aryya Gangopadhyay .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this chapter

Cite this chapter

Gangopadhyay, A., Yesha, R., Siegel, E. (2016). Knowledge Discovery in Clinical Data. In: Holzinger, A. (eds) Machine Learning for Health Informatics. Lecture Notes in Computer Science(), vol 9605. Springer, Cham. https://doi.org/10.1007/978-3-319-50478-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50478-0_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50477-3

  • Online ISBN: 978-3-319-50478-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics