Knowledge Discovery in Clinical Data

  • Aryya Gangopadhyay
  • Rose Yesha
  • Eliot Siegel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9605)


There has been a recent surge in the implementation of electronic health care records. These patient records contain valuable medical information including patient demographic data, diagnosis, therapeutic approach, and patient outcomes. It is important to analyze patterns within these records in order to more effectively treat individuals. In this paper, a method is presented for identifying these themes and patterns within patient data. This methodology includes extraction of the main themes or patterns in the data and linking those themes back to the corpus from which they were generated. In our research, we partitioned graphs from terms gathered from electronic medical records. We used two sets of data including eight charts and ten case studies for this study from primary disease categories. The Electronic Medical Records (EMRs) and case studies were modeled as networks of interacting terms where the interactions were captured by their co-occurrences in the documents. A greedy algorithm was used to find communities with high modularity. Finally, we compared our method with probabilistic topic modeling algorithms and evaluated the efficacy of our method by using recall and precision measures.


Electronic medical records Visualization Analytics Pattern recognition 


  1. 1.
    Wang, F., Lee, N., Hu, J., Sun, J., Ebadollahi, S., Laine, A.F.: A framework for mining signatures from event sequences and its applications in healthcare data. IEEE Trans. Pattern Anal. Mach. Intell. 35(2), 272–285 (2013)CrossRefGoogle Scholar
  2. 2.
    Gotz, D., Wongsuphasawat, K.: Interactive intervention analysis. In: AMIA 2012, American Medical Informatics Association Annual Symposium, Chicago, Illinois, USA, 3–7 November 2012 (2012)Google Scholar
  3. 3.
    Gotz, D., Wang, F., Perer, A.: A methodology for interactive mining and visual analysis of clinical event patterns using electronic health record data. J. Biomed. Inform. 48, 148–159 (2014)CrossRefGoogle Scholar
  4. 4.
    Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Informatics 3(2), 119–131 (2016)CrossRefGoogle Scholar
  5. 5.
    Hund, M., Bohm, D., Sturm, W., Seldmair, M., Schreck, T., Ulrich, T., Keim, D.A., Majnaric, L., Holzinger, A.: Visual analytics for concept exploration in subspaces of patient groups: making sense of complex datasets with the doctor-in-the-loop. Brain Bioinform. 15(Suppl. 6), 233–247 (2016)Google Scholar
  6. 6.
    Blondel, V.D., Guillaume, J., Lambiotte, R., Lefebvre, E.: Fast unfolding of community hierarchies in large networks. J. Stat. Mech. 2008, 10008–10011 (2008)Google Scholar
  7. 7.
    Batal, I., Hauskrecht, M.: Mining of predictive patterns in electronic health records data (2014)Google Scholar
  8. 8.
    Wang, T.D., Plaisant, C., Shneiderman, B., Spring, N., Roseman, D., Marchand, G., Mukherjee, V., Smith, M.: Temporal summaries: supporting temporal categorical searching, aggregation and comparison. IEEE Trans. Visual Comput. Graphics 15(6), 1049–1056 (2009)CrossRefGoogle Scholar
  9. 9.
    Wongsuphasawat, K., Gotz, D.: Outflow: visualizing patient flow by symptoms and outcome. In: IEEE VisWeek Workshop on Visual Analytics in Healthcare (2011)Google Scholar
  10. 10.
    Wongsuphasawat, K., Shneiderman, B.: Finding comparable temporal categorical records: a similarity measure with an interactive visualization. In: Proceedings of the IEEE Symposium on Visual Analytics Science and Technology, pp. 27–34 (2009)Google Scholar
  11. 11.
    Pickett, R.M., Grinstein, G.G.: Iconographic displays for visualizing multidimensional data. In: Proceedings of the 1988 IEEE International Conference on Systems, Man, and Cybernetics, pp. 514–519 (1998)Google Scholar
  12. 12.
    Post, F.H., Walsum, T., Post, F.H., Silver, D.: Iconic techniques for feature visualization, pp. 288–295 (1995)Google Scholar
  13. 13.
    Chernoff, H.: The use of faces to represent points in k-dimensional space graphically. J. Am. Stat. Assoc. 68, 361–368 (1973)CrossRefGoogle Scholar
  14. 14.
    Muller, H., Reihs, R., Zatloukal, K., Holzinger, A.: Analysis of biomedical data with multilevel glyphs. Brain Bioinform. 15(Suppl. 6), 117–140 (2016)Google Scholar
  15. 15.
    Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): what is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40511-2_22 CrossRefGoogle Scholar
  16. 16.
    Mu̇ller, H., Reihs, R., Zatloukal, K., Holzinger, A.: Analysis of biomedical data with multilevel glyphs. BNC Bioinform. 15(6), S5 (2014)Google Scholar
  17. 17.
    Gotz, D., Sun, J., Cao, N., Ebadollahi, S.: Visual cluster analysis in support of clinical decision intelligence. In: AMIA Annual Symposium Proceedings, pp. 481–490 (2011)Google Scholar
  18. 18.
    Orthuber, W.: A searchable patient record database for decision support. Stud. Health Technol. Inform. 150, 584–588 (2009)Google Scholar
  19. 19.
    Tufte, E.R.: The Visual Display of Quantitative Information. Graphics Press, USA (1992)Google Scholar
  20. 20.
    Carr, D.B., Littlefield, R.J., Nichloson, W.L.: Scatterplot matrix techniques for large n. J. Am. Stat. Assoc. 82, 424–436 (1986)MathSciNetGoogle Scholar
  21. 21.
    Inselberg, A., Dimsdale, B.: Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proceedings of the 1st Conference on Visualization 1990. IEEE Computer Society Press (1990)Google Scholar
  22. 22.
    Novotny, M.: Visually effective information visualization of large data. In: 8th Central European Seminar on Computer Graphics (2004)Google Scholar
  23. 23.
    Climer, S., Zhang, W.: Rearrangement clustering: pitfalls, remedies, and applications. J. Mach. Learn. Res. 7, 919–943 (2006)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95(25), 14863–14868 (1998)CrossRefGoogle Scholar
  25. 25.
    Friendly, M.: Corrgrams: exploratory displays for correlation matrices. Am. Stat. 56(4), 316–324 (2002)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Chattopadhyay, S., Ray, P., Chen, H.S., Lee, M.B., Chiang, H.C.: Suicidal risk evaluation using a similarity-based classifier. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS (LNAI), vol. 5139, pp. 51–61. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88192-6_7 CrossRefGoogle Scholar
  27. 27.
    Ebadollahi, S., Sun, J., Gotz, D., Hu, J., Sow, D., Neti, C.: Predicting patient’s trajectory of physiological data using temporal trends in similar patients: a system for near-term prognostics. In: AMIA Annual Symposium Proceedings, pp. 192–196 (2010)Google Scholar
  28. 28.
    Cao, N., Gotz, D., Sun, J., Qu, H.: Dicon: interactive visual analysis of multidimensional clusters. IEEE Trans. Vis. Comput. Graph. 17(12), 2581–2590 (2011)CrossRefGoogle Scholar
  29. 29.
    Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13(6), 394–405 (2012)CrossRefGoogle Scholar
  30. 30.
    Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)CrossRefGoogle Scholar
  31. 31.
    Oztekin, A., Delen, D., Kong, Z.: Predicting the graft survival for heart-lung transplantation patients: an integrated data mining methodology. Int. J. Med. Inf. 78(12), e84–e96 (2009)CrossRefGoogle Scholar
  32. 32.
    Hidalgo, C.A., Blumm, N., Barabasi, A.L., Christakis, N.: A dynamic network approach for the study of human phenotypes. PLOS Comput. Biol. 5, 1–11 (2009)CrossRefGoogle Scholar
  33. 33.
    Chen, L., Blumm, N., Christakis, N., Barabasi, A.L., Deisboeck, T.S.: Cancer metastasis networks and the prediction of progression patterns. Br. Cancer J. 101, 749–758 (2009)CrossRefGoogle Scholar
  34. 34.
    Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. 69, 066133 (2004)Google Scholar
  35. 35.
    Albert, R., Barbasi, A.: The Structure and Dynamics of Networks. Princeton University Press, NY (2006)Google Scholar
  36. 36.
    Agichtein, E., Castillo, C., Donato, D.: Finding high-quality content in social media. In: Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM (2008)Google Scholar
  37. 37.
    Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M.: Maximizing modularity is hard. arXiv:physics/0608255v2 [] (2006)
  38. 38.
    Nakao, Y., Terai, H.: Embolic brain infarction related to posttraumatic occlusion of vertebral artery resulting from cervical spine injury: a case report. J. Med. Case Rep. 8, 344–350 (2014)Google Scholar
  39. 39.
    Yilmaz, T., Cikla, U., Kirst, A., Baskaya, M.: Glioblastoma multiforme in klippel-trenaunay- weber syndrome: a case report. J. Med. Case Rep. 9, 83–87 (2015)Google Scholar
  40. 40.
    Genuis, K., Pewarchuk, J.: Granulomatosis with polyangiitis (wegeners) as a necrotizing gingivitis mimic: a case report. J. Med. Case Rep. 8, 297–301 (2014)Google Scholar
  41. 41.
    Toyoshima, M., Kudo, T., Igeta, S., et al.: Spontaneous retroperitoneal hemorrhage caused by rupture of an ovarian artery aneurysm: a case report and review of the literature. J. Med. Case Rep. 9, 84–89 (2015)Google Scholar
  42. 42.
    Panazzolo, D., Braga, T., Bergamim, A., et al.: Hypoparathyroidism after roux-en-y gastric bypass - a challenge for clinical management: a case report. J. Med. Case Rep. 8, 357–361 (2014)Google Scholar
  43. 43.
    Osaku, T., Ogata, H., Magoshi, S., et al.: Metastatic nonpalpable invasive lobular breast carcinoma presenting as rectal stenosis: a case report. J. Med. Case Rep. 9, 88–93 (2015)Google Scholar
  44. 44.
    Hartog, N., Kamath, A.: A 90-year-old patient presenting with postoperative hypotension and a new murmur: a case report. J. Med. Case Rep. 8, 363–366 (2014)Google Scholar
  45. 45.
    Jellinge, M.: Severe septic shock and cardiac arrest in a patient with vibrio metschnikovii: a case report. J. Med. Case Rep. 8, 348–350 (2014)Google Scholar
  46. 46.
    Jiang, B., Zhu, R., Cao, Q., Pan, H.: Severe thoracic spinal fracture-dislocation without neurological symptoms and costal fractures: a case report and review of the literature. J. Med. Case Rep. 8, 343–348 (2014)Google Scholar
  47. 47.
    Erdal, U., Mehmet, D., Turkay, K., Mehmet, I., Ibrahim, N., Hasan, B.: Esophagus perforation and myocardial penetration caused by swallowing of a foreign body leading to a misdiagnosis of acute coronary syndrome: a case report. J. Med. Case Rep. 9, 57–59 (2015)Google Scholar
  48. 48.
    Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Department of Information SystemsUniversity of Maryland, Baltimore CountyBaltimoreUSA
  2. 2.University of Maryland School of Medicine, VA Maryland Healthcare SystemBaltimoreUSA

Personalised recommendations