Data Driven Analytics for Personalized Healthcare

  • Jianying Hu
  • Adam Perer
  • Fei Wang
Part of the Health Informatics book series (HI)


The concept of Learning Health Systems (LHS) is gaining momentum as more and more electronic healthcare data becomes increasingly accessible. The core idea is to enable learning from the collective experience of a care delivery network as recorded in the observational data, to iteratively improve care quality as care is being provided in a real world setting. In line with this vision, much recent research effort has been devoted to exploring machine learning, data mining and data visualization methodologies that can be used to derive real world evidence from diverse sources of healthcare data to provide personalized decision support for care delivery and care management. In this chapter, we will give an overview of a wide range of analytics and visualization components we have developed, examples of clinical insights reached from these components, and some new directions we are taking.


Data driven healthcare analytics Learning health system Practice based evidence Real world evidence Clinical decision support Machine learning Data mining Data visualization 


  1. 1.
    Alexander GC, Stafford RS. Does comparative effectiveness have a comparative edge? JAMA. 2009;301:2488–90.PubMedCentralCrossRefPubMedGoogle Scholar
  2. 2.
    Berwick DM. Disseminating innovations in healthcare. JAMA. 2003;289:1969–75.CrossRefPubMedGoogle Scholar
  3. 3.
    Ebadollahi S, Sun J, Gotz D, Hu J, Sow D, Neti C. Predicting patient’s trajectory of physiological data using temporal trends in similar patients: a system for near-term prognostics. AMIA Annu Symp Proc. 2010;2010:192–6.PubMedCentralPubMedGoogle Scholar
  4. 4.
    Gawande A. The hot spotters. New Yorker, Jan 2011.Google Scholar
  5. 5.
    Global Initiative for Chronic Obstructive Lung Disease. Global strategy for the diagnosis, management, and prevention of COPD. 2014. Accessed 21 Apr 2015.
  6. 6.
    Gotz D, Starvropoulos H, Sun J, Wang F. ICDA: a platform for intelligent care delivery analytics. Am Med Inform Assoc Annu Symp AMIA. 2012;2012:264–73.Google Scholar
  7. 7.
    Halpern Y, Sontag D. Unsupervised learning of noisy-or bayesian networks. In: Proceedings of the twenty-ninth conference on uncertainty in artificial intelligence (UAI2013). Bellevue, WA, USA. 2013. p. 272–81. arXiv:1309.6834 [cs.LG].
  8. 8.
    Hu J, Wang F, Sun J, Sorrentino R, Ebadollahi S. A healthcare utilization analysis framework for hot spotting and contextual anomaly detection. Am Med Inform Assoc Annu Symp (AMIA 2012). 2012;2012:360–9.Google Scholar
  9. 9.
    Krause J, Perer A, Bertini E. INFUSE: interactive feature selection for predictive modelling of high dimensional data. Paris: IEEE Visual Analytics Science and Technology (VAST 2014); 2014.Google Scholar
  10. 10.
    Lenfant C. Clinical research to clinical practice – lost in translation. N Engl J Med. 2003;349:868–74.CrossRefPubMedGoogle Scholar
  11. 11.
    LHS. Institute of Medicine Report: best care at lower cost: the path to continuously learning health care in America, released on 6 Sept 2012. 2013.
  12. 12.
    Luo D, Wang F, Sun J, Markatou M, Hu J, Ebadollahi S. SOR: scalable orthogonal regression for non redundant feature selection and its healthcare applications. SIAM Data Mining. 2012.
  13. 13.
    Markatou M, Kuruppumullage Don P, Hu J, Wang F, Sun J, Sorrentino R, Ebadollahi S. Case-based reasoning in comparative effectiveness research. IBM J Dev Res. 2012;56(5):468–79.CrossRefGoogle Scholar
  14. 14.
    Mitsa T, editor. Temporal data mining. 1st ed. Boca Raton: Chapman & Hall/CRC; 2010.Google Scholar
  15. 15.
    Ng K, Ghoting A, Steinhubl SR, Stewart WF, Malin B, Sun J. PARAMO: a PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records. J Biomed Inform. 2014;48:160–70.PubMedCentralCrossRefPubMedGoogle Scholar
  16. 16.
    Partners Healthcare. i2b2. 2014.
  17. 17.
    Perer A, Gotz D. Data driven exploration of care plans for patients. Paris: ACM CHI; 2013.CrossRefGoogle Scholar
  18. 18.
    Perer A, Sun J. MatrixFlow: temporal network visual analytics to track symptom evolution during disease progression. Am Med Inform Assoc Annu Symp (AMIA 2012). 2012;2012:716–25.Google Scholar
  19. 19.
    Perer A, Wang F. Frequence: interactive mining and visualization of temporal frequent event sequences. In: IUI ’14 proceedings of the 19th international conference on intelligent user interfaces. New York: ACM; 2014. doi: 10.1145/2557500.2557508.
  20. 20.
    Plaisant C, Mushlin R, Snyder A, Li J, Heller D, Shneiderman B. Lifelines: using visualization to enhance navigation and analysis of patient records. In American Medical Informatics Association Annual Symposium (AMIA), AMIA 1998 (1998), 7680.Google Scholar
  21. 21.
    Shwe MA, Middleton B, Heckerman D, Henrion M, Horvitz E, Lehmann H, Cooper G. Probabilistic diagnosis using a reformulation of the internist-1/qmr knowledge base. Methods Inf Med. 1991;30:241–55.PubMedGoogle Scholar
  22. 22.
    Sun J, Hu J, Luo D, Markatou M, Wang F, Edabollahi S, Steinhubl SE, Daar Z, Stewart WF. Combining knowledge and data driven insights for identifying risk factors using electronic health records. AMIA. 2012;2012:901–10.PubMedCentralGoogle Scholar
  23. 23.
    Sun J, Sow DM, Hu J, Ebadollahi S. A system for mining temporal physiological data streams for advanced prognostic decision support. In: IEEE international conference on data mining. 2010. p. 1061–66.
  24. 24.
    Sun J, Sow DM, Hu J, Ebadollahi S. Localized supervised metric learning on temporal physiological data. In: International conference on pattern recognition. 2010. p. 4149–52.
  25. 25.
    Tracy CS, Dantas G, Upshur R. Evidence- based medicine in primary care: qualitative study of family physicians. BMC Fam Pract. 2003;4(1):6.PubMedCentralCrossRefPubMedGoogle Scholar
  26. 26.
    Wang X, Sontag D, Wang F. Unsupervised learning of disease progression models. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM; 2014. p. 85–94.
  27. 27.
    Wang F, Sun J, Hu J, Ebadollahi S. iMet: interactive metric learning in healthcare applications. In: SIAM Data Mining Conference. 2011. pp. 944–55.
  28. 28.
    Wang F, Sun J, Ebadollahi S. Integrating distance metrics learned from multiple experts and its application in inter-patient similarity assessment. In: SIAM Data Mining Conference. 2011. p. 59–70.
  29. 29.
    Wang X, Wang F, Wang J, Qian B, Hu J. Exploring patient risk groups with incomplete knowledge. 2013 IEEE 13th international conference on data mining (ICDM). New York: IEEE; 2013. p. 1223–28.
  30. 30.
    Wang F, Zhang C. Feature extraction by maximizing the average neighborhood margin. In: Computer Vision and Pattern Recognition, New York: IEEE; 2007. p. 1–8.
  31. 31.
    Wang F, Zhang C. Label propagation through linear neighborhoods. In: Proceedings of the 23rd international conference on machine learning, Pittsburgh, 2006, p. 985–92.
  32. 32.
  33. 33.
    Wei WQ, Cronin RM, Xu H, Lasko TA, Bastarache L, Denny JC. Development and evaluation of an ensemble resource linking medications to their indications. J Am Med Inform Assoc. 2013;20(5):954–61.PubMedCentralCrossRefPubMedGoogle Scholar
  34. 34.
    Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;34(Database issue):D668–72.PubMedCentralCrossRefPubMedGoogle Scholar
  35. 35.
    Zhang Z, Gotz D, Perer A. Iterative cohort analysis and exploration. Journal of Information Visualization, March 19, 2014. doi:  10.1177/1473871614526077.
  36. 36.
    Zhang P, Wang F, Hu J, Sorrentino R. Towards personalized medicine: leveraging patient similarity and drug similarity analytics. Am Med Inform Assoc (AMIA) Jt Summit Transl Sci Transl Bioinforma (TBI). 2014;2014:132–6.Google Scholar
  37. 37.
    Zhou J, Wang F, Hu J, Ye J. From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM; 2014. p. 135–44.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Healthcare Analytics Research GroupIBM T.J. Watson Research CenterYorktown HeightsUSA
  2. 2.Computer Science and EngineeringUniversity of ConnecticutStorrsUSA

Personalised recommendations