Data Mining and Clinical Decision Support Systems

  • Bunyamin OzaydinEmail author
  • J. Michael Hardin
  • David C. Chhieng
Part of the Health Informatics book series (HI)


Data mining is a process of pattern and relationship discovery within large sets of data. Because of the large volume of data generated in healthcare settings, it is not surprising that healthcare organizations have been interested in data mining to enhance physician practices, disease management, and resource utilization. This chapter discusses a variety of data mining techniques that have been used to develop clinical decision support systems, including decision trees, neural networks, logistic regression, nearest neighbor classifiers. In addition, genetic algorithms, biologic and quantum computing, and big data analytics as well as methods of evaluating and comparing the different approaches are also discussed.


Statistical pattern recognition Data mining Neural networks Decision trees Genetic algorithms Big data analytics Quantum computing 


  1. 1.
    Fayyad UM, Piatetsky-Shapiro G, Smyth P. Knowledge discovery and data mining: towards a unifying framework. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland. pp. 82–88. August 1996. AAAI Press. Available from: Accessed 17 July 2006.
  2. 2.
    Leatherman S, Peterson E, Heinen L, Quam L. Quality screening and management using claims data in a managed care setting. QRB Qual Rev Bull. 1991;17:349–59.PubMedGoogle Scholar
  3. 3.
    Finlay PN. Introducing decision support systems. Cambridge, MA: Blackwell Publishers; 1994.Google Scholar
  4. 4.
    Huber S, Medl M, Vesely M, Czembirek H, Zuna I, Delorme S. Ultrasonographic tissue characterization in monitoring tumor response to neoadjuvant chemotherapy in locally advanced breast cancer (work in progress). J Ultrasound Med. 2000;19:677–86.PubMedGoogle Scholar
  5. 5.
    Christodoulou CI, Pattichis CS. Unsupervided pattern recognition for the classification of EMG signals. IEEE Trans Biomed Eng. 1999;46:169–78.CrossRefPubMedGoogle Scholar
  6. 6.
    Karayiannis NB, Mukherjee A, Glover JR, Frost J, Hrachovy JR, Mizrahi EM. An evaluation of quantum neural networks in the detection of epileptic seizures in the neonatal electroencephalogram. Soft Comput. 2006;10:382–96.CrossRefGoogle Scholar
  7. 7.
    Banez LL, Prasanna P, Sun L, et al. Diagnostic potential of serum proteomic patterns in prostate cancer. J Urol. 2003;170(2 Pt 1):442–26.CrossRefPubMedGoogle Scholar
  8. 8.
    Leonard JE, Colombe JB, Levy JL. Finding relevant references to genes and proteins in Medline using a Bayesian approach. Bioinformatics. 2002;18:1515–22.CrossRefPubMedGoogle Scholar
  9. 9.
    Bins M, van Montfort LH, Timmers T, Landeweerd GH, Gelsema ES, Halie MR. Classification of immature and mature cells of the neutrophil series using morphometrical parameters. Cytometry. 1983;3:435–8.CrossRefPubMedGoogle Scholar
  10. 10.
    Hibbard LS, McKeel Jr DW. Automated identification and quantitative morphometry of the senile plaques of Alzheimer’s disease. Anal Quant Cytol Histol. 1997;19:123–38.PubMedGoogle Scholar
  11. 11.
    Baumgartner C, Bohm C, Baumgartner D, et al. Supervised machine learning techniques for the classification of metabolic disorders in newborns. Bioinformatics. 2004;20:2985–96.CrossRefPubMedGoogle Scholar
  12. 12.
    Gordon HS, Johnson ML, Wray NP, et al. Mortality after noncardiac surgery: prediction from administrative versus clinical data. Med Care. 2005;43:159–67.CrossRefPubMedGoogle Scholar
  13. 13.
    Ocak H. A medical decision support system based on support vector machines and the genetic algorithm for the evaluation of fetal well-being. J Med Syst. 2013;37:1–9.CrossRefGoogle Scholar
  14. 14.
    Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015. pp. 1721–30.Google Scholar
  15. 15.
    Tekin C, Atan O, van der Schaar M. Discover the expert: context-adaptive expert selection for medical diagnosis. IEEE Trans Emerg Topics Comput. 2015;3:220–34. IEEE.CrossRefGoogle Scholar
  16. 16.
    Zhuang ZY, Churilov L, Burstein F, Sikaris K. Combining data mining and case-based reasoning for intelligent decision support for pathology ordering by general practitioners. Eur J Oper Res. 2009;195:662–75.CrossRefGoogle Scholar
  17. 17.
    Rane AL. Clinical decision support model for prevailing diseases to improve human life survivability. 2015 International Conference on Pervasive Computing (ICPC), 2015. pp. 1–5.Google Scholar
  18. 18.
    Wang X, Sontag D, Wang F. Unsupervised learning of disease progression models. Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. 2014. pp. 85–94.Google Scholar
  19. 19.
    Dilsizian SE, Siegel EL. Artificial intelligence in medicine and cardiac imaging: harnessing big data and advanced computing to provide personalized medical diagnosis and treatment. Curr Cardiol Rep. 2014;16:1–8.CrossRefGoogle Scholar
  20. 20.
    Anooj P. Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules. J King Saud Univ-Comput Inf Sci. 2012;24:27–40.Google Scholar
  21. 21.
    Srinivas K, Rani BK, Govrdhan A. Applications of data mining techniques in healthcare and prediction of heart attacks. Int J Comput Sci Eng (IJCSE). 2010;2:250–5.Google Scholar
  22. 22.
    Bowd C, Chan K, Zangwill LM, Goldbaum MH, Lee T-W, Sejnowski TJ, et al. Comparing neural networks and linear discriminant functions for glaucoma detection using confocal scanning laser ophthalmoscopy of the optic disc. Investig Ophthalmol Vis Sci. 2002;43:3444–54.Google Scholar
  23. 23.
    Lin A, Hoffman D, Gaasterland DE, Caprioli J. Neural networks to identify glaucomatous visual field progression. Am J Ophthalmol. 2003;135:49–54.CrossRefPubMedGoogle Scholar
  24. 24.
    Bengtsson B, Bizios D, Heijl A. Effects of input data on the performance of a neural network in distinguishing normal and glaucomatous visual fields. Invest Ophthalmol Vis Sci. 2005;46:3730–6.CrossRefPubMedGoogle Scholar
  25. 25.
    Al-Hyari AY, Al-Taee AM, Al-Taee MA. Diagnosis and classification of chronic renal failure utilising intelligent data mining classifiers. Int J Inf Technol Web Eng (IJITWE). 2014;9:1–12.CrossRefGoogle Scholar
  26. 26.
    Yeh D-Y, Cheng C-H, Chen Y-W. A predictive model for cerebrovascular disease using data mining. Expert Syst Applic. 2011;38:8970–7.CrossRefGoogle Scholar
  27. 27.
    Lee BJ, Kim JY. Identification of type 2 diabetes risk factors using phenotypes consisting of anthropometry and triglycerides based on machine learning. IEEE J Biomed Health Inform. 2016;20(1):39–46. doi: 10.1109/JBHI.2015.2396520.Google Scholar
  28. 28.
    Dugan T, Mukhopadhyay S, Carroll A, Downs S, et al. Machine learning techniques for prediction of early childhood obesity. Appl Clin Inform. 2015;6:506–20.CrossRefPubMedGoogle Scholar
  29. 29.
    Marakas GM. Decision support systems. 2nd ed. Princeton: Prentice Hall; 2002.Google Scholar
  30. 30.
    Ambrosiadou BV, Goulis DG, Pappas C. Clinical evaluation of the DIABETES expert system for decision support by multiple regimen insulin dose adjustment. Comp Methods Programs Biomed. 1996;49:105–15.CrossRefGoogle Scholar
  31. 31.
    Marchevsky AM, Coons G. Expert systems as an aid for the pathologist’s role of clinical consultant: CANCER-STAGE. Mod Pathol. 1993;6:265–9.PubMedGoogle Scholar
  32. 32.
    Nguyen AN, Hartwell EA, Milam JD. A rule-based expert system for laboratory diagnosis of hemoglobin disorders. Arch Pathol Lab Med. 1996;120:817–27.PubMedGoogle Scholar
  33. 33.
    Papaloukas C, Fotiadis DI, Likas A, Stroumbis CS, Michalis LK. Use of a novel rule-based expert system in the detection of changes in the ST segment and the T wave in long duration ECGs. J Electrocardiol. 2002;35:27–34.CrossRefPubMedGoogle Scholar
  34. 34.
    Riss PA, Koelbl H, Reinthaller A, Deutinger J. Development and application of simple expert systems in obstetrics and gynecology. J Perinat Med. 1988;16:283–7.CrossRefPubMedGoogle Scholar
  35. 35.
    Sailors RM, East TD. A model-based simulator for testing rule-based decision support systems for mechanical ventilation of ARDS patients. Proc Ann Symp Comp Appl Med Care. 1994:1007.
  36. 36.
    Shortliffe EH, Davis R, Axline SG, Buchanan BG, Green CC, Cohen SN. Computer-based consultations in clinical therapeutics: explanation and rule acquisition capabilities of the MYCIN system. Comput Biomed Res. 1975;8:303–20.CrossRefPubMedGoogle Scholar
  37. 37.
    Duda RO, Hart PE, Stork DG. Pattern classification and scene analysis. 2nd ed. New York: Wiley; 2000.Google Scholar
  38. 38.
    Fukunaga K. Introduction to statistical pattern recognition. 2nd ed. New York: Academic; 1990.Google Scholar
  39. 39.
    Schalkoff RJ. Pattern recognition: statistical, structural and neural approaches. New York: Wiley; 1991.Google Scholar
  40. 40.
    Goldman L, Cook EF, Brand DA, et al. A computer protocol to predict myocardial infarction in emergency department patients with chest pain. N Engl J Med. 1988;318:797–803.CrossRefPubMedGoogle Scholar
  41. 41.
    Qamar A, McPherson C, Babb J, Bernstein L, Werdmann M, Yasick D, et al. The Goldman algorithm revisited: prospective evaluation of a computer-derived algorithm versus unaided physician judgment in suspected acute myocardial infarction. Am Heart J. 1999;138:705–9.CrossRefPubMedGoogle Scholar
  42. 42.
    Scott AJ, Wild CJ. Fitting logistic models under case-control or choice based sampling. J Roy Stat Soc B. 1986;48:170–82.Google Scholar
  43. 43.
    Avanzolini G, Barbini P, Gnudi G. Unsupervised learning and discriminant analysis applied to identification of high risk postoperative cardiac patients. Int J Biomed Comp. 1990;25:207–21.CrossRefGoogle Scholar
  44. 44.
    Mullins IM, Siadaty MS, Lyman J, Scully K, Garrett CT, Miller WG, et al. Data mining and clinical data repositories: insights from a 667,000 patient data set. Comput Biol Med. 2006;36:1351–77.CrossRefPubMedGoogle Scholar
  45. 45.
    Gerald LB, Tang S, Bruce F, et al. A decision tree for tuberculosis contact investigation [see comment]. Am J Respir Crit Care Med. 2002;166:1122–7.CrossRefPubMedGoogle Scholar
  46. 46.
    Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform. 2008;77:81–97.CrossRefPubMedGoogle Scholar
  47. 47.
    Wang TL, Jang TN, Huang CH, et al. Establishing a clinical decision rule of severe acute respiratory syndrome at the emergency department. Ann Emerg Med. 2004;43:17–22.CrossRefPubMedGoogle Scholar
  48. 48.
    Gibbs P, Turnbull LW. Textural analysis of contrast-enhanced MR images of the breast. Magn Reson Med. 2003;50:92–8.CrossRefPubMedGoogle Scholar
  49. 49.
    Haykin S. Neural networks and learning machines. New York: Prentice Hall/Pearson; 2009.Google Scholar
  50. 50.
    Joo S, Yang YS, Moon WK, Kim HC. Computer-aided diagnosis of solid breast nodules: use of an artificial neural network based on multiple sonographic features. IEEE Transact Med Imaging. 2004;23:1292–300.CrossRefGoogle Scholar
  51. 51.
    Walsh P, Cunningham P, Rothenberg SJ, O’Doherty S, Hoey H, Healy R. An artificial neural network ensemble to predict disposition and length of stay in children presenting with bronchiolitis. Eur J Emerg Med. 2004;11:259–564.CrossRefPubMedGoogle Scholar
  52. 52.
    Burroni M, Corona R, Dell’Eva G, et al. Melanoma computer-aided diagnosis: reliability and feasibility study. Clin Cancer Res. 2004;10:1881–6.CrossRefPubMedGoogle Scholar
  53. 53.
    Press WH, Flannery BP, Teukolsky SA, Vetterling WT. Numerical recipes in FORTRAN example book: the art of scientific computing. 2nd ed. New York: Cambridge University Press; 1992.Google Scholar
  54. 54.
    Collins FS, Varmus H. A new initiative on precision medicine. New Engl J Med Mass Med Soc. 2015;372:793–5.CrossRefGoogle Scholar
  55. 55.
    Perou CM, Sorlie T, Eisen MB, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–52.CrossRefPubMedGoogle Scholar
  56. 56.
    Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98:10869–74.CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Jiang X, Cai B, Xue D, Lu X, Cooper GF, Neapolitan RE. A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets. J Am Med Inform Assoc. 2014;21:e312–9.CrossRefPubMedPubMedCentralGoogle Scholar
  58. 58.
    Zellner BB, Rand SD, Prost R, Krouwer H, Chetty VK. A cost-minimizing diagnostic methodology for discrimination between neoplastic and non-neoplastic brain lesions: utilizing a genetic algorithm. Acad Radiol. 2004;11:169–77.CrossRefPubMedGoogle Scholar
  59. 59.
    Bozcuk H, Bilge U, Koyuncu E, Gulkesen H. An application of a genetic algorithm in conjunction with other data mining methods for estimating outcome after hospitalization in cancer patients. Med Sci Monit. 2004;10:CR246–51.PubMedGoogle Scholar
  60. 60.
    Ravindran S, Jambek AB, Muthusamy H, Neoh S-C. A novel clinical decision support system using improved adaptive genetic algorithm for the assessment of fetal well-being. Comput Math Methods Med. 2015;2015:283532. doi: 10.1155/2015/283532.Google Scholar
  61. 61.
    Bonnet J, Yin P, Ortiz ME, Subsoontorn P, Endy D. Amplifying genetic logic gates. Science. 2013;340:599–603.CrossRefPubMedGoogle Scholar
  62. 62.
    Benenson Y, Gil B, Ben-Dor U, Adar R, Shapiro E. An autonomous molecular computer for logical control of gene expression. Nature. 2004;429:423–9.CrossRefPubMedGoogle Scholar
  63. 63.
    Saeedi K, Simmons S, Salvail JZ, Dluhy P, Riemann H, Abrosimov NV, et al. Room-temperature quantum bit storage exceeding 39 minutes using ionized donors in silicon-28. Science. 2013;342:830–3.CrossRefPubMedGoogle Scholar
  64. 64.
    Lu T-C, Yu G-R, Juang J-C. Quantum-based algorithm for optimizing artificial neural networks. IEEE Trans Neural Netw Learn Syst. 2013;24:1266–78.CrossRefGoogle Scholar
  65. 65.
    Zadeh LA. Fuzzy sets. Information and control. World Sci. 1965;8:338–53.Google Scholar
  66. 66.
    Rokach L. Using fuzzy logic in data mining. In: Maimon O, Rokach L, editors. Data mining and knowledge discovery handbook. New York: Springer; 2010. p. 505–20.Google Scholar
  67. 67.
    Nguyen T, Khosravi A, Creighton D, Nahavandi S. Classification of healthcare data using genetic fuzzy logic system and wavelets. Expert Syst Applic. 2015;42:2184–97.CrossRefGoogle Scholar
  68. 68.
    Seera M, Lim CP. A hybrid intelligent system for medical data classification. Expert Syste Applic. 2014;41:2239–49.CrossRefGoogle Scholar
  69. 69.
    Margolis R, Derr L, Dunn M, Huerta M, Larkin J, Sheehan J, et al. The National Institutes of health’s big data to knowledge (BD2K) initiative: capitalizing on biomedical big data. JAMIA. 2014;21:957–8.PubMedPubMedCentralGoogle Scholar
  70. 70.
    Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet. 2012;13:395–405.CrossRefPubMedGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Bunyamin Ozaydin
    • 1
    Email author
  • J. Michael Hardin
    • 2
  • David C. Chhieng
    • 3
  1. 1.School of Health Professions, Department of Health Services AdministrationUniversity of Alabama at BirminghamBirminghamUSA
  2. 2.Academic AffairsSamford UniversityBirminghamUSA
  3. 3.PathologyMount Sinai Health SystemNew YorkUSA

Personalised recommendations