Abstract
Data mining is a process of pattern and relationship discovery within large sets of data. Because of the large volume of data generated in healthcare settings, it is not surprising that healthcare organizations have been interested in data mining to enhance physician practices, disease management, and resource utilization. This chapter discusses a variety of data mining techniques that have been used to develop clinical decision support systems, including decision trees, neural networks, logistic regression, nearest neighbor classifiers. In addition, genetic algorithms, biologic and quantum computing, and big data analytics as well as methods of evaluating and comparing the different approaches are also discussed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Fayyad UM, Piatetsky-Shapiro G, Smyth P. Knowledge discovery and data mining: towards a unifying framework. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland. pp. 82–88. August 1996. AAAI Press. Available from: http://ww-aig.jpl.nasa.gov.kdd96. Accessed 17 July 2006.
Leatherman S, Peterson E, Heinen L, Quam L. Quality screening and management using claims data in a managed care setting. QRB Qual Rev Bull. 1991;17:349–59.
Finlay PN. Introducing decision support systems. Cambridge, MA: Blackwell Publishers; 1994.
Huber S, Medl M, Vesely M, Czembirek H, Zuna I, Delorme S. Ultrasonographic tissue characterization in monitoring tumor response to neoadjuvant chemotherapy in locally advanced breast cancer (work in progress). J Ultrasound Med. 2000;19:677–86.
Christodoulou CI, Pattichis CS. Unsupervided pattern recognition for the classification of EMG signals. IEEE Trans Biomed Eng. 1999;46:169–78.
Karayiannis NB, Mukherjee A, Glover JR, Frost J, Hrachovy JR, Mizrahi EM. An evaluation of quantum neural networks in the detection of epileptic seizures in the neonatal electroencephalogram. Soft Comput. 2006;10:382–96.
Banez LL, Prasanna P, Sun L, et al. Diagnostic potential of serum proteomic patterns in prostate cancer. J Urol. 2003;170(2 Pt 1):442–26.
Leonard JE, Colombe JB, Levy JL. Finding relevant references to genes and proteins in Medline using a Bayesian approach. Bioinformatics. 2002;18:1515–22.
Bins M, van Montfort LH, Timmers T, Landeweerd GH, Gelsema ES, Halie MR. Classification of immature and mature cells of the neutrophil series using morphometrical parameters. Cytometry. 1983;3:435–8.
Hibbard LS, McKeel Jr DW. Automated identification and quantitative morphometry of the senile plaques of Alzheimer’s disease. Anal Quant Cytol Histol. 1997;19:123–38.
Baumgartner C, Bohm C, Baumgartner D, et al. Supervised machine learning techniques for the classification of metabolic disorders in newborns. Bioinformatics. 2004;20:2985–96.
Gordon HS, Johnson ML, Wray NP, et al. Mortality after noncardiac surgery: prediction from administrative versus clinical data. Med Care. 2005;43:159–67.
Ocak H. A medical decision support system based on support vector machines and the genetic algorithm for the evaluation of fetal well-being. J Med Syst. 2013;37:1–9.
Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015. pp. 1721–30.
Tekin C, Atan O, van der Schaar M. Discover the expert: context-adaptive expert selection for medical diagnosis. IEEE Trans Emerg Topics Comput. 2015;3:220–34. IEEE.
Zhuang ZY, Churilov L, Burstein F, Sikaris K. Combining data mining and case-based reasoning for intelligent decision support for pathology ordering by general practitioners. Eur J Oper Res. 2009;195:662–75.
Rane AL. Clinical decision support model for prevailing diseases to improve human life survivability. 2015 International Conference on Pervasive Computing (ICPC), 2015. pp. 1–5.
Wang X, Sontag D, Wang F. Unsupervised learning of disease progression models. Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. 2014. pp. 85–94.
Dilsizian SE, Siegel EL. Artificial intelligence in medicine and cardiac imaging: harnessing big data and advanced computing to provide personalized medical diagnosis and treatment. Curr Cardiol Rep. 2014;16:1–8.
Anooj P. Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules. J King Saud Univ-Comput Inf Sci. 2012;24:27–40.
Srinivas K, Rani BK, Govrdhan A. Applications of data mining techniques in healthcare and prediction of heart attacks. Int J Comput Sci Eng (IJCSE). 2010;2:250–5.
Bowd C, Chan K, Zangwill LM, Goldbaum MH, Lee T-W, Sejnowski TJ, et al. Comparing neural networks and linear discriminant functions for glaucoma detection using confocal scanning laser ophthalmoscopy of the optic disc. Investig Ophthalmol Vis Sci. 2002;43:3444–54.
Lin A, Hoffman D, Gaasterland DE, Caprioli J. Neural networks to identify glaucomatous visual field progression. Am J Ophthalmol. 2003;135:49–54.
Bengtsson B, Bizios D, Heijl A. Effects of input data on the performance of a neural network in distinguishing normal and glaucomatous visual fields. Invest Ophthalmol Vis Sci. 2005;46:3730–6.
Al-Hyari AY, Al-Taee AM, Al-Taee MA. Diagnosis and classification of chronic renal failure utilising intelligent data mining classifiers. Int J Inf Technol Web Eng (IJITWE). 2014;9:1–12.
Yeh D-Y, Cheng C-H, Chen Y-W. A predictive model for cerebrovascular disease using data mining. Expert Syst Applic. 2011;38:8970–7.
Lee BJ, Kim JY. Identification of type 2 diabetes risk factors using phenotypes consisting of anthropometry and triglycerides based on machine learning. IEEE J Biomed Health Inform. 2016;20(1):39–46. doi:10.1109/JBHI.2015.2396520.
Dugan T, Mukhopadhyay S, Carroll A, Downs S, et al. Machine learning techniques for prediction of early childhood obesity. Appl Clin Inform. 2015;6:506–20.
Marakas GM. Decision support systems. 2nd ed. Princeton: Prentice Hall; 2002.
Ambrosiadou BV, Goulis DG, Pappas C. Clinical evaluation of the DIABETES expert system for decision support by multiple regimen insulin dose adjustment. Comp Methods Programs Biomed. 1996;49:105–15.
Marchevsky AM, Coons G. Expert systems as an aid for the pathologist’s role of clinical consultant: CANCER-STAGE. Mod Pathol. 1993;6:265–9.
Nguyen AN, Hartwell EA, Milam JD. A rule-based expert system for laboratory diagnosis of hemoglobin disorders. Arch Pathol Lab Med. 1996;120:817–27.
Papaloukas C, Fotiadis DI, Likas A, Stroumbis CS, Michalis LK. Use of a novel rule-based expert system in the detection of changes in the ST segment and the T wave in long duration ECGs. J Electrocardiol. 2002;35:27–34.
Riss PA, Koelbl H, Reinthaller A, Deutinger J. Development and application of simple expert systems in obstetrics and gynecology. J Perinat Med. 1988;16:283–7.
Sailors RM, East TD. A model-based simulator for testing rule-based decision support systems for mechanical ventilation of ARDS patients. Proc Ann Symp Comp Appl Med Care. 1994:1007. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2247879/.
Shortliffe EH, Davis R, Axline SG, Buchanan BG, Green CC, Cohen SN. Computer-based consultations in clinical therapeutics: explanation and rule acquisition capabilities of the MYCIN system. Comput Biomed Res. 1975;8:303–20.
Duda RO, Hart PE, Stork DG. Pattern classification and scene analysis. 2nd ed. New York: Wiley; 2000.
Fukunaga K. Introduction to statistical pattern recognition. 2nd ed. New York: Academic; 1990.
Schalkoff RJ. Pattern recognition: statistical, structural and neural approaches. New York: Wiley; 1991.
Goldman L, Cook EF, Brand DA, et al. A computer protocol to predict myocardial infarction in emergency department patients with chest pain. N Engl J Med. 1988;318:797–803.
Qamar A, McPherson C, Babb J, Bernstein L, Werdmann M, Yasick D, et al. The Goldman algorithm revisited: prospective evaluation of a computer-derived algorithm versus unaided physician judgment in suspected acute myocardial infarction. Am Heart J. 1999;138:705–9.
Scott AJ, Wild CJ. Fitting logistic models under case-control or choice based sampling. J Roy Stat Soc B. 1986;48:170–82.
Avanzolini G, Barbini P, Gnudi G. Unsupervised learning and discriminant analysis applied to identification of high risk postoperative cardiac patients. Int J Biomed Comp. 1990;25:207–21.
Mullins IM, Siadaty MS, Lyman J, Scully K, Garrett CT, Miller WG, et al. Data mining and clinical data repositories: insights from a 667,000 patient data set. Comput Biol Med. 2006;36:1351–77.
Gerald LB, Tang S, Bruce F, et al. A decision tree for tuberculosis contact investigation [see comment]. Am J Respir Crit Care Med. 2002;166:1122–7.
Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform. 2008;77:81–97.
Wang TL, Jang TN, Huang CH, et al. Establishing a clinical decision rule of severe acute respiratory syndrome at the emergency department. Ann Emerg Med. 2004;43:17–22.
Gibbs P, Turnbull LW. Textural analysis of contrast-enhanced MR images of the breast. Magn Reson Med. 2003;50:92–8.
Haykin S. Neural networks and learning machines. New York: Prentice Hall/Pearson; 2009.
Joo S, Yang YS, Moon WK, Kim HC. Computer-aided diagnosis of solid breast nodules: use of an artificial neural network based on multiple sonographic features. IEEE Transact Med Imaging. 2004;23:1292–300.
Walsh P, Cunningham P, Rothenberg SJ, O’Doherty S, Hoey H, Healy R. An artificial neural network ensemble to predict disposition and length of stay in children presenting with bronchiolitis. Eur J Emerg Med. 2004;11:259–564.
Burroni M, Corona R, Dell’Eva G, et al. Melanoma computer-aided diagnosis: reliability and feasibility study. Clin Cancer Res. 2004;10:1881–6.
Press WH, Flannery BP, Teukolsky SA, Vetterling WT. Numerical recipes in FORTRAN example book: the art of scientific computing. 2nd ed. New York: Cambridge University Press; 1992.
Collins FS, Varmus H. A new initiative on precision medicine. New Engl J Med Mass Med Soc. 2015;372:793–5.
Perou CM, Sorlie T, Eisen MB, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–52.
Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98:10869–74.
Jiang X, Cai B, Xue D, Lu X, Cooper GF, Neapolitan RE. A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets. J Am Med Inform Assoc. 2014;21:e312–9.
Zellner BB, Rand SD, Prost R, Krouwer H, Chetty VK. A cost-minimizing diagnostic methodology for discrimination between neoplastic and non-neoplastic brain lesions: utilizing a genetic algorithm. Acad Radiol. 2004;11:169–77.
Bozcuk H, Bilge U, Koyuncu E, Gulkesen H. An application of a genetic algorithm in conjunction with other data mining methods for estimating outcome after hospitalization in cancer patients. Med Sci Monit. 2004;10:CR246–51.
Ravindran S, Jambek AB, Muthusamy H, Neoh S-C. A novel clinical decision support system using improved adaptive genetic algorithm for the assessment of fetal well-being. Comput Math Methods Med. 2015;2015:283532. doi:10.1155/2015/283532.
Bonnet J, Yin P, Ortiz ME, Subsoontorn P, Endy D. Amplifying genetic logic gates. Science. 2013;340:599–603.
Benenson Y, Gil B, Ben-Dor U, Adar R, Shapiro E. An autonomous molecular computer for logical control of gene expression. Nature. 2004;429:423–9.
Saeedi K, Simmons S, Salvail JZ, Dluhy P, Riemann H, Abrosimov NV, et al. Room-temperature quantum bit storage exceeding 39 minutes using ionized donors in silicon-28. Science. 2013;342:830–3.
Lu T-C, Yu G-R, Juang J-C. Quantum-based algorithm for optimizing artificial neural networks. IEEE Trans Neural Netw Learn Syst. 2013;24:1266–78.
Zadeh LA. Fuzzy sets. Information and control. World Sci. 1965;8:338–53.
Rokach L. Using fuzzy logic in data mining. In: Maimon O, Rokach L, editors. Data mining and knowledge discovery handbook. New York: Springer; 2010. p. 505–20.
Nguyen T, Khosravi A, Creighton D, Nahavandi S. Classification of healthcare data using genetic fuzzy logic system and wavelets. Expert Syst Applic. 2015;42:2184–97.
Seera M, Lim CP. A hybrid intelligent system for medical data classification. Expert Syste Applic. 2014;41:2239–49.
Margolis R, Derr L, Dunn M, Huerta M, Larkin J, Sheehan J, et al. The National Institutes of health’s big data to knowledge (BD2K) initiative: capitalizing on biomedical big data. JAMIA. 2014;21:957–8.
Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet. 2012;13:395–405.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Ozaydin, B., Hardin, J.M., Chhieng, D.C. (2016). Data Mining and Clinical Decision Support Systems. In: Berner, E. (eds) Clinical Decision Support Systems. Health Informatics. Springer, Cham. https://doi.org/10.1007/978-3-319-31913-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-31913-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31911-7
Online ISBN: 978-3-319-31913-1
eBook Packages: MedicineMedicine (R0)