Data Mining in Healthcare and Biomedicine: A Survey of the Literature

Yoo, Illhoi; Alafaireet, Patricia; Marinov, Miroslav; Pena-Hernandez, Keila; Gopidi, Rajitha; Chang, Jia-Fu; Hua, Lei

doi:10.1007/s10916-011-9710-5

Data Mining in Healthcare and Biomedicine: A Survey of the Literature

ORIGINAL PAPER
Published: 03 May 2011

Volume 36, pages 2431–2448, (2012)
Cite this article

Journal of Medical Systems Aims and scope Submit manuscript

Illhoi Yoo^1,3,
Patricia Alafaireet²,
Miroslav Marinov³,
Keila Pena-Hernandez³,
Rajitha Gopidi³,
Jia-Fu Chang³ &
…
Lei Hua³

15k Accesses
372 Citations
18 Altmetric
1 Mention
Explore all metrics

Abstract

As a new concept that emerged in the middle of 1990’s, data mining can help researchers gain both novel and deep insights and can facilitate unprecedented understanding of large biomedical datasets. Data mining can uncover new biomedical and healthcare knowledge for clinical and administrative decision making as well as generate scientific hypotheses from large experimental data, clinical databases, and/or biomedical literature. This review first introduces data mining in general (e.g., the background, definition, and process of data mining), discusses the major differences between statistics and data mining and then speaks to the uniqueness of data mining in the biomedical and healthcare fields. A brief summarization of various data mining algorithms used for classification, clustering, and association as well as their respective advantages and drawbacks is also presented. Suggested guidelines on how to use data mining algorithms in each area of classification, clustering, and association are offered along with three examples of how data mining has been used in the healthcare industry. Given the successful application of data mining by health related organizations that has helped to predict health insurance fraud and under-diagnosed patients, and identify and classify at-risk people in terms of health with the goal of reducing healthcare cost, we introduce how data mining technologies (in each area of classification, clustering, and association) have been used for a multitude of purposes, including research in the biomedical and healthcare fields. A discussion of the technologies available to enable the prediction of healthcare costs (including length of hospital stay), disease diagnosis and prognosis, and the discovery of hidden biomedical and healthcare patterns from related databases is offered along with a discussion of the use of data mining to discover such relationships as those between health conditions and a disease, relationships among diseases, and relationships among drugs. The article concludes with a discussion of the problems that hamper the clinical use of data mining by health professionals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The role of artificial intelligence in healthcare: a structured literature review

Article Open access 10 April 2021

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Article 13 January 2022

Big data in healthcare: management, analysis and future prospects

Article Open access 19 June 2019

Notes

MeSH is National Library of Medicine (NLM)’s controlled vocabulary used for indexing MEDLINE articles.
For example, if it takes for a hierarchical algorithm 60 s to cluster 1000 objects (records), to cluster 3000 objects it takes 1620 s (=(3000/1000)³*60) (if there is enough system memory).
Some classification algorithms can mine only either numeric data or categorical data.
Clustering accuracies can be measured only if class (i.e., a dependent variable) is available.
http://www.ncbi.nlm.nih.gov/mesh
http://www.usrds.org/atlas.htm

References

The Technology Review Ten, MIT Technology Review (January/February 2001).
Larose, D. T., Discovering knowledge in data: an introduction to data mining. Wiley, 2004.
Hand, D., Mannila, H., Smyth, P., Principles of data mining. MIT, 2001.
Yoo, I., Song, M., Biomedical ontologies and text mining for biomedicine and healthcare: a survey. Journal of Computing Science and Engineering 2(2):109–36, 2008. (http://jcse.kiise.org/html/download.asp?id=17).
Google Scholar
Richards, G., Rayward-Smith, V. J., Sönksen, P. H., Carey, S., and Weng, C., Data mining for indicators of early mortality in a database of clinical records. Artif. Intell. Med. 22:215–231, 2001.
Article Google Scholar
Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P., The KDD process of extracting useful knowledge from volumes of data. Commun. ACM 39(11):27–34, 1996.
Article Google Scholar
Berger, A., and Berger, C., Data mining as a tool for research and knowledge development in nursing. Comput. Inform. Nurs. 22(3):123–131, 2004.
Article Google Scholar
Shearer, C., The CRISP-DM model: the new blueprint for data mining. J Data Warehous 5(4):13–22, 2000.
Google Scholar
Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P., From data mining to knowledge discovery in databases. Commun. ACM 39(11):24–26, 1996.
Article Google Scholar
Han, J., Kamber, M., Data mining: concepts and techniques. 2nd ed. The Morgan Kaufmann Series, 2006.
Silver, M., Sakara, T., Su, H. C., Herman, C., Dolins, S. B., and O’shea, M. J., Case study: how to apply data mining techniques in a healthcare data warehouse. J. Healthc. Inf. Manage. 15(2):155–164, 2001.
Google Scholar
Harper, P. R., A review and comparison of classification algorithms for medical decision making. Health Policy 71:315–331, 2005.
Article Google Scholar
Sierra, B., and Larranaga, P., Predicting survival in malignant skin melanoma using Bayesian networks automatically induced by genetic algorithms. An empirical comparison between different approaches. Artif. Intell. Med. 14:215–230, 1998.
Article Google Scholar
Eastwood, E. A., Magaziner, J., Wang, J., Silberzweig, S. B., Hannan, E. L., Strauss, E., et al., Patients with hip fracture: subgroups and their outcomes. J. Am. Geriatr. Soc. 50:1240–1249, 2002.
Article Google Scholar
Stel, V. S., Pluijm, S. M., Deeg, D. J., Smit, J. H., Bouter, L. M., and Lips, P., A classification tree for predicting recurrent falling in community-dwelling older persons. J. Am. Geriatr. Soc. 51:1356–1364, 2003.
Article Google Scholar
Yu, J. S., Ongarello, S., Fiedler, R., Chen, X. W., Toffolo, G., Cobelli, C., and Trajanoski, Z., Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data. Bioinformatics 21:2200–2209, 2005.
Article Google Scholar
Adam, B. L., Qu, Y., Davis, J. W., Ward, M. D., Clements, M. A., Cazares, L. H., et al., Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res. 62:3609–3614, 2002.
Google Scholar
Petricoin, E. F., Ardekani, A. M., Hitt, B. A., Levine, P. J., Fusaro, V. A., Steinberg, S. M., et al., Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359:572–577, 2002.
Article Google Scholar
Bellazzi, R., and Zupan, B., Predictive data mining in clinical medicine: current issues and guidelines. Int. J. Med. Inform. 77:81–97, 2008.
Article Google Scholar
Hand, D., Data mining: statistic or more? Am. Stat. 52(2):112–118, 1998.
MathSciNet Google Scholar
Seifert, J. W., Data mining: An overview. CRS Report for Congress, The Library of Congress, Dec 2004.
Hand, D., Statistics and data mining: intersecting disciplines. ACM SIGKDD 1(1):16–19, 1999.
Article Google Scholar
Ichise, R., and Numao Learning, M., First-order rules to handle medical data. NII Journal 2:9–14, 2001.
Google Scholar
Jolins, J., Ancukiewicz, M., DeLong, E., Pryor, D., Muhlbaier, L., and Mark, D., Discordance of databases designed for claims payment versus clinical information systems: implications for outcomes research. Ann. Intern. Med. 119:844–850, 1993.
Google Scholar
Dans, P., Looking for answers in all the wrong places. Ann. Intern. Med. 119:855–857, 1993.
Google Scholar
Prather, J. C., Lobach, D. F., Goodwin, L. F., Hales, J. W., Hage, M. L., and Hammond, W. E., Medical data mining knowledge discovery in a clinical data warehouse. AMIA 1091–8280:101–105, 1997.
Google Scholar
Berman, J. J., Confidentiality issues for medical data miners. Artif. Intell. Med. 26:25–36, 2002.
Article Google Scholar
Cios, K., and Moore, G. W., Uniqueness of medical data mining. Artif. Intell. Med. 26(1–2):1–24, 2002.
Article Google Scholar
Brachman, R. J., Khabaza, T., Kloesgen, W., Piatetsky-Shapiro, G., and Simoudis, E., Mining business databases. Commun. ACM 39(11):42–48, 1996.
Article Google Scholar
Velickov, S., Solomatine, D., Predictive data mining: practical examples. 2nd Joint Workshop on Applied AI in Civil Engineering, Cottbus, Germany, March 2000.
Dunham, M., Data mining—Introductory and advanced topics. Pearson Education, 2003.
Kononenko, I., Machine learning for medical diagnosis: history, state of the art and perspective. Artif. Intell. Med. 23:89–109, 2001.
Article Google Scholar
Delen, D., Walker, G., and Kadam, A., Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34:113–127, 2005.
Article Google Scholar
Anderson, J. A., and Davis, J., An introduction to neural networks. MIT, Cambride, 1995.
MATH Google Scholar
Obenshain, M. K., Application of data mining techniques to healthcare data. Infect. Control Hosp. Epidemiol. 25(8):690–695, 2004.
Article Google Scholar
Übeyli, E. D., Comparison of different classification algorithms in clinical decision making. Expert syst 24(1):17–31, 2007.
Article Google Scholar
Kaur, H., and Wasan, S. K., Empirical study on applications of data mining techniques in healthcare. J. Comput. Sci. 2(2):194–200, 2006.
Article Google Scholar
Romeo, M., Burden, F., Quinn, M., Wood, B., and McNaughton, D., Infrared microspectroscopy and artificial neural networks in the diagnosis of cervical cancer. Cell. Mol. Biol. (Noisy-le-Grand, France) 44(1):179, 1998.
Google Scholar
Ball, G., Mian, S., Holding, F., Allibone, R., Lowe, J., Ali, S., et al., An integrated approach utilizing artificial neural networks and SELDI mass spectrometry for the classification of human tumours and rapid identification of potential biomarkers. Bioinformatics 18(3):395–404, 2002.
Article Google Scholar
Aleynikov, S., and Micheli-Tzanakou, E., Classification of retinal damage by a neural network based system. J. Med. Syst. 22(3):129–136, 1998.
Article Google Scholar
Potter, R., Comparison of classification algorithms applied to breast cancer diagnosis and prognosis, advances in data mining, 7th Industrial Conference, ICDM 2007, Leipzig, Germany, July 2007, pp.40–49.
Kononenko, I., Bratko, I., and Kukar, M., Application of machine learning to medical diagnosis. Machine Learning and Data Mining: Methods and Applications 389:408, 1997.
Google Scholar
Sharma, A., and Roy, R. J., Design of a recognition system to predict movement during anesthesia. IEEE Trans. Biomed. Eng. 44(6):505–511, 1997.
Article Google Scholar
Einstein, A. J., Wu, H. S., Sanchez, M., and Gil, J., Fractal characterization of chromatin appearance for diagnosis in breast cytology. J. Pathol. 185(4):366–381, 1998.
Article Google Scholar
Brickley, M., Shepherd, J. P., and Armstrong, R. A., Neural networks: a new technique for development of decision support systems in dentistry. J. Dent. 26(4):305–309, 1998.
Article Google Scholar
Schwarzer, G., Vach, W., and Schumacher, M., On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. Stat. Med. 19:541–561, 2000.
Article Google Scholar
Craven, M. W., Shavlik, J. W., Learning symbolic rules using artificial neural networks. Proc. 10th International Conference on Machine Learning. Amherst, MA, 1993.
Quinlan, J. R., Discovering rules by induction from large collections of examples. In: Michie, D., (Ed.), Expert Systems in the Micro Electronic Age. Edinburgh University Press, 1979.
Quinlan, J. R., Learning efficient classification procedures and their application to chess endgames. In: Michalski, R. S., Carbonell, J. G., and Mitchell, T. M. (Eds.), Machine learning: an artificial intelligence approach. Tioga Publishing Company, Palo Alto, 1983.
Google Scholar
Quinlan, J. R., C4.5: programs for machine learning. Morgan Kaufmann, Amsterdam, 1993.
Google Scholar
Boser, B. E., Guyon, I. M., and Vapnik, V. N., A training algorithm for optimal margin classifiers, Fifth Annual Workshop on Computational Learning Theory. ACM, Pittsburgh, pp. 144–152, 1992.
Google Scholar
Vapnik, V. N., The nature of statistical learning theory. Springer, NY, 1995.
MATH Google Scholar
Vapnik, V. N., and Lerner, A., Pattern recognition using generalized portrait method. Autom. Remote Control 24:774–780, 1963.
Google Scholar
Vapnik, V. N., and Chervonenkis, Y., On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16:264–280, 1971.
Article MATH Google Scholar
Meyer, D., Leischa, F., and Hornikb, K., The support vector machine under test. Neurocomputing 55(1–2):169–186, 2003.
Article Google Scholar
Liu, B., Hsu, W., Ma, Y., Integrating classification and association rule mining, KDD’98. New York, NY, Aug. 1998.
Cho, S. B., and Won, H. H., Cancer classification using ensemble of neural networks with multiple significant gene subsets. Appl. Intell. 26:243–250, 2007.
Article MATH Google Scholar
Whitehead, M., and Yaeger, L., Sentiment mining using ensemble classification models. In: Sobh, T. (Ed.), Innovations and advances in computer sciences and engineering. Springer, Netherlands, pp. 509–514, 2010.
Chapter Google Scholar
Moon, H., Ahn, H., Kodell, R. L., Baek, S., Lin, C. J., and Chen, J. J., Ensemble methods for classification of patients for personalized medicine with high-dimensional data. Artif. Intell. Med. 41(3):197–207, 2007.
Article Google Scholar
Schapire, R. E., The strength of weak learnability. Mach. Learn. 5(2):197–227, 1990.
Google Scholar
Breiman, L., Bagging predictors. Mach. Learn. 24(2):123–140, 1996.
MathSciNet MATH Google Scholar
Ho, T. K., The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8):832–844, 1998.
Article Google Scholar
Ahn, H., Moon, H., Fazzari, M. J., Lim, N., Chen, J. J., and Kodell, R. L., Classification by ensembles from random partitions of high-dimensional data. Comput. Stat. Data Anal. 51:6166–6179, 2007.
Article MathSciNet MATH Google Scholar
Zhou, Z. H., et al., Lung cancer cell identification based on artificial neural network ensembles. Artif. Intell. Med. 24(1):25–36, 2002.
Article MATH Google Scholar
Santos-Garcia, G., Varela, G., Novoa, N., and Jiménez, M. F., Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble. Artif. Intell. Med. 30(1):61–69, 2004.
Article Google Scholar
Freund, Y., and Schapire, R., A desicion-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55:119–139, 1997.
Article MathSciNet MATH Google Scholar
Morra, J. H., Tu, Z., Apostolova, L. G., Green, A. E., Toga, A. W., and Thompson, P. M., Comparison of Adaboost and support vector machines for detecting Alzheimer’s disease through automated hippocampal segmentation. IEEE Trans. Med. Imag. 29(1):30–43, 2010.
Article Google Scholar
Situ, N., Yuan, X., Zouridakis, G., Boosting instance prototypes to detect local dermoscopic features, 32nd Annual International Conference of the IEEE EMBS (Buenos Aires, Argentina, 2010, Aug 31–Sep 4), pp. 5561–5564.
Douglas, P. K., Harris, S., Yuille, A., Cohen, M. S., Performance comparison of machine learning algorithms and number of independent components used in fMRI decoding of belief vs. disbelief. Neuroimage, 2010. doi:10.1016/j.neuroimage.2010.11.002.
Lopes, R., Ayache, A., Makni, N., Puech, P., Villers, A., Mordon, S., et al., Prostate cancer characterization on MR images using fractal features. Med. Phys. 38:83–95, 2011.
Article Google Scholar
Kaufman, L., Rousseeuw, P. J., Finding groups in data: an introduction to cluster analysis. Wiley, 1990.
Yoo, I., and Hu, X., A comprehensive comparison study of document clustering for a biomedical digital library MDELINE. ACM/IEEE Joint Conference on Digital Libraries 11–15:220–229, 2006. Chapel Hill, NC, June 11–15, 2006.
Google Scholar
Yoo, I., Hu, X., and Song, I.-Y., Biomedical ontology improves biomedical literature clustering performance: a comparison study. Int. J. Bioinform. Res. Appl. 3(3):414–428, 2007.
Article Google Scholar
Piatetsky-Shapiro, G., Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro, G., (Ed.), Knowledge Discovery in Databases. AAAI/MIT Press, 1991, pp. 229–248.
Agrawal, R., Imielinski, T., and Swami, A., Mining association rules between sets of items in large databases, Proceedings of the ACM SIGMOD International Conference on the Management of Data. ACM, Washington DC, pp. 207–216, 1993.
Google Scholar
Agrawal, R., and Srikant, R., Fast algorithms for mining association rules, Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). Morgan Kaufmann, Santiago, pp. 487–499, 1994.
Google Scholar
Park, J. S., Chen, M. S., Yu, P. S., An effective hash-based algorithm for mining association rules, Proceedings 1995 ACM SIGMOD International Conference on Management of Data (SIGMOD’95), San Jose, CA (May 1995), pp. 175–186.
Toivonen, H., Sampling large databases for association rules, Proceedings 1996 International Conference on Very Large Databases (VLDB’96), Bombay, India (Sept. 1996), pp.134–145.
Steinbach, M., Karypis, G., Kumar, V., A comparison of document clustering techniques, Technical Report #00-034. Department of Computer Science and Engineering, University of Minnesota, 2000.
SAS. First Things First—Highmark makes healthcare-fraud prevention top priority with SAS. 2006a. http://www.sas.com/success/pdf/highmarkfraud.pdf.
SAS. Highmark maximizes Medicare revenues with SAS. 2006b http://www.sas.com/success/pdf/highmark.pdf.
SAS. Healthways Heads Off Increased Costs with SAS. 2009. http://www.sas.com/success/pdf/healthways.pdf.
Golub, T. R., et al., Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537, 1999.
Article Google Scholar
Hu, H., Li, J., Plank, A., Wang, H., Daggard, G., A comparative study of classification methods for microarray data analysis. CRPIT Volume 61, Proceedings Fifth Australasian Data Mining Conference. 2006. p. 33–37.
Ries, L. A. G., Harkins, D., Krapcho, M., et al., SEER Cancer Statistics Review, 1975–2003. National Cancer Institute, Bethesda, 2006.
Google Scholar
Van’t Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A., Mao, M., Peterse, H. L., et al., Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536, 2002.
Article Google Scholar
Weka Version 3.5.5, University of Waikato, Waikato, New Zealand, 1999–2007, http://www.cs.waikato.ac.nz/ml/weka/.
Cox, D. R., Analysis of survival data. Chapman & Hall, London, 1984.
Google Scholar
Shah, S., Kusiak, A., and Dixon, B., Data Mining in predicting survival of kidney dialysis patients, Proceedings of Photonics West—Bios 2003. In: Bass, L. S., et al. (Eds.), Lasers in surgery: advanced characterization, therapeutics, and systems XIII, 4949. SPIE, Belingham, 2003.
Google Scholar
Beller, G., The rising cost of health care in the United States: is it making the United States globally noncompetitive? J. Nucl. Cardiol. 15(4):481–482, 2008.
Article Google Scholar
Bertsimas, D., Bjarnadóttir, M. V., Kane, M. A., Kryder, J. C., Pandey, R., Vempala, S., and Wang, G., Algorithmic prediction of health-care costs. Oper. Res. 56(6):1382–1392, 2008.
Article MATH Google Scholar
Kerr, G., Ruskin, H. J., Crane, M., and Doolan, P., Techniques for clustering gene expression data. Comput. Biol. Med. 38(3):283–293, 2008.
Article Google Scholar
Do, J. H., and Choi, D. K., Clustering approaches to identifying gene expression patterns from DNA microarray data. Mol. Cells 25(2):279–288, 2008.
Google Scholar
Chae, Y. M., Ho, S. H., Cho, K. W., Lee, D. H., and Ji, S. H., Data mining approach to policy analysis in a health insurance domain. Int. J. Med. Inform. 62:103–111, 2001.
Article Google Scholar
Adler, L. D., and Nierenberg, A. A., Review of medication adherence in children and adults with ADHD. Postgrad. Med. 122(1):184–191, 2010.
Article Google Scholar
Tsai, M. H., and Huang, Y. S., Attention-deficit/hyperactivity disorder and sleep disorders in children. Med. Clin. North Am. 94(3):615–632, 2010.
Article Google Scholar
Kessler, R. C., Adler, L. A., Barkley, R., et al., The prevalence and correlates of adult ADHD in the United States: results from the National Comorbidity Survey Replication. Am. J. Psychiatry 163(4):716–723, 2006.
Article Google Scholar
Gau, S., Chong, M., Chen, T., and Cheng, A., A 3-year panel study of mental disorders among adolescents in Taiwan. Am. J. Psychiatry 162(7):1344–1350, 2005.
Article Google Scholar
Tai, Y. M., and Chiu, H. W., Comorbidity study of ADHD: applying association rule mining (ARM) to National Health Insurance Database of Taiwan. Int. J. Med. Inform. 78:75–83, 2009.
Article Google Scholar
Chen, T. J., Chou, L. F., and Hwang, S. J., Application of a data-mining technique to analyze coprescription patterns for antacids in Taiwan. Clin. Ther. 25(9):2453–2463, 2003.
Article Google Scholar
Breault, J. L., Data mining diabetic databases: are rough sets a useful addition? Proceedings of the 33rd Symposium on the Interface. Computing Science and Statistics, Fairfax, 2001.
Google Scholar
Goodwin, L., and Iannacchione, M. A., Data mining methods for improving birth outcomes prediction. Outcomes Manage. 6(2):80–85, 2002.
Google Scholar
Breault, J. L., Goodall, C. R., and Fos, P. J., Data mining a diabetic data warehouse. Artif. Intell. Med. 26:37–54, 2002.
Article Google Scholar
Andrews, P. J., Sleeman, D. H., Statham, P. F. X., Mcquatt, A., Corruble, V., Jones, P. A., et al., Predicting recovery in patients suffering from traumatic brain injury by using admission variables and physiological data: a comparison between decision tree analysis and logistic regression. J. Neurosurg. 97:326–336, 2002.
Article Google Scholar
Goodwin, L., VanDyne, M., Lin, S., and Talbert, S., Data mining issues and opportunities for building nursing knowledge. J. Biomed. Inform. 36:379–388, 2003.
Article Google Scholar
Nevins, J. R., Huang, E. S., Dressman, H., Pittman, J., Huang, A. T., and West, M., Towards integrated clinico-genomic models for personalized medicine: combining gene expression signatures and clinical factors in breast cancer outcomes prediction, Human Molecular Genetics 12. Review Issue 2:R153–R157, 2003.
Google Scholar
Sigurdardottir, A. K., Jonsdottir, H., and Benediktsson, R., Outcomes of educational interventions in type 2 diabetes: WEKA data-mining analysis. Patient Educ. Couns. 67:21–31, 2007.
Article Google Scholar
Huang, L., Hsu, S., Lin, E., A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data. Journal of Translational Medicine. 7–81, 2009.
Toussi, M., Lamy, J., Le Toumelin, P., Venot, A., Using data mining techniques to explore physicians’ therapeutic decisions when clinical guidelines do not provide recommendations: methods and example for type 2 diabetes. BMC Med. Informat. Decis. Making 9–28, 2009.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. H., The WEKA data mining software: an update. SIGKDD Explorations 11(1), 2009.

Download references

Author information

Authors and Affiliations

Health Management and Informatics Department, University of Missouri School of Medicine, CE 718, DC006.00, Columbia, MO, 65212, USA
Illhoi Yoo
Health Management and Informatics Department, University of Missouri School of Medicine, CE 734, DC006.00, Columbia, MO, 65212, USA
Patricia Alafaireet
Informatics Institute, University of Missouri, Columbia, MO, 65211, USA
Illhoi Yoo, Miroslav Marinov, Keila Pena-Hernandez, Rajitha Gopidi, Jia-Fu Chang & Lei Hua

Authors

Illhoi Yoo
View author publications
You can also search for this author in PubMed Google Scholar
Patricia Alafaireet
View author publications
You can also search for this author in PubMed Google Scholar
Miroslav Marinov
View author publications
You can also search for this author in PubMed Google Scholar
Keila Pena-Hernandez
View author publications
You can also search for this author in PubMed Google Scholar
Rajitha Gopidi
View author publications
You can also search for this author in PubMed Google Scholar
Jia-Fu Chang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Hua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Illhoi Yoo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoo, I., Alafaireet, P., Marinov, M. et al. Data Mining in Healthcare and Biomedicine: A Survey of the Literature. J Med Syst 36, 2431–2448 (2012). https://doi.org/10.1007/s10916-011-9710-5

Download citation

Received: 07 February 2011
Accepted: 07 April 2011
Published: 03 May 2011
Issue Date: August 2012
DOI: https://doi.org/10.1007/s10916-011-9710-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data Mining in Healthcare and Biomedicine: A Survey of the Literature

Abstract

Access this article

Similar content being viewed by others

The role of artificial intelligence in healthcare: a structured literature review

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Big data in healthcare: management, analysis and future prospects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data Mining in Healthcare and Biomedicine: A Survey of the Literature

Abstract

Access this article

Similar content being viewed by others

The role of artificial intelligence in healthcare: a structured literature review

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Big data in healthcare: management, analysis and future prospects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation