Journal of Medical Systems

, Volume 36, Issue 4, pp 2431–2448 | Cite as

Data Mining in Healthcare and Biomedicine: A Survey of the Literature

  • Illhoi YooEmail author
  • Patricia Alafaireet
  • Miroslav Marinov
  • Keila Pena-Hernandez
  • Rajitha Gopidi
  • Jia-Fu Chang
  • Lei Hua


As a new concept that emerged in the middle of 1990’s, data mining can help researchers gain both novel and deep insights and can facilitate unprecedented understanding of large biomedical datasets. Data mining can uncover new biomedical and healthcare knowledge for clinical and administrative decision making as well as generate scientific hypotheses from large experimental data, clinical databases, and/or biomedical literature. This review first introduces data mining in general (e.g., the background, definition, and process of data mining), discusses the major differences between statistics and data mining and then speaks to the uniqueness of data mining in the biomedical and healthcare fields. A brief summarization of various data mining algorithms used for classification, clustering, and association as well as their respective advantages and drawbacks is also presented. Suggested guidelines on how to use data mining algorithms in each area of classification, clustering, and association are offered along with three examples of how data mining has been used in the healthcare industry. Given the successful application of data mining by health related organizations that has helped to predict health insurance fraud and under-diagnosed patients, and identify and classify at-risk people in terms of health with the goal of reducing healthcare cost, we introduce how data mining technologies (in each area of classification, clustering, and association) have been used for a multitude of purposes, including research in the biomedical and healthcare fields. A discussion of the technologies available to enable the prediction of healthcare costs (including length of hospital stay), disease diagnosis and prognosis, and the discovery of hidden biomedical and healthcare patterns from related databases is offered along with a discussion of the use of data mining to discover such relationships as those between health conditions and a disease, relationships among diseases, and relationships among drugs. The article concludes with a discussion of the problems that hamper the clinical use of data mining by health professionals.


Data mining Review Healthcare Biomedicine 


  1. 1.
    The Technology Review Ten, MIT Technology Review (January/February 2001).Google Scholar
  2. 2.
    Larose, D. T., Discovering knowledge in data: an introduction to data mining. Wiley, 2004.Google Scholar
  3. 3.
    Hand, D., Mannila, H., Smyth, P., Principles of data mining. MIT, 2001.Google Scholar
  4. 4.
    Yoo, I., Song, M., Biomedical ontologies and text mining for biomedicine and healthcare: a survey. Journal of Computing Science and Engineering 2(2):109–36, 2008. ( Scholar
  5. 5.
    Richards, G., Rayward-Smith, V. J., Sönksen, P. H., Carey, S., and Weng, C., Data mining for indicators of early mortality in a database of clinical records. Artif. Intell. Med. 22:215–231, 2001.CrossRefGoogle Scholar
  6. 6.
    Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P., The KDD process of extracting useful knowledge from volumes of data. Commun. ACM 39(11):27–34, 1996.CrossRefGoogle Scholar
  7. 7.
    Berger, A., and Berger, C., Data mining as a tool for research and knowledge development in nursing. Comput. Inform. Nurs. 22(3):123–131, 2004.CrossRefGoogle Scholar
  8. 8.
    Shearer, C., The CRISP-DM model: the new blueprint for data mining. J Data Warehous 5(4):13–22, 2000.Google Scholar
  9. 9.
    Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P., From data mining to knowledge discovery in databases. Commun. ACM 39(11):24–26, 1996.CrossRefGoogle Scholar
  10. 10.
    Han, J., Kamber, M., Data mining: concepts and techniques. 2nd ed. The Morgan Kaufmann Series, 2006.Google Scholar
  11. 11.
    Silver, M., Sakara, T., Su, H. C., Herman, C., Dolins, S. B., and O’shea, M. J., Case study: how to apply data mining techniques in a healthcare data warehouse. J. Healthc. Inf. Manage. 15(2):155–164, 2001.Google Scholar
  12. 12.
    Harper, P. R., A review and comparison of classification algorithms for medical decision making. Health Policy 71:315–331, 2005.CrossRefGoogle Scholar
  13. 13.
    Sierra, B., and Larranaga, P., Predicting survival in malignant skin melanoma using Bayesian networks automatically induced by genetic algorithms. An empirical comparison between different approaches. Artif. Intell. Med. 14:215–230, 1998.CrossRefGoogle Scholar
  14. 14.
    Eastwood, E. A., Magaziner, J., Wang, J., Silberzweig, S. B., Hannan, E. L., Strauss, E., et al., Patients with hip fracture: subgroups and their outcomes. J. Am. Geriatr. Soc. 50:1240–1249, 2002.CrossRefGoogle Scholar
  15. 15.
    Stel, V. S., Pluijm, S. M., Deeg, D. J., Smit, J. H., Bouter, L. M., and Lips, P., A classification tree for predicting recurrent falling in community-dwelling older persons. J. Am. Geriatr. Soc. 51:1356–1364, 2003.CrossRefGoogle Scholar
  16. 16.
    Yu, J. S., Ongarello, S., Fiedler, R., Chen, X. W., Toffolo, G., Cobelli, C., and Trajanoski, Z., Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data. Bioinformatics 21:2200–2209, 2005.CrossRefGoogle Scholar
  17. 17.
    Adam, B. L., Qu, Y., Davis, J. W., Ward, M. D., Clements, M. A., Cazares, L. H., et al., Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res. 62:3609–3614, 2002.Google Scholar
  18. 18.
    Petricoin, E. F., Ardekani, A. M., Hitt, B. A., Levine, P. J., Fusaro, V. A., Steinberg, S. M., et al., Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359:572–577, 2002.CrossRefGoogle Scholar
  19. 19.
    Bellazzi, R., and Zupan, B., Predictive data mining in clinical medicine: current issues and guidelines. Int. J. Med. Inform. 77:81–97, 2008.CrossRefGoogle Scholar
  20. 20.
    Hand, D., Data mining: statistic or more? Am. Stat. 52(2):112–118, 1998.MathSciNetGoogle Scholar
  21. 21.
    Seifert, J. W., Data mining: An overview. CRS Report for Congress, The Library of Congress, Dec 2004.Google Scholar
  22. 22.
    Hand, D., Statistics and data mining: intersecting disciplines. ACM SIGKDD 1(1):16–19, 1999.CrossRefGoogle Scholar
  23. 23.
    Ichise, R., and Numao Learning, M., First-order rules to handle medical data. NII Journal 2:9–14, 2001.Google Scholar
  24. 24.
    Jolins, J., Ancukiewicz, M., DeLong, E., Pryor, D., Muhlbaier, L., and Mark, D., Discordance of databases designed for claims payment versus clinical information systems: implications for outcomes research. Ann. Intern. Med. 119:844–850, 1993.Google Scholar
  25. 25.
    Dans, P., Looking for answers in all the wrong places. Ann. Intern. Med. 119:855–857, 1993.Google Scholar
  26. 26.
    Prather, J. C., Lobach, D. F., Goodwin, L. F., Hales, J. W., Hage, M. L., and Hammond, W. E., Medical data mining knowledge discovery in a clinical data warehouse. AMIA 1091–8280:101–105, 1997.Google Scholar
  27. 27.
    Berman, J. J., Confidentiality issues for medical data miners. Artif. Intell. Med. 26:25–36, 2002.CrossRefGoogle Scholar
  28. 28.
    Cios, K., and Moore, G. W., Uniqueness of medical data mining. Artif. Intell. Med. 26(1–2):1–24, 2002.CrossRefGoogle Scholar
  29. 29.
    Brachman, R. J., Khabaza, T., Kloesgen, W., Piatetsky-Shapiro, G., and Simoudis, E., Mining business databases. Commun. ACM 39(11):42–48, 1996.CrossRefGoogle Scholar
  30. 30.
    Velickov, S., Solomatine, D., Predictive data mining: practical examples. 2nd Joint Workshop on Applied AI in Civil Engineering, Cottbus, Germany, March 2000.Google Scholar
  31. 31.
    Dunham, M., Data mining—Introductory and advanced topics. Pearson Education, 2003.Google Scholar
  32. 32.
    Kononenko, I., Machine learning for medical diagnosis: history, state of the art and perspective. Artif. Intell. Med. 23:89–109, 2001.CrossRefGoogle Scholar
  33. 33.
    Delen, D., Walker, G., and Kadam, A., Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34:113–127, 2005.CrossRefGoogle Scholar
  34. 34.
    Anderson, J. A., and Davis, J., An introduction to neural networks. MIT, Cambride, 1995.zbMATHGoogle Scholar
  35. 35.
    Obenshain, M. K., Application of data mining techniques to healthcare data. Infect. Control Hosp. Epidemiol. 25(8):690–695, 2004.CrossRefGoogle Scholar
  36. 36.
    Übeyli, E. D., Comparison of different classification algorithms in clinical decision making. Expert syst 24(1):17–31, 2007.CrossRefGoogle Scholar
  37. 37.
    Kaur, H., and Wasan, S. K., Empirical study on applications of data mining techniques in healthcare. J. Comput. Sci. 2(2):194–200, 2006.CrossRefGoogle Scholar
  38. 38.
    Romeo, M., Burden, F., Quinn, M., Wood, B., and McNaughton, D., Infrared microspectroscopy and artificial neural networks in the diagnosis of cervical cancer. Cell. Mol. Biol. (Noisy-le-Grand, France) 44(1):179, 1998.Google Scholar
  39. 39.
    Ball, G., Mian, S., Holding, F., Allibone, R., Lowe, J., Ali, S., et al., An integrated approach utilizing artificial neural networks and SELDI mass spectrometry for the classification of human tumours and rapid identification of potential biomarkers. Bioinformatics 18(3):395–404, 2002.CrossRefGoogle Scholar
  40. 40.
    Aleynikov, S., and Micheli-Tzanakou, E., Classification of retinal damage by a neural network based system. J. Med. Syst. 22(3):129–136, 1998.CrossRefGoogle Scholar
  41. 41.
    Potter, R., Comparison of classification algorithms applied to breast cancer diagnosis and prognosis, advances in data mining, 7th Industrial Conference, ICDM 2007, Leipzig, Germany, July 2007, pp.40–49.Google Scholar
  42. 42.
    Kononenko, I., Bratko, I., and Kukar, M., Application of machine learning to medical diagnosis. Machine Learning and Data Mining: Methods and Applications 389:408, 1997.Google Scholar
  43. 43.
    Sharma, A., and Roy, R. J., Design of a recognition system to predict movement during anesthesia. IEEE Trans. Biomed. Eng. 44(6):505–511, 1997.CrossRefGoogle Scholar
  44. 44.
    Einstein, A. J., Wu, H. S., Sanchez, M., and Gil, J., Fractal characterization of chromatin appearance for diagnosis in breast cytology. J. Pathol. 185(4):366–381, 1998.CrossRefGoogle Scholar
  45. 45.
    Brickley, M., Shepherd, J. P., and Armstrong, R. A., Neural networks: a new technique for development of decision support systems in dentistry. J. Dent. 26(4):305–309, 1998.CrossRefGoogle Scholar
  46. 46.
    Schwarzer, G., Vach, W., and Schumacher, M., On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. Stat. Med. 19:541–561, 2000.CrossRefGoogle Scholar
  47. 47.
    Craven, M. W., Shavlik, J. W., Learning symbolic rules using artificial neural networks. Proc. 10th International Conference on Machine Learning. Amherst, MA, 1993.Google Scholar
  48. 48.
    Quinlan, J. R., Discovering rules by induction from large collections of examples. In: Michie, D., (Ed.), Expert Systems in the Micro Electronic Age. Edinburgh University Press, 1979.Google Scholar
  49. 49.
    Quinlan, J. R., Learning efficient classification procedures and their application to chess endgames. In: Michalski, R. S., Carbonell, J. G., and Mitchell, T. M. (Eds.), Machine learning: an artificial intelligence approach. Tioga Publishing Company, Palo Alto, 1983.Google Scholar
  50. 50.
    Quinlan, J. R., C4.5: programs for machine learning. Morgan Kaufmann, Amsterdam, 1993.Google Scholar
  51. 51.
    Boser, B. E., Guyon, I. M., and Vapnik, V. N., A training algorithm for optimal margin classifiers, Fifth Annual Workshop on Computational Learning Theory. ACM, Pittsburgh, pp. 144–152, 1992.Google Scholar
  52. 52.
    Vapnik, V. N., The nature of statistical learning theory. Springer, NY, 1995.zbMATHGoogle Scholar
  53. 53.
    Vapnik, V. N., and Lerner, A., Pattern recognition using generalized portrait method. Autom. Remote Control 24:774–780, 1963.Google Scholar
  54. 54.
    Vapnik, V. N., and Chervonenkis, Y., On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16:264–280, 1971.zbMATHCrossRefGoogle Scholar
  55. 55.
    Meyer, D., Leischa, F., and Hornikb, K., The support vector machine under test. Neurocomputing 55(1–2):169–186, 2003.CrossRefGoogle Scholar
  56. 56.
    Liu, B., Hsu, W., Ma, Y., Integrating classification and association rule mining, KDD’98. New York, NY, Aug. 1998.Google Scholar
  57. 57.
    Cho, S. B., and Won, H. H., Cancer classification using ensemble of neural networks with multiple significant gene subsets. Appl. Intell. 26:243–250, 2007.zbMATHCrossRefGoogle Scholar
  58. 58.
    Whitehead, M., and Yaeger, L., Sentiment mining using ensemble classification models. In: Sobh, T. (Ed.), Innovations and advances in computer sciences and engineering. Springer, Netherlands, pp. 509–514, 2010.CrossRefGoogle Scholar
  59. 59.
    Moon, H., Ahn, H., Kodell, R. L., Baek, S., Lin, C. J., and Chen, J. J., Ensemble methods for classification of patients for personalized medicine with high-dimensional data. Artif. Intell. Med. 41(3):197–207, 2007.CrossRefGoogle Scholar
  60. 60.
    Schapire, R. E., The strength of weak learnability. Mach. Learn. 5(2):197–227, 1990.Google Scholar
  61. 61.
    Breiman, L., Bagging predictors. Mach. Learn. 24(2):123–140, 1996.MathSciNetzbMATHGoogle Scholar
  62. 62.
    Ho, T. K., The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8):832–844, 1998.CrossRefGoogle Scholar
  63. 63.
    Ahn, H., Moon, H., Fazzari, M. J., Lim, N., Chen, J. J., and Kodell, R. L., Classification by ensembles from random partitions of high-dimensional data. Comput. Stat. Data Anal. 51:6166–6179, 2007.MathSciNetzbMATHCrossRefGoogle Scholar
  64. 64.
    Zhou, Z. H., et al., Lung cancer cell identification based on artificial neural network ensembles. Artif. Intell. Med. 24(1):25–36, 2002.zbMATHCrossRefGoogle Scholar
  65. 65.
    Santos-Garcia, G., Varela, G., Novoa, N., and Jiménez, M. F., Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble. Artif. Intell. Med. 30(1):61–69, 2004.CrossRefGoogle Scholar
  66. 66.
    Freund, Y., and Schapire, R., A desicion-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55:119–139, 1997.MathSciNetzbMATHCrossRefGoogle Scholar
  67. 67.
    Morra, J. H., Tu, Z., Apostolova, L. G., Green, A. E., Toga, A. W., and Thompson, P. M., Comparison of Adaboost and support vector machines for detecting Alzheimer’s disease through automated hippocampal segmentation. IEEE Trans. Med. Imag. 29(1):30–43, 2010.CrossRefGoogle Scholar
  68. 68.
    Situ, N., Yuan, X., Zouridakis, G., Boosting instance prototypes to detect local dermoscopic features, 32nd Annual International Conference of the IEEE EMBS (Buenos Aires, Argentina, 2010, Aug 31–Sep 4), pp. 5561–5564.Google Scholar
  69. 69.
    Douglas, P. K., Harris, S., Yuille, A., Cohen, M. S., Performance comparison of machine learning algorithms and number of independent components used in fMRI decoding of belief vs. disbelief. Neuroimage, 2010. doi: 10.1016/j.neuroimage.2010.11.002.
  70. 70.
    Lopes, R., Ayache, A., Makni, N., Puech, P., Villers, A., Mordon, S., et al., Prostate cancer characterization on MR images using fractal features. Med. Phys. 38:83–95, 2011.CrossRefGoogle Scholar
  71. 71.
    Kaufman, L., Rousseeuw, P. J., Finding groups in data: an introduction to cluster analysis. Wiley, 1990.Google Scholar
  72. 72.
    Yoo, I., and Hu, X., A comprehensive comparison study of document clustering for a biomedical digital library MDELINE. ACM/IEEE Joint Conference on Digital Libraries 11–15:220–229, 2006. Chapel Hill, NC, June 11–15, 2006.Google Scholar
  73. 73.
    Yoo, I., Hu, X., and Song, I.-Y., Biomedical ontology improves biomedical literature clustering performance: a comparison study. Int. J. Bioinform. Res. Appl. 3(3):414–428, 2007.CrossRefGoogle Scholar
  74. 74.
    Piatetsky-Shapiro, G., Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro, G., (Ed.), Knowledge Discovery in Databases. AAAI/MIT Press, 1991, pp. 229–248.Google Scholar
  75. 75.
    Agrawal, R., Imielinski, T., and Swami, A., Mining association rules between sets of items in large databases, Proceedings of the ACM SIGMOD International Conference on the Management of Data. ACM, Washington DC, pp. 207–216, 1993.Google Scholar
  76. 76.
    Agrawal, R., and Srikant, R., Fast algorithms for mining association rules, Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). Morgan Kaufmann, Santiago, pp. 487–499, 1994.Google Scholar
  77. 77.
    Park, J. S., Chen, M. S., Yu, P. S., An effective hash-based algorithm for mining association rules, Proceedings 1995 ACM SIGMOD International Conference on Management of Data (SIGMOD’95), San Jose, CA (May 1995), pp. 175–186.Google Scholar
  78. 78.
    Toivonen, H., Sampling large databases for association rules, Proceedings 1996 International Conference on Very Large Databases (VLDB’96), Bombay, India (Sept. 1996), pp.134–145.Google Scholar
  79. 79.
    Steinbach, M., Karypis, G., Kumar, V., A comparison of document clustering techniques, Technical Report #00-034. Department of Computer Science and Engineering, University of Minnesota, 2000.Google Scholar
  80. 80.
    SAS. First Things First—Highmark makes healthcare-fraud prevention top priority with SAS. 2006a.
  81. 81.
    SAS. Highmark maximizes Medicare revenues with SAS. 2006b
  82. 82.
    SAS. Healthways Heads Off Increased Costs with SAS. 2009.
  83. 83.
    Golub, T. R., et al., Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537, 1999.CrossRefGoogle Scholar
  84. 84.
    Hu, H., Li, J., Plank, A., Wang, H., Daggard, G., A comparative study of classification methods for microarray data analysis. CRPIT Volume 61, Proceedings Fifth Australasian Data Mining Conference. 2006. p. 33–37.Google Scholar
  85. 85.
    Ries, L. A. G., Harkins, D., Krapcho, M., et al., SEER Cancer Statistics Review, 1975–2003. National Cancer Institute, Bethesda, 2006.Google Scholar
  86. 86.
    Van’t Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A., Mao, M., Peterse, H. L., et al., Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536, 2002.CrossRefGoogle Scholar
  87. 87.
    Weka Version 3.5.5, University of Waikato, Waikato, New Zealand, 1999–2007,
  88. 88.
    Cox, D. R., Analysis of survival data. Chapman & Hall, London, 1984.Google Scholar
  89. 89.
    Shah, S., Kusiak, A., and Dixon, B., Data Mining in predicting survival of kidney dialysis patients, Proceedings of Photonics West—Bios 2003. In: Bass, L. S., et al. (Eds.), Lasers in surgery: advanced characterization, therapeutics, and systems XIII, 4949. SPIE, Belingham, 2003.Google Scholar
  90. 90.
    Beller, G., The rising cost of health care in the United States: is it making the United States globally noncompetitive? J. Nucl. Cardiol. 15(4):481–482, 2008.CrossRefGoogle Scholar
  91. 91.
    Bertsimas, D., Bjarnadóttir, M. V., Kane, M. A., Kryder, J. C., Pandey, R., Vempala, S., and Wang, G., Algorithmic prediction of health-care costs. Oper. Res. 56(6):1382–1392, 2008.zbMATHCrossRefGoogle Scholar
  92. 92.
    Kerr, G., Ruskin, H. J., Crane, M., and Doolan, P., Techniques for clustering gene expression data. Comput. Biol. Med. 38(3):283–293, 2008.CrossRefGoogle Scholar
  93. 93.
    Do, J. H., and Choi, D. K., Clustering approaches to identifying gene expression patterns from DNA microarray data. Mol. Cells 25(2):279–288, 2008.Google Scholar
  94. 94.
    Chae, Y. M., Ho, S. H., Cho, K. W., Lee, D. H., and Ji, S. H., Data mining approach to policy analysis in a health insurance domain. Int. J. Med. Inform. 62:103–111, 2001.CrossRefGoogle Scholar
  95. 95.
    Adler, L. D., and Nierenberg, A. A., Review of medication adherence in children and adults with ADHD. Postgrad. Med. 122(1):184–191, 2010.CrossRefGoogle Scholar
  96. 96.
    Tsai, M. H., and Huang, Y. S., Attention-deficit/hyperactivity disorder and sleep disorders in children. Med. Clin. North Am. 94(3):615–632, 2010.CrossRefGoogle Scholar
  97. 97.
    Kessler, R. C., Adler, L. A., Barkley, R., et al., The prevalence and correlates of adult ADHD in the United States: results from the National Comorbidity Survey Replication. Am. J. Psychiatry 163(4):716–723, 2006.CrossRefGoogle Scholar
  98. 98.
    Gau, S., Chong, M., Chen, T., and Cheng, A., A 3-year panel study of mental disorders among adolescents in Taiwan. Am. J. Psychiatry 162(7):1344–1350, 2005.CrossRefGoogle Scholar
  99. 99.
    Tai, Y. M., and Chiu, H. W., Comorbidity study of ADHD: applying association rule mining (ARM) to National Health Insurance Database of Taiwan. Int. J. Med. Inform. 78:75–83, 2009.CrossRefGoogle Scholar
  100. 100.
    Chen, T. J., Chou, L. F., and Hwang, S. J., Application of a data-mining technique to analyze coprescription patterns for antacids in Taiwan. Clin. Ther. 25(9):2453–2463, 2003.CrossRefGoogle Scholar
  101. 101.
    Breault, J. L., Data mining diabetic databases: are rough sets a useful addition? Proceedings of the 33rd Symposium on the Interface. Computing Science and Statistics, Fairfax, 2001.Google Scholar
  102. 102.
    Goodwin, L., and Iannacchione, M. A., Data mining methods for improving birth outcomes prediction. Outcomes Manage. 6(2):80–85, 2002.Google Scholar
  103. 103.
    Breault, J. L., Goodall, C. R., and Fos, P. J., Data mining a diabetic data warehouse. Artif. Intell. Med. 26:37–54, 2002.CrossRefGoogle Scholar
  104. 104.
    Andrews, P. J., Sleeman, D. H., Statham, P. F. X., Mcquatt, A., Corruble, V., Jones, P. A., et al., Predicting recovery in patients suffering from traumatic brain injury by using admission variables and physiological data: a comparison between decision tree analysis and logistic regression. J. Neurosurg. 97:326–336, 2002.CrossRefGoogle Scholar
  105. 105.
    Goodwin, L., VanDyne, M., Lin, S., and Talbert, S., Data mining issues and opportunities for building nursing knowledge. J. Biomed. Inform. 36:379–388, 2003.CrossRefGoogle Scholar
  106. 106.
    Nevins, J. R., Huang, E. S., Dressman, H., Pittman, J., Huang, A. T., and West, M., Towards integrated clinico-genomic models for personalized medicine: combining gene expression signatures and clinical factors in breast cancer outcomes prediction, Human Molecular Genetics 12. Review Issue 2:R153–R157, 2003.Google Scholar
  107. 107.
    Sigurdardottir, A. K., Jonsdottir, H., and Benediktsson, R., Outcomes of educational interventions in type 2 diabetes: WEKA data-mining analysis. Patient Educ. Couns. 67:21–31, 2007.CrossRefGoogle Scholar
  108. 108.
    Huang, L., Hsu, S., Lin, E., A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data. Journal of Translational Medicine. 7–81, 2009.Google Scholar
  109. 109.
    Toussi, M., Lamy, J., Le Toumelin, P., Venot, A., Using data mining techniques to explore physicians’ therapeutic decisions when clinical guidelines do not provide recommendations: methods and example for type 2 diabetes. BMC Med. Informat. Decis. Making 9–28, 2009.Google Scholar
  110. 110.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. H., The WEKA data mining software: an update. SIGKDD Explorations 11(1), 2009.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Illhoi Yoo
    • 1
    • 3
    Email author
  • Patricia Alafaireet
    • 2
  • Miroslav Marinov
    • 3
  • Keila Pena-Hernandez
    • 3
  • Rajitha Gopidi
    • 3
  • Jia-Fu Chang
    • 3
  • Lei Hua
    • 3
  1. 1.Health Management and Informatics DepartmentUniversity of Missouri School of MedicineColumbiaUSA
  2. 2.Health Management and Informatics DepartmentUniversity of Missouri School of MedicineColumbiaUSA
  3. 3.Informatics InstituteUniversity of MissouriColumbiaUSA

Personalised recommendations