Skip to main content

Data Mining Techniques in Health Informatics: A Case Study from Breast Cancer Research

  • Conference paper
  • First Online:
Information Technology in Bio- and Medical Informatics (ITBAM 2015)

Abstract

This paper presents a case study of using data mining techniques in the analysis of diagnosis and treatment events related to Breast Cancer disease. Data from over 16,000 patients has been pre-processed and several data mining techniques have been implemented by using Weka (Waikato Environment for Knowledge Analysis). In particular, Generalized Sequential Patterns mining has been used to discover frequent patterns from disease event sequence profiles based on groups of living and deceased patients. Furthermore, five models have been evaluated in Classification with the objective to classify the patients based on selected attributes. This research showcases the data mining process and techniques to transform large amounts of patient data into useful information and potentially valuable patterns to help understand cancer outcomes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Burke, H.B., Rosen, D., Goodman, P.: Comparing the prediction accuracy of artificial neural networks and other statistical models for breast cancer survival. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, pp. 1063–1068. MIT Press, Cambridge (1995)

    Google Scholar 

  2. Campbell, K., Thygeson, N.N., Srivastava, J., Speedie, S.: Exploration of Classification Techniques as a Treatment Decision Support Tool for Patients with Uterine Fibroids. In: International Workshop on Data Mining for HealthCare Management, PAKDD (2010)

    Google Scholar 

  3. Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)

    Article  Google Scholar 

  4. Fayyad, U., PiatetskyShapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Magazine. 17(3), 37–54 (1996)

    Google Scholar 

  5. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann. (2011)

    Google Scholar 

  6. Jacob, S.G., Ramani, R.G.: Data mining in clinical data sets: a review. Int. J. Appl. Inf. Syst. 4(6), 15–16 (2012)

    Google Scholar 

  7. Jerez-Aragones, J.M., Gomez-Ruiz, J.A., Ramos-Jimenez, G., MunozPerez, J., Alba-Conejo, E.: A combined neural network and decision trees model for prognosis of breast cancer relapse. Artif. Intell. Med. 27(1), 45–63 (2003)

    Article  Google Scholar 

  8. Holzinger, A.: Trends in interactive knowledge discovery for personalized medicine: cognitive science meets machine learning. IEEE Intell. Inform. Bull. 15(1), 6–14 (2014)

    Google Scholar 

  9. Laxminarayan, P., Alvarez, S.A., Ruiz, C., Moonis, M.: Mining statistically significant associations for exploratory analysis of human sleep data. IEEE Trans. Inf Technol. Biomed. 10(3), 440–450 (2006)

    Article  Google Scholar 

  10. Lee, Y.J., Mangasarian, O.L., Wolberg, W.H.: Survival-time classification of breast cancer patients. Comput. Optim. Appl. 25(1–3), 151–166 (2003)

    Article  MathSciNet  Google Scholar 

  11. Li, Q., Feng, J., Wang, L., Chu, H., Yu, H.: Method for knowledge acquisition and decision-making process analysis in clinical decision support system. In: Bursa, M., Khuri, S., Renda, M. (eds.) ITBAM 2014. LNCS, vol. 8649, pp. 79–82. Springer, Heidelberg (2014)

    Google Scholar 

  12. Lu, J., Chen, W.R., Adjei, O., Keech, M.: Sequential patterns post-processing for structural relation patterns mining. Int. J. Data Warehousing and Mining 4(3), 71–89 (2008). IGI Global, Hershey, Pennsylvania

    Article  Google Scholar 

  13. Mahajan, R., Shneiderman, B.: Visual and textual consistency checking tools for graphical user interfaces. IEEE Trans. Software Eng. 23(11), 722–735 (1997)

    Article  Google Scholar 

  14. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations 11(1), 10–11 (2009)

    Article  Google Scholar 

  15. Martin, M.A., Meyricke, R., O’Neill, T., Roberts, S.: Mastectomy or breast conserving surgery? factors affecting type of surgical treatment for breast cancer: a classification tree approach. BMC Cancer 6, 98 (2006)

    Article  Google Scholar 

  16. Quinlan, J. Ross. C4.5: Programs for Machine Learning. Elsevier (2014)

    Google Scholar 

  17. Razavi, A.R., Gill, H., Ahlfeldt, H., Shahsavar, N.: Predicting metastasis in breast cancer: comparing a decision tree with domain experts. J. Med. Syst. 31, 263–273 (2007)

    Article  Google Scholar 

  18. Reps, J., Garibaldi, J.M., Aickelin, U., Soria, D., Gibson, J.E., Hubbard, R.B.: Discovering Sequential Patterns in a UK General Practice Database. In: IEEE-EMBS International Conference on Biomedical and Health Informatics, pp. 960–963 (2012)

    Google Scholar 

  19. Rew, D.A.: Understanding outcomes in cancer surgery through time structured patient records. Indian J. Surg. Oncol. 2(4), 265–270 (2011)

    Article  MATH  Google Scholar 

  20. Stolba, N., Tjoa, A.: The relevance of data warehousing and data mining in the field of evidence-based medicine to support healthcare decision making. Int. J. Comput. Syst. Sci. Eng. 3(3), 143–148 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Lu, J. et al. (2015). Data Mining Techniques in Health Informatics: A Case Study from Breast Cancer Research. In: Renda, M., Bursa, M., Holzinger, A., Khuri, S. (eds) Information Technology in Bio- and Medical Informatics. ITBAM 2015. Lecture Notes in Computer Science(), vol 9267. Springer, Cham. https://doi.org/10.1007/978-3-319-22741-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22741-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22740-5

  • Online ISBN: 978-3-319-22741-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics