Skip to main content

Advertisement

Log in

Construction the Model on the Breast Cancer Survival Analysis Use Support Vector Machine, Logistic Regression and Decision Tree

  • Patient Facing Systems
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

The aim of the paper is to use data mining technology to establish a classification of breast cancer survival patterns, and offers a treatment decision-making reference for the survival ability of women diagnosed with breast cancer in Taiwan. We studied patients with breast cancer in a specific hospital in Central Taiwan to obtain 1,340 data sets. We employed a support vector machine, logistic regression, and a C5.0 decision tree to construct a classification model of breast cancer patients’ survival rates, and used a 10-fold cross-validation approach to identify the model. The results show that the establishment of classification tools for the classification of the models yielded an average accuracy rate of more than 90 % for both; the SVM provided the best method for constructing the three categories of the classification system for the survival mode. The results of the experiment show that the three methods used to create the classification system, established a high accuracy rate, predicted a more accurate survival ability of women diagnosed with breast cancer, and could be used as a reference when creating a medical decision-making frame.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Fabregue, M., Bringay, S., Poncelet, P., Teisseire, M., and Orsetti, B., Mining microarray data to predict the histological grade of a breast cancer. J. Biomed. Inform. 44(1):S12–S16, 2011. doi:10.1016/j.jbi.2011.03.002.

    Article  Google Scholar 

  2. Department of Health, Executive Yuan, R.O.C., 2013. Retrieved from http://www.mohw.gov.tw/cht/DOS/Statistic.aspx?f_list_no=312&fod_list_no=2747.

  3. Hartmann, S., Reimer, T., and Gerber, B., Management of early invasive breast cancer in very young women (<35 years). Clin. Breast Cancer 11(4):196–203, 2011. doi:10.1016/j.clbc.2011.06.001.

    Article  Google Scholar 

  4. Jerez-Aragonés, J. M., Gomez-Ruiz, J. A., Ramos-Jimenez, G., Munoz-Perez, J., and Alba-Conejo, E., A combined neural network and decision trees model for prognosis of breast cancer relapse. Artif. Intell. Med. 27(1):45–63, 2003. doi:10.1016/S0933-3657(02)00086-6.

    Article  Google Scholar 

  5. O’Malley, C. D., Le, G. M., Glaser, S. L., Shema, S. J., and West, D. W., Socioeconomic status and breast carcinoma survival in four racial/ethnic groups: A population-based study. Am. Cancer Soc. 97(5):1303–1311, 2003. doi:10.1002/cncr.11160.

    Google Scholar 

  6. Nahar, J., Imam, T., Tickle, K. S., Ali, A. B. M. S., and Chen, Y.-P. P., Computational intelligence for microarray data and biomedical image analysis for the early diagnosis of breast cancer. Expert Syst. Appl. 39(16):12371–12377, 2012. doi:10.1016/j.eswa.2012.04.045.

    Article  Google Scholar 

  7. Keles, A., Keles, A., and Yavuz, U., Expert system based on neuro-fuzzy rules for diagnosis breast cancer. Expert Syst. Appl. 38(5):5719–5726, 2011. doi:10.1016/j.eswa.2010.10.061.

    Article  Google Scholar 

  8. Luo, S. T., and Cheng, B. W., Diagnosing breast masses in digital mammography using feature selection and ensemble methods. J. Med. Syst. 36(2):569–577, 2012. doi:10.1007/s10916-010-9518-8.

    Article  MathSciNet  Google Scholar 

  9. Fan, C.-Y., Chang, P.-C., Lin, J.-J., and Hsieh, J. C., A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification. Appl. Soft Comput. 11(1):632–644, 2011. doi:10.1016/j.asoc.2009.12.023.

    Article  Google Scholar 

  10. D’Eredita, G., Giardina, C., Martellotta, M., Natale, T., and Ferrarese, F., Prognostic factors in breast cancer: the predictive value of the Nottingham Prognostic Index in patients with a long-term follow-up that were treated in a single institution. Eur. J. Cancer 37(1):591–596, 2001. doi:10.1016/s0959-8049(00)00435-4.

    Article  Google Scholar 

  11. Liao, H. C., and Tsai, J. H., Data mining for DNA viruses with breast cancer, fibroadenoma, and normal mammary tissue. Appl. Math. Comput. 188(1):989–1000, 2007. doi:10.1016/j.amc.2006.10.069.

    Article  MathSciNet  MATH  Google Scholar 

  12. Chhatwal, J., Alagoz, O., Lindstrom, M. J., Kahn, C. E., Jr., Shaffer, K. A., and Burnside, E. S., A logistic regression model based on the national mammography database format to aid breast cancer diagnosis. Am. J. Roentgenol. 192(4):1117–1127, 2009. doi:10.2214/AJR.07.3345.

    Article  Google Scholar 

  13. Richards, G., Rayward-Smith, V. J., Sonksen, P. H., Carey, S., and Weng, C., Data mining for indicators of early mortality in a database of clinical records. Artif. Intell. Med. 22(3):215–231, 2001. doi:10.1016/S0933-3657(00)00110-X.

    Article  Google Scholar 

  14. Pendharkar, P. C., Rodger, J. A., Yaverbaum, G., Herman, N., and Benner, M., Association, statistical, mathematical and neural approaches for mining breast cancer patterns. Expert Syst. Appl. 17(3):223–232, 1999. doi:10.1016/S0957-4174(99)00036-6.

    Article  Google Scholar 

  15. Acharya, U. R., Ng, E. Y., Tan, J. H., and Sree, S. V., Thermography based breast cancer detection using texture features and Support Vector Machine. J. Med. Syst. 36(3):1503–1510, 2012. doi:10.1007/s10916-010-9611-z.

    Article  Google Scholar 

  16. Saritas, I., Prediction of breast cancer using artificial neural networks. J. Med. Syst. 36(5):2901–2907, 2012. doi:10.1007/s10916-011-9768-0.

    Article  Google Scholar 

  17. Shoorehdeli, M. A., Breast cancer classification based on advanced multi dimensional fuzzy neural network. J. Med. Syst. 36(5):2713–2720, 2012. doi:10.1007/s10916-011-9747-5.

    Article  Google Scholar 

  18. Huang, M. L., Hung, Y. H., et al., Usage of case-based reasoning, neural network and adaptive neuro-fuzzy inference system classification techniques in breast cancer dataset classification diagnosis. J. Med. Syst. 36(2):407–414, 2012.

    Article  MathSciNet  Google Scholar 

  19. Chen, et al., Support vector machine based diagnostic system for breast cancer using swarm intelligence. J. Med. Syst. 36(4):2505–2519, 2012. doi:10.1007/s10916-011-9723-0.

    Article  Google Scholar 

  20. Huang, M. L., Hung, Y. H., and Chen, W. Y., Neural network classifier with entropy based feature selection on breast cancer diagnosis. J. Med. Syst. 34(5):865–873, 2010. doi:10.1007/s10916-009-9301-x.

    Article  Google Scholar 

  21. Delen, D., Walker, G., and Kadam, A., Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2):113–127, 2005. doi:10.1016/j.artmed.2004.07.002.

    Article  Google Scholar 

  22. Lee, Y. J., Mangasarian, O. L., and Wolberg, W. H., Survival-time classification of breast cancer patients. Comput. Optim. Appl. 25(1–3):151–166, 2003. doi:10.1023/A:1022953004360.

    Article  MathSciNet  MATH  Google Scholar 

  23. Vapnik, V., The nature of statistical learning theory. Springer, New York, 1995.

    Book  MATH  Google Scholar 

  24. Stoean, R., Stoean, C., et al., Evolutionary-driven support vector machines for determining the degree of liver fibrosis in chronic hepatitis C. Artif. Intell. Med. 51(1):53–65, 2011.

    Article  MathSciNet  Google Scholar 

  25. Cristianini, N., and Taylor, J., An introduction to support vector machines. Cambridge University Press, Cambridge, UK, 2000.

    Google Scholar 

  26. Quinlan, J. R., C4.5: Programs for machine learning. Morgan Kaufmann Publishers, San Mateo, 1993.

    Google Scholar 

  27. Mazzocco, T., and Hussain, A., Novel logistic regression models to aid the diagnosis of dementia. Expert Syst. Appl. 39(3):3356–3361, 2012. doi:10.1016/j.eswa.2011.09.023.

    Article  Google Scholar 

  28. Pradhan, B., A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput. Geosci. 51(1):350–365, 2013.

    Article  Google Scholar 

  29. Petrović, J., Ibrić, S., Betzb, G., and Durić, Z., Optimization of matrix tablets controlled drug release using Elman dynamic neural networks and decision trees. Int. J. Pharm. 428(1–2):57–67, 2012. doi:10.1016/j.ijpharm.2012.02.031.

    Article  Google Scholar 

  30. Biggs, D., et al., A method of choosing multiway partitions for classification and decision trees. J. Appl. Stat. 18(1):49–62, 1991.

    Article  Google Scholar 

  31. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J., Classification and regression trees. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, CA, 1984.

    MATH  Google Scholar 

  32. Cios, K., and Moore, G., Uniqueness of medical data mining. Artif. Intell. Med. 26(1):1–24, 2002. doi:10.1016/S0933-3657(02)00049-0.

    Article  Google Scholar 

  33. Szalay, A., and Gray, J., Science in an exponential world. Nature 440(1):413–414, 2006.

    Article  Google Scholar 

Download references

Acknowledgments

This research was performed under the auspices of Taiwan’s National Science Council (NSC 99-2221-E-224-033-MY2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ya-Wen Yu.

Additional information

This article is part of the Topical Collection on Patient Facing Systems

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chao, CM., Yu, YW., Cheng, BW. et al. Construction the Model on the Breast Cancer Survival Analysis Use Support Vector Machine, Logistic Regression and Decision Tree. J Med Syst 38, 106 (2014). https://doi.org/10.1007/s10916-014-0106-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-014-0106-1

Keywords

Navigation