Knowledge discovery from a breast cancer database
We report on the use of various Machine Learning algorithms on an electronic database of breast cancer patients. The task was to predict breast cancer recurrence using a short subset of clinical attributes such as tumor presence, tumor size, invasive nature of tumor, number of lymph nodes involved, severity of lymphedema and stage of tumor. The predictive accuracy over fifty runs employing test sets not used to build the model were 63.4%(Cart), 63.9%(C45), 62.5%(C45rules), 66.4%(FOCL) and 68.3%(Naive Bayes). An extension of the model using additional features and larger datasets is contemplated.
KeywordsBreast Cancer Machine Learn Algorithm Optimal Treatment Plan Medical Record Database Learn Bayesian Network
Unable to display preview. Download preview PDF.
- 1.Fentiman IS and Gregory WM. “The Hormonal Milieu and Prognosis in Operable Breast Cancer” In: Cancer Surveys-Breast Cancer vol 18 pp149–163 Guest ed. Fentiman IS and Taylor-Papadimitriou J. Cold Spring Harbor Laboratory Press. 1993.Google Scholar
- 2.Wolberg W.H. and Mangasarian O.L. “Computer-Designed Expert Systems for Breast Cytology Diagnosis” Analytical and Quantitative Cytology and Histology, vol 15, pp 67–74, Feb 1993.Google Scholar
- 3.Tan M. and Eshelman L. “Using weighted networks to represent classification knowledge in noisy domains. Proceedings of the Fifth International Conference on Machine Learning, pp121–134 Ann Arbor, MI.Google Scholar
- 4.Quinlan, JR. “C4.5: Programs for Machine Learning” Morgan Kaufmann 1993 Los Altos, California.Google Scholar
- 5.Brieman L., Friedman J.H., Olshen R.A. and Stone C.J. “Classification and Regression Trees” Wadsworth 1984 Belmont.Google Scholar
- 6.Pazzani, Michael and Dennis Kibler. “The Utility of Knowledge in Inductive Learning” Machine Learning 9:57–94, 1992Google Scholar