An Innovative Application of a Constrained-Syntax Genetic Programming System to the Problem of Predicting Survival of Patients
This paper proposes a constrained-syntax genetic programming (GP) algorithm for discovering classification rules in medical data sets. The proposed GP contains several syntactic constraints to be enforced by the system using a disjunctive normal form representation, so that individuals represent valid rule sets that are easy to interpret. The GP is compared with C4.5 in a real-world medical data set. This data set represents a difficult classification problem, and a new preprocessing method was devised for mining the data.
KeywordsGenetic Programming Classification Rule Disjunctive Normal Form Classification Experiment Innovative Application
Unable to display preview. Download preview PDF.
- [Bojarczuk et al. 2000]C. C. Bojarczuk, H. S. Lopes, A. A. Freitas. Genetic programming for knowledge discovery in chest pain diagnosis. IEEE Engineering in Medicine and Biology magazine-special issue on data mining and knowledge discovery, 19(4), 38–44, July/Aug. 2000.Google Scholar
- [Dhar et al. 2000]
- [Freitas 2002]A. A. Freitas. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer, 2002.Google Scholar
- [Hand 1997]
- [Kishore et al. 2000]
- [Montana 1995]
- [Papagelis and Kalles 2001]A. Papagelis and D. Kalles. Breeding decision trees using evolutionary techniques. Proc. 18 th Int. Conf. on Machine Learning, 393–400. San Mateo: Morgan Kaufmann, 2001.Google Scholar
- [Quinlan 1993]J. R. Quinlan. C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann, 1993.Google Scholar
- [Witten and Frank 2000]I. H. Witten and E. Frank. Data Mining: practical machine learning tools and techniques with Java implementations. San Mateo: Morgan Kaufmann, 2000.Google Scholar