Structural Risk Minimization on Decision Trees Using an Evolutionary Multiobjective Optimization
Inducing decision trees is a popular method in machine learning. The information gain computed for each attribute and its threshold helps finding a small number of rules for data classification. However, there has been little research on how many rules are appropriate for a given set of data. In this paper, an evolutionary multi-objective optimization approach with genetic programming will be applied to the data classification problem in order to find the minimum error rate for each size of decision trees. Following structural risk minimization suggested by Vapnik, we can determine a desirable number of rules with the best generalization performance. A hierarchy of decision trees for classification performance can be provided and it is compared with C4.5 application.
KeywordsDecision Tree Leaf Node Information Gain Tree Size Generalization Error
Unable to display preview. Download preview PDF.
- 1.Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases. In: Proceedings of the Fifth International Conference on Machine Learning (1998)Google Scholar
- 2.Bot, M.C.J.: Improving induction of linear classification trees with genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2000), pp. 403–410. Morgan Kaufmann, San Francisco (2000)Google Scholar
- 3.Bot, M.C.J., Langdon, W.B.: Application of genetic programming to induction of linear classification trees. In: Proceedings of the 3rd European Conference on Genetic Programming (2000)Google Scholar
- 5.Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group (1984)Google Scholar
- 6.Fayyad, U.M.: On the induction of decision trees for multiple concept learning. Ph. D. dissertation, EECS department, University of Michigan (1991)Google Scholar
- 7.Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization: Formulation, discussion and generalization. In: Proceedings of the Fifth Int. Conf. on Genetic Algorithms, pp. 416–423. Morgan Kaufmann, San Francisco (1993)Google Scholar
- 8.Freitas, A.A., Pappa, G.L., Kaestner, C.A.A.: Attribute selection with a multiobjective genetic algorithm. In: Proceedings of the 16th Brazilian Symposium on Artificial Intelligence, pp. 280–290. Springer, Heidelberg (2002)Google Scholar
- 9.Irani, K.B., Khaminsani, V.A.: Knowledge based automation of semiconductor manufacturing. In SRC Project Annual Review Report, The University of Michigan, Ann Arbor (1991)Google Scholar
- 11.Kim, D., Hallam, J.: An evolutionary approach to quantify internal states needed for the woods problem. In: From Animals to Animats 7, pp. 312–322. MIT Press, Cambridge (2002)Google Scholar
- 17.Zitzler, E.: Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications. Ph. D. dissertation, Swiss Federal Institute of Technology (1999)Google Scholar