Abstract
This paper describes a genetic programming (GP) approach to binary classification with class imbalance problems. This approach is examined on two benchmark and two synthetic data sets. The results show that when using the overall classification accuracy as the fitness function, the GP system is strongly biased toward the majority class. Two new fitness functions are developed to deal with the class imbalance problem. The experimental results show that both of them substantially improve the performance for the minority class, and the performance for the majority and minority classes is much more balanced.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intell. Data Anal. 6(5), 429–449 (2002)
Weiss, G.M., Provost, F.: The effect of class distribution on classifier learning: An empirical study (August 12, 2001)
Orriols, A., Bernadó-Mansilla, E.: The class imbalance problem in learning classifier systems:A preliminary study. In: Rothlauf, et al. (eds.) Genetic and Evolutionary Computation Conference (GECCO2005) workshop program, June 25-29, 2005, pp. 74–78. ACM Press, Washington, D.C., USA (2005)
Flach, P.A.: The geometry of ROC space: Understanding machine learning metrics through ROC isometrics. In: Fawcett, T., Mishra, N. (eds.) ICML, pp. 194–201. AAAI Press, USA (2003)
Koza, J.R.: Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, Mass (1992)
Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D.: Genetic Programming: An Introduction on the Automatic Evolution of computer programs and its Applications. Morgan Kaufmann Publishers, San Francisco, Calif. (1998)
Zhang, M., Ciesielski, V., Andreae, P.: A domain independent window-approach to multiclass object detection using genetic programming. EURASIP Journal on Signal Processing, Special Issue on Genetic and Evolutionary Computation for Signal Processing and Image Analysis 2003(8), 841–859 (2003)
Muni, D.P., Pal, N.R., Das, J.: A novel approach to design classifier using genetic programming. IEEE Transactions on Evolutionary Computation 8(2), 183–196 (2004)
Krawiec, K., Bhanu, B.: Visual learning by coevolutionary feature synthesis. IEEE Transactions on System, Man, and Cybernetics – Part B 35(3), 409–425 (2005)
Newman, D., Hettich, S., Blake, C., Merz, C.: Uci repository of machine learning databases (1998)
Koza, J.R.: Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, Cambridge, Mass (1994)
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Patterson, G., Zhang, M. (2007). Fitness Functions in Genetic Programming for Classification with Unbalanced Data. In: Orgun, M.A., Thornton, J. (eds) AI 2007: Advances in Artificial Intelligence. AI 2007. Lecture Notes in Computer Science(), vol 4830. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76928-6_90
Download citation
DOI: https://doi.org/10.1007/978-3-540-76928-6_90
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76926-2
Online ISBN: 978-3-540-76928-6
eBook Packages: Computer ScienceComputer Science (R0)