Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Data
Credit card classification based on machine learning has attracted considerable interest from the research community. One of the most important tasks in this area is the ability of classifiers to handle the imbalance in credit card data. In this scenario, classifiers tend to yield poor accuracy on the minority class despite realizing high overall accuracy. This is due to the influence of the majority class on traditional training criteria. In this paper, we aim to apply genetic programming to address this issue by adapting existing fitness functions. We examine two fitness functions from previous studies and develop two new fitness functions to evolve GP classifiers with superior accuracy on the minority class and overall. Two UCI credit card datasets are used to evaluate the effectiveness of the proposed fitness functions. The results demonstrate that the proposed fitness functions augment GP classifiers, encouraging fitter solutions on both the minority and the majority classes.
KeywordsClass imbalance Credit card data Fitness functions
This work is funded by Vietnam International Education Development (VIED) and by agreement with the Irish Universities Association.
- 1.Brabazon, A., Cahill, J., Keenan, P., Walsh, D.: Identifying online credit card fraud using artificial immune systems. In: 2010 IEEE Congress on Evolutionary Computation (CEC), pp. 1–7. IEEE (2010)Google Scholar
- 4.Monard, M.C., Batista, G.E.: Learning with skewed class distrihutions. Adv. Log. Artif. Intell. Robot. LAPTEC 2002 85, 173 (2002)Google Scholar
- 6.Kubat, M., Matwin, S., et al.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, Nashville, USA, vol. 97, pp. 179–186 (1997)Google Scholar
- 8.Caruana, R., Niculescu-Mizil, A.: Data mining in metric space: an empirical analysis of supervised learning performance criteria. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 69–78. ACM (2004)Google Scholar
- 11.Lichman, M.: UCI Machine Learning Repository (2013)Google Scholar
- 13.Loveard, T., Ciesielski, V.: Representing classification problems in genetic programming. In: Proceedings of the 2001 Congress on Evolutionary Computation, vol. 2, pp. 1070–1077. IEEE (2001)Google Scholar