Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Data

  • Van Loi Cao
  • Nhien-An Le-Khac
  • Michael O’Neill
  • Miguel Nicolau
  • James McDermott
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9597)

Abstract

Credit card classification based on machine learning has attracted considerable interest from the research community. One of the most important tasks in this area is the ability of classifiers to handle the imbalance in credit card data. In this scenario, classifiers tend to yield poor accuracy on the minority class despite realizing high overall accuracy. This is due to the influence of the majority class on traditional training criteria. In this paper, we aim to apply genetic programming to address this issue by adapting existing fitness functions. We examine two fitness functions from previous studies and develop two new fitness functions to evolve GP classifiers with superior accuracy on the minority class and overall. Two UCI credit card datasets are used to evaluate the effectiveness of the proposed fitness functions. The results demonstrate that the proposed fitness functions augment GP classifiers, encouraging fitter solutions on both the minority and the majority classes.

Keywords

Class imbalance Credit card data Fitness functions 

References

  1. 1.
    Brabazon, A., Cahill, J., Keenan, P., Walsh, D.: Identifying online credit card fraud using artificial immune systems. In: 2010 IEEE Congress on Evolutionary Computation (CEC), pp. 1–7. IEEE (2010)Google Scholar
  2. 2.
    Duman, E., Ozcelik, M.H.: Detecting credit card fraud by genetic algorithm and scatter search. Expert Syst. Appl. 38(10), 13057–13063 (2011)CrossRefGoogle Scholar
  3. 3.
    Lu, Q., Ju, C.: Research on credit card fraud detection model based on class weighted support vector machine. J. Convergence Inf. Technol. 6(1), 62–68 (2011)CrossRefGoogle Scholar
  4. 4.
    Monard, M.C., Batista, G.E.: Learning with skewed class distrihutions. Adv. Log. Artif. Intell. Robot. LAPTEC 2002 85, 173 (2002)Google Scholar
  5. 5.
    Barandela, R., Sánchez, J.S., Garcıa, V., Rangel, E.: Strategies for learning in class imbalance problems. Pattern Recogn. 36(3), 849–851 (2003)CrossRefGoogle Scholar
  6. 6.
    Kubat, M., Matwin, S., et al.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, Nashville, USA, vol. 97, pp. 179–186 (1997)Google Scholar
  7. 7.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)CrossRefGoogle Scholar
  8. 8.
    Caruana, R., Niculescu-Mizil, A.: Data mining in metric space: an empirical analysis of supervised learning performance criteria. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 69–78. ACM (2004)Google Scholar
  9. 9.
    Bhowan, U., Zhang, M., Johnston, M.: Genetic programming for classification with unbalanced data. In: Esparcia-Alcázar, A.I., Ekárt, A., Silva, S., Dignum, S., Uyar, A.Ş. (eds.) EuroGP 2010. LNCS, vol. 6021, pp. 1–13. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Bhowan, U., Johnston, M., Zhang, M.: Developing new fitness functions in genetic programming for classification with unbalanced data. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 406–421 (2012)CrossRefGoogle Scholar
  11. 11.
    Lichman, M.: UCI Machine Learning Repository (2013)Google Scholar
  12. 12.
    Koza, J.R.: Genetic Programming: on the Programming of Computers by Means of Natural Selection, vol. 1. MIT press, Cambridge (1992)MATHGoogle Scholar
  13. 13.
    Loveard, T., Ciesielski, V.: Representing classification problems in genetic programming. In: Proceedings of the 2001 Congress on Evolutionary Computation, vol. 2, pp. 1070–1077. IEEE (2001)Google Scholar
  14. 14.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: An update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Van Loi Cao
    • 1
  • Nhien-An Le-Khac
    • 1
  • Michael O’Neill
    • 1
  • Miguel Nicolau
    • 1
  • James McDermott
    • 1
  1. 1.Natural Computing Research and Application GroupUniversity College DublinDublinIreland

Personalised recommendations