A Proposal of Evolutionary Prototype Selection for Class Imbalance Problems
Unbalanced data in a classification problem appears when there are many more instances of some classes than others. Several solutions were proposed to solve this problem at data level by under-sampling. The aim of this work is to propose evolutionary prototype selection algorithms that tackle the problem of unbalanced data by using a new fitness function. The results obtained show that a balancing of data performed by evolutionary under-sampling outperforms previously proposed under-sampling methods in classification accuracy, obtaining reduced subsets and getting a good balance on data.
KeywordsGeometric Mean Class Distribution Minority Class Balance Accuracy Class Imbalance Problem
Unable to display preview. Download preview PDF.
- Eshelman, L.J.: The CHC adaptative search algorithm: How to safe search when engaging in nontraditional genetic recombination. In: FOGA, pp. 265–283 (1990)Google Scholar
- Baluja, S.: Population-based incremental learning: A method for integrating genetic search based function optimization and competitive learning. Technical report, Pittsburgh, PA, USA (1994)Google Scholar
- Kubat, M., Matwin, S.: Addressing the course of imbalanced training sets: Onesided selection. In: ICML, pp. 179–186 (1997)Google Scholar
- Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 7, 37–66 (1991)Google Scholar
- Newman, D.J., Hettich, S., Merz, C.B.: UCI repository of machine learning databases (1998) Google Scholar
- Demśar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)Google Scholar