Abstract
In a concept learning problem, imbalances in the distribution of the data can occur either between the two classes or within a single class. Yet, although both types of imbalances are known to affect negatively the performance of standard classifiers, methods for dealing with the class imbalance problem usually focus on rectifying the between-class imbalance problem, neglecting to address the imbalance occurring within each class. The purpose of this paper is to extend the simplest proposed approach for dealing with the between-class imbalance problem—random re—sampling in order to deal simultaneously with the two problems. Although re-sampling is not necessarily the best way to deal with problems of imbalance, the results reported in this paper suggest that addressing both problems simultaneously is beneficial and should be done by more sophisticated techniques as well.
This research was supported by an NSERC Research Grant.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chris Drummond and Robert Holte. Explicitely representing expected cost: An alternative to roc representation. In Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 198–207, 2000.
R. C. Holte, Acker L. E., and B. W. Porter. Concept learning and the problem of small disjuncts. In IJCAI-89, 1989.
Miroslav Kubat and Stan Matwin. Addressing the curse of imbalanced data sets: One-sided sampling. In ICML-97, 1997.
Charles X. Ling and Chenghui Li. Data mining for direct marketing: Problems and solutions. In KDD-98, 1998.
Adam Nickerson, Nathalie Japkowicz, and Evangelos Milios. Using unsupervised learning to guide resampling in imbalanced data sets. In AISTATS-01 (to appear), 2001.
Foster Provost and Tom E. Fawcett. Robust classification for imprecise environments. Machine Learning, 42:203–231, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Japkowicz, N. (2001). Concept-Learning in the Presence of Between-Class and Within-Class Imbalances. In: Stroulia, E., Matwin, S. (eds) Advances in Artificial Intelligence. Canadian AI 2001. Lecture Notes in Computer Science(), vol 2056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45153-6_7
Download citation
DOI: https://doi.org/10.1007/3-540-45153-6_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42144-3
Online ISBN: 978-3-540-45153-2
eBook Packages: Springer Book Archive