Concept-Learning in the Presence of Between-Class and Within-Class Imbalances

Japkowicz, Nathalie

doi:10.1007/3-540-45153-6_7

Nathalie Japkowicz³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2056))

Included in the following conference series:

Conference of the Canadian Society for Computational Studies of Intelligence

1345 Accesses
55 Citations

Abstract

In a concept learning problem, imbalances in the distribution of the data can occur either between the two classes or within a single class. Yet, although both types of imbalances are known to affect negatively the performance of standard classifiers, methods for dealing with the class imbalance problem usually focus on rectifying the between-class imbalance problem, neglecting to address the imbalance occurring within each class. The purpose of this paper is to extend the simplest proposed approach for dealing with the between-class imbalance problem—random re—sampling in order to deal simultaneously with the two problems. Although re-sampling is not necessarily the best way to deal with problems of imbalance, the results reported in this paper suggest that addressing both problems simultaneously is beneficial and should be done by more sophisticated techniques as well.

This research was supported by an NSERC Research Grant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chris Drummond and Robert Holte. Explicitely representing expected cost: An alternative to roc representation. In Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 198–207, 2000.
Google Scholar
R. C. Holte, Acker L. E., and B. W. Porter. Concept learning and the problem of small disjuncts. In IJCAI-89, 1989.
Google Scholar
Miroslav Kubat and Stan Matwin. Addressing the curse of imbalanced data sets: One-sided sampling. In ICML-97, 1997.
Google Scholar
Charles X. Ling and Chenghui Li. Data mining for direct marketing: Problems and solutions. In KDD-98, 1998.
Google Scholar
Adam Nickerson, Nathalie Japkowicz, and Evangelos Milios. Using unsupervised learning to guide resampling in imbalanced data sets. In AISTATS-01 (to appear), 2001.
Google Scholar
Foster Provost and Tom E. Fawcett. Robust classification for imprecise environments. Machine Learning, 42:203–231, 2001.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology and Engineering, University of Ottawa, 150 Louis Pasteur, P.O. Box 450, Stn. A Ottawa, Ontario, Canada, K1N 6N5
Nathalie Japkowicz

Authors

Nathalie Japkowicz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Alberta, Edmonton, AB, Canada, T6G 2E8
Eleni Stroulia
School of Information Technology and Engineering, University of Ottawa, Ottawa, ON, Canada, K1N 6N5
Stan Matwin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Japkowicz, N. (2001). Concept-Learning in the Presence of Between-Class and Within-Class Imbalances. In: Stroulia, E., Matwin, S. (eds) Advances in Artificial Intelligence. Canadian AI 2001. Lecture Notes in Computer Science(), vol 2056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45153-6_7

Download citation

DOI: https://doi.org/10.1007/3-540-45153-6_7
Published: 16 May 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42144-3
Online ISBN: 978-3-540-45153-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics