Skip to main content

Concept-Learning in the Presence of Between-Class and Within-Class Imbalances

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (Canadian AI 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2056))

Abstract

In a concept learning problem, imbalances in the distribution of the data can occur either between the two classes or within a single class. Yet, although both types of imbalances are known to affect negatively the performance of standard classifiers, methods for dealing with the class imbalance problem usually focus on rectifying the between-class imbalance problem, neglecting to address the imbalance occurring within each class. The purpose of this paper is to extend the simplest proposed approach for dealing with the between-class imbalance problem—random re—sampling in order to deal simultaneously with the two problems. Although re-sampling is not necessarily the best way to deal with problems of imbalance, the results reported in this paper suggest that addressing both problems simultaneously is beneficial and should be done by more sophisticated techniques as well.

This research was supported by an NSERC Research Grant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chris Drummond and Robert Holte. Explicitely representing expected cost: An alternative to roc representation. In Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 198–207, 2000.

    Google Scholar 

  2. R. C. Holte, Acker L. E., and B. W. Porter. Concept learning and the problem of small disjuncts. In IJCAI-89, 1989.

    Google Scholar 

  3. Miroslav Kubat and Stan Matwin. Addressing the curse of imbalanced data sets: One-sided sampling. In ICML-97, 1997.

    Google Scholar 

  4. Charles X. Ling and Chenghui Li. Data mining for direct marketing: Problems and solutions. In KDD-98, 1998.

    Google Scholar 

  5. Adam Nickerson, Nathalie Japkowicz, and Evangelos Milios. Using unsupervised learning to guide resampling in imbalanced data sets. In AISTATS-01 (to appear), 2001.

    Google Scholar 

  6. Foster Provost and Tom E. Fawcett. Robust classification for imprecise environments. Machine Learning, 42:203–231, 2001.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Japkowicz, N. (2001). Concept-Learning in the Presence of Between-Class and Within-Class Imbalances. In: Stroulia, E., Matwin, S. (eds) Advances in Artificial Intelligence. Canadian AI 2001. Lecture Notes in Computer Science(), vol 2056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45153-6_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-45153-6_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42144-3

  • Online ISBN: 978-3-540-45153-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics