An Approach to Imbalanced Data Sets Based on Changing Rule Strength

Grzymala-Busse, Jerzy W.; Goodwin, Linda K.; Grzymala-Busse, Witold J.; Zheng, Xinqun

doi:10.1007/978-3-642-18859-6_21

Jerzy W. Grzymala-Busse⁶,
Linda K. Goodwin⁷,
Witold J. Grzymala-Busse⁸ &
…
Xinqun Zheng⁶

Part of the book series: Cognitive Technologies ((COGTECH))

274 Accesses
11 Citations

Summary

. This chapter describes experiments with a challenging data set describing preterm births. The data set, collected at the Duke University Medical Center, was large but at the same time many attribute values were missing. However, the main problem was that only 20.7% of the total number of cases represented the important preterm birth class. Thus, the data set was imbalanced. For comparison, we include results of experiments on another imbalanced data set, the well-known breast cancer data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Reference

R. Bairagi, C.M. Suchindran. An estimator of the cutoff point maximizing sum of sensitivity and specificity.Sankhya Series B Indian Journal of Statistics 51: 263–269, 1989.
MathSciNet Google Scholar
L.B. Booker, D.E. Goldberg, J.F. Holland. Classifier systems and genetic algorithms. In J. G. Carbonell, editorMachine Learning: Paradigms and Methods235–282, MIT Press, Cambridge, MA, 1990.
Google Scholar
R.K. Creasy, M.A. Herron. Prevention of preterm birth.Seminars in Perinatology5: 295–302, 1981.
Google Scholar
R.K. Creasy. Preterm birth prevention: Where are we?American Journal of Obstetrics & Gynecology168: 1223–1230,1993.
Google Scholar
J.W. Grzymala-Busse. On the unknown attribute values in learning from examples. InProceedings of the 6th International Symposium on Methodologies for Intelligent Systems (ISMIS’91)LNAI 542, 368–377, Springer, Berlin, 1991.
Google Scholar
J.W. Grzymala-Busse. LERS — A system for learning from examples based on rough sets. In R. Slowiáski, editorIntelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory3–18, Kluwer, Dordrecht, 1992.
Chapter Google Scholar
J.W. Grzymala-Busse, L.K. Goodwin, X. Zhang. Increasing sensitivity of preterm birth by changing rule strengths. InProceedings of the 8th Workshop on Intelligent Information Systems (IIS’99)127–136, Institute of Fundamentals of Computer Science of the Polish Academy of Sciences, Warsaw, 1999.
Google Scholar
J.W. Grzymala-Busse, W.J. Grzymala-Busse, L.K. Goodwin. A closest fit approach to missing attribute values in preterm birth data. InProceedings of the 7th International Workshop on Rough Sets Fuzzy Sets,Data Mining and Granular-Soft Computing (RSFDGrC’99)LNAI 1711, 405–413, Springer, Berlin, 1999.
Google Scholar
J.H. Holland, K.J. Holyoak, R.E. Nisbett.Induction: Processes of Inference Learning and Discovery. MIT Press, Cambridge, MA, 1986.
Google Scholar
M. McLean, W.A. Walters, R. Smith. Prediction and early diagnosis of preterm labor: A critical review.Obstetrical & Gynecological Survey, 48: 209–225, 1993.
Article Google Scholar
R.S. Michalski, I. Mozetic, J. Hong, N. Lavrac. The AQ15 inductive learning system: An overview and experiments. Report number UIUCDCD-R-86–1260 of the Department of Computer Science, University of Illinois, 1986.
Google Scholar
Z. Pawlak, J.W. Grzymala-Busse, R. Slowiíiski, W. Ziarko. Rough sets. Communications of the ACM, 38: 89–95, 1995.
Article Google Scholar
Z. Pawlak. Rough sets. International Journal of Computer and Information Sciences 11: 341–356, 1982.
Article MathSciNet MATH Google Scholar
Z. Pawlak.Rough Sets: Theoretical Aspects of Reasoning about Data.Kluwer, Dordrecht, 1991.
MATH Google Scholar
J.R. Quinlan.C4.5: Programs for Machine Learning.Morgan Kaufmann, San Mateo CA, 1993.
Google Scholar
L.K. Woolery, J. Grzymala-Busse. Machine learning for an expert system to predict preterm birth risk. JAmer. Med. Inf. Assoc.1: 439–446, 1994.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, University of Kansas, KS, Lawrence, 66045, USA
Jerzy W. Grzymala-Busse & Xinqun Zheng
Department of Information Services and the School of Nursing, Duke University, NC, Durham, 27710, USA
Linda K. Goodwin
RS Systems, Inc, KS, Lawrence, 66047, USA
Witold J. Grzymala-Busse

Authors

Jerzy W. Grzymala-Busse
View author publications
You can also search for this author in PubMed Google Scholar
Linda K. Goodwin
View author publications
You can also search for this author in PubMed Google Scholar
Witold J. Grzymala-Busse
View author publications
You can also search for this author in PubMed Google Scholar
Xinqun Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Indian Statistical Institute, Machine Intelligence Unit, 203 Barrackpore Trunk Road, Calcutta, 700 035, India
Sankar K. Pal
Department of Mathematics and Computer Science, University of Warmia and Mazury, Zolnierska 14, Olsztyn, Poland
Lech Polkowski
Institute of Mathematics, Warsaw University, Banacha 2, 02-097, Warsaw, Poland
Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Grzymala-Busse, J.W., Goodwin, L.K., Grzymala-Busse, W.J., Zheng, X. (2004). An Approach to Imbalanced Data Sets Based on Changing Rule Strength. In: Pal, S.K., Polkowski, L., Skowron, A. (eds) Rough-Neural Computing. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18859-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-642-18859-6_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-62328-8
Online ISBN: 978-3-642-18859-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics