Summary
. This chapter describes experiments with a challenging data set describing preterm births. The data set, collected at the Duke University Medical Center, was large but at the same time many attribute values were missing. However, the main problem was that only 20.7% of the total number of cases represented the important preterm birth class. Thus, the data set was imbalanced. For comparison, we include results of experiments on another imbalanced data set, the well-known breast cancer data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Reference
R. Bairagi, C.M. Suchindran. An estimator of the cutoff point maximizing sum of sensitivity and specificity.Sankhya Series B Indian Journal of Statistics 51: 263–269, 1989.
L.B. Booker, D.E. Goldberg, J.F. Holland. Classifier systems and genetic algorithms. In J. G. Carbonell, editorMachine Learning: Paradigms and Methods235–282, MIT Press, Cambridge, MA, 1990.
R.K. Creasy, M.A. Herron. Prevention of preterm birth.Seminars in Perinatology5: 295–302, 1981.
R.K. Creasy. Preterm birth prevention: Where are we?American Journal of Obstetrics & Gynecology168: 1223–1230,1993.
J.W. Grzymala-Busse. On the unknown attribute values in learning from examples. InProceedings of the 6th International Symposium on Methodologies for Intelligent Systems (ISMIS’91)LNAI 542, 368–377, Springer, Berlin, 1991.
J.W. Grzymala-Busse. LERS — A system for learning from examples based on rough sets. In R. Slowiáski, editorIntelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory3–18, Kluwer, Dordrecht, 1992.
J.W. Grzymala-Busse, L.K. Goodwin, X. Zhang. Increasing sensitivity of preterm birth by changing rule strengths. InProceedings of the 8th Workshop on Intelligent Information Systems (IIS’99)127–136, Institute of Fundamentals of Computer Science of the Polish Academy of Sciences, Warsaw, 1999.
J.W. Grzymala-Busse, W.J. Grzymala-Busse, L.K. Goodwin. A closest fit approach to missing attribute values in preterm birth data. InProceedings of the 7th International Workshop on Rough Sets Fuzzy Sets,Data Mining and Granular-Soft Computing (RSFDGrC’99)LNAI 1711, 405–413, Springer, Berlin, 1999.
J.H. Holland, K.J. Holyoak, R.E. Nisbett.Induction: Processes of Inference Learning and Discovery. MIT Press, Cambridge, MA, 1986.
M. McLean, W.A. Walters, R. Smith. Prediction and early diagnosis of preterm labor: A critical review.Obstetrical & Gynecological Survey, 48: 209–225, 1993.
R.S. Michalski, I. Mozetic, J. Hong, N. Lavrac. The AQ15 inductive learning system: An overview and experiments. Report number UIUCDCD-R-86–1260 of the Department of Computer Science, University of Illinois, 1986.
Z. Pawlak, J.W. Grzymala-Busse, R. SlowiÃiski, W. Ziarko. Rough sets. Communications of the ACM, 38: 89–95, 1995.
Z. Pawlak. Rough sets. International Journal of Computer and Information Sciences 11: 341–356, 1982.
Z. Pawlak.Rough Sets: Theoretical Aspects of Reasoning about Data.Kluwer, Dordrecht, 1991.
J.R. Quinlan.C4.5: Programs for Machine Learning.Morgan Kaufmann, San Mateo CA, 1993.
L.K. Woolery, J. Grzymala-Busse. Machine learning for an expert system to predict preterm birth risk. JAmer. Med. Inf. Assoc.1: 439–446, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Grzymala-Busse, J.W., Goodwin, L.K., Grzymala-Busse, W.J., Zheng, X. (2004). An Approach to Imbalanced Data Sets Based on Changing Rule Strength. In: Pal, S.K., Polkowski, L., Skowron, A. (eds) Rough-Neural Computing. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18859-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-18859-6_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-62328-8
Online ISBN: 978-3-642-18859-6
eBook Packages: Springer Book Archive