Handling Missing Attribute Values in Preterm Birth Data Sets

  • Jerzy W. Grzymala-Busse
  • Linda K. Goodwin
  • Witold J. Grzymala-Busse
  • Xinqun Zheng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3642)


The objective of our research was to find the best approach to handle missing attribute values in data sets describing preterm birth provided by the Duke University. Five strategies were used for filling in missing attribute values, based on most common values and closest fit for symbolic attributes, averages for numerical attributes, and a special approach to induce only certain rules from specified information using the MLEM2 approach. The final conclusion is that the best strategy was to use the global most common method for symbolic attributes and the global average method for numerical attributes.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bairagi, R., Suchindran, C.M.: An estimator of the cutoff point maximizing sum of sensitivity and specificity. Sankhya, Series B, Indian Journal of Statistics 51, 263–269 (1989)MathSciNetGoogle Scholar
  2. 2.
    Grzymala-Busse, J.W.: LERS—A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
  3. 3.
    Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2002, Annecy, France, July 1-5, pp. 243–250 (2002)Google Scholar
  4. 4.
    Grzymala-Busse, J.W., Grzymala-Busse, W.J., Goodwin, L.K.: A closest fit approach to missing attribute values in preterm birth data. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), pp. 405–413. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  5. 5.
    Grzymala-Busse, J.W., Zou, X.: Classification strategies using certain and possible rules. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 37–44. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  6. 6.
    Grzymala-Busse, J.W., Goodwin, L.K., Zhang, X.: Increasing sensitivity of preterm birth by changing rule strengths. In: Proceedings of the 8th Workshop on Intelligent Information Systems (IIS 1999), Ustronie, Poland, June 14–18, pp. 127–136 (1999)Google Scholar
  7. 7.
    McLean, M., Walters, W.A., Smith, R.: Prediction and early diagnosis of preterm labor: a critical review. Obstetrical & Gynecological Survey 48, 209–225 (1993)CrossRefGoogle Scholar
  8. 8.
    Swets, J.A., Pickett, R.M.: Evaluation of Diagnostic Systems. Methods from Signal Detection Theory. Academic Press, Methods from (1982)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jerzy W. Grzymala-Busse
    • 1
    • 2
  • Linda K. Goodwin
    • 3
  • Witold J. Grzymala-Busse
    • 4
  • Xinqun Zheng
    • 5
  1. 1.Department of Electrical Engineering and Computer ScienceUniversity of KansasLawrenceUSA
  2. 2.Institute of Computer SciencePolish Academy of SciencesWarsawPoland
  3. 3.Nursing Informatics ProgramDuke UniversityDurhamUSA
  4. 4.FilterlogixLawrenceUSA
  5. 5.PC SprintOverland ParkUSA

Personalised recommendations