Skip to main content

Discretization and Fuzzification of Numerical Attributes in Attribute-Based Learning

  • Chapter
Fuzzy Systems in Medicine

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 41))

Abstract

Machine learning (ML) algorithms have been capable of processing symbolic, categorial data only. Real-world problems, particularly in medicine, comprise not only symbolic, but also numerical attributes. There are several approaches to discretize (categorize) numerical attributes. This article describes two newer algorithms for such a discretization.

The first one has been designed and implemented in KEX (Knowledge Explorer) as its preprocessing procedure. The other discretization procedure was designed for the CN4 algorithm, a large extension of the well-known CN2. The discretization procedure in CN4 works on-line, i.e., it dynamically (within the induction) discretizes numerical attributes.

A large drawback of these discretization procedures, either off-line or on-line, is that they generate sharp bounds between intervals. One way how to eliminate an impurity around the interval borders is to fuzzify them. Here we introduce the newest empirical procedures for fuzzification, both off-line (within KEX) and on-line (CN4).

This chapter first surveys the methodology of empirical machine learning (Section 1), then attribute-based rile-inducing learning from examples (Section 2). Section 3 briefly introduces the KEX algorithm and Section 4 surveys CN4. The last Section focuses on discretization and fuzzification procedures, includes empirical results that compare performance of KEX, CN4, and other well-known machine learning algorithms as for discretization and fuzzification, and concludes with analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. P. Berka, I. Bruha (1995): Various discretizing procedures of numerical attributes: Empirical comparisons. European Conf on Machine Learning, Workshop Statistics, Machine Learning, and Knowledge Discovery in Databases, Heraklion, Crete, 136–141.

    Google Scholar 

  2. P. Berka, J. Ivanek (1994): Automated knowledge acquisition for PROSPECTOR-like expert systems. ECML-94, Springer-Verlag, 339–342

    Google Scholar 

  3. P.B. Brazdil, I. Bruha (1992): A note on processing missing attribute values: a modified technique. Workshop on Machine learning, Canadian Conf AI, Vancouver.

    Google Scholar 

  4. D. Biggs, B. de Ville, E. Suen (1991): A method of choosing multiway partitions for classification and decision trees. J. Applied Statistics, 18, 1, 49–62.

    Article  Google Scholar 

  5. I. Bruha, S. Kockova (1993): Quality of decision rules: empirical and statistical approaches. Informatica, 17, 233–243.

    Google Scholar 

  6. I. Bruha, S Kockova (1993): A covering learning algorithm for cost-sensitive and noisy environments. European Conf. Machine Learning, Workshop on Learning Robots, Vienna.

    Google Scholar 

  7. I. Bruha (1996): Quality of decision rules. Definitions and classification schemes for multiple rules. In: G Nakhaeizadeh, C.C. Taylor (eds.): Machine Learning and Statistics: The Interface. John Wiley, 107–131.

    Google Scholar 

  8. I. Bruha, S. Kockova (1994): A support for decision making: Cost-sensitive learning system. Artificial Intelligence in Medicine, 6, 67–82.

    Google Scholar 

  9. J. Catlett (1991): On changing continuous attributes into ordered discrete attributes. EWSL-91, Porto, Springer-Verlag, 164–178.

    Google Scholar 

  10. B. Cestnik, I. Kononenko, I. Bratko (1988): Assistant 86: A knowledge-elicitation tool for sophisticated users. In: I. Bratko, N. Lavrac (eds.): Progress in machine learning. Proc. EWSL’88, Sigma Press, 31–46.

    Google Scholar 

  11. P. Clark, R Boswell (1991): Rule Induction with CN2: Some Recent Improvements. EWSL-91, Porto.

    Google Scholar 

  12. J.G Carbonell, RS. Michalski, TM Mitchell (1983): An overview of machine learning. In [19]

    Google Scholar 

  13. P. Clark, T Niblett (1989): The CN2 induction algorithm. Machine Learning, 3, 261–283

    Google Scholar 

  14. RO. Dada, J.E. Gasching (1979): Model Design in the PROSPECTOR Consultant System for Mineral Exploration. In: Michie, D. (ed.), Expert Systems in the Micro Electronic Age, Edinburgh University Press, UK.

    Google Scholar 

  15. P. Hajek (1985): Combining Functions for Certainty Factors in Consulting Systems. Int.J. Man- Machine Studies, 22, 59–76.

    Article  Google Scholar 

  16. C. Lee, D. Shin (1994): A context-sensitive discretization of numeric attributes for classification learning. ECAI-94, Amsterdam, John Wiley, 428–432.

    Google Scholar 

  17. RS. Michalski (1980): Pattern recognition as rule-guided inductive inference. IEEE Trans. PAMI-2, 4, 349–361.

    Article  Google Scholar 

  18. RS. Michalski et al. (1986): The multi purpose incremental learning system AQ15 and its testing application to three medical domains. Proc. 5th AAAI, 10415.

    Google Scholar 

  19. RS. Michalski, J.G Carbonell, T.M. Mitchell (eds.) (1983): Machine learning: An artificial intelligence approach, I. Tioga Publ.

    Google Scholar 

  20. M. Nunez (1988): Economic induction: a case study. EWSL’88, Glasgow, 139145.

    Google Scholar 

  21. J.R Quinlan (1986): Induction of decision trees. Machine Learning, 1, 81–106.

    Google Scholar 

  22. J.R Quinlan (1987): Simplifying decision trees. Interni. J. on Man machine Studies, 27, 221–234.

    Article  Google Scholar 

  23. J.R Quinlan (1989)• Unknown attribute values in ID3. Intrn’l Conf ML,164–168.

    Google Scholar 

  24. J.R Quinlan (1994): C4.5: Programs for machine learning. Morgan Kaufinann Publ.

    Google Scholar 

  25. H.A. Simon (1983): Why should machines learn? In [19].

    Google Scholar 

  26. M. Tan, J.C. Schlimmer (1990): Two case studies in cost-sensitive concept acquisition. 8th Conf. AI.

    Google Scholar 

  27. J. Zeidler, M. Schlosser (1995): Fuzzy handling of continuous-valued attributes in decision trees. 8th European Conf. on Machine Learning, Workshop Statistics, Machine Learning, and Knowledge Discovery in Databases, Heraklion, Crete, 41–46.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bruha, I., Berka, P. (2000). Discretization and Fuzzification of Numerical Attributes in Attribute-Based Learning. In: Szczepaniak, P.S., Lisboa, P.J.G., Kacprzyk, J. (eds) Fuzzy Systems in Medicine. Studies in Fuzziness and Soft Computing, vol 41. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1859-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-7908-1859-8_6

  • Publisher Name: Physica, Heidelberg

  • Print ISBN: 978-3-662-00395-4

  • Online ISBN: 978-3-7908-1859-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics