Discretization and Fuzzification of Numerical Attributes in Attribute-Based Learning

Bruha, Ivan; Berka, Petr

doi:10.1007/978-3-7908-1859-8_6

Ivan Bruha⁴ &
Petr Berka⁵

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 41))

716 Accesses
2 Citations

Abstract

Machine learning (ML) algorithms have been capable of processing symbolic, categorial data only. Real-world problems, particularly in medicine, comprise not only symbolic, but also numerical attributes. There are several approaches to discretize (categorize) numerical attributes. This article describes two newer algorithms for such a discretization.

The first one has been designed and implemented in KEX (Knowledge Explorer) as its preprocessing procedure. The other discretization procedure was designed for the CN4 algorithm, a large extension of the well-known CN2. The discretization procedure in CN4 works on-line, i.e., it dynamically (within the induction) discretizes numerical attributes.

A large drawback of these discretization procedures, either off-line or on-line, is that they generate sharp bounds between intervals. One way how to eliminate an impurity around the interval borders is to fuzzify them. Here we introduce the newest empirical procedures for fuzzification, both off-line (within KEX) and on-line (CN4).

This chapter first surveys the methodology of empirical machine learning (Section 1), then attribute-based rile-inducing learning from examples (Section 2). Section 3 briefly introduces the KEX algorithm and Section 4 surveys CN4. The last Section focuses on discretization and fuzzification procedures, includes empirical results that compare performance of KEX, CN4, and other well-known machine learning algorithms as for discretization and fuzzification, and concludes with analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Discretizing Numerical Attributes: An Analysis of Human Perceptions

Using discretization for extending the set of predictive features

Article Open access 18 January 2018

A Comparison of Two Approaches to Discretization: Multiple Scanning and C4.5

References

P. Berka, I. Bruha (1995): Various discretizing procedures of numerical attributes: Empirical comparisons. European Conf on Machine Learning, Workshop Statistics, Machine Learning, and Knowledge Discovery in Databases, Heraklion, Crete, 136–141.
Google Scholar
P. Berka, J. Ivanek (1994): Automated knowledge acquisition for PROSPECTOR-like expert systems. ECML-94, Springer-Verlag, 339–342
Google Scholar
P.B. Brazdil, I. Bruha (1992): A note on processing missing attribute values: a modified technique. Workshop on Machine learning, Canadian Conf AI, Vancouver.
Google Scholar
D. Biggs, B. de Ville, E. Suen (1991): A method of choosing multiway partitions for classification and decision trees. J. Applied Statistics, 18, 1, 49–62.
Article Google Scholar
I. Bruha, S. Kockova (1993): Quality of decision rules: empirical and statistical approaches. Informatica, 17, 233–243.
Google Scholar
I. Bruha, S Kockova (1993): A covering learning algorithm for cost-sensitive and noisy environments. European Conf. Machine Learning, Workshop on Learning Robots, Vienna.
Google Scholar
I. Bruha (1996): Quality of decision rules. Definitions and classification schemes for multiple rules. In: G Nakhaeizadeh, C.C. Taylor (eds.): Machine Learning and Statistics: The Interface. John Wiley, 107–131.
Google Scholar
I. Bruha, S. Kockova (1994): A support for decision making: Cost-sensitive learning system. Artificial Intelligence in Medicine, 6, 67–82.
Google Scholar
J. Catlett (1991): On changing continuous attributes into ordered discrete attributes. EWSL-91, Porto, Springer-Verlag, 164–178.
Google Scholar
B. Cestnik, I. Kononenko, I. Bratko (1988): Assistant 86: A knowledge-elicitation tool for sophisticated users. In: I. Bratko, N. Lavrac (eds.): Progress in machine learning. Proc. EWSL’88, Sigma Press, 31–46.
Google Scholar
P. Clark, R Boswell (1991): Rule Induction with CN2: Some Recent Improvements. EWSL-91, Porto.
Google Scholar
J.G Carbonell, RS. Michalski, TM Mitchell (1983): An overview of machine learning. In [19]
Google Scholar
P. Clark, T Niblett (1989): The CN2 induction algorithm. Machine Learning, 3, 261–283
Google Scholar
RO. Dada, J.E. Gasching (1979): Model Design in the PROSPECTOR Consultant System for Mineral Exploration. In: Michie, D. (ed.), Expert Systems in the Micro Electronic Age, Edinburgh University Press, UK.
Google Scholar
P. Hajek (1985): Combining Functions for Certainty Factors in Consulting Systems. Int.J. Man- Machine Studies, 22, 59–76.
Article Google Scholar
C. Lee, D. Shin (1994): A context-sensitive discretization of numeric attributes for classification learning. ECAI-94, Amsterdam, John Wiley, 428–432.
Google Scholar
RS. Michalski (1980): Pattern recognition as rule-guided inductive inference. IEEE Trans. PAMI-2, 4, 349–361.
Article Google Scholar
RS. Michalski et al. (1986): The multi purpose incremental learning system AQ15 and its testing application to three medical domains. Proc. 5th AAAI, 10415.
Google Scholar
RS. Michalski, J.G Carbonell, T.M. Mitchell (eds.) (1983): Machine learning: An artificial intelligence approach, I. Tioga Publ.
Google Scholar
M. Nunez (1988): Economic induction: a case study. EWSL’88, Glasgow, 139145.
Google Scholar
J.R Quinlan (1986): Induction of decision trees. Machine Learning, 1, 81–106.
Google Scholar
J.R Quinlan (1987): Simplifying decision trees. Interni. J. on Man machine Studies, 27, 221–234.
Article Google Scholar
J.R Quinlan (1989)• Unknown attribute values in ID3. Intrn’l Conf ML,164–168.
Google Scholar
J.R Quinlan (1994): C4.5: Programs for machine learning. Morgan Kaufinann Publ.
Google Scholar
H.A. Simon (1983): Why should machines learn? In [19].
Google Scholar
M. Tan, J.C. Schlimmer (1990): Two case studies in cost-sensitive concept acquisition. 8th Conf. AI.
Google Scholar
J. Zeidler, M. Schlosser (1995): Fuzzy handling of continuous-valued attributes in decision trees. 8th European Conf. on Machine Learning, Workshop Statistics, Machine Learning, and Knowledge Discovery in Databases, Heraklion, Crete, 41–46.
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. Computing & Software, McMaster University, Hamilton, Ont., Canada, L8S4K1
Ivan Bruha
Laboratory of Intelligent Systems, University of Economics, Prague, Czech Republic, CZ-13067
Petr Berka

Authors

Ivan Bruha
View author publications
You can also search for this author in PubMed Google Scholar
Petr Berka
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Technical University of Łódź, ul. Sterlinga 16/18, 90-217, Łódź, Poland
Piotr S. Szczepaniak PhD, DSc (Associate Professor) (Associate Professor)
School of Computing and Mathematical Science, Liverpool John Moores University, L33AF, Liverpool, UK
Paulo J. G. Lisboa
Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447, Warsaw, Poland
Janusz Kacprzyk

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bruha, I., Berka, P. (2000). Discretization and Fuzzification of Numerical Attributes in Attribute-Based Learning. In: Szczepaniak, P.S., Lisboa, P.J.G., Kacprzyk, J. (eds) Fuzzy Systems in Medicine. Studies in Fuzziness and Soft Computing, vol 41. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1859-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-7908-1859-8_6
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-662-00395-4
Online ISBN: 978-3-7908-1859-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Discretization and Fuzzification of Numerical Attributes in Attribute-Based Learning

Abstract

Access this chapter

Preview

Similar content being viewed by others

Discretizing Numerical Attributes: An Analysis of Human Perceptions

Using discretization for extending the set of predictive features

A Comparison of Two Approaches to Discretization: Multiple Scanning and C4.5

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Discretization and Fuzzification of Numerical Attributes in Attribute-Based Learning

Abstract

Access this chapter

Preview

Similar content being viewed by others

Discretizing Numerical Attributes: An Analysis of Human Perceptions

Using discretization for extending the set of predictive features

A Comparison of Two Approaches to Discretization: Multiple Scanning and C4.5

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation