Abstract
Real-world data usually contain a certain percentage of unknown (missing) attribute values. Therefore efficient robust data mining algorithms should comprise some routines for processing these unknown values. The paper [5] figures out that each dataset has more or less its own ’favourite’ routine for processing unknown attribute values. It evidently depends on the magnitude of noise and source of unknownness in each dataset. One possibility how to solve the above problem of selecting the right routine for processing unknown attribute values for a given database is exhibited in this paper. The covering machine learning algorithm CN4 processes a given database for six routines for unknown attribute values independently. Afterwards, a meta-learner (meta-combiner) is used to derive a meta-classifier that makes up the overall (final) decision about the class of input unseen objects.
The results of experiments with various percentages of unknown attribute values on real-world data are presented and performances of the meta-classifier and the six base classifiers are then compared.
Keywords
- Base Classifier
- Base Learner
- Numerical Attribute
- Beam Search
- Average Classification Accuracy
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Berka, P. and Bruha, I.: Various discretizing procedures of numerical attributes: Empirical comparisons. 8th European Conference Machine Learning, Workshop Statistics, Machine Learning, and Knowledge Discovery in Databases, Heraklion, Crete (1995), 136–141
Boswell, R.: Manual for CN2, version 4.1. Turing Institute, Techn. Rept. P-2145/Rab/4/1.3 (1990)
Brazdil, P.B. and Bruha, I.: A note on processing missing attribute values: A modified technique. Workshop on Machine learning, Canadian Conference AI, Vancouver (1992)
Bruha, I.: Unknown attribute values processing utilizing expert knowledge on attribute hierarchy. 8th European Conference on Machine Learning, Workshop Statistics, Machine Learning, and Knowledge Discovery in Databases, Heraklion, Crete (1995), 130–135
Bruha, I. and Franek, F.: Comparison of various routines for unknown attribute value processing: Covering paradigm. International Journal Pattern Recognition and Artificial Intelligence, 10,8 (1996), 939–955
Bruha, I. and Kockova, S.: Quality of decision rules: Empirical and statistical approaches. Informatica, 17 (1993), 233–243
Bruha, I. and Kockova, S.: A support for decision making: Cost-sensitive learning system. Artificial Intelligence in Medicine, 6 (1994), 67–82
Cestnik, B.: Estimating probabilities: A crucial task in machine learning. ECAI-90 (1990)
Cestnik, B., Kononenko, I., Bratko, I.: Assistant 86: A knowledge-elicitation tool for sophisticated users. In: Bratko, I. and Lavrac, N. (eds.): Progress in machine learning. Proc. EWSL’87, Sigma Press (1987)
Clark, P. and Boswell, R.: Rule induction with CN2: Some recent improvements. EWSL’91, Porto (1991), 151–163
Clark. P. and Niblett, T.: The CN2 induction algorithm. Machine Learning, 3 (1989), 261–283
Fan, D.W., Chan, P.K., Stolfo, S.J.: A comparative evaluation of combiner and stacked generalization. Workshop Integrating Multiple Learning Models, AAAI, Portland (1996)
Kononenko, I. and Bratko, I.: Information-based evaluation criterion for classifier’s performance. Machine Learning, 6 (1991), 67–80
Quinlan, J.R.: Induction of decision trees. Machine Learning, 1 (1986), 81–106
Quinlan, J.R.: Unknown attribute values in ID3. International Conference ML (1989), 164–8
Quinlan, J.R.: C4.5 programs for machine learning. Morgan Kaufmann (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bruha, I. (2002). Unknown Attribute Values Processing by Meta-learner. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_49
Download citation
DOI: https://doi.org/10.1007/3-540-48050-1_49
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43785-7
Online ISBN: 978-3-540-48050-1
eBook Packages: Springer Book Archive
