Abstract
Using multiple classifiers for increasing learning accuracy is an active research area. In this paper we present a new general method for merging classifiers. The basic idea of Cascade Generalization is to sequentially run the set of classifiers, at each step performing an extension of the original data set by adding new attributes. The new attributes are derived from the probability class distribution given by a base classifier. This constructive step extends the representational language for the high level classifiers, relaxing their bias. Cascade Generalization produces a single but structured model for the data that combines the model class representation of the base classifiers. We have performed an empirical evaluation of Cascade composition of three well known classifiers: Naive Bayes, Linear Discriminant, and C4.5. Composite models show an increase of performance, sometimes impressive, when compared with the corresponding single models, with significant statistical confidence levels.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ali, K. and Pazzani, M. (1996) “Error reduction through Learning Multiple Descriptions”, in Machine Learning, Vol. 24, No. 1 Kluwer Academic Publishers
Breiman,L. (1996) “Bagging predictors“, in Machine Learning, 24 Kluwer Academic Publishers
Breiman,L. (1996) “Bias, Variance, and Arcing Classifiers”, Technical Report 460, Statistics Department, University of California
Brodley, C. (1995) “Recursive Automatic Bias Selection for Classifier Construction”, in Machine Learning, 20, 1995, Kluwer Academic Publishers
Buntine, W. (1990) “A theory of Learning Classification Rules”, Phd Thesis, University of Sydney
Chan P. and Stolfo S., (1995) “A Comparative Evaluation of Voting and Metalearning on Partitioned Data”, in Machine Learning Proc of 12th International Conference, Ed. L.Saitta
Chan P. and Stolfo S. (1995) “Learning Arbiter and Combiner Trees from Partitioned Data for Scaling Machine Learning”, KDD 95
Domingos P. and Pazzani M. (1996) “Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier“, in Machine Learning Proc. of 12th International Conference, Ed. L.Saitta
Freund, Y. and Schapire, R (1996) “Experiments with a new boosting algorithm”, in Machine Learning Proc of 13th International Conference, Ed. L. Saitta
Gama, J, (1997) “Probabilistic Linear Tree”, in Machine Learning Proc. of the 14th International Conference Ed. D.Fisher
Gama,J. (1997) “Oblique Linear Tree”, in Advances in Intelligent Data Analysis — Reasoning about Data', Ed. X.Liu, P.Cohen, M.Berthold, Springer Verlag LNCS
Henery R. (1997) “Combining Classification Procedures” in Machine Learning and Statistics. The Interface. Ed. Nakhaeizadeh, C. Taylor, John Wiley & Son, Inc.
Kohavi, R and Wolpert, D. (1996) “Bias plus Variance Decomposition for zero-one loss function”, in Machine Learning Proc of 13th International Conference, Ed. Lorenza Saitta
Langley P. (1993) “Induction of recursive Bayesian Classifiers”, in Machine Learning: ECML-93 Ed. P.Brazdil, LNAI n667, Springer Verlag
Mitchell T. (1997) Machine Learning, MacGraw-Hill Companies, Inc.
Quinlan R., (1996) “Bagging, Boosting and C4.5”, Procs. 13th American Association for Artificial Intelligence, AAAI Press
Quinlan, R. (1993) C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, Inc.
Ting K.M. and Witten I.H. (1997) “Stacked Generalization: when does it work?” in Procs. International Joint Conference on Artificial Intelligence
Tumer K. and Ghosh J. (1995) “Classifier combining: analytical results and implications”, in Proceedings of Workshop in Induction of Multiple Learning Models
Thrun S., et all, (1991) The Monk's problems: A performance Comparison of different Learning Algorithms, CMU-CS-91-197
Wolpert D. (1992) “Stacked Generalization”, Neural Networks Vol.5, Pergamon Press
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gama, J. (1998). Combining classifiers by constructive induction. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026688
Download citation
DOI: https://doi.org/10.1007/BFb0026688
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64417-0
Online ISBN: 978-3-540-69781-7
eBook Packages: Springer Book Archive