Abstract
Recent research has proved the benefits of using an ensemble of diverse and accurate base classifiers for classification problems. In this paper the focus is on producing diverse ensembles with the aid of three feature selection heuristics based on two approaches: correlation and contextual merit -based ones. We have developed an algorithm and experimented with it to evaluate and compare the three feature selection heuristics on ten data sets from UCI Repository. On average, simple correlation-based ensemble has the superiority in accuracy. The contextual merit -based heuristics seem to include too many features in the initial ensembles and iterations were most successful with it.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Apte, C., Hong, S.J., Hosking, J.R.M., Lepre, J., Pednault, E.P.D., Rosen, B.K.: Decomposition of Heterogeneous Classification Problems. In X. Liu, P. Cohen, M. Bethold (eds.): Advances in Intelligent Data Analysis, Lecture Notes in Computer Science, Vol. 1280. Springer-Verlag (1997) 17–28
Hall, M.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proc. 17th Int. Conf. on Machine learning. Morgan Kaufmann Publishers, CA (2000).
Hong, S.J.: Use of contextual information for feature ranking and discretization. IEEE Transactions on knowledge and Data Engineering 9 (1997) 718–730
Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining UsingMLC++: AMachine Learning Library in C++. In: Tools with Artificial Intelligence, IEEE CS Press (1996) 234–245
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Datasets [http://www.ics.uci.edu/(mlearn/MLRepository.html]. Dep-t of Information and CS, Un-ty of California, Irvine, CA (1998)
Opitz, D., Maclin, R.: Popular Ensemble Methods: An Empirical Study. Artificial Intelligent Research 11 (1999) 169–198
Opitz, D.: Feature Selection for Ensembles. In: Proc. of the 16th National Conf. on Artificial Intelligence (AAAI), Orlando (1999) 379–384
Oza, N., Tumer, K.: Dimensionality Reduction Through Classifier Ensembles. Tech. Rep. NASA-ARC-IC-1999-126 (1999)
Prodromidis, A. L., Stolfo, S. J., Chan P. K.: Pruning Classifiers in a Distributed Meta-Learning System. In: Proc. 1st National Conference on New Information Technologies. (1998) 151–160
Puuronen, S., Skrypnyk, I., Tsymbal, A.: Ensemble Feature Selection based on the Contextual Merit. In: Proc. 3rd Int. Conf. on Data Warehousing and Knowledge Discovery (DaWaK’01). September 5–7, 2001 Munich, Germany
Puuronen, S., Skrypnyk, I., Tsymbal, A.: Ensemble Feature Selection based on Contextual Merit and Correlation Heuristics. In: Proceedings of the Fifth East-European Conference on Advances in Databases and Information Systems. September 25–28, 2001, Vilnius, Lithuania
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, California (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Puuronen, S., Tsymbal, A., Skrypnyk, I. (2001). Correlation-Based and Contextual Merit-Based Ensemble Feature Selection. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44816-0_14
Download citation
DOI: https://doi.org/10.1007/3-540-44816-0_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42581-6
Online ISBN: 978-3-540-44816-7
eBook Packages: Springer Book Archive