Abstract
In machine learning systems, especially in medical applications, clinical datasets usually contain high dimensional feature spaces with relatively few samples that lead to poor classifier performance. To overcome this problem, feature selection and ensemble classification are applied in order to improve accuracy and stability. This research presents an analysis of the effect of removing irrelevant and redundant features with ensemble classifiers using five datasets and compared with floating search method. Eliminating redundant features provides better accuracy and computational time than removing irrelevant features of the ensemble.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bellman, R.E.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)
Liu, H., Dougherty, E., Dy, J., Torkkola, K., Tuv, E., Peng, H., Ding, C., Long, F., Berens, M., Parsons, L., Zhao, Z., Yu, L., Forman, G.: Evolving feature selection. IEEE Intelligent Systems 20(6), 64–76 (2005)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)
Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Windeatt, T.: Ensemble MLP Classifier Design. LNCS, vol. 137, pp. 133–147. Springer, Heidelberg (2008)
Windeatt, T.: Accuracy/diversity and ensemble MLP classifier design. IEEE Transactions on Neural Networks 17(5), 1194–1211 (2006)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Proceedings of the Ninth National Conference on Artificial Intelligence, pp. 547–552. AAAI Press, Menlo Park (1991)
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceeding of the 17th International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann, San Francisco (2000)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
Malarvili, M., Mesbah, M.: Hrv feature selection based on discriminant and redundancy analysis for neonatal seizure detection. In: 6th International Conference on Information, Communications and Signal Processing, p. 15 (2007)
Deisy, C., Subbulakshmi, B., Baskar, S., Ramaraj, N.: Efficient dimensionality reduction approaches for feature selection. In: International Conference on Computational Intelligence and Multimedia Applications, vol. 2, pp. 121–127 (2007)
Chou, T., Yen, K., Luo, J., Pissinou, N., Makki, K.: Correlation-based feature selection for intrusion detection design. In: IEEE on Military Communications Conference, MILCOM 2007, pp. 1–7 (2007)
Biesiada, J., Duch, W.: Feature Selection for High- Dimensional Data - A Pearson Redundancy Based Filter, vol. 45, pp. 242–249. Springer, Heidelberg (2008)
Biesiada, J., Duch, W.: A Kolmogorov-Smirnov Correlation-Based Filter for Microarray Data. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part II. LNCS, vol. 4985, pp. 285–294. Springer, Heidelberg (2008)
Kudo, M., Sklansky, J.: Comparison of algorithms that select features for pattern classifiers. Pattern Recognition 33, 25–41 (2000)
Zhang, H., Sun, G.: Feature Selection using Tabu search. Pattern Recognition 35, 701–711 (2002)
Pudil, P., Novovicova, J., Kitler, J.: Floating Search Methods in Feature Selection. Pattern Recognition Letters 15, 1119–1125 (1994)
Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/mlearn/MLRepository.html
John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem, pp. 121–129. Morgan Kaufmann, San Francisco (1994)
Polat, K., Gunes, S.: A hybrid approach to medical decision support systems: Combining feature selection, fuzzy weighted pre-processing and airs. Computer Methods and Program in Biomedicine 88(2), 164–174 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Duangsoithong, R., Windeatt, T. (2009). Relevance and Redundancy Analysis for Ensemble Classifiers. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-03070-3_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03069-7
Online ISBN: 978-3-642-03070-3
eBook Packages: Computer ScienceComputer Science (R0)